Information vs Behaviour

I read a couple of chapters of two different books recently:

  • Ch.5 of O’Reilly’s “Beautiful Architecture“: Resource-Oriented Architectures: Being ‘In The Web’ by Brian Sletten
  • Ch.1 of Manning’s “SOA Patterns“: Solving SOA Pains With Patterns by Arnon Rotem-Gal-Oz, Eric Bruno, and Udi Dahan (from a free online Early Access Program)
In these chapters, there were points made that stood out for me and underlined the fact that the two approaches of Resource Oriented Architecture (ROA) and Service Oriented Architecture (SOA) are mostly diametrically opposed, along the axes of information and behaviour.
I’ve been saying for a while, and indeed recently, that the data in an enterprise is a key asset of any corporation, and should be treated as such. Information should be secure, available, and above all addressable. As Brian says:
“We like giving names to things because we are fundamentally name-oriented beings”
Information elements should be first class citizens on the web, not relegated to anonymous lumps of data only accessible indirectly through opaque service endpoints. Those IT departments that enable uniform, transparent, controlled and consistent access to a corporation’s data, especially across a complex system landscape, are the ones that are in line to give their business the greatest benefit.
So it was with great delight (and fervent agreement) that I read, in “Beautiful Architecture”, Brian’s eloquent description of how the IT industry uses
“the wrong abstractions internally, overemphasising our software and services and underemphasising our data”
and proceeds to describe how an information centric approach is more appropriate.
Then, only a day later, I read, in “SOA Patterns”, about the challenges of SOA — in particular:
“how do you solve the BI / SOA impedance mismatch of getting a centralised view of the data in an architectural style that encourages encapsulation and privacy?”
Impedance mismatch! Yes, my point entirely!
The behaviour-focused approach of SOA, diametrically opposed to the information-focused approach of ROA, is a natural barrier to leveraging an enterprise’s key asset — information, and in this case Business Intelligence.

Tarpipe REST connector in 5 minutes

Tarpipe implemented a REST connector a short while ago. This is something that I and others have been wanting for a while now, so it’s great news. The announcement was quite short and didn’t have much detail. I like to see things visually, and I’m guessing others do too, so I decided to write a little handler to receive a sample request from the REST connector to dump it for inspection.

As Bruno showed in the announcement, this is what the REST connector looks like:

Tarpipe REST connector

It will take whatever values it receives in the title, description and link input fields on the left hand side of the connector, and construct a piece of JSON which it then sends in an application/x-www-form-urlencoded format as a data=<JSON> name/value pair in the message body of an HTTP POST request to the resource specified in the serviceUrl field.

So if we pass the values “DJ’s Weblog” into the title, “Reserving the right to be wrong” into the description, “http://www.pipetree.com/qmacro/blog/” into the link fields, and pass “http://example.org/bucket/” into the serviceUrl field, the following HTTP request is made on the http://example.org/bucket/ resource like this:

POST /bucket/ HTTP/1.1
Content-Length: 218
Content-Type: application/x-www-form-urlencoded
Host: example.org
Accept: */*

data=%7B%22items%22%3A%5B%7B%22title%22%3A%22DJ%27s+Weblog%22%2C%22description
     %22%3A%22Reserving+the+right+to+be+wrong%5Cn%22%2C%22link%22%3A%22http%3A
     %5C%2F%5C%2Fwww.pipetree.com%5C%2Fqmacro%5C%2Fblog%5C%2F%22%7D%5D%7D

(whitespace added by me for readability).

When decoded and pretty-printed, that message body looks like this

data={
    "items":[
       {
           "title":"DJ's+Weblog",
           "description":"Reserving+the+right+to+be+wrong",
           "link":"http://www.pipetree.com/qmacro/blog/"
       }
    ]
}

This is what your app gets to process.

Bruno said that the format was chosen to be compatible with the Yahoo! Pipes Web Service Module, and it sure is — look at this example from the Web Service Module documentation:

data={
    "items":[
       {
           "title": "First Title",
           "link": "http://example.com/first",
           "description": "First Description"
       },
       {
           "title": "Last Title",
           "link": "http://example.com/last",
           "description": "Last Description"
       }
    ]
}

And what about those three output fields on the right hand side of the REST connector? Well, if your app returns a response with JSON in the body — this time not as a name/value pair, but as pure JSON — like this:

{
  "items":[
     {
         "title": "The response!",
         "description": "Long text description of the response",
         "link": "http://example.org/banana/"
     }
  ]
}

then the workflow can continue and you can connect those values in the corresponding title, description and link output fields as input to further connectors.

Happy tarpiping!

Twitter’s success

Yes yes, I know I’m late to the game, and everyone and his dog has given their angle on why Twitter is so successful, but I’d like to weigh in with a few thoughts too. The thoughts are those that came together when I was chatting to Ian Forrester (@cubicgarden), at a GeekUp event in Manchester last week.

Messaging Systems

Back in the day, I talked about, wrote about and indeed built interconnected messaging systems based around the idea of a message bus, that has human, system and bot participation. The fundamental idea was based around one or more channels, rooms or groupings of messages; messages which could be originated from any participant, and likewise filtered, consumed and acted upon by any other. I wrote a couple of articles positing that bots might be the command line of the future.

Using my favourite messaging protocol, I built such a messaging system for an enterprise client. This system was based around a series of rooms, and had a number of small-but-perfectly-formed agents that threw information onto the message bus, information such as messages resulting from monitoring systems across the network (”disk space threshold reached”, “System X is not responding”, “File received from external source”, etc) and messages from SAP systems (”Sales Order nnn received”, “Transport xxx released“, “Purchase Order yyy above value z created”, etc). It also had a complement of agents that listened to that RSS/ATOM-sourced stream of enterprise consciousness and acted upon messages they were designed to filter — sending an SMS message here, emailing there, re-messaging onto a different bus or system elsewhere.

So what does this have to do with Twitter? Well, Twitter is a messaging system too. And Twitter’s ‘timeline’ concept is similar to the above message groupings. People, systems and bots can and do (I hesitate to say ‘publish’ and ’subscribe to’ here) create, share and consume messages very easily.

Killer Feature

But the killer feature is that Twitter espouses the guiding design principle:

Everything has a URL

and everything is available via the lingua franca of today’s interconnected systems — HTTP. Timelines (message groupings) have URLs. Message producers and consumers have URLs. Crucially, individual messages have URLs (this is why I could refer to a particular tweet at the start of this post). All the moving parts of this microblogging mechanism are first class citizens on the web. Twitter exposes message data as feeds, too.

Even Twitter’s API, while not entirely RESTful, is certainly facing in the right direction, exposing information and functionality via simple URLs and readily consumable formats (XML, JSON). The simplest thing that could possibly work usually does, enabling the “small pieces, loosely joined” approach that lets you pipeline the web, like this:

dj@giant:~$ GET http://twitter.com/users/show/qmacro.json |
              perl -MJSON -e "print from_json(<>)->{'location'},qq/\n/"
Manchester, England
dj@giant:~$

None of this opaque, heavy and expensive SOA stuff here, thank you very much.

Other Microblogging Systems and Decentralisation

And does this feature set apply only to Twitter? Of course not. Other microblogging systems, notably laconi.ca — most well known for the public instance identi.ca — follow these guiding design principles too.

What’s fascinating about laconi.ca is that just as a company that wants to keep message traffic within the enterprise can run their own mail server (SMTP) and instant messaging & presence server (Jabber/XMPP), so also can laconi.ca be used within a company for instant and flexible enterprise social messaging, especially when combined with enterprise RSS. But that’s a story for another post :-)

Analysing CV searches with Delicious

I put my CV online recently, and having the machine that serves this website (an iMac running Ubuntu Linux) sitting in the study, I can almost ‘feel’ the HTTP requests entering the house, going down the wire, and being served, like lumps travelling down a pipe in a Tom & Jerry cartoon.

So I was thinking about doing something useful with Apache’s access log, more than what I already have with the excellent Webalizer. Inspired (as ever) by Jon Udell’s “ongoing fascination with Delicious as a user-programmable database“, I decided to pipe the access log into a Perl script and pull all the Google search referrer URLs that led to /qmacro/CV.html. For every referrer URL found, I grabbed the query string that was used and split it into words, removing noise. I also made a note of the top level domain for the Google hostname - a very rough indication of where queries were coming from.

But rather than create a database, or even an application, to analyse the results, I just posted the information as bookmarks to Delicious (after a simple incantation of perl -MCPAN -e ‘install Net::Delicious- just what I needed, thanks!).

Delicious *is* a database, and by its very nature and purpose has a flavour that lends itself very well to loosely coupled data processing and manipulation. It’s about URLs and tags. It’s about adding data, replacing data, removing data. Basic building blocks and functions. Every item in the database has, and is keyed by, a URL, and as such, every item is recognised and treated as a first class citizen on the web. Even the metadata (tag information) is treated the same.

So what did I end up with? Well, for a start, I have a useful collection of referring CV search URLs, the collection being made via a common grouping tag ‘cvsearchkeywords‘ that I assigned to each Delicious post in addition to the tags derived from the query string.

CV search keywords on Delicious

I also have a useful analysis of the search keywords, in the list of “Related Tags” - tags related to the common grouping tag. I can see right now for example that beyond the obvious ones such as “cv”, popular keywords are abap, architect and developer. What’s more, that analysis is interactive. Delicious’s UI design, and moreover its excellent URL design, means that I can drill down and across to find out what keywords were commonly used with others, for example.

That collection, and that analysis, will grow automatically as soon as I add the script to the logrotate mechanism on the server. That is, of course, assuming people remain interested in my CV!

And my favourite referrer search string so far? “How to write a CV of a DJ” :-)

SAP everywhere!

I remember back in the ’90s joking with my friend Piers

When I see the first book on SAP hit the bookstores, it’s time to move on :-)

In those days there were no books on SAP, and I was still in shock from receiving SAP documentation properly printed and bound — in the early days we had SAP install guides on green and white striped fanfold paper from daisywheel printers, with sentences literally half in German, half in English.

How things have changed. Beyond the SAP Developer Network, which I can proudly say I had a hand in forming and nurturing, I’ve just seen a video on YouTube by Jon Reed on how to find and follow SAP people on Twitter! I’ve also just added myself to the SAP Affinity Group. A long way from SAP-R3-L!

Perhaps it’s time to rebuild Planet SAP?

An HTTP connector for Tarpipe: ‘tarbridge’

One thing that Tarpipe would really benefit from is a connector that would enable an HTTP request (I’m thinking of POST, here) to be made on an arbitrary resource (URL). This is something that other people have already mentioned — and the Tarpipe folks are certainly working on it.

I couldn’t wait, however, and thought I’d have a bit of fun building an HTTP connector. I don’t have access to Tarpipe’s sources, so I had to go a roundabout route. Tarpipe has a Mailer connector, which enables emails to be sent from within a workflow. So I built a very simple email-to-HTTP-POST mechanism ‘tarbridge’. This way, you can use the Mailer connector to send an email like this:

Recipient: tarbridge+<token>@pipetree.com
Subject: the URL to POST to and an optional content-type
Body: the payload of the HTTP POST

and an HTTP POST will be made to the URL specified. You’ll even get an email reply with the HTTP response.

Here’s an example workflow that receives an email containing something to bookmark in Delicious. It uses the Delicious connector, and also makes an HTTP POST to a little test application (running on a local devserver version of the excellent Google AppEngine, fwiw) via tarbridge.

Workflow using tarbridge

The Subject of the email contains the URL to make the HTTP POST to. By default the Content-Type will be set to application/x-www-form-urlencoded, but you can override this by specifying a different content type (here I’ve specified text/plain) as a second parameter in the Subject.

The addressee of the email is ‘tarbridge+<some token>@pipetree.com’. I’ve used this approach so I can control what goes through this tarbridge mechanism. A token is associated with an email address, to which the HTTP response is sent in reply.

The body of the email is what’s send as the payload in the HTTP request.

So sending this email to the Tarpipe workflow above:

From: DJ Adams <dj@pipetree.com>
To: bury69xxxx@tarpipe.net
Subject: http://blog.tarpipe.com

Tarpipe blog

results in this Delicious entry:

Tarpipe Blog URL on Delicious

and this email sent, via the Mailer connector, to the tarbridge mechanism:

To: tarbridge+token@pipetree.com
Subject: http://www.pipetree.com:8888/feed/ text/plain
From: tarpipe mailer <mailer@tarpipe.net>

http://blog.tarpipe.com http://del.icio.us/url/95948a42d8777b46278d4da333345473

which in turn results in an HTTP POST being made like this:

POST /feed/ HTTP/1.1
User-Agent: tarbridge/0.1 libwww-perl/5.812
Host: www.pipetree.com:8888
Content-Type: text/plain
[...]
http://blog.tarpipe.com http://del.icio.us/url/95948a42d8777b46278d4da333345473

The result of the HTTP POST is emailed back like this:

Subject: Re: http://www.pipetree.com:8888/feed/ text/plain
To: DJ Adams <dj.adams@pobox.com>
From: tarbridge+token@pipetree.com
HTTP/1.0 201 Created
Date: Fri, 24 Apr 2009 10:06:55 GMT
Location: http://www.pipetree.com:8888/feed/test-feed-1/agtmZWVkYnVpbGRlc[...]
[...]

So if you were really crazy you could even feed that response back into the Tarpipe loop, using a second workflow (hmm, Tarpipe could do with a string parsing connector too :-)

The tarbridge mechanism is just a little Perl script that’s triggered via Procmail. I’m running Ubuntu on pipetree.com so it was just a question of configuring Postfix to use Procmail for delivery, and writing a .procmailrc rule like this:

:0 c
| ~/handler.pl 2>> ~/tarbridge.log

If you’re interested in trying this out using my (pipetree) instance of this tarbridge, please email me and I’ll set you up with a token. Usual caveats apply. And remember, this is only in lieu of a real HTTP connector which I hope is coming soon from Tarpipe!

tarpipe.com - Programming 2.0?

Is tarpipe.com an early example of a “Programming 2.0″ concept?

I first read about Tarpipe from Curt Cagle’s “Analysis 2009″. In turn, Curt points to Jeff Barr’s post which describes the concept and the implementation very well. It’s a fascinating concoction of Web 2.0 services and visual programming (in the style of Yahoo! Pipes), and in its beta infancy has that great “wow, imagine the full potential!” feel to it.

Here’s an example of what I’ve been playing around with. With my phone — and with the G1 it’s so easy — I can snap a picture of the beer I’m drinking, and email that picture to a Tarpipe workflow, along with the name of the beer in the subject line and a list of tags rating the beer in the body.

The workflow uses the existing Tarpipe connectors to:

  • post the picture on Flickr with the beer name as the title and the rating words as tags, including a statically added ‘beerrating
  • have a short URL constructed via TinyURL for the new Flickr picture page (ok this is pre Kellan’s rev=”canonical”, and while Flickr already has such links the URLs are not exposed by Tarpipe’s Flickr connector)
  • dent the rating, with the short picture URL, on identi.ca (which in turn, re-dents to Twitter too)
  • reply to the original email confirming that the beer was successfully rated

All in the space of a few clicks and drags! Here’s a shot of that workflow (with a couple of connectors partially obscured — it’s a known bug in Tarpipe):

tarpipebeerrater

Tarpipe workflow for beer rating

But what’s more fabulous: Tarpipe has been ideal for my son Joseph to start up with programming, with me. And he finds it really interesting. Visual, direct feedback, using and connecting things and services he understands. Gone are the days of

10 PRINT "HELLO WORLD"
20 GOTO 10

on black and white low-res displays.

After explaining a few concepts, Joseph was totally up and away, building his first workflow which is pretty impressive! (I’m a biased, proud dad of course :-) And now we’re off looking at Yahoo! Pipes too, and he’s asking how we can link the two services together.

Hello, new programming world.

Old feed URLs fixed with a bit of mod_rewrite voodoo

As feeds are the new blogs (quoting myself, oh dear!) I thought it important to make sure that the feed bots that have been continuously polling my weblog’s feed and getting 404s (since 2005, I guess) are sent to the right place. My Apache access.log file was showing that 404s were being returned for /qmacro/blog/index.rdf and /qmacro/blog/index.xml, and /qmacro/xml for that matter … all old locations for the weblog feed.

The power of HTTP, and the voodoo of mod_rewrite, allow me to fix things. Inserting these lines into the relevant .htaccess files does the trick:

RewriteRule ^index\.(xml|rdf)$ /qmacro/blog/feed/atom/ [R=301,L]
RewriteRule ^xml$ /qmacro/blog/feed/atom/ [R=301,L]

Now the bots are redirected to this weblog’s shiny new feed. And I’ll try not to change the URL again :-)

Back from Real Life

I’m back online, after an eXtended Away in Real Life. I don’t think my online presence will ever be what it was (I can’t believe how much I posted in the past few years) but blogging isn’t the same anyway. Even the new UK edition of Wired magazine (I’ve subscribed, btw) lists blogging under “Tired”. Now there’s microblogging (identi.ca and Twitter spring to mind immediately) and people seem to be *emailing* each other on Facebook these days! That’s the equivalent of the heinous corporate crime of using Excel for everything, like sending screenshots to each other, or writing simple lists (*shudder*).

What’s more, my son Joseph is online now too, complete with blog, identi.ca & Twitter accounts, and more!

Anyway, I’ve got myself a local copy of Wordpress, and am slowly retrieving my past with the help of The Wayback Machine. It’s a slow and not entirely painless process, but I’m getting there. I’m doing a month at a time, and am up to Jan 2003. Nothing’s properly categorised or tagged yet, nor are all the links working perfectly. There are even some posts that aren’t properly datestamped yet! More importantly, I haven’t yet put the mod_rewrite magic in place to reduce the 404s that I’m seeing in my HTTP access log.

Watch this space.

Java and Gosling’s FUD - madness or desperation?

For a while now, people have been talking about the fall from favour of Java (aka the new COBOL).

Today Obie Fernandez points to some hopelessly weak arguments against scripting / dynamic languages … from the father of Java himself, James Gosling.

Is Gosling’s post just a moment of madness, or a sign of hopeless desperation?