XMPP for web services

20080725 – 094816+0000

Rob Kaye has written up what looks like a really interesting talk, but the comments sometimes highlight how badly understood XMPP is, and what the advantages are.

So a few, erm, meta-comments, and I apologise for really just spelling out what’s already been said.

First off, federation is key, here. It’s entirely possible to use XMPP as a purely internal message switch, but it’s probably not all that useful - at least, not unless you’re really aiming to make it external later. There’s good reasons for this, not least because although you could use a Javascript XMPP client in the browser and BOSH back to a single XMPP cluster, it’s far cheaper to just sling together a bespoke long-poll system yourself, using something like JQuery. That’s cheaper in terms of man months, infrastructure cost, and complexity.

However, if you allow users direct access to the message switch (via, as is suggested here, pubsub, or just via simple subscriptions and messages) then you’re doing “real” push, and in a way that allows user-defined access to those messages - basically, it’s like the HTTP “API”s, but on steroids. The web browser may well not be involved - and that’s okay, because it’s really not designed to do that kind of thing.

And that leads onto the second point - you *can* use HTTP to access XMPP, via BOSH, but you really don’t have to. BOSH is there when needed - and it does allow third parties to write web-based apps which can talk to your souped-up XMPP feeds - but when it’s not needed - like when you want to do a desktop app, or when you’re feeding into some other web app, or - well, almost anything you can think of.

An XMPP Primer

20080715 – 102111+0000

@edd, as I like to call him, suggested that XMPP wasn’t terribly easy to get into, as it lacks a kind of introduction for would-be developers - there’s a steep learning curve to get through. I’m not convinced, in part because I wrote a client good enough to have a conversation over within 24 hours, but it’s fair to say that I couldn’t have written much more, and besides, I read specifications quite a bit anyway.

So these posts are, essentially, a quick primer - they describe the protocol in high-level terms, and the aim is that when you (yes, you) read the RFCs and XEPs, it won’t be quite so scary. There will be simplifications - and I’ll try to point those out when I make them - and there will be errors - because I’m not infallible - but hopefully you’ll end up with some familiarity when you read the specifications.

3,737,844,653

20080707 – 074753+0000

L. M. Orchard writes the following in an otherwise quite sensible article on queue-based architectures for broadcast messaging systems:

“Even with these delays, the system is still better at getting the word out than the original content creator would be at notifying all the others involved with an out-of-band system like IM or email.”

I can broadcast any change in my status to everyone who follows in in seconds already with XMPP - because this would (or at least should) be modeled with presence (either basic or PEP). And this can be optimized a bit further for the typical “open subscription” µblog model.

Spurious Keyword Of The Day: Marc Chagall.

Outline of a OpenMicroBlogging proposal

20080706 – 135448+0000

I acknowledge that nothing here is new, but I’ve tried to go from first principles. IIRC, Ralph Meijer has already had some of this stuff working.

Okay… Instead of users being referred to by a profile URI, which seems a bit sucky to me, let’s start off by assuming there’s only one microblogging site per domain. I think that’s a reasonable restriction.

This allows you to refer to a particular user globally by a tuple consisting of a username and a domain. Are we good so far?

As far as I can tell, existing microblogging sites don’t allow an @ sign in their usernames, so this allows us to have a string representation for any user, globally, of:

user@domain

Wowee, groundbreaking.

So assuming that’s all okay so far, we need a protocol to carry data from one domain to another. Given that we want to avoid polling, we’d be best off with a continuous stream orientated protocol, rather than a request/response one. A TCP based protocol seems sensible, since presumably neither end is going to be behind a heavily restricted NAT, and so communications are going to be reasonably free.

The protocol probably needs to be extensible, which suggests to me that we want to be looking for an XML based protocol. And that rather suggests that XMPP or BEEP is a solution here. BEEP is actually lighter - as long as we strip out the over-engineered stream bits we don’t want - but there are many fewer libraries, and besides, most of these sites have an XMPP interface anyway.

So, this suggests that each microblogging update is carried in a single stanza, and that the “subscriptions” are effectively treated as a roster subscription.

So far, we have XMPP, with some kind of message (we’ve not yet decided what) whizzing between servers.

Next, let’s consider what existing facilities are present in XMPP which might do this job.

The obvious one is PEP. In PEP, each user has a number of nodes, which are individually addressable, and can effective act as broadcast demultiplexers - the user emits a message - an XML gobbet - “aimed at” the node, and the node emits a message to each subscriber.

It’s simple, and almost exactly what we want, so if nobody objects we’ll go with this.

And there we have an almost complete OpenMicroblogging specification.

The missing bits are:

- discovering metadata about the user, which can probably be done with s to the user, or possibly XEP-0154 attributes.

- the precise formatting of the microblogging updates.

- an optimization for reducing server-server traffic loads.

The latter can be done by, in the case where the microblogging update is open and public (ie, has a predictable and largely uncontrolled access model), then we may as well send updates conceptually between servers, such that multiple users on remote servers “share” a single update message.

The easy way to do this is for the remote server to indicate that the subscription is being proxied.

The thing I wonder about, though, is why one would bother have microblogging at all, in this circumstance - what I’ve outlined above is, after all, essentially a simple PEP service that could be put into place today with many servers, and would take minimal client development. That would mean, though, that microblogging sites would essentially act as an interface onto the XMPP service, which is an odd state of affairs, but I think overall I prefer it.

Pointless Keyword Of The Day: Rick Moranis.

Mission Accomplished

20080705 – 083828+0000

@edd: Enjoying discovering @dwd’s cranky blog at http://blog.dave.cridland.net/
about 11 hours ago

Hmmm.

But anyway, I got ten times as many views on the Git/Haskell/Ruby post yesterday as the two identi.ca/laconi.ca posts combined, thus restoring my lack of faith in human nature entirely. Turns out that posts cynically designed purely to trigger keywords and attract visitors doomed to disappointment are, in fact, more popular than reasoned, considered thoughts.

So from now on, I’m going to put my mighty and unparalleled intellect into choosing suitable keywords, instead of bothering with silly things like content. So, who is Jordan Hasay anyway?

laconi.ca’s XMPP Interface

20080704 – 080049+0000

Some random features I’m hoping to persuade someone else into writing… (Yes, I might manage to sully myself with PHP to help out)

- identi.ca is multi-user chat. So it looks, to you, like a chatroom. Apparently this is not a new idea. Joe Hildebrand (hildjj) has added some interesting ideas here, changing the as well as the depending on the last change. Of course, if the user is online via XMPP, it makes sense to expose that - with the user’s permission.

- Settings really ought to be via XEP-0050. It makes so much more sense. Particularly for IM settings, but in practise any settings can be handled here.

- It should pick up my status, sure, but it should also pick up my avatar, and quite possibly geoloc etc too.

I’m vaguely wondering whether laconi.ca - and therefore identi.ca - ought to be accepting a jid in the same way as they accept OpenID, which would (in principle) allow you to simply join the service via XMPP, with no passwords needed, and avoid the web interface entirely - it’d act, instead, just like an XMPP chatroom where you chose who’d be present. (Which, when you think of it, is actually quite cool.)

Of course, this concentration on the XMPP interface might just be because I’m getting timeouts all the time on the web interface…

It’s all about popular!

20080703 – 234350+0000

So, it seems that although identi.ca is “well cool”, and thoroughly of the moment, my Haskel/Git/Ruby post was, at the end of the day, just a tiny bit more popular (at one stage, it was twice as popular). I’d feel pround, except Kev and Remko really wrote it, in the jdev@conference.jabber.org chatroom.

My XYMPKI “exposé” still remains my top post overall (it was probably responsible for several thousand hits), which is nice. Pointless, too, because Apple and Yahoo still haven’t fixed it. Indeed, I noticed that KMail now uses this as a “feature” to get IMAP access to Yahoo on the desktop. Wikipædia still don’t mention the issue, having removed the mention very early on.

“Did they have brains or knowledge?
Don’t make me laugh!”

(That, and the title, from the song “Popular”, from the musical “Wicked”.)

Things that Git is missing: Haskell and Ruby microblogging support

20080703 – 090756+0000

How us bloggers decide what to write about:

[10:00:23] remko: dwd: i noticed that any post on reddit that has the words “Haskell”, “Git” or “Ruby” in them gets read a lot of times. So, I tried that theory out, and made a post with the word “Haskell” in it. And behold, i never got reditted, and then i did.
[10:00:54] remko: imagine what would happen if i did a useless post with ‘Git’ in it
[10:00:59] remko: like “Things that Git is missing”
[10:01:03] remko: or “Things I like about Git”
[10:01:19] Kev: “Things that Git is missing: Haskell and Ruby microblogging support”
[10:01:48] remko: killer post
[10:02:28] remko: Rails interface to Git using Haskell backend
[10:03:08] Kev: over XMPP :D
[10:03:55] remko: XMPP is not popular on reddit
[10:03:58] remko: it’s not worth it
[10:04:23] remko: i only want hip words in my titles
[10:06:05] Kev: “Rails interface to Git using Haskell backend for surgical hip replacements”

Worth a try, eh?

identi.ca

20080703 – 081630+0000

So I joined identi.ca last night - my, my, how trendy and with-it I am - and played a bit, and slept on it, and now I shall post my belated post on my rarely used blog.

Yes, it’s another pointless microblogging site. What we used to call a web talker, before people forgot what a talker was. Apologies to anyone who thought these were a recent phenomenon. Further apologies to anyone who thought phenomenon was spelt “m-e-m-e”.

Yes, it, too, has some XMPP integration. I never got on with the ones that don’t, except for when they had telnet, and that goes back a while. Moreover, the XMPP integration gives the user a much better interface, albeit with a lot fewer graphics. Some cynical folks amongst us think that’s a good thing.

Yes, it could be the next big thing. There’s a bunch of blog posting on the matter, many of which criticise the scaling - the interesting thing is that by federating the microblogosphere - and yes, I need to wash my hands after typing that - Evan Prodromou has managed to take this kind of talker a step beyond it’s mid-90’s roots. It almost doesn’t matter if it won’t scale well, federation means internet-wide clustering, in effect, and the potential is pretty impressive.

The federation alone marks it as being highly distinct from the Twitters and Jaikus of this world - as PSA says, it probably needs a bit of thought in terms of protocol design, but the concept is indeed a powerful one.

Still, it’s got teething problems - there’s no formal support, so quite why the identi.ca XMPP bot hasn’t subscribed to my presence is an unknown to me - it means it can’t update my status based on my presence, which I thought was both a sensible and interesting concept, and one I really wanted to try.

But on balance, I think it’s headed in the right direction, and its open-source nature should mean that new features and suchlike happen quickly.

One thing I’d like to see - and might tinker with if I ever find the time - XMPP integration is typically dull. More interesting would be to model identi.ca as a MUC service, wherein each user has control of a MUC room that is populated by the people they subscribe to. That would give a history interface, as well as a “last status” display, almost for free.

Ooooh. Toy!

20080429 – 085100+0000

PEP Aggregator, announced (very) early this morning, and nothing to do with my last post which talked about precisely this kind of thing. (I talked about bookmarks, he’s doing user tune, but it’s much of a muchness).

Stephan Maka wrote it, you can contact him on the jdev list if you’re interested in helping (and can code Ruby).

I know it’s terribly trendy to do this, but if you ask me, this is Web 3.0, and the shape of things to come.