StatusNet daemon rearchitecture

October 22nd, 2009

Some of StatusNet’s awesomer features like the XMPP and Twitter bridges require running background daemons to watch event queues or keep connections to XMPP servers open.

Alas, this just isn’t going to scale to the future StatusNet 1.0 world where we’re going to be running thousands of instances for our hosted services. We see a lot of problems with memory leaks and instability in those daemons with even just a few live sites now…

I’ve worked out what I think is a feasible architecture for a more scalable queue/daemon system for big sites to use; details and some diagrams I whipped up to help me keep my head straight are up on the wiki:

old-smallnew-small

I’m planning to use a single lightweight master daemon which will maintain long-running connections to the ActiveMQ queue and XMPP daemons. Actual event handling will still be done at the PHP/StatusNet level, but now as short-running processes which will handle a single event and exit.

This reduces the surface area for memory leaks and other oddities we encounter in long-running PHP scripts, and most importantly means we only have to run as many processes are there actually are events being handled!

Please feel free to give some feedback before I jump into implementation. :)

– brion vibber (brion @ status.net)
Senior Software Architect, StatusNet
San Francisco

Adium 1.4b10 fixes for StatusNet

October 16th, 2009

Adium 1.4b10 has just been released, with fixes for StatusNet support:

Screen shot 2009-10-16 at 11.35.23 AM

Menu items are now labeled as StatusNet rather than the old Laconica name, and more importantly you can now select whether to connect to the server over SSL or not — in previous Adium betas it always tried to use SSL, which worked with identi.ca but failed for many third-party sites.

PubSubHubbub and microblogging

October 15th, 2009

pushI’m poking about at the Realtime Web Summit, just got out of the PubHubSubbub session.

PuSH is a relatively lightweight protocol for pushing feed updates in more or less real-time using standard web protocols (eg, HTTP!). As currently spec’d, PuSH covers the server-to-server replication space pretty well — publishers send their updates to hubs, which send them on to the callback URLs given by subscribers.

For StatusNet, we’re really interested in two possible extensions to this, which would be outside the scope of the current PuSH spec:

Microblogging metadata extensions. PuSH deals with RSS and Atom feeds, but doesn’t really care what’s inside them. Microblogging and other social-type services will have various metadata — profile name & avatar, friend relations, ratings, comment id links, etc — which could be embedded into activity stream feeds, allowing different services to handle remote subscriptions interoperably.

Of course, we could just push everybody to support OMB… but the PuSH model may be more flexible, allowing subscriptions to blogs etc to be aggregated into your notice stream.

“Last mile” push to clients. There’s still no standardization for real-time communications from an aggregator or social service to end-user web, desktop, or mobile clients. PuSH as spec’d can’t handle this since it needs a URL to post updates to subscribers, which a NAT’d or mobile client obviously isn’t going to have.

A more or less standard way to attach XMPP or long-polling to pull updates from an aggregating hub to client end-points would be very nice; just like common use of the Twitter API has allowed interop between client apps and services (many Twitter clients will happily speak to identi.ca if you just change the API url to https://identi.ca/api!). That third-party ecosystem is mostly restricted to polling, though, and could really benefit from interoperable methods for pushing updates to an open client.

gettext: the agony and the ecstasy

October 13th, 2009

I’ve been poking around on StatusNet’s i18n to see if we can get localizations working with less fidgeting; all languages should “just work” when you drop in your StatusNet install.

We’re loading translations using gettext, which unfortunately can only  load translations for languages which are set up as locales system-wide. This is massively problematic and has lead to localization just plain not working for most people — it’s even broken on identi.ca!

The good news is that we already have a compatibility layer that doesn’t have this limitation: php-gettext, which provides source-compatible drop-in gettext interface in pure PHP.

The bad news is that if the native gettext module *is* present, there doesn’t seem to be any way to override calls to _() — PHP has no facility for monkeypatching to replace existing functions. So, to force use of the compat functions for unsupported locales we’d need to change all calls to a new name.

Any preferences between something short and cryptic like __() or something clearer and StatusNet-y like common_msg()?

This is also a good time to think about handling localization for plugins; currently the plugins that ship with StatusNet’s source just have their localizations lumped into StatusNet’s main file — that obviously won’t do for plugins that are maintained and distributed separately.

In the gettext model it looks like the way to handle this is to use multiple “domains”; StatusNet’s core domain is “statusnet” (duh) and set as the default domain for gettext calls. A plugin can bind its own locale subdirectory to another domain (say “ldapplugin”) and instead of calling _(“Some text”) can call dgettext(“ldapplugin”, “Some text”).

This could perhaps be simplified by adding helper methods onto the Plugin base class…

$this->_("blah")

Thoughts?

Updated 2009-10-16: I seem to be able to work around the locale setup problem by setting a valid locale before setting the invalid one. :P That should hold us for a while before we try larger changes.


Cross-posted w/ StatusNet-Dev mailing list.

SVG in Wikipedia and Wikimedia Commons

October 4th, 2009

page1-200px-SVG-Open-2009-Wikipedia.pdfSlides for my talk at SVG Open available for download as PDF or Keynote source. (I can make my test corpus available as well — let me know if interested!)

– brion

post mirrored from Wikimedia Tech Blog

Moving to StatusNet

September 28th, 2009

I’d like to share some exciting news with you all… After four awesome years working for the Wikimedia Foundation full-time, next month I’m going to be starting a new position at StatusNet, leading development on the open-source microblogging system which powers identi.ca and other sites.

I’ve been contributing to StatusNet (formerly Laconica) as a user, bug reporter, and patch submitter since 2008, and I’m really excited at the opportunity to get more involved in the project at this key time as we gear up for a 1.0 release, hosted services, and support offerings.

StatusNet was born in the same free-culture and free-software community that brought me to Wikipedia; many of you probably already know founder Evan Prodromou from his longtime work in the wiki community, launching the awesome Wikitravel and helping out with MediaWiki development on various fronts. The “big idea” driving StatusNet is rebalancing power in the modern social web — pushing data portability and open protocols to protect your autonomy from siloed proprietary services… People need the ability to control their own presence on the web instead of hoping Facebook or Twitter always treat you the way you want.

This does unfortunately mean that I’ll have less time for MediaWiki as I’ll be leaving my position as Wikimedia CTO sooner than originally anticipated, but that doesn’t mean I’m leaving the Wikimedia community or MediaWiki development!

Just as I was in the MediaWiki development community before Wikimedia hired me, you’ll all see me in the same IRC channels and on the same mailing lists… I know this is also a busy time with our fundraiser coming up and lots of cool ongoing developments, so to help ease the transition I’ve worked out a commitment to come into the WMF office one day a week through the end of December to make sure all our tech staff has a chance to pick my brain as we smooth out the code review processes and make sure things are as well documented as I like to think they are. ;)

We’ve got a great tech team here at Wikimedia, and we’ve done so much with so little over the last few years. A lot of really good work is going on now, modernizing both our infrastructure and our user interface… I have every confidence that Wikipedia and friends will continue to thrive!

I’ll start full-time at StatusNet on October 12. My key priorities until then are getting some of our key software rollouts going, supporting the Usability Initiative’s next scheduled update and getting a useful but minimally-disruptive Flagged Revisions configuration going on English Wikipedia. I’m also hoping to make further improvements to our code review process, based on my experience with our recent big updates as well as the git-based workflow we’re using at StatusNet — I’ve got a lot of great ideas for improving the CodeReview extension…

Erik Moeller will be the primary point of contact for WMF tech management issues starting October 12, until the new CTO is hired. I’ll support the hiring process as much as I can, and we’re hoping to have a candidate in the door by the end of the year.

– brion vibber (brion @ wikimedia.org)
CTO, Wikimedia Foundation
San Francisco

Update: Evan’s announce is up on the StatusNet blog.

Flickr hates England

September 14th, 2009

Ah, search suggestions…

punishment

Screen integration with terminals?

July 17th, 2009

As a guy who spends a lot of time in remote Linux shells from a laptop, I’m looking for better integration between my terminal emulators and screen sessions.

  • Let me use native scrollbars to access the backscroll!
  • Start me in screen by default so I don’t forget to start one.
  • If I have disconnected sessions, let me choose to reconnect or create a new session, with some reasonable menu.
  • Automatically reconnect to the server and the screen session after network disruption (switching networks, sleeping the machine overnight, etc)
  • Not messing up backspace. (This plagues me on Mac clients a lot. Backspace works fine in regular terminal but becomes forward delete in screen session. WTF?)

Linux & Mac clients both welcome… Anybody know something down this road already available?

HDTV and the video look

June 28th, 2009

I spent some time last night playing with my parents’ shiny new HDTV, which puts my 2005-vintage 26″ set to shame.

Pretty nice set; 40-something inch, 1080p, 120 Hz whatchamahooie, and you can plug in a USB stick full of JPEGs and force your family to watch your vacation photos. Nice!

It seems to be all the rage on new sets to have motion interpolation which can take 24-frame-sourced content (feature films and most US drama and sitcom TV shows) and smooth out the frame-to-frame motion, making it look more like 60-field video. Lots of higher-end sets advertise 120 Hz or even 240 Hz, which honestly seems excessive to me — the human eye can’t distinguish much more than 60 frames per second. :)

I’m a bit torn; on the one hand, the faster frame rate makes motion look much more vivid and realistic from any objective point of view. On the other hand, audiences have been trained over the last few decades to associate the video look with “cheesier” programming — soaps, reality shows, etc — while “serious” programs are shot on film at 24fps, making them feel more like a big-budget feature film… even to the point that lots of money was spent developing HD video cameras that could shoot at the slower, less realistic 24fps instead of HD’s native 60!

We stumbled into Harold and Kumar escape from Guantanamo of all things on HBO, and ran it for a while just to get a feel for the set. At first it drove me nuts seeing a movie I’d already seen on film looking distinctly like HD video, but after a half hour I got quite used to it and rather grew to like it. Of course as a former cinema-television student I’m extra-sensitized to this stuff — my wife immediately took to the more vivid display and commented on how much better it looked than when we’d seen it in the theater!

Looks like the mass audiences are happy to embrace high-motion video… I wonder if the long-standing holdover of the “film look” over the last decade was driven more by the oversensitized film geeks in the industry than any actual audience comparison…

Let’s learn a lesson here with our software development as well — those of us who’ve been nose-deep in web sites and software UI for years aren’t necessarily the most qualified to tell what our actual users are going to be most comfortable with.

Still alive!

June 17th, 2009

In the last few weeks:

  • got married to my awesome lady Marti
  • saw Fleetwood Mac concert
  • took two weeks vacation in Chicago area (still sorting and uploading gajillions of pictures)
  • dragged 11-13-year old nieces & nephews all over Chicago
  • hit wacky science-fiction/fantasy cons in California and Illinois
  • sold my soul to Apple & AT&T again for a new iPhone 3G S on order (mmmm, delicious gilded cage)

Whew! Ok, now I need a vacation from my vacation, and that means… back to work!


I love Wikipedia!