October 12th, 2008
We had a database master (for German Wikipedia and several other major languages) fill up its disk, breaking replication and causing some general havoc. Tim patched things up for the time being, but we still have to fix up some of the de-synced servers once they’re done rebuilding; in the meantime, you may see old data in some places or very long reported lag times for your bot tools.
See Tim’s summary of what lead to the failure and my follow-up notes about necessary fail-safe fixes and monitoring procedures.
Posted in devel, wiki | No Comments »
October 10th, 2008
Posted in apple, devel, wtf | No Comments »
October 3rd, 2008
Hey all –
I hereby invite you all to the first official regular weekly MediaWiki Bug Monday, to occur on October 6*.
Come hang out in #mediawiki on irc.freenode.net and help us verify and clean up bug reports, raise submitted patches for review and application, and generally try to break stuff.
It’ll be awesome!
* (Pick a timezone, any timezone, just show up when you can — even you Australians!)
Posted in devel, wiki | No Comments »
October 2nd, 2008
One of the problems we’ve been seeing is that our code review procedure doesn’t always scale well. We have a fairly large number of committers, and a pretty liberal policy about committing new code to trunk — but we also need things to work consistently so we can keep the production code up to date.
Traditionally, code that’s been committed to SVN gets reviewed offline by me or Tim before we push things out live. If we find problems, we fix them up or revert the code to be redone correctly later.
There’s a couple big problems with this:
- We can’t easily coordinate our notes; I can’t see what Tim’s reviewed and what he hasn’t. We end up either duplicating effort or missing things.
- If we’re both busy, sick, on vacation, etc sometimes it just doesn’t get done!
- If we want more people to pitch in, coordination gets even harder.
In my spare time over the last few weeks I’ve thrown together a little CodeReview extension for MediaWiki to help with this. It pulls the SVN revision data as commits are made and presents an interface on the wiki where we can see what’s been reviewed, tag problems, and add comments for follow-up issues.
Yesterday I went ahead and put it live:

The UI’s still a little rough, and not all linking and metadata features are implemented; Aaron’s going to help polish it up.
But it’s already useful, and I’ve got some revisions flagged as fixme…
Feel free to try it out, and add notes for things which need work.
Currently comments are open to any registered user on the wiki; status changes and tagging updates are limited to the ‘coder’ group, which is viral — any coder can make another user a coder.
Posted in devel, wiki | No Comments »
September 30th, 2008
Looks like we had a problem this morning with our domain redirection configuration, which broke access to the site for at least some people for a while.
Wikimedia has a lot of non-default domains registered, which we set up as redirects to the various primary domains — for instance www.wikipedia.com redirects to www.wikipedia.org, the standard location for Wikipedia’s multilingual entry portal.
This is handled by setting up a special Apache web server virtual host configuration which accepts connections for all the domains we don’t actually host wikis on — this virtual host has a bunch of mod_rewrite settings which go through and decide which domain to send the request on to. It returns an HTTP redirect response to the browser, which then goes on to the correct site.
For efficiency, many of these responses are declared to be cacheable (”301 Moved Permanently”), since they always send on to the same spot. This means that multiple hits to the same redirected URL will make use of our Squid proxy caching layer, reducing traffic to our backend servers.
The unfortunate thing is that if the configuration gets messed up and people are sent to the *wrong* URL, that’s also cached. An accidental breakage in the redirect config file was made this morning while maintaining it, creating some redirect loops for URLs which weren’t supposed to redirect in the first place.
To fix it, we’ve been restarting the Squid proxies and clearing their caches to ensure that all bad redirects are flushed out of the system.
As part of our ongoing mission to create permanent fixes to known site maintenance problems, we’re pushing up some improvements already on our list but not yet reached:
- Proper version control for the relevant config files
- Staging server for web server configuration changes — something we can test against in the live environment but which doesn’t pollute the primary web caches if it breaks while we’re testing it
Posted in devel, wiki | 3 Comments »
September 24th, 2008
We’ve tracked down today’s problems to a combination of a couple of things:
- There’ve been ongoing database locking issues with the site statistics updates — these would all block on each other, making page saves very slow at times
- … which held open database connections, causing the text storage servers to start locking out new connections …
- … which exacerbated problems with the failover behavior of recent changes to the storage and load balancing code.
The code changes have been rolled back, fixing the slow site load behavior. (doing this correctly unfortunately was a bit painful, as we had to restore the broken code for a while in order to pick out what was going on enough to fully revert it again.)
Domas believes the main culprit on the database locking is actually an issue with our mail server — some actions (such as creation of new accounts) would involve both mail and updates to the site statistics table. With overload to the mail server, and a very simple local mail client called from MediaWiki, the outgoing mail would sometimes hang, while the transaction was still open, causing the locks, causing other updates to stall.
As a temporary measure I’ve disabled the site stats updates, fixing the failures on page save. (They’ll need to be re-updated after we’ve totally resolved it.)
We’re looking at the way the mail servers are set up to see if we can ensure that internal connections don’t stall the way they were; we should also be able to rearrange the transactions so that things are committed before the mail goes out!
Posted in devel, wiki | No Comments »
September 22nd, 2008
Well, today was exciting! Wikimedia’s sites experienced two downtime events today.
The first, which lasted about 30 minutes, was due to a power problem. While Rob was performing maintenance fixing up power in rack B2, power was inadvertently shut off to an access switch serving another rack of servers, which took a chunk of our core text storage offline.
The second, which also lasted about 30 minutes, was caused by a file server failure. The file server that holds our NFS home directories and misc files and logs experienced a kernel crash, then turned up some disk errors on reboot. (Possibly two failed drives, which may hose the array.)
Ideally this wouldn’t disturb production web serving, but various debugging logs were being saved onto this server, and this caused the web servers to hang waiting for NFS to come back up.
We’ve disabled the internal debug logging for now, and the site’s back up and running while we poke at recovering or replacing the file server.
Both of these problems can be ameliorated in the future with some more failure-proof design:
- Spreading text storage clusters across multiple racks will protect against localized power or network failures
- Moving debug logs to a UDP system will have a more graceful failure mode for centralized logging than hanging NFS shares
Posted in devel, wiki | 2 Comments »
September 12th, 2008
The other day SourceForge launched their new hosted apps system, allowing SF-hosted open source projects to much more easily set up some web tools for their projects. The apps available at launch include phpBB, MediaWiki, and LimeSurvey.
While it’s been possible to run MediaWiki in your SourceForge project web space for a long time, it’s been a little tricky to set up, particularly as they’ve tightened security configurations in the last couple years.
Centralized administration, authentication, and maintenance should make it a lot easier for SourceForge project admins to get a wiki up and running for their project, and more wiki equals more fun!
Posted in wiki | 2 Comments »
September 3rd, 2008
After a previous reworking of MediaWiki’s stylesheet-handling code to allow adding handheld stylesheets, I’ve gone ahead and implemented bug 2889 adding per-site customizable MediaWiki:Print.css and MediaWiki:Handheld.css pages.
The ability to specify some handheld tweaks is needed to be able to work around issues with certain kinds of layout formatting, especially the big beautiful multi-column table layouts which are popular on portal and main pages.
While lovely on a large screen, on a small device they tend to either make the columns reaaaally tiny or push things out off screen. On English Wikipedia I’ve thrown in some quick style hacks to flatten out those tables on the main page (this was applied already by Opera Mini’s classic view, but not Opera’s other browsers in small-screen mode):
Before:
After:
There are still improvements that can be done, but it at least helps things fit on screen! MediaWiki:Handheld.css can be edited on each of our wikis to tweak things up as desired/required.
Of course it’s always best to try to use clean, scalable styles that work on small screens to begin with.
Posted in devel, mobile, wiki | 1 Comment »
August 27th, 2008
For some time, MediaWiki has provided an OpenSearch interface to allow supporting browsers to list your favorite wiki as a search provider right in your browser’s built-in search bar.
Firefox since version 2.0 supports type-ahead search suggestions as well, making it easier to reach the page you’re looking for without typing the whole thing:

Internet Explorer introduced basic OpenSearch support in IE 7, letting you add Wikipedia as a search provider, but still didn’t include the handy type-ahead search suggestions.
IE 8 beta 2 has finally added support for the search suggestions, including an extended format which allows including text extracts and images with the results. We’ve added support for this in the OpenSearchXml extension, now enabled on Wikipedia and all the Wikimedia sites:

You won’t have to change anything on your wiki to support search suggestions on IE 8, but with the extension you’ll get a little extra bling.
The description text and image extraction is still a little experimental, but does a pretty good job, and we plan to bring these capabilities into the core software where they can be used in other search and site map interfaces.
Posted in devel, wiki | 3 Comments »