So what’s in the job queue anyway?

In en.wikipedia.org’s job queue at the moment, breakdown by job type…

job_cmd count(*)
htmlCacheUpdate 31,147
refreshLinks 10,106,739
renameUser 119

Note that the current system allows for duplicate entries to get put in the queue; the dupes are removed as the first one in the stack gets run. This makes the raw number of refreshLinks entries much higher than it “really” is — Talk:Union Station (Louisville) is listed 9 times, presumably once for each template edit that triggered an “update me!” job.

Update: Figured out why the queues were growing so big last few days — system clock was 7 seconds slow on the database master. This made the replication lag detection misread a 7-second minimum lag on every slave. The job queue batch runners were all sitting waiting for the lag to resolve. :)

Resynced the clock (presumably drifted during the period when some IPs were broken), things are moving again.

One Response to “So what’s in the job queue anyway?”

  1. MaxSem Says:

    Is there any statistics about what edits cause most queue load? We may want to protect hevily used templates to avoid putting extra load on servers due to not-really-needed edits.

Leave a Reply


I love Wikipedia!