Artbox Diagnostics

Disclaimer: Proximity Corporation provided the information in this article and it was deemed accurate as of 30 May 2007. Apple Inc. is not responsible for the article's content. This article is provided as is and may or may not be updated in the future.

There are various tools available to help diagnose the artbox storeserver.

Database Deadlocks

With multiple database sessions it is conceivably possible for two database sessions to deadlock each other because they both perform queries that require the database to acquire particular locks. And it is not possible to satisfy either, for example one session has a lock on A and wants a lock on B, the other session has a lock on B and wants a lock on A. Postgres will normally detect this condition after a short period and fail one query with an error, allowing the other to proceed, so in practice it's most likely that you'll just see the occasional error that reports a deadlock.

To view a database session deadlock you can use:

This lists all current running postgres processes. Each process has a short description of what its currently doing in the ps output, which will be one of "idle" (ie. no query pending), "idle in transaction" (ie. a transaction has been started on this connection, but it's not currently processing a query) or actual SQL command (ie. SELECT, INSERT, etc).

If you see two connections that have the same SQL command state for a long period of time, that might be a deadlock. You would need to wait quite a long time, polling ps all the time, to be certain of that. If the deadlock doesn't resolve itself, then restarting the storeserver should fix it.

Internal Storeserver Deadlock

Much more common are internal storeserver deadlocks. The storeserver usually has multiple requests being serviced at once, and so has its own system of avoiding inconsistent simultaneous updates—it has its own locks, which can be a source of deadlock contention. The other sources of deadlock contention are the database connections and the connection to LDS.

If a particular request seems to have just stopped, a deadlock might be responsible.

You can get some idea of what is happening with ps ax | grep postgres, as before. Sometimes you can tell if there's a deadlock if one or more connections are noted as "idle in transaction." This might mean that whatever task within the storeserver that should be sending the next request to that connection is waiting for something else. The somethings else it might be waiting for include a database connection, which might be waiting for a database lock it can't get now, another internal lock, or an LDS request. However, it is possible for a deadlock to happen that does not involve the database, so don't rely on ps for the whole story.

If a deadlock happens, the best way to diagnose what is actually happening is to dump the state of the storeserver, which you can do using:

This will produce a file pms_state.dump in the working directory of the storeserver executable (on a production artbox this is normally /usr/artbox/logs) which contains a complete description of all the things going on inside the storeserver at that instant, all the tasks and subtasks, what they are doing, what locks they have, what locks they are waiting for, whether they have a database query in progress, whether they are inside a database transaction, and whether they have an LDS request in progress.

This command will now also get LDS to produce a similar dump file, which is useful for diagnosing any problems where LDS seems to be not responding, or seems to be doing some stuff but just ignoring others. The LDS dump file is called lds_state.dump and is written to the LDS working directory, which on a production artbox is also /usr/artbox/logs.

The file is laid out as an indented hierarchy, so you can see which tasks are children of other tasks. Generally speaking, a task will complete after all its children have completed, so any task that has children is likely to be waiting for one or more of its children to do something.

To fix a deadlock, restart the storeserver. Make sure to capture the state dump before restarting for further troubleshooting.

Published Date: Feb 20, 2012