Deadlocks


What are deadlocks?

A
deadlock is the situation where you have two, or more, Oracle "sessions" (well, transactional "states") competing for mutually locked resources. There are a few reasons why your application may experience deadlocks, most of which are caused by application design and coding practices. However, there are a few situations when, due to certain architectural design decisions, you may experience deadlocks simply due to the internal mechanisms of Oracle itself. The architect and developer should be aware of the situations where they may occur so that they can be considered and avoided.

Deadlocks cannot be allowed to occur since they are basically an infinite loop and cannot be resolved without intervention. There are a few different causes of deadlocks but in all cases, a session will raise an ORA-00060, e.g.
ORA-00060: deadlock detected while waiting for resource
In the situations where there is only one session, it will obviously be that session that has the ORA-00060 raised. In the situations involving two sessions (such as the ABBA deadlock, see later), Oracle automatically detects the deadlock and resolves it by causing one of the sessions (can't actually guarantee which one) to receive an ORA-00060 error. What about the other session? Well, this may be surprising to some people, but it carries on waiting, because the ORA-00060 error does not automatically cause the session to commit or rollback, and hence it retains it's previous lock(s), including those involved in the deadlock scenario. It is the responsibility of the client to commit or rollback if an ORA-00060 error is encountered.

As with all deadlock scenarios, it's about avoidance rather than remedy, e.g. for the classic ABBA deadlock situation, you have to ensure that sessions do their operations in the same "order", i.e. if a process updates table t1 then t2, that any other process does so in the same order.

Detecting and identifying deadlocks

The obvious way of detecting a deadlock is when you see the ORA-00060 error, however it may not be obvious what situation actually causes the deadlock to manifest. Fortunately, whenever a deadlock occurs, Oracle creates a trace file in the current
Application Diagnostic Repository (ADR) (see V$DIAG_INFO for details) with information which will allow you to see details about the deadlock, what the shared resource is, what the current SQL statement was, etc.

The "classic" ABBA deadlock

If session A is locking row 1, session B locks row 2, then session A attempts to lock row 2 (which will block since session B has the lock on that row), and then session B attempts to lock row 1 (which will block since session A has the lock on that row). In this situation, session A is waitingcfor session B, and session B is waiting on session A, which of course will never be resolved - hence the term ABBA deadlock.

Here's an example:
SQL> CREATE TABLE t ( a varchar2(10) );

Table created.

SQL> insert into t values ( 'x' );

1 row created.

SQL> insert into t values ( 'y' );

1 row created.

SQL> commit;

Commit complete.
Now, session A attempts to delete row 'x', but doesn't commit :
SESS1> delete from t where a = 'x';

1 row deleted.
Now, session B attempts to delete row 'y', again, not committing :
SESS2> delete from t where a = 'y';

1 row deleted.
Now, since session B has not committed the delete against row 'y', session A can still "see" it (and attempt to delete it) :
SESS1> delete from t where a = 'y';
However, this will be blocked by session B since row 'y' is locked, so session A is holding the lock on row 'x' but is waiting for row 'y'. Now, session B attempts to do it the "other way around", i.e. attempting to delete row 'x', while it retains the lock on row 'y' :
SESS2> delete from t where a = 'x';
this will be blocked by session A (and hence wait). So, the situation is tied, session B is waiting for session A and session A is waiting for session B.

The following is a deadlock trace file extract for the above situation
DEADLOCK DETECTED
[Transaction Deadlock]
Current SQL statement for this session:
delete from t where a = 'x'
The following deadlock is not an ORACLE error. It is a
deadlock due to user error in the design of an application
or from issuing incorrect ad-hoc SQL. The following
information may aid in determining the deadlock:
Deadlock graph:
                        ---------Blocker(s)--------  ---------Waiter(s)---------
Resource Name          process session holds waits  process session holds waits
TX-00070023-0000009c        18      27     X             20      38           X
TX-00060021-0000008d        20      38     X             18      27           X
session 27: DID 0001-0012-00000011  session 38: DID 0001-0014-00000005
session 38: DID 0001-0014-00000005  session 27: DID 0001-0012-00000011
Rows waited on:
Session 38: obj - rowid = 0000354E - AAADVOAAEAAAAAPAAB
    (dictionary objn - 13646, file - 4, block - 15, slot - 1)
Session 27: obj - rowid = 0000354E - AAADVOAAEAAAAAPAAA
    (dictionary objn - 13646, file - 4, block - 15, slot - 0)
Note, the fragment (dictionary objn - 13646, file - 4, block - 15, slot - 1), the objn - 13646 relates to the object_id which can be used if necessary to query DBA_OBJECTS.

Deadlocks with Autonomous Transactions

As mentioned in the previous section, all deadlocks are variations on the same theme, i.e. multiple "sessions" competing for mutually locked resources. Since an Autonomous Transaction (AT) is basically a "session within a session", it's perfectly possible for it to occur here as well. This is maybe a little less obvious, since some people think that somehow they're part of the same "parent" session, but this is not the case.

Two types of deadlock are seen with autonomous transactions, TX X X and TX X S - both named after what locks are held and are seen in the deadlock graph.

TX X X deadlock

This is the situation where the parent transaction has an exclusive (X) lock on a shared resource and the autonomous transaction tries to get an exclusive lock on the same resource, e.g.
SQL> CREATE OR REPLACE PROCEDURE p_at
  2  AS
  3  PRAGMA AUTONOMOUS_TRANSACTION;
  4  BEGIN
  5    DELETE FROM t;
  6    COMMIT;
  7  END p_at;
  8  /

Procedure created.
Now, put a single row into a table :
SQL> insert into t values ( 'x' );

1 row created.

SQL> commit;

Commit complete.
Now, delete the row in the "parent" session :
SQL> delete from t ;

1 row deleted.
Now, since the "parent" session is locking the row in "t", any attempt to delete it via an AT from within the "parent" session, will result in ORA-00060, i.e.
SQL> exec p_at;
BEGIN p_at; END;

*
ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource
ORA-06512: at "MARTIN.P_AT", line 5
ORA-06512: at line 1
So, why does the AT simply not block on the 'x' row? Well, if it did, imagine the situation, "parent" session is waiting for the AT to "finish" (like any other PL/SQL call), but it never will since it's blocked waiting for it's parent session to commit or rollback. The following trace extract is seen in the deadlock graph - noting one session entry in "Rows waited on" - a classic giveaway that ATs are involved:
session 1420: DID 0001-005E-00003E85    session 1420: DID 0001-005E-00003E85

Rows waited on:
        Session 1420: obj - rowid = 0000661C - AAA6H4AAHAAGuWxAAA
        (dictionary objn - 26140, file - 7, block - 1762737, slot - 0)    

Deadlock graph:
                                          ------------Blocker(s)------------  ------------Waiter(s)-------------
Resource Name                             process session holds waits serial  process session holds waits serial
TX-001B0006-0005D067-00000000-00000000         94    1420     X        24509       94    1420           X  24509
As you can see, a TX deadlock occurs with the blocker holder a X lock and the waiter trying to obtain an X lock - in this case, waiter and blocker are the same session due to the AT usage.

TX X S deadlock

This is the situation where the parent transaction has an exclusive (X) lock on a shared resource and the autonomous transaction tries to get a share lock on the same resource. This usually occurs with foreign key maintenance of the child table to the parent table when the parent record has not been committed. in this situation, it does not matter if there is an index on the child table - it is the share (S) lock attempt of the constraint that causes the deadlock.
SQL> CREATE TABLE mc_parent_table ( p_pk NUMBER PRIMARY KEY );

SQL> CREATE TABLE mc_child_table  ( c_pk NUMBER PRIMARY KEY, c_p_pk NUMBER, CONSTRAINT c_p_fk FOREIGN KEY (c_p_pk) REFERENCES mc_parent_table (p_pk) );
Now, create an AT that inserts into the child table for a given parent record:
CREATE OR REPLACE PROCEDURE mc_at( p_pk IN mc_parent_table.p_pk%TYPE )
IS
  PRAGMA AUTONOMOUS_TRANSACTION;
BEGIN    
  INSERT INTO mc_child_table VALUES ( 1, p_pk );
  COMMIT;
END mc_at;
Then, insert a row into the parent table and (without committing), call the AT:
SQL> INSERT INTO mc_parent_table VALUES ( 1 );

SQL> EXEC mc_at(1);

BEGIN mc_at(1); END;

*
ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource
ORA-06512: at "HOU.MC_AT", line 5
ORA-06512: at line 1
The following is the resultant TX X S deadlock graph - again noting that there is only one "Rows waited on" entry and both blocker and waiter are the same Oracle session:
session 1420: DID 0001-005E-00003E85    session 1420: DID 0001-005E-00003E85

Rows waited on:
      Session 1420: obj - rowid = 0000661C - AAA6H4AAHAAGuWxAAA
      (dictionary objn - 26140, file - 7, block - 1762737, slot - 0)    

Deadlock graph:
                                          ------------Blocker(s)------------  ------------Waiter(s)-------------
Resource Name                             process session holds waits serial  process session holds waits serial
TX-001B0006-0005D067-00000000-00000000         94    1420     X        24509       94    1420           S  24509

Deadlocks with Bitmap Indexes

Bitmap indexes were designed to be used solely within data warehouses, i.e. where the vast majority of the database activity is reading data, and there's very little (or no) data modification, except for batch processes which occasionally re-populate the warehouse. Each "row" in the bitmap index contains references to potentially many different rowids, in contrast to a B*-tree index which references a single rowid. It should be obvious, therefore, that, since the transactional mechanism is the same for all database operations, that any DML on a table which impacts the bitmap index may end up locking (or attempting to lock) many different "rows" within the index. This is the key concept with deadlocks in bitmap indexes, you're not being deadlocked on the underlying table, but on the index blocks.

The ability of an INSERT statement to deadlock (which is usually impossible) is therefore explained in that the bitmap index is maintaining other rows as a result of the DML.

Fortunately, it's easy to detect these kind of deadlocks, because the trace file has a very particular format :
Deadlock graph:
                       ---------Blocker(s)--------  ---------Waiter(s)---------
Resource Name          process session holds waits  process session holds waits
TX-00080027-0000d2a1        12      37     X             15      35           S
TX-000a0016-0000d6d2        15      35     X             12      37           S
session 37: DID 0001-000C-00000002  session 35: DID 0001-000F-00000002
session 35: DID 0001-000F-00000002  session 37: DID 0001-000C-00000002
Rows waited on:
Session 35: no row
Session 37: no row
Note, the "no row" entries. This is the prime indicator that this is a bitmap index deadlock. There's very little you can do about this, apart from not use bitmap indexes in a DML environment.

Deadlocks with unindexed foreign keys

There are various attributes which can be specified when creating foreign key constraints. The ones of interest are UPDATE and DELETE CASCADE. These attributes define what happens to the child record(s) when various actions are performed on the parent record.

The upshot of the deadlock problem is that in these situations (i.e. using the UPDATE or DELETE CASCADE options of the FK constraint), without an index on the foreign key column(s), Oracle has no option but to attempt to get a table lock on the child table while it performs it's maintenance. In this situation, it's obvious that the potential for deadlocking is vastly increased for having to lock the whole child table.

Fortunately, resolving this is quite straightforward, you simply put an index on the foreign key column(s).

For more information about this (and other performance related issues regarding unindexed FKs), see
https://asktom.oracle.com/Misc/httpasktomoraclecomtkyteunindex.html.