Pros & Cons of TRUNCATE vs DELETE (PostgreSQL)
TL;DR Truncate is fast and quickly reclaims disk space, but requires exclusive access to the table. This may cause problems, so if you cannot guarantee that others will not try to access the table, it is better to use DELETE instead.
When there is a need to clean up a whole database table, some may tend to use the TRUNCATE method, due to its performance.
The difference is significant. I’ve done some tests with both DELETE and TRUNCATE methods and got the following results:
-- Created a table truncate_test with 3.026.131 rows-- Created a table delete_test also with 3.026.131 rowsTRUNCATE test_schema.truncate_test; ---> Took 337 ms
DELETE FROM test_schema.delete_test; ---> Took 4.46 s
LOCKS
A fundamental concept that must be known before talking about locks is the idea of ACID.
ACID
ACID is the acronym for Atomicity, Consistency, Isolation, and Durability. These are four fundamental properties to ensure that transactions in a database occur in an expected and secure manner.
If you want to deep dive into ACID and transactions you can find a lot about it on the internet.
Back into Locks
One of the most known tools to provide ACID properties is the famous locks. Locks are probably used on all database systems, and with PostgreSQL, this is no different.
Locks are also used as a synchronization technique in the context of multi-threaded programs, this way when different threads are trying to access a shared memory area, only one of them can access this critical zone at once, preventing the occurrence of race conditions.
PostgreSQL Locks
PostgreSQL has almost ten different kinds of locks to protect its data during concurrent transactions. And the key principle here is that some locks may conflict with others.
This means that if two concurrent transactions are requiring table locks to proceed, and the required kind of locks conflict one with the other, the transactions gonna be serialized and only one will occur at once.
It’s not always true that different transactions holding locks cannot access the same database table mutually. This restriction is only valid if the kind of lock held by each transaction conflicts with another.
The table below shows us the different kinds of locks available on a PostgreSQL database, and how each of them coexists with another.
TRUNCATE (ACCESS EXCLUSIVE LOCK)
According to PostgreSQL documentation, the access exclusive lock is the most restrictive one:
ACCESS EXCLUSIVE LOCK Conflicts with locks of all modes (
ACCESS SHARE
,ROW SHARE
,ROW EXCLUSIVE
,SHARE UPDATE EXCLUSIVE
,SHARE
,SHARE ROW EXCLUSIVE
,EXCLUSIVE
, andACCESS EXCLUSIVE
). This mode guarantees that the holder is the only transaction accessing the table in any way.Acquired by the
DROP TABLE
,TRUNCATE
,REINDEX
,CLUSTER
,VACUUM FULL
, andREFRESH MATERIALIZED VIEW
(withoutCONCURRENTLY
) commands. Many forms ofALTER INDEX
andALTER TABLE
also acquire a lock at this level. This is also the default lock mode forLOCK TABLE
statements that do not specify a mode explicitly.
So, besides the fast performance provided by the TRUNCATE, there is the exclusive lock downside. The transaction trying to delete all rows of the table must wait for all ongoing read/write operations to finish, and once started, no other operation can happen until the truncate is completely done.
VACUUM
After data removal, the disk space is not immediately reclaimed. PostgreSQL has a VACUUM functionality that acts as a garbage collector and physically reclaims disk space occupied by removing dead tuples.
It’s interesting to notice that the truncate operation is always followed by a vacuum. So, the truncate operation has the advantage of quickly reclaiming disk space. With DELETE/UPDATE operations this is not true, and at some point, the vacuum will be required.
Conclusion
TRUNCATE advantages:
- Fast
- Reclaims disk space immediately
TRUNCATE disadvantage:
- Holds an access exclusive lock
DELETE advantage:
- Does not lock the entire table
DELETE disadvantages:
- Can be slow on large tables
- Does not reclaim disk space