Tag Archives: PostgreSQL free space map FSM explained

How SELECT, INSERT, UPDATE, and DELETE Work Internally in PostgreSQL

 

How SELECT, INSERT, UPDATE, and DELETE work internally in PostgreSQL?

PostgreSQL uses MVCC (Multi-Version Concurrency Control), which means every modification creates a new version of the row instead of overwriting the old one. This is the core concept behind all DML operations.

1. Common Foundation (Applies to All Operations)

  • Heap: Data is stored in heap files (tables). Each row is a tuple.
  • Tuple Header: Every row contains:
    • xmin: Transaction ID that created the row
    • xmax: Transaction ID that deleted/updated the row (null if active)
    • Other visibility information
  • WAL (Write-Ahead Log): All changes are first written to WAL for durability before being applied to data files.
  • Shared Buffers: Data pages are read into memory before being modified.
  • Visibility Rules: A row is visible to a transaction based on its snapshot (which transactions it can see).

2. How SELECT Works

          Visibility Check: For each row, PostgreSQL checks:

    • The row’s xmin must be committed and visible to the current transaction.
    • The row’s xmax must be either null or not visible to the current transaction.

 


3. How INSERT Works

  1. Transaction Starts: A new transaction ID (xid) is assigned.
  2. Buffer Allocation: PostgreSQL finds a free slot in a heap page (or allocates a new page).
  3. Tuple Creation: A new row is created with:
    • xmin = current transaction ID
    • xmax = null
  4. WAL Logging: The change is written to the WAL buffer first.
  5. Page Modification: The new tuple is inserted into the shared buffer.
  6. Commit: On commit, the WAL is flushed to disk (fsync), making the change durable.

4. How UPDATE Works

UPDATE in PostgreSQL does not modify the existing row. It creates a new version of the row.

  1. Find the row using the same visibility rules as SELECT.
  2. Mark old row as dead: Set xmax = current transaction ID on the old tuple.
  3. Insert new row: Create a completely new tuple with:
    • xmin = current transaction ID
    • xmax = null
    • Updated column values
  4. Update indexes (if needed): New index entries point to the new tuple.
  5. WAL Logging: Both the old row’s xmax change and the new tuple are logged in WAL.
  6. Commit: Changes become visible to other transactions after commit.

Important: The old row version remains in the table until VACUUM cleans it up.


5. How DELETE Works

DELETE also uses MVCC — it doesn’t remove the row immediately.

  1. Locate the row using visibility rules.
  2. Mark as deleted: Set xmax = current transaction ID on the tuple.
  3. WAL Logging: The change is logged.
  4. Commit: The row becomes invisible to new transactions.
  5. Physical Removal: The row is not physically deleted from disk until VACUUM runs.

Caution: Your use of any information or materials on this website is entirely at your own risk. It is provided for educational purposes only. It has been tested internally, however, we do not guarantee that it will work for you. Ensure that you run it in your test environment before using.

Thank you,
Rajasekhar Amudala
Email: br8dba@gmail.com
Linkedin: https://www.linkedin.com/in/rajasekhar-amudala/