Skip to content

Latest commit

 

History

History
66 lines (52 loc) · 3.33 KB

pg-internal: mvcc.md

File metadata and controls

66 lines (52 loc) · 3.33 KB

pg-internal: MVCC

MVCC is short for Multi-version Concurrency Control. It means multiple clients can access the database concurrently. Each will see a consistent snapshot of data and read won't block write while write won't block read. Other database like Oracle, mysql also support MVCC but the implementation may be different.

MVCC implementation

For each row, postgres won't overwrite the row inplace when updating. It will mark the old row as deleted and insert new row in other place. By this way, old transaction can use old row and new transaction will use new row for same logical row at the same time.

Then a simple row (id=1, name=James) may have many physical rows in database files if it was updated multiple times. But at a specific point, only one physical row is visible for a transaction. So how to know whether a row is visible or not?

For each row (tuple), postgres added some metadata fields, such as t_ctid, t_xmin, t_xmax and other fields. t_ctid is physical index of each tuple. For example, (0, 1) means first tuple of first page. t_ctid may point to new tuple when deprecated.

The t_xmin is the ID of transaction that inserted itself. And the t_xmax is the ID of transaction that deprecated itself by deleting or updating. 0 means not deprecated.

Remember that the transaction may be ROLLBACKed. So you need to check clog to get status of the transaction. If the transaction of the t_xmin is rollbacked, then it means the inserting is rollbacked. And same to t_xmax.

So with t_xmin and t_xmax and other info like transaction status, we can decided that whether a tuple is visible to current transaction.

We can select ctid, xmin and xmax as normal fields but they are not contained in the *.

postgres@localhost:test1> select ctid,xmax,xmin, * from t1;
+--------+---------+---------+------+---------------+
| ctid   | xmax    | xmin    | id   | name          |
|--------+---------+---------+------+---------------|
| (0,1)  | 0       | 9219118 | 1    | abc           |
| (0,2)  | 9219122 | 9219121 | 2    | test rollback |
+--------+---------+---------+------+---------------+

Since it is a normal sql, it cannot see invalid tuples. We can use the extension pageinspect to get raw tuple info of page.

postgres@localhost:test1> create extension pageinspect;    -- install the extension

postgres@localhost:test1> SELECT lp,lp_off,lp_len,t_xmin,t_xmax,t_ctid,t_infomask2,t_infomask FROM heap_page_items(get_raw_page('t1', 0))
+------+----------+----------+----------+----------+----------+---------------+--------------+
| lp   | lp_off   | lp_len   | t_xmin   | t_xmax   | t_ctid   | t_infomask2   | t_infomask   |
|------+----------+----------+----------+----------+----------+---------------+--------------|
| 1    | 8160     | 32       | 9219118  | 0        | (0,1)    | 2             | 2306         |
| 2    | 8112     | 42       | 9219121  | 9219122  | (0,3)    | 16386         | 2306         |
| 3    | 8064     | 48       | 9219122  | 0        | (0,3)    | 32770         | 10754        |
+------+----------+----------+----------+----------+----------+---------------+--------------+

With this, we can see tuples that rollbacked or replaced.

references