PostgreSQL数据库表的内部结构
Posted kuang17
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了PostgreSQL数据库表的内部结构相关的知识,希望对你有一定的参考价值。
A page within a table contains three kinds of data described as follows:
- heap tuple(s) – A heap tuple is a record data itself. They are stacked in order from the bottom of the page. The internal structure of tuple is described in Section 5.2 and Chapter 9 as the knowledge of both Concurrency Control(CC) and WAL in PostgreSQL are required.
- line pointer(s) – A line pointer is 4 byte long and holds a pointer to each heap tuple. It is also called an item pointer.
- Line pointers form a simple array, which plays the role of index to the tuples. Each index is numbered sequentially from 1, and called offset number. When a new tuple is added to the page, a new line pointer is also pushed onto the array to point to the new one.
- header data – A header data defined by the structure PageHeaderData is allocated in the beginning of the page. It is 24 byte long and contains general information about the page. The major variables of the structure are described below.
- pd_lsn – This variable stores the LSN of XLOG record written by the last change of this page. It is an 8-byte unsigned integer, related to the WAL (Write-Ahead Logging) mechanism. The details are described in Chapter 9.
- pd_checksum – This variable stores the checksum value of this page. (Note that this variable is supported in version 9.3 or later; in earlier versions, this part had stored the timelineId of the page.)
- pd_lower, pd_upper – pd_lower points to the end of line pointers, and pd_upper to the beginning of the newest heap tuple.
- pd_special – This variable is for indexes. In the page within tables, it points to the end of the page. (In the page within indexes, it points to the beginning of special space which is the data area held only by indexes and contains the particular data according to the kind of index types such as B-tree, GiST, GiN, etc.)
An empty space between the end of line pointers and the beginning of the newest tuple is referred to as free space or hole.
To identify a tuple within the table, tuple identifier (TID) is internally used. A TID comprises a pair of values: the block number of the page that contains the tuple, and the offset number of the line pointer that points to the tuple. A typical example of its usage is index. See more detail in Section 1.4.2.
While the HeapTupleHeaderData structure contains seven fields, four fields are required in the subsequent sections.
- t_xmin holds the txid of the transaction that inserted this tuple.
- t_xmax holds the txid of the transaction that deleted or updated this tuple. If this tuple has not been deleted or updated, t_xmax is set to 0, which means INVALID.
- t_cid holds the command id (cid), which means how many SQL commands were executed before this command was executed within the current transaction beginning from 0. For example, assume that we execute three INSERT commands within a single transaction: ‘BEGIN; INSERT; INSERT; INSERT; COMMIT;‘. If the first command inserts this tuple, t_cid is set to 0. If the second command inserts this, t_cid is set to 1, and so on.
- t_ctid holds the tuple identifier (tid) that points to itself or a new tuple. tid, described in Section 1.3, is used to identify a tuple within a table. When this tuple is updated, the t_ctid of this tuple points to the new tuple; otherwise, the t_ctid points to itself.
引用:http://www.interdb.jp/pg/pgsql01.html
以上是关于PostgreSQL数据库表的内部结构的主要内容,如果未能解决你的问题,请参考以下文章
三分钟!彻底搞懂PostgreSQL 和 MySQL 区别之分