Skip to content

Columns and Lists

Pranjal Gupta edited this page Mar 14, 2021 · 1 revision

Columns and Lists are two primary data structures that we use to store node and rel properties as well as adjacency lists in the system.

Columns

A Column is simply an array with a value (of the same data type) at each offset. Hence, given an offset, a Column gives constant-time random access to its associated value.

Lists

Lists can be imagined as the extension to Columns where multiple values can be associated per offset. Hence, given an offset, a Lists gives constant-time random access to the associated list that contains an array of values (of the same data type). We use a 2-level CSR structure to organize data in Lists. We logically differentiate lists into the following types:

  • Small List: A list is defined as small if it can fit into one system page, that is in 4KB. For example, if elements of lists occupy 4Bytes each, then we say that a list is small is it contains less than or equal to 1024.
  • Large List: If it is not a small list.

Lists are also logically divided into chunks and all the small lists in a chunk are contiguously stored in the file. The size of the chunk is fixed in the system to be 512. Large Lists of a chunk span multiple pages and are stored outside the chunk, though contiguously.

We further go on to group offsets in Lists into chunks of 512. All the lists of a single chunk are stored contiguously in the file With each lists, we use 2 auxiliary structures that contains information about where to find the data (which is a list) in the lists data structure.

  • Headers: There is one header per offset in the lists data structure. A header contains the
  • ListsMetadata
Clone this wiki locally