⚠️ This project is currently in the planning stage. The documentation is only partially accurate and there are lots of bugs and missing features!
bitformat is a Python module for creating and parsing file formats, especially at the bit rather than byte level.
It is from the author of the widely used bitstring module.
- A bitformat is a specification of a binary format using fields that can say how to build it from supplied values, or how to parse binary data to retrieve those values.
- A wide array of data types is supported. Want to use a 13 bit integer or an 8-bit float? Fine - there are no special hoops to jump through.
- Several field types are available:
- The simplest is just a
Field
which contains a single data type, and either a single value or an array of values. These can usually be constructed from just a string. - A
Format
contains a list of other fields. These can be nested to any depth. - Fields like
Repeat
,Find
andCondition
can be used to add more logical structure.
- The simplest is just a
- The values of other fields can be used in later calculations via an f-string-like expression syntax.
- Data is always stored efficiently as a contiguous array of bits.
A quick example: the MPEG-2 video standard specifies a 'sequence_header' that could be defined in bitformat by
seq_header = Format(['sequence_header_code: hex32 = 0x000001b3',
'horizontal_size_value: u12',
'vertical_size_value: u12',
'aspect_ratio_information: u4',
'frame_rate_code: u4',
'bit_rate_value: u18',
'marker_bit: bool',
'vbv_buffer_size_value: u10',
'constrained_parameters_flag: bool',
'load_intra_quantizer_matrix: bool',
Repeat('{load_intra_quantizer_matrix}',
'intra_quantizer_matrix: [u8; 64]'),
'load_non_intra_quantizer_matrix bool',
Repeat('{load_non_intra_quantizer_matrix}',
'non_intra_quantizer_matrix: [u8; 64]'),
Find('0x000001')
], 'sequence_header')
To parse such a header you can write simply
seq_header.parse(some_bytes_object)
then you can access and modify the field values
seq_header['bit_rate_value'].value *= 2
before rebuilding the binary object
b = seq_header.build()
Copyright (c) 2024 Scott Griffiths