CoMux is short for Connection Multiplexing. It's a special file format I implemented as part of this project to specify the content of multiple socket connections in a single file.
Each comux file is formatted like so:
+---------------------------------------------------+
| MAGIC | VERSION | NUM_CONNS | NUM_CHUNKS |
+------------+------------+------------+------------+
| C1_ID | C1_LEN | C1_SCHED | C1_FLAGS |
+------------+------------+------------+------------+
| C1_DATA |
+------------+------------+------------+------------+
| C2_ID | C2_LEN | C2_SCHED | C2_FLAGS |
+------------+------------+------------+------------+
| C2_DATA |
+------------+------------+------------+------------+
| ... |
+------------+------------+------------+------------+
| CN_ID | CN_LEN | CN_SCHED | CN_FLAGS |
+------------+------------+------------+------------+
| CN_DATA |
+---------------------------------------------------+
Below are notes and discussion about the layout of this file: what the fields mean, and why they're present.
The magic
is a simple chunk of bytes that's used to identify a file as a comux file. It's a good sanity check when opening a comux file and parsing it. It's comparable to the ELF
bytes seen at the front of an ELF binary.
Presently, the magic is 8 bytes long: comux!!!
(the word "comux" followed by three exclamation points).
This field marks the comux version number the file is formatted in. Presently the only version number is zero, but this field was included for extensibility and backwards-compatibiilty, should this file format be adopted and modified by others.
This is an unsigned integer specfiyng the number of concurrent connections to be spawned when communicating with the target server. All of the chunks listed in the file have a connection ID between 0
and NUM_CONNS
.
This is an unsigned integer specifying the number of chunk-header and chunk-data pairs specified in the file after the main header.
The main comux header defines a few fields and is immediately followed by the first chunk header.
A chunk header marks the beginning of a new chunk segment in the comux file. Each chunk represents a set of bytes to be sent to one of the NUM_CONNS
connections. It contains the id
, length
, schedule
, and flags
fields, and is immediately followed by that chunk's data.
The id
field is a simple unsigned integer that specifies which connection this data is assigned to. It can vary between 0
and NUM_CONNS
(specified in the main comux header).
Each chunk header has a length
field. This is used to specify the length of the data following the header. A reader of this comux file can use this field to quickly seek from one chunk header to the next.
An alternative design approach would be to place a list of chunk header offsets at the front of the file, immediately after the comux header. This would make it easy for a parser to immediately understand exactly where each chunk header begins in the file.
However, this design choice creates more overhead: the reader of a comux file will need to create some data structure to hold all of these offsets. With the chosen approach, this isn't necessary - the reader simply needs to walk through the file, parsing each chunk length as it goes. If it's truly needed, the reader could still create a table of offsets after making a single pass through the file.
The schedule
field defines a scheduling value for each chunk. This value is used to decide when the chunk of data is sent across the wire to the target server. Chunks with lower scheduling values are sent first. Take this example:
COMUX FILE: num_conns=3, num_chunks=6
CHUNK 0: conn_id=0, sched=1
CHUNK 1: conn_id=2, sched=0
CHUNK 2: conn_id=1, sched=3
CHUNK 3: conn_id=1, sched=4
CHUNK 4: conn_id=2, sched=5
CHUNK 5: conn_id=0, sched=2
The chunks would be sent in the following order:
- Chunk 1 --> connection 2
- Chunk 0 --> connection 0
- Chunk 5 --> connection 0
- Chunk 2 --> connection 1
- Chunk 3 --> connection 1
- Chunk 4 --> connection 2
The purpose of having this scheduling field for each chunk is to allow the mutator (the other half of this project) to fuzz the order in which chunks are sent, simply by modifying a chunks scheduling field.
This is used to toggle various switches to tell the LD_PRELOAD
library how to treat this connection.
It's entirely possible more flags will be implemented in the future, so this field also exists for extensibility purposes.