Does side-by-side diffs of control flow graphs by comparing basic blocks in a way that can ignore all differences in block and value names. It only cares about the structure of the code when doing the comparison. It uses an algorithm similar to Needleman–Wunsch algorithm.
You will need a Rust toolchain and LLVM 14 in order to support LLVM IR diffing. To build:
cargo build
This can also build a Python module. To do that:
pip install maturin
cd pylib
maturin build
First, create the two files you wish to diff:
cat > foo_int.c <<EOI
int foo(int x, int y) {
return x * y + y;
}
EOI
cat > foo_long.c <<EOI
int foo(long x, long y) {
return (int)(x * y + y);
}
EOI
clang -emit-llvm -c foo_int.c foo_long.c
Then you can diff the two files:
ctflgrdiff -f ll-bc foo_int.bc foo_long.bc
The demo
directory contains example pairs of C code.
ll-ir
: LLVM text IR; note that LLVM 15+ use a different pointer format that will trigger LLVM 14 to segfaultll-bc
: LLVM bitcode; note that LLVM 15+ use a different pointer format that will trigger LLVM 14 to segfaultarm64
akaaarch64
akaarmv8
: 64-bit ARM code in a binaryarm32
akaaarch32
akaarmv7
: 32-bit ARM code in a binaryavr
: ATmel AVR code in a binary; note that it cannot be in a fat MachO binaryx86
akax86-32
akax86_32
akai386
akai686
: 32-bit Intel code in a binaryx64
akax86-64
akax86_64
: 64-bit Intel code in a binary
For all formats in a binary, an ELF, MachO, or PE (Windows) executable,
library, or object file can be provided. An archive (.a
) file containing ELF,
MachO, or PE object files is also supported. MachO multi-architecture (aka
fat) binaries are supported and only the instruction set requested will be
used.
Where appropriate, function names will go through C++ and Rust symbol demangling.