-
Notifications
You must be signed in to change notification settings - Fork 8
Source Code Walkthrough
Typically, Ghidra proc mods are described completely by a Sleigh slaspec and other architecture files (pspec, cspec, opinion, ldefs, etc). But hexagon.slaspec is auto-generated, and instruction fields and token have no semantic information. Furthermore, the hexagon.slaspec requires many context registers to be set appropriately by an accompanying HexagonPacketAnalyzer Analyzer plugin.
The auto-generation script for hexagon.slaspec is loosely adapted from binja-hexagon, which in turn manipulates auto-generated artifacts from qemu-hexagon
First, all instructions are disassembled without any context set. This differs from the final disassembly in a few notable ways:
- All duplex sub instructions (consecutive two-byte instructions that appear at the end of a Hexagon packet) will decode as one 4-byte DUPLEX instruction
- All pc-relative immediates are incorrect
- Therefore instruction flows are typically incorrect
- All new-value operands are scalars because they cannot be resolved to registers
- Hardware endloops are not identified
When a set of straight-line instructions are disassembled, HexagonPacketAnalyzer organizes each instruction into larger packets, then sets the corresponding context registers:
pkt_start
pkt_next
-
subinsn
: values 1-5 indicating A, L1, L2, S1, or S2 duplex subinstruction, respectively -
hasnew
: whether the instruction has a new-value operand -
dotnew
: the register number corresponding to R0-R31 that the dot-new operand resolves to -
endloop
: values 1-3 indicating endloop0, endloop1, or endloop01, respectively -
duplex_next
: if an immext instruction precedes a pair of duplex instructions, specifies the address of the second duplex instruction, which is the one that receives the extension
The HexagonPacketAnalyzer plugin sets the context appropriately, then re-disassembles all instructions. At this point, all new-value operands are identified as registers instead of scalars, and all single DUPLEX placeholder instructions are split up into two two-byte duplex subinstructions.
Finally, HexagonPacketAnalyzer sets fallthrough for all instructions in the packet besides the last one, and cleans up any bookmarks corresponding to disassembly errors.
As with all architectures, the Hexagon ABI specifies a handful of architecture-specific ELF relocations, which are implemented in Hexagon_ElfRelocationHandler.java.
For more information on the topic consult Qualcomm Hexagon Application Binary Interface
ParallelInstructionLanguageHelper
is a pre-existing feature/plugin that you can specify in the pspec. But Hexagon requires some changes/improvements to how this feature works:
- Added a new "Parallel Suffix" label for "}" endpacket marker
- Added "Parallel ||" and "Parallel Suffix" labels to Function Graph by default
-
SimpleBlockModel
andBasicBlockModel
are enlightened to respect packet boundaries\ - Added hook in
DecompileCallbacks.getPcodePacked
toParallelInstructionLanguageHelper
to fixup Pcode for entire packet before sending to decompiler
On that last point, HexagonParallelInstructionLanguageHelper
needs to fixup pcode for all instructions in a packet, in order to emulate the behavior of parallel execution. This is handled in HexagonPcodeEmitPacked
.