Improve register allocation (part 1) #40

janvrany · 2023-11-28T20:24:21Z

This (arguably massive) PR improves register allocation and supporting code though it does not bring support for general spill / reload. It does however bring in support for spilling / reloading used volatile registers over calls. This effectively doubles the set of available registers on (at least) RISC-V.

In particular, this PR consists of:

New TRRegisterDependencies attached to an instruction that describe desired mapping of virtual registers to real register and/or set of registers thrashed by an instruction.
Changes in code generator / linkages to use TRRegisterDependencies to specify which values should go to given real registers (as specified by particular ABI)
New default register allocator based on linear scan but progressing backwards (RLSRA) which simplifies insertion of move / spill / reload instructions into instruction stream.
Improvements to RLSRA to spill / reload used thrashed volatile registers

The downside is that now there's little resemblance of RLSRA code to original Poletto's and Sarkar's paper.

...over preserved registers. This saves us the need to save / reload (preserved) registers in prologue / epilogue.

This commit fixes a bug where return value (virtual) register of an `TRLeave` pseudo-instruction was not considered as "read" by that instruction, which in turn could have caused RA to allocate the same physical register to different virtual register and thus cause invalid value to be returned. This commit fixes this problem (which, interestingly, did not manifest)

…By...` as the word 'assigned' is more commonly used in literature.

This commit introduces a new (default) register allocator class: `TRReverseLinearScanRegisterAllocator`. It is similar to linear scan allocator used previously except it progresses "backwards", that is from last instruction to first, from last register use towards its assignment. The advantage of this is that one can insert spills / reloads into an instruction stream without changing indexes of instructions not yet "processed". However, support for spills / reloads is not yet implemented.

This commit removes original support for constraints on virtual register allocation. Not only they were not used but also they could not represent fact that some instructions kill registers instruction is not using - for example a call kills (trashes) all volatile registers. Subsequent commit will introduce per-instruction specified register dependencies just like in grown-up Testarossa.

…reOldIntervals:` ...to use the names from original paper [1]. [1]: MASSIMILIANO POLETTO, VIVEK SARKAR: Linear Scan Register Allocation

...by instructions that require certain values to be in specific registers. Most common example are calls where (some) parameters are passed in specific registers an return value is passed on specific register(s). Moreover, register dependencies may be used to express situation where an instruction trashes some registers - for example call to a function trashes all volatile registers. Also, some architectures have other constraints, for example on x86, `mul` places result to `eax` / `rax`. Again, this can be (should be) expressed register dependencies as well. This commit only adds the necessary classes and infrastructure but does not use it. This is left for later commits.

This commit adds new instvar - `register` - to `TRRegisterLiveRange` making use of associations in linear scan allocators unnecessary. This feels as a cleaner solution.

...just like `?call` is delegated. As before, (new) `TRLinkage >> #generateReturn:` is supposed to generate a leave instruction (see `TRLeave`), *NOT* complete epilogue. This is in preparation for using register dependencies to coerce virtual registers to specific physical registers.

...for setting (and reading) current instruction generation cursor. Useful for injecting register moves / spills and reloads after RA. See `AcDSLAssembler >> #cursor:` (and `#cursor`)

...to move register contents from one register to another. Useful (for example) for injecting register moves after RA to satisfy register dependencies.

This commit uses `TRRegisterDependencies` to express constraints on registers upon function call and return such that arguments are passed in specific registers as well as return value is returned through a register.

This commit treats any dependent virtual register in instructions' register pre-dependencies as "read" by that instruction. This makes sure the register is live at that point.

Similar to previous commit, this commit treats any dependent virtual register in instructions' register post-dependencies as "assigned" by that instruction. This makes sure the register is live at that point.

Previously, satisfying (virtual) register dependencies was hard-coded in linkage, i.e., linkage generated code to move values to appropriate parameter registers before the call and move return value from return register to desired (virtual) register. This solution has two problems: * It works only for parameter / return values of a call, not for any instruction. This is especially problem for x86 which is famous for having peculiar restrictions on what registers can be used with what instructions. * Moreover, it effectively prohibits register allocator to allocate values directly into parameter registers or read value directly from return register and thus, increases register pressure. This commit addreses these two problem by moving the responsibility for satisfying register dependencies to to register allocator. As of now, it does not do anything smart w.r.t. allocation but simply inserts register moves. This will be improved later.

This commit simplifies implementation of both (forward) linear scan allocator and reverse linear scan allocator so they no longer use odd interval start points and and even end points (see that `i * 2` and `(i * 2) + 1` when collecting live intervals. Obviously the same effect can be achieved by carefully implementing interval expiration (which is what this commit does).

This commit refactors reverse linear scan allocator so that it processes one instruction at time, allocating registers and ensuring dependencies are met. This is a preparation for more clever allocation that handles spills and reloads as well.

This commit further refactors reverse linear scan allocator to facilitate further improvements.

This commit adds new boolean option - `stressRA`. When set to `true` it reduces the set of registers available for allocation drastically to stress-test register allocator.

…ster allocator ...as it forces allocator to spill / reload live volatile register on each call.

This commit creates register fixup map manually rather than depending on fact that virtual register converts to concrete bitvector when real register is allocated. This is not only easier to understand but also makes it easier debug RA because inserted spills / reloads / moves refer to virtual registers so it is clear which (virtual) register is spilled.

This commit add support for spilling / reloading trashed registers that contain live values. Mostly, this is the case of volatile (and argument) registers being (potentially) thrashed by calls. This is only supported by reverse linear scan allocator. Implementation note: for now, spill slot is allocated as unnamed automatic on a stack for each (spilled) interval. For actually spilling, it used `TRCodeGenerator >> registerStore:to:` and `#registerLoad:from:`. This may not work in all cases with all linkages - some linkages may prescribe where to spill registers (PPC64 ELF v2 ABI?, Microsoft x64?).

…gister allocators This commit removes `TRLinearScanRegisterAllocator` and `TRNaiveConstraintSolvingRegisterAllocator` register allocators. The main reason is that commit "Let the register allocator to handle (virtual) register dependencies" moves the responsibility to move values to argument registers and from return register(s) to register allocator based on register dependencies. Neither `TRLinearScanRegisterAllocator` and `TRNaiveConstraintSolvingRegisterAllocator` can be easily updated to work with this change, so this commit removes them.

janvrany · 2023-11-28T20:31:12Z

FYI: @melkyades

janvrany added 28 commits November 28, 2023 18:09

RISC-V: avoid storing / reloading link register for leaf functions

1426248

RISC-V: prefer volatile registers for leaf methods

8311627

...over preserved registers. This saves us the need to save / reload (preserved) registers in prologue / epilogue.

Rename #virtualRegistersModifiedBy... to `#virtualRegistersAssigned…

e0e102e

…By...` as the word 'assigned' is more commonly used in literature.

Rename TRLinearScanRegisterAllocator >> #expireOldRanges: to `#expi…

ca3a88b

…reOldIntervals:` ...to use the names from original paper [1]. [1]: MASSIMILIANO POLETTO, VIVEK SARKAR: Linear Scan Register Allocation

Add (virtual) register to TRRegisterLiveRange

f5c8f2a

This commit adds new instvar - `register` - to `TRRegisterLiveRange` making use of associations in linear scan allocators unnecessary. This feels as a cleaner solution.

Add TRCodeGenerator >> #cursor: (and #cursor)

2796102

...for setting (and reading) current instruction generation cursor. Useful for injecting register moves / spills and reloads after RA. See `AcDSLAssembler >> #cursor:` (and `#cursor`)

Add TRCodeGenerator >> #registerCopyFrom:to:

2e3a542

...to move register contents from one register to another. Useful (for example) for injecting register moves after RA to satisfy register dependencies.

RISC-V: Implement TRCodeGenerator >> #registerCopyFrom:to:

81f89e4

RISC-V: use register dependencies to express register constraints

ae54ae1

This commit uses `TRRegisterDependencies` to express constraints on registers upon function call and return such that arguments are passed in specific registers as well as return value is returned through a register.

POWER: Implement TRCodeGenerator >> #registerCopyFrom:to:

5de102e

POWER: use register dependencies to express register constraints

7c918d0

This commit uses `TRRegisterDependencies` to express constraints on registers upon function call and return such that arguments are passed in specific registers as well as return value is returned through a register.

Treat pre-dependent virtual registers of an instruction as "read"

d43edf9

This commit treats any dependent virtual register in instructions' register pre-dependencies as "read" by that instruction. This makes sure the register is live at that point.

Treat post-dependent virtual registers of an instruction as "assigned"

d5d185d

Similar to previous commit, this commit treats any dependent virtual register in instructions' register post-dependencies as "assigned" by that instruction. This makes sure the register is live at that point.

Refactor reverse linear scan allocator (part ii)

62335c2

This commit further refactors reverse linear scan allocator to facilitate further improvements.

Keep reference to code generator in TRVirtualRegister

73e3c0b

Add new compilation option: stressRA

4481da3

This commit adds new boolean option - `stressRA`. When set to `true` it reduces the set of registers available for allocation drastically to stress-test register allocator.

RISC-V: always prefer volatile registers when stress-testing the regi…

1d75e60

…ster allocator ...as it forces allocator to spill / reload live volatile register on each call.

janvrany requested a review from shingarov November 28, 2023 20:30

shingarov merged commit e0d782e into master Nov 28, 2023
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve register allocation (part 1) #40

Improve register allocation (part 1) #40

janvrany commented Nov 28, 2023

janvrany commented Nov 28, 2023

Improve register allocation (part 1) #40

Improve register allocation (part 1) #40

Conversation

janvrany commented Nov 28, 2023

janvrany commented Nov 28, 2023