Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve register allocation (part 1) #40

Merged
merged 28 commits into from
Nov 28, 2023

Conversation

janvrany
Copy link
Owner

This (arguably massive) PR improves register allocation and supporting code though it does not bring support for general spill / reload. It does however bring in support for spilling / reloading used volatile registers over calls. This effectively doubles the set of available registers on (at least) RISC-V.

In particular, this PR consists of:

  • New TRRegisterDependencies attached to an instruction that describe desired mapping of virtual registers to real register and/or set of registers thrashed by an instruction.
  • Changes in code generator / linkages to use TRRegisterDependencies to specify which values should go to given real registers (as specified by particular ABI)
  • New default register allocator based on linear scan but progressing backwards (RLSRA) which simplifies insertion of move / spill / reload instructions into instruction stream.
  • Improvements to RLSRA to spill / reload used thrashed volatile registers

The downside is that now there's little resemblance of RLSRA code to original Poletto's and Sarkar's paper.

...over preserved registers. This saves us the need to save / reload
(preserved) registers in prologue / epilogue.
This commit fixes a bug where return value (virtual) register of an
`TRLeave` pseudo-instruction was not considered as "read" by that instruction,
which in turn could have caused RA to allocate the same physical register
to different virtual register and thus cause invalid value to be returned.

This commit fixes this problem (which, interestingly, did not manifest)
…By...`

as the word 'assigned' is more commonly used in literature.
This commit introduces a new (default) register allocator class:
`TRReverseLinearScanRegisterAllocator`. It is similar to linear scan
allocator used previously except it progresses "backwards", that is from
last instruction to first, from last register use towards its assignment.

The advantage of this is that one can insert spills / reloads into
an instruction stream without changing indexes of instructions not yet
"processed". However, support for spills / reloads is not yet implemented.
This commit removes original support for constraints on virtual register
allocation. Not only they were not used but also they could not represent
fact that some instructions kill registers instruction is not using -
for example a call kills (trashes) all volatile registers.

Subsequent commit will introduce per-instruction specified register
dependencies just like in grown-up Testarossa.
…reOldIntervals:`

...to use the names from original paper [1].

[1]: MASSIMILIANO POLETTO, VIVEK SARKAR: Linear Scan Register Allocation
...by instructions that require certain values to be in specific registers.
Most common example are calls where (some) parameters are passed in
specific registers an return value is passed on specific register(s).

Moreover, register dependencies may be used to express situation where
an instruction trashes some registers - for example call to a function
trashes all volatile registers.

Also, some architectures have other constraints, for example on x86,
`mul` places result to `eax` / `rax`. Again, this can be (should be)
expressed  register dependencies as well.

This commit only adds the necessary classes and infrastructure but
does not use it. This is left for later commits.
This commit adds new instvar - `register` - to `TRRegisterLiveRange` making
use of associations in linear scan allocators unnecessary. This feels as a
cleaner solution.
...just like `?call` is delegated. As before, (new) `TRLinkage >> #generateReturn:`
is supposed to generate a leave instruction (see `TRLeave`), *NOT* complete
epilogue.

This is in preparation for using register dependencies to coerce virtual
registers to specific physical registers.
...for setting (and reading) current instruction generation cursor.
Useful for injecting register moves / spills and reloads after RA.

See `AcDSLAssembler >> #cursor:` (and `#cursor`)
...to move register contents from one register to another. Useful
(for example) for injecting register moves after RA to satisfy register
dependencies.
This commit uses `TRRegisterDependencies` to express constraints on
registers upon function call and return such that arguments are passed
in specific registers as well as return value is returned through a
register.
This commit uses `TRRegisterDependencies` to express constraints on
registers upon function call and return such that arguments are passed
in specific registers as well as return value is returned through a
register.
This commit treats any dependent virtual register in instructions'
register pre-dependencies as "read" by that instruction. This makes sure
the register is live at that point.
Similar to previous commit, this commit treats any dependent virtual
register in instructions' register post-dependencies as "assigned" by
that instruction. This makes sure the register is live at that point.
Previously, satisfying (virtual) register dependencies was hard-coded in
linkage, i.e., linkage generated code to move values to appropriate
parameter registers before the call and move return value from return
register to desired (virtual) register.

This solution has two problems:

 * It works only for parameter / return values of a call, not for any
   instruction. This is especially problem for x86 which is famous for
   having peculiar restrictions on what registers can be used with what
   instructions.
 * Moreover, it effectively prohibits register allocator to allocate
   values directly into parameter registers or read value directly from
   return register and thus, increases register pressure.

This commit addreses these two problem by moving the responsibility for
satisfying register dependencies to to register allocator.

As of now, it does not do anything smart w.r.t. allocation but simply
inserts register moves. This will be improved later.
This commit simplifies implementation of both (forward) linear scan
allocator and reverse linear scan allocator so they no longer use odd
interval start points and and even end points (see that `i * 2` and
`(i * 2) + 1` when collecting live intervals.

Obviously the same effect can be achieved by carefully implementing
interval expiration (which is what this commit does).
This commit refactors reverse linear scan allocator so that it processes
one instruction at time, allocating registers and ensuring dependencies
are met.

This is a preparation for more clever allocation that handles spills and
reloads as well.
This commit further refactors reverse linear scan allocator to
facilitate further improvements.
This commit adds new boolean option - `stressRA`. When set to `true` it
reduces the set of registers available for allocation drastically to
stress-test register allocator.
…ster allocator

...as it forces allocator to spill / reload live volatile register on each
call.
This commit creates register fixup map manually rather than depending on
fact that virtual register converts to concrete bitvector when real register
is allocated.

This is not only easier to understand but also makes it easier debug
RA because inserted spills / reloads / moves refer to virtual registers
so it is clear which (virtual) register is spilled.
This commit add support for spilling / reloading trashed registers that
contain live values. Mostly, this is the case of volatile (and argument)
registers being (potentially) thrashed by calls.

This is only supported by reverse linear scan allocator.

Implementation note: for now, spill slot is allocated as unnamed automatic
on a stack for each (spilled) interval. For actually spilling, it used
`TRCodeGenerator >> registerStore:to:` and `#registerLoad:from:`. This may
not work in all cases with all linkages - some linkages may prescribe
where to spill registers (PPC64 ELF v2 ABI?, Microsoft x64?).
…gister allocators

This commit removes `TRLinearScanRegisterAllocator` and `TRNaiveConstraintSolvingRegisterAllocator`
register allocators.

The main reason is that commit "Let the register allocator to handle
(virtual) register dependencies" moves the responsibility to move values
to argument registers and from return register(s) to register allocator
based on register dependencies.

Neither `TRLinearScanRegisterAllocator` and `TRNaiveConstraintSolvingRegisterAllocator`
can be easily updated to work with this change, so this commit removes
them.
@janvrany
Copy link
Owner Author

FYI: @melkyades

@shingarov shingarov merged commit e0d782e into master Nov 28, 2023
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants