Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for inline constants in lsu instructions #15

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

d1duarte
Copy link
Contributor

No description provided.

@zwabbit
Copy link
Contributor

zwabbit commented May 30, 2016

Wait, what instruction ever does this?

@d1duarte
Copy link
Contributor Author

For instance when a tbuffer_load set's the soffset to 0x80 (inline constant 0).
I'm attaching a screenshot of this happening. The code is generated by AMD's CodeXL from the MatrixMultiplication example.
snapshot1
In the right hand column you can see the instruction binary, on the tbuffer_loads all of them use the inline constant 0. Without this correction it will read register s0.

@zwabbit
Copy link
Contributor

zwabbit commented May 30, 2016

Ah okay, that situation. The problem actually is a bit more complicated, in that we never properly implemented support for ANY of the constant resources that are supposed to be accessible via scalar register addresses. Let me check through which documentation actually describes them and I'll get back to you. Or if you know which one it is, feel free to point me to it.

@zwabbit
Copy link
Contributor

zwabbit commented May 31, 2016

NVM, found it, table 6.1 in the SI ISA document.

@d1duarte
Copy link
Contributor Author

Looking at the reg_field_encoder.v file (in the decode_core), it's easy to extract the integer constants from the address (as i proposed in this change). Extracting the floating point constants requires the use of the 'fp_constant' port but it's not required for LSU, since I don't think a floating point offset will ever be used

@zwabbit
Copy link
Contributor

zwabbit commented Jun 6, 2016

Oh sorry, github for some reason didn't send me a notification of your response so I didn't realize you had answered. Need to check these manually more often...

I've talked this over a bit with my prof and we're both of the opinion that the fix of this should take into account (or at least does not make difficult future work to support) the other possible inline constants as specified in table 6.1 in the ISA doc. I have a few ideas on how to approach it, which mostly boil down to having an input sorter module somewhere and feeding into the modules that actually perform the operation. The catch here is that it is not just memory instructions that might have inline constants, but VOP2 arithmetic instructions can as well. And also the need to keep the entire thing synchronized through the pipeline. Thoughts?

@d1duarte
Copy link
Contributor Author

d1duarte commented Jun 7, 2016

That's a good point. I would suggest one of these two approaches:

Since the constant 'address' will always have bit number 7 on (which means they start at 128). And, there only being 104 sgprs (bit number 7 off) you can either:

  1. simply check for this bit in the decode stage, create a new flag 'constant_needed' which will later be used to call some new module that performs the conversion between 'address' and constant value and provides this value to the execution unit (either simf, simd, salu or lsu).

  2. provide the complete address to the SGPR module and use a set of 'memory mapped' registers to provide this value. The procedure would not require great changes in the execution units (no new data_in inputs or any additional control).

I personally favor option 2 as this would require the least changes to the system and the access would be seamlessly to most units, providing more modularity. In decode you would only need to decide if the addressed source is a special purpose register (s0, exec, vcc, etc), a vector register (bit 8 = '1') or a 'sgpr' (bit 8 = '0', bit 7 = '1', and not a special purpose register).

Note: for Vector op's you would need to verify that bit number 8 is off as well.

@zwabbit
Copy link
Contributor

zwabbit commented Jun 8, 2016

Doing option 2 would work for the actual constant values, but the main issue right now is that the VCC and EXEC values are not actually maintained in the SGPR, I believe they are maintained in the wavepool. On the other hand they are routed as a matter of course into the SIMD/F units and at least for the latter also into the LSU automatically as part of the execution, so what we probably want is a combination of both options, option 2 for the actual constants that can be just address mapped in the SGPR and option 1 for the dynamic values maintained in other parts of the compute unit.

Do you think you'd be interested in working that out? I'm stuck working on some infrastructure related pieces and won't have time in the short term to work on MIAOW's internals.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants