Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for cmo extension #137

Closed
wants to merge 1 commit into from

Conversation

liweiwei90
Copy link

add support for cmo extension:

  • add parameter for cache block size
  • add cbo.zero to zero the cache block
  • add cbo.inval/flush/clean to do envcfg csr and memory access check
  • add prefetch* to only do the address-translation

Copy link
Collaborator

@jrtc27 jrtc27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Various initial feedback; generally only commented on the first instance of something even when it applies multiple times, so please check for other instances of things I've said, especially when it comes to code formatting

Makefile Outdated Show resolved Hide resolved
c_emulator/riscv_platform_impl.c Outdated Show resolved Hide resolved
model/riscv_insts_cmoext.sail Outdated Show resolved Hide resolved
model/riscv_insts_cmoext.sail Outdated Show resolved Hide resolved
model/riscv_insts_cmoext.sail Outdated Show resolved Hide resolved
model/riscv_sys_regs.sail Outdated Show resolved Hide resolved
model/riscv_types.sail Outdated Show resolved Hide resolved
ocaml_emulator/platform_impl.ml Show resolved Hide resolved
c_emulator/riscv_sim.c Outdated Show resolved Hide resolved
model/riscv_insts_cmoext.sail Outdated Show resolved Hide resolved
@liweiwei90
Copy link
Author

Various initial feedback; generally only commented on the first instance of something even when it applies multiple times, so please check for other instances of things I've said, especially when it comes to code formatting

OK, I'll try to seperate the three extensions.

@allenjbaum
Copy link
Collaborator

allenjbaum commented Dec 18, 2021 via email

@allenjbaum
Copy link
Collaborator

allenjbaum commented Dec 18, 2021 via email

@jrtc27
Copy link
Collaborator

jrtc27 commented Dec 18, 2021

Thanks for an incredibly quick and thorough review. But, at one point you say there shouldn't be random spaces, but the guidelines said you can add them to align things vertically to make them easier to understand. Which would be fine if in this case he added a couple more extra spaces to make the expressions actually line up. Do I have that correct?

Yes, that's right; I took an educated guess based on the low number of additional spaces that the intent was not to line things up

@dkruckemyer-ventana
Copy link

What are the next steps to take here?

model/riscv_insts_zicbom.sail Outdated Show resolved Hide resolved
model/riscv_insts_zicboz.sail Outdated Show resolved Hide resolved
model/riscv_insts_zicbom.sail Outdated Show resolved Hide resolved
model/riscv_insts_zicbom.sail Show resolved Hide resolved
model/riscv_insts_zicboz.sail Outdated Show resolved Hide resolved
model/riscv_insts_zicbom.sail Outdated Show resolved Hide resolved
model/riscv_insts_zicbom.sail Outdated Show resolved Hide resolved
model/riscv_insts_zicbom.sail Outdated Show resolved Hide resolved
model/riscv_insts_zicbom.sail Outdated Show resolved Hide resolved
model/riscv_insts_zicboz.sail Outdated Show resolved Hide resolved
c_emulator/riscv_sim.c Outdated Show resolved Hide resolved
@arichardson
Copy link
Collaborator

The description of this PR says prefetch is supported, but I don't see any Zicbop support in the code? Maybe that got lost while splitting up the files?

@ptomsich
Copy link
Collaborator

What is the status on this PR?
I see plenty of unresolved review comments but no recent code changes (last update is 8 months old)... who is driving this?

Given that the Zicbo[mpz] extensions are comparatively self-contained (and we don't have to implement any real cache-management functionality/prefetchers), I'd like to see this pushed over the finish line sooner than later.

@liweiwei90
Copy link
Author

liweiwei90 commented May 25, 2023

The description of this PR says prefetch is supported, but I don't see any Zicbop support in the code? Maybe that got lost while splitting up the files?

Yeah. the file with the support for Zicbop is deleted in previous updates. It will be dead code since prefetch instructions reuse the encoding of existed instruction.

@liweiwei90
Copy link
Author

What is the status on this PR? I see plenty of unresolved review comments but no recent code changes (last update is 8 months old)... who is driving this?

Sorry, I didn't maintain this PR after I replied the comments and update the code to fix the problems several month ago. I think most of the questions have been resolved or replied(may need further discussion).

Given that the Zicbo[mpz] extensions are comparatively self-contained (and we don't have to implement any real cache-management functionality/prefetchers), I'd like to see this pushed over the finish line sooner than later.

Yeah. I agree. I'll try to re-maintain and update it sooner.

@jrtc27
Copy link
Collaborator

jrtc27 commented May 26, 2023

At a glance I don't think the extension hooks comments have been properly addressed

@liweiwei90
Copy link
Author

At a glance I don't think the extension hooks comments have been properly addressed

Sorry. Do you mean ext_data_get_addr problem?

@jrtc27
Copy link
Collaborator

jrtc27 commented May 26, 2023

At a glance I don't think the extension hooks comments have been properly addressed

Sorry. Do you mean ext_data_get_addr problem?

Yes. You give it a register index for the base address register used, an offset from that register's value and the size of the memory operation being performed. All of those must be correct. Currently you pass zeros() as the register index, the virtual address as the offset and BYTE as the width. This works in the upstream Sail model because all it does is return Ext_DataAddr_OK(X(base) + offset), but that relies on the specifics of the implementation; your use does not conform to the interface. This would break our work downstream which uses that hook for something actually useful, and needs the register index and access width to be correct.

@liweiwei90
Copy link
Author

liweiwei90 commented May 26, 2023

At a glance I don't think the extension hooks comments have been properly addressed

Sorry. Do you mean ext_data_get_addr problem?

Yes. You give it a register index for the base address register used, an offset from that register's value and the size of the memory operation being performed. All of those must be correct. Currently you pass zeros() as the register index, the virtual address as the offset and BYTE as the width. This works in the upstream Sail model because all it does is return Ext_DataAddr_OK(X(base) + offset), but that relies on the specifics of the implementation; your use does not conform to the interface. This would break our work downstream which uses that hook for something actually useful, and needs the register index and access width to be correct.

Yeah. I don't find the original meaning of this function as I commented before. So I used it as an address calculation progress.
For the rs1 and offset, I can adjust it to conform the interface. However, I think cache block size is right size for the memory access. However it seems not a correct value for it. So Byte is used currently.

@jrtc27
Copy link
Collaborator

jrtc27 commented May 26, 2023

At a glance I don't think the extension hooks comments have been properly addressed

Sorry. Do you mean ext_data_get_addr problem?

Yes. You give it a register index for the base address register used, an offset from that register's value and the size of the memory operation being performed. All of those must be correct. Currently you pass zeros() as the register index, the virtual address as the offset and BYTE as the width. This works in the upstream Sail model because all it does is return Ext_DataAddr_OK(X(base) + offset), but that relies on the specifics of the implementation; your use does not conform to the interface. This would break our work downstream which uses that hook for something actually useful, and needs the register index and access width to be correct.

Yeah. I don't find the original meaning of this function as I commented before. So I used it as an address calculation progress. For the rs1 and offset, I can adjust it to conform the interface. However, I think cache block size is right size for the memory access. However it seems not a correct value for it. So Byte is used currently.

We can change it to take the actual number of bytes rather than one of the width enum values.

@liweiwei90
Copy link
Author

liweiwei90 commented May 26, 2023

At a glance I don't think the extension hooks comments have been properly addressed

Sorry. Do you mean ext_data_get_addr problem?

Yes. You give it a register index for the base address register used, an offset from that register's value and the size of the memory operation being performed. All of those must be correct. Currently you pass zeros() as the register index, the virtual address as the offset and BYTE as the width. This works in the upstream Sail model because all it does is return Ext_DataAddr_OK(X(base) + offset), but that relies on the specifics of the implementation; your use does not conform to the interface. This would break our work downstream which uses that hook for something actually useful, and needs the register index and access width to be correct.

Yeah. I don't find the original meaning of this function as I commented before. So I used it as an address calculation progress. For the rs1 and offset, I can adjust it to conform the interface. However, I think cache block size is right size for the memory access. However it seems not a correct value for it. So Byte is used currently.

We can change it to take the actual number of bytes rather than one of the width enum values.

OK. I'll change it.

@github-actions
Copy link

github-actions bot commented May 29, 2023

Unit Test Results

712 tests  ±0   712 ✔️ ±0   0s ⏱️ ±0s
    6 suites ±0       0 💤 ±0 
    1 files   ±0       0 ±0 

Results for commit ab52bc3. ± Comparison against base commit 3fca4c8.

♻️ This comment has been updated with latest results.

@ptomsich ptomsich added the tgmm-agenda Tagged for the next Golden Model meeting agenda. label May 29, 2023
@ptomsich
Copy link
Collaborator

Action items from today:

  • @liweiwei90 Update PR w/ new patch set and provide information on testing (e.g., link to the cbo.zero ACT PR)
  • @billmcspadden-riscv Find a subject matter expert (from the CMO TG?) to review the updated test set.

@ptomsich ptomsich removed the tgmm-agenda Tagged for the next Golden Model meeting agenda. label May 30, 2023
@liweiwei90
Copy link
Author

Action items from today:

* @liweiwei90 Update PR w/ new patch set and provide information on testing (e.g., link to the cbo.zero ACT PR)

OK. I'll check the above comments again and mark them as resolved or update the patch if needed. The ACT PR can be found in riscv-non-isa/riscv-arch-test#226

@liweiwei90
Copy link
Author

Two problems may need further discussion:

  1. What is the reasonable size range for cache block size? Whether cache block size can be multiple pages?
  2. For pmpcheck, whether xlenbits type can be converted to atom type? How ?

@ptomsich
Copy link
Collaborator

ptomsich commented Jun 1, 2023

Two problems may need further discussion:

  1. What is the reasonable size range for cache block size? Whether cache block size can be multiple pages?

The specification does not disallow it, so it must be supported by our implementations.

The only requirements (from the specification) are:

  • naturally aligned
  • power-of-two

(They will be overlapped by ORI instruciton).
@@ -127,7 +127,7 @@ function clause execute(LOADRES(aq, rl, rs1, width, rd)) = {
/* Get the address, X(rs1) (no offset).
* Extensions might perform additional checks on address validity.
*/
match ext_data_get_addr(rs1, zeros(), Read(Data), width) {
match ext_data_get_addr(rs1, zeros(), Read(Data), EXTZ(size_bits(width))) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't these size_bits be word_width_bytes everywhere?

Also when you rebase you'll have to upload a load of the vector code that calls ext_data_get_addr(). Though you can be lazy and just wait if you want, because I am rebasing/updating this...

@@ -116,7 +116,7 @@ type ext_data_addr_error = unit

/* Default data addr is just base register + immediate offset (may be zero).
Extensions might override and add additional checks. */
function ext_data_get_addr(base : regidx, offset : xlenbits, acc : AccessType(ext_access_type), width : word_width)
function ext_data_get_addr(base : regidx, offset : xlenbits, acc : AccessType(ext_access_type), width : xlenbits)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is xlenbits the right type? Maybe nat would be better (or range(1, 1024) for performance reasons).

@Timmmm
Copy link
Collaborator

Timmmm commented Nov 22, 2023

I'm working on this. There are a number of things I've noticed/changed - I'll list them here just so they don't get lost in case of unexpected bus accidents. But if you can wait a few weeks I will upload a PR.

  • Restrict cache block size to a page size (4096). Technically the spec doesn't disallow larger sizes, but it also describe any behaviour with block sizes larger than a page, and there are lots of questions as to how that would work. I guess they didn't specify it because nobody is going to have a cache block that large anyway, so restricting it to that makes sense IMO.
  • Change the block size parameter to be the exponent instead of actual size. This avoids the possibility of invalid values, and makes the code a bit simpler (e.g. making masks). Not 100% sure about this change tbh. --cache-block-size 64 is a bit easier to follow than --cache-block-size-exp 6. I may revert this and just do that transformation in C. Or maybe not because then I also have to figure out how to do it in OCaml.
  • Use long CLI option instead of -B.
  • Change it from writing 1 byte at a time to just do the whole write in one go. This is faster, simpler and produces more sensible traces.
  • To do that I had to increase max_mem_access to 4096, but that's fine. It generates exactly the same code.
  • For flush/inval, do PMA/PMP check for read or write access, rather than actually performing a read. This means if only write access is allowed it will work correctly, and also it will produce the correct traces (since no read was actually performed). This depends on our PMA implementation, which I hope to upstream at some point.
  • General simplifications, refactoring, minor fixes, comments etc.
  • Still no tests in this repo, but we will test it privately using STING.

@davidharrishmc
Copy link

As mentioned back in May, cbo.zero support in Sail is gating the ability to verify the cbo.zero tests that were recently added to riscv-arch-test. Looking forward to this PR being completed.

riscv-non-isa/riscv-arch-test#226

@Timmmm
Copy link
Collaborator

Timmmm commented Dec 21, 2023

I've prepared a PR for this, but the people that need to approve me uploading it are on holiday. And also it depends on #350 and an HPM counter PR that I also need to get approval for too. Hopefully not too much longer. #350 is ready to merge I think.

@davidharrishmc
Copy link

Nudge.

@Timmmm
Copy link
Collaborator

Timmmm commented Mar 6, 2024

Coincidentally I prepared a PR today. Just getting approval to send it (shouldn't take long).

@Timmmm
Copy link
Collaborator

Timmmm commented Apr 22, 2024

@davidharrishmc See #455

@Timmmm
Copy link
Collaborator

Timmmm commented Aug 30, 2024

#455 was merged so I'll close this one. Thanks for the initial code @liweiwei90!

@Timmmm Timmmm closed this Aug 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants