-
Notifications
You must be signed in to change notification settings - Fork 857
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add support for CMO v1.0 #891
add support for CMO v1.0 #891
Conversation
I'd like someone from the CMO TG to perform a code review. @dkruckemyer-ventana can you nominate someone? |
I was also making RISC-V CMO instruction but my implementation lacked actual cache invalidation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is definitely better than my test implementation (not PR-ed).
Still, my work told me that there's a few changes are required.
disasm/disasm.cc
Outdated
DISASM_INSN("prefetch.i", prefetch_i, 0, {&store_address}); | ||
DISASM_INSN("prefetch.r", prefetch_i, 0, {&store_address}); | ||
DISASM_INSN("prefetch.w", prefetch_i, 0, {&store_address}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disassembly support for prefetch.[irw]
instructions must be placed before ori
instruction because prefetch hint instructions are special encodings of ori
instruction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can test this file (spike --isa=rv64gc_zicbom_zicbop_zicboz -l test-zicbo.bare.elf
): test-zicbo.bare.zip
This is built with GNU Binutils + my CBO support patchset and my toy builder (will be open-sourced later).
.globl _start
.section .entry, "ax", %progbits
.attribute arch, "rv64gc_zicbom_zicbop_zicboz"
_start:
prefetch.i 0(x0)
prefetch.r 32(x1)
prefetch.w 64(x2)
cbo.clean x1
cbo.clean x30
cbo.inval x1
cbo.inval x30
cbo.flush x1
cbo.flush x30
cbo.zero x1
cbo.zero x30
ret
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true. I didn't realize this question before jrtc27 comment in riscv/sail-riscv#137. It's easy to fix the problem in disasm. However, It seems hard to fix the function code of prefetch.* . Is there any suggestion?
riscv/encoding.h
Outdated
#define ENVCFG_CBIE 0x30 | ||
#define ENVCFG_CBCFE 0x40 | ||
#define ENVCFG_CBZE 0x80 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My changes including CBIE
, CBCFE
and CBZE
bits are merged into riscv-opcodes
.
riscv/riscv-opcodes#94
Because riscv/encoding.h
is intended to be auto-generated by riscv-opcodes
, managing our bits (especially with our names) is not a good idea.
Even if we change riscv/encoding.h
manually, we should use following changes instead (as upstreamed into master):
#define MENVCFG_CBIE 0x00000030
#define MENVCFG_CBCFE 0x00000040
#define MENVCFG_CBZE 0x00000080
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I'll update encoding.h.
riscv/processor.cc
Outdated
@@ -546,6 +552,10 @@ void state_t::reset(processor_t* const proc, reg_t max_isa) | |||
csrmap[CSR_MIMPID] = std::make_shared<const_csr_t>(proc, CSR_MIMPID, 0); | |||
csrmap[CSR_MVENDORID] = std::make_shared<const_csr_t>(proc, CSR_MVENDORID, 0); | |||
csrmap[CSR_MHARTID] = std::make_shared<const_csr_t>(proc, CSR_MHARTID, proc->get_id()); | |||
const reg_t envcfg_mask = ENVCFG_CBIE | ENVCFG_CBCFE | ENVCFG_CBZE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extension masking is required.
envcfg_mask = (proc->extension_enabled(EXT_ZICBOM) ? MENVCFG_CBCFE | MENVCFG_CBIE: 0)
| (proc->extension_enabled(EXT_ZICBOZ) ? MENVCFG_CBZE: 0);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I'll update this.
riscv/processor.cc
Outdated
csrmap[CSR_MENVCFG] = menvcfg = std::make_shared<masked_csr_t>(proc, CSR_MENVCFG, envcfg_mask, 0); | ||
csrmap[CSR_SENVCFG] = senvcfg = std::make_shared<masked_csr_t>(proc, CSR_SENVCFG, envcfg_mask, 0); | ||
csrmap[CSR_HENVCFG] = henvcfg = std::make_shared<masked_csr_t>(proc, CSR_HENVCFG, envcfg_mask, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, those CSRs are not that simple.
I'm working on the fix as a part of other privileged spec 1.12 CSR changes and will support merging this simple implementation for now (I just let you know there is a todo-list entry here).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
riscv/insns/prefetch_i.h
Outdated
@@ -0,0 +1,2 @@ | |||
require_extension(EXT_ZICBOP); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete this line since this instruction is a hint and must not generate illegal instruction trap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can optionally stop prefetching if EXT_ZICBOP
is not enabled. Even so, we must not generate any trap here (even prefetch.i 0(zero)
must be safe).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I'll fix this.
riscv/insns/prefetch_r.h
Outdated
@@ -0,0 +1,2 @@ | |||
require_extension(EXT_ZICBOP); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete this line since this instruction is a hint and must not generate illegal instruction trap.
riscv/insns/prefetch_w.h
Outdated
@@ -0,0 +1,2 @@ | |||
require_extension(EXT_ZICBOP); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete this line since this instruction is a hint and must not generate illegal instruction trap.
disasm/disasm.cc
Outdated
DISASM_INSN("cbo.inval", cbo_inval, 0, {&xrs1}); | ||
DISASM_INSN("cbo.zero", cbo_zero, 0, {&xrs1}); | ||
DISASM_INSN("prefetch.i", prefetch_i, 0, {&store_address}); | ||
DISASM_INSN("prefetch.r", prefetch_i, 0, {&store_address}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DISASM_INSN("prefetch.r", prefetch_r, 0, {&store_address});
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry. I forgot to update the second argument after copy.
disasm/disasm.cc
Outdated
DISASM_INSN("cbo.zero", cbo_zero, 0, {&xrs1}); | ||
DISASM_INSN("prefetch.i", prefetch_i, 0, {&store_address}); | ||
DISASM_INSN("prefetch.r", prefetch_i, 0, {&store_address}); | ||
DISASM_INSN("prefetch.w", prefetch_i, 0, {&store_address}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DISASM_INSN("prefetch.w", prefetch_w, 0, {&store_address});
Thanks for your comments. |
e789232
to
cb040d5
Compare
I looked into the current implementation and... I concluded that
There will be a small performance degradation, but nothing more. Yes, it's not very intuitive behavior but not very urgent to fix either. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I first didn't notice that disasm part is resolved (because cbo.*
instructions are moved too). Anyway, here's my suggestions.
spike_main/spike.cc
Outdated
@@ -241,6 +242,7 @@ int main(int argc, char** argv) | |||
uint16_t rbb_port = 0; | |||
bool use_rbb = false; | |||
unsigned dmi_rti = 0; | |||
reg_t blocksz = 16; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with riscv/sail-riscv#137 (comment) and default block size would be better to be 64.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I'll fix this. There is still some problem unsolved in the review of sail-riscv, this may also affect the implementation of spike.
spike_main/spike.cc
Outdated
@@ -376,6 +378,7 @@ int main(int argc, char** argv) | |||
exit(-1); | |||
} | |||
}); | |||
parser.option(0, "blocksz", 1, [&](const char* s){blocksz = strtoull(s, 0, 0);}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about checking whether the block size is a power of 2 somewhere? See processor_t::check_pow2
for argument checking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have sent email to David to discuss about the unsolved problems, including the reasonable range for cache block size. There is no reply currently. Anyway, I agree it should be power of 2. I'll update it later.
riscv/riscv.mk.in
Outdated
@@ -1260,6 +1260,15 @@ riscv_insn_svinval = \ | |||
hinval_vvma \ | |||
hinval_gvma \ | |||
|
|||
riscv_insn_cmo = \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not sail-riscv but actual extension name would be better here, too. How about zicbo
(for Zicbo[mpz]
) instead of cmo
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name in riscv_insn_list are all use the total name of extensions(k, b, v,...) or specific extension name(svinval). I think cmo is more like the total name of zicbo[mpz] than zicbo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. If the maintainer agrees, that's okay.
Thanks for your explanation. I missed the case that the instruction will not moved to the front if it have the same match value with its previous or later instruction. |
Sorry, I don't understand the problem for moving cbo.* instructions before. |
I just felt that it wasn't necessary. |
OK. I just want to put the cmo related instructions together. |
The prefetch instructions should be deleted from this PR. Executing them as no-ops is a fully legitimate implementation of the extension. The proposed approach adds complexity but doesn't provide meaningful benefit. |
riscv/memtracer.h
Outdated
@@ -39,6 +40,11 @@ class memtracer_list_t : public memtracer_t | |||
for (std::vector<memtracer_t*>::iterator it = list.begin(); it != list.end(); ++it) | |||
(*it)->trace(addr, bytes, type); | |||
} | |||
void clean_invalidate(uint64_t addr, size_t bytes, bool clean, bool inval) | |||
{ | |||
for (std::vector<memtracer_t*>::iterator it = list.begin(); it != list.end(); ++it) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use modern C++ loop:
for (auto it: list)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The prefetch instructions should be deleted from this PR. Executing them as no-ops is a fully legitimate implementation of the extension. The proposed approach adds complexity but doesn't provide meaningful benefit.
OK. The prefetchs have been deleted
riscv/mmu.cc
Outdated
void mmu_t::cbo_zero(reg_t addr) { | ||
addr &= ~(blocksz - 1); | ||
reg_t vpn = addr >> PGSHIFT; | ||
if (likely(tlb_store_tag[vpn % TLB_ENTRIES] == vpn)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a lot of logic here (and in clean_inval() too) to perform what is basically a simple store, no? There ought to be a better way than to have to deal with all these possibilities (TLB hit/miss, tracing, triggers) over and over.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is similar to simple store, However the store length is longer. So the effect of TLB hit/miss, tracing, triggers is a little different from simple store.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I cannot get your idea of a better way. Current MACRO for load/store cannot fit into block size operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aswaterman any ideas? cbo_zero() and clean_inval() ought to be much simpler for what they do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@scottj97 I agree.
cbo.zero
was defined in such a way that breaking it into a sequence of smaller stores is legal. IMO, we should take advantage of that property here. Could delete this routine and replace the instruction implementation with something like
auto base = RS1 & -blocksz;
for (size_t offset = 0; offset < blocksz; offset += 4)
MMU.store_uint32(base + offset, 0);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better yet, increment "offset" by 1 and simply store bytes, since that's the arch definition (and avoids assumptions about the atomicity of the individual operations).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that would get the point across better. They should be equivalent as far as Spike's concerned, though, since the difference in atomicity isn't observable--Spike's single-threaded, and 4 bytes is the smallest protection granule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I still have one problem: How about the tracer for cache_misses and write_accesses counter? Do we need to take it into acount?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this approach is OK with respect to those counters. They're part of the Spike implementation, rather than governed by the ISA, and so we can just say that the current behavior is correct by definition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
Please squash the fixes back into the original commits. I.e. everything starting with d819da1 |
OK. |
56c55e5
to
0bceee0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is much improved, but clean_inval() still smells funny.
riscv/mmu.cc
Outdated
void mmu_t::clean_inval(reg_t addr, bool clean, bool inval) | ||
{ | ||
reg_t paddr = addr; | ||
bool page = false, guest = false, access = false; | ||
bool has_gva; | ||
reg_t gpa; | ||
access_type type = LOAD; | ||
bool fault = true; | ||
try { | ||
paddr = translate(addr, 1, LOAD, 0); | ||
type = LOAD; | ||
fault = false; | ||
} catch (trap_load_access_fault& t) { | ||
access = true; | ||
has_gva = t.has_gva(); | ||
} catch (trap_load_page_fault& t) { | ||
page = true; | ||
has_gva = t.has_gva(); | ||
} catch (trap_load_guest_page_fault& t) { | ||
guest = true; | ||
gpa = t.get_tval2(); | ||
} | ||
|
||
if (fault) { | ||
try { | ||
paddr = translate(addr, 1, STORE, 0); | ||
type = STORE; | ||
fault = false; | ||
} catch (trap_store_access_fault& t) { | ||
access = true; | ||
has_gva = t.has_gva(); | ||
} catch (trap_store_page_fault& t) { | ||
page = true; | ||
has_gva = t.has_gva(); | ||
} catch (trap_store_guest_page_fault& t) { | ||
guest = true; | ||
gpa = t.get_tval2(); | ||
} | ||
} | ||
|
||
if (fault) { | ||
try { | ||
paddr = translate(addr, 1, FETCH, 0); | ||
type = FETCH; | ||
} catch (trap_instruction_access_fault& t) { | ||
access = true; | ||
has_gva = t.has_gva(); | ||
} catch (trap_instruction_page_fault& t) { | ||
page = true; | ||
has_gva = t.has_gva(); | ||
} catch (trap_instruction_guest_page_fault& t) { | ||
guest = true; | ||
gpa = t.get_tval2(); | ||
} | ||
} | ||
|
||
if (access) { | ||
trap_store_access_fault(has_gva, addr, 0, 0); | ||
} else if (guest) { | ||
trap_store_guest_page_fault(addr, gpa, 0); | ||
} else if (page) { | ||
trap_store_page_fault(has_gva, addr, 0, 0); | ||
} else { | ||
if (auto host_addr = sim->addr_to_mem(paddr)) { | ||
if (tracer.interested_in_range(paddr, paddr + PGSIZE, type)) | ||
tracer.clean_invalidate(paddr, blocksz, clean, inval); | ||
else | ||
refill_tlb(addr, paddr, host_addr, type); | ||
} else { | ||
throw trap_store_access_fault((proc) ? proc->state.v : false, addr, 0, 0); | ||
} | ||
|
||
if (!matched_trigger) { | ||
matched_trigger = trigger_exception(type == LOAD? OPERATION_LOAD : | ||
type == STORE? OPERATION_STORE : OPERATION_EXECUTE, | ||
addr, 0); | ||
if (matched_trigger) | ||
throw *matched_trigger; | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems excessive, especially all the special handling of exceptions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the spec, " A cache-block management instruction is permitted to access the specified cache block whenever a load
instruction, store instruction, or instruction fetch is permitted to access the corresponding physical addresses." That means it may do LOAD STORE or FETCH check. However, current translate progress can only pass one type at a time. So in above implementation it divide into three steps to check each access type:
- No matter which access type pass the translation progress, cbo instruction will execute successfully.
- If all of them trigger exceptions. it will combine the final exception result according which exception is triggered at last (birefly access fault> guest page fault > page fault)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since stores don't succeed where loads fail, it should be possible to only check for load or fetch permissions.
Furthermore, the interaction between CBOs and triggers is implementation-defined, so we're free to skip that part.
We're also under no obligation to refill the TLB. For these instructions, simple code is
And finally, we don't need to be quite so precise about what type
gets passed to tracer.interested_in_range
. Between all of those, I think this can be simplified substantially. Something like this:
void mmu_t::clean_inval(reg_t addr, bool clean, bool inval)
{
reg_t paddr;
try {
paddr = translate(addr, 1, LOAD, 0);
} catch (...) {
// Try again with fetch permissions
try {
paddr = translate(addr, 1, FETCH, 0);
} catch (trap_instruction_access_fault& t) {
trap_store_access_fault(t.has_gva(), addr, 0, 0);
} catch (trap_instruction_page_fault& t) {
trap_store_page_fault(t.has_gva(), addr, 0, 0);
} catch (trap_instruction_guest_page_fault& t) {
trap_store_guest_page_fault(addr, t.get_tval2(), 0);
} catch (...) {
abort();
}
}
if (auto host_addr = sim->addr_to_mem(paddr)) {
if (tracer.interested_in_range(paddr, paddr + PGSIZE, LOAD))
tracer.clean_invalidate(paddr, blocksz, clean, inval);
} else {
throw trap_store_access_fault((proc) ? proc->state.v : false, addr, 0, 0);
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make any sense to modify translate()
so that we can pass a list of access types? So we could do:
paddr = translate(addr, 1, LOAD | FETCH | STORE, 0);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd need to see it both ways to judge. I'm worried that doing so would make the already complex page-table-walking code messier. This approach is certainly ugly, but at least it is contained.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the spec, " A cache-block management instruction is permitted to access the specified cache block whenever a load instruction, store instruction, or instruction fetch is permitted to access the corresponding physical addresses." That means it may do LOAD STORE or FETCH check. However, current translate progress can only pass one type at a time. So in above implementation it divide into three steps to check each access type:
- No matter which access type pass the translation progress, cbo instruction will execute successfully.
- If all of them trigger exceptions. it will combine the final exception result according which exception is triggered at last (birefly access fault> guest page fault > page fault)
How does this work with MPRV=1 (which does not affect fetch)? Perhaps the spec is too vague here. As of 83fdb70, clean_inval()
tries a load access type, and if that detects a fault, it tries a fetch access type. Those could be two completely different paddrs if MPRV=1. That doesn't seem desirable. The spec refers only to "the corresponding physical address", i.e. only a single possible physical address.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You had me stumped there for a moment, but I think everything is OK in the spec....
The translation is based on the effective address of the CBO, which, like other explicit load and store accesses, is affected by MPRV. Once the translation is performed, the permissions are checked against the RWX bits in the various mechanisms. So the translation and permission check is not performed as if the access were an instruction fetch.
But this likely highlights an issue with performing a FETCH translation by itself.... (I was wrong in thinking that was OK.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps a way to approach this is to perform two LOAD accesses, one with MXR=0 and one with MXR=1? (Though I haven't quite thought through whether that explodes with virtualization....)
Edit: or just force MXR=1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MXR doesn't affect PMP, and so that wouldn't suffice to allow CBOs to regions that have a R=0 X=1 PMP configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have modified translation progress to support check for cbo ops If virt is not a problem.
8c6acf7
to
e9ffda5
Compare
riscv/mmu.cc
Outdated
} | ||
} | ||
|
||
if (access) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two problems I see in this logic:
- If LOAD detects an access fault, it tries again with FETCH. But
access
will still be true, so this will take a store access fault, when it should successfully perform the CBO if the fetch succeeds. - This does not seem to check for STORE permissions at all? If the page is read-only, the operation should not be allowed (right?). Edit: apparently cache ops are allowed to read-only pages, so never mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. this is a logic error. I should take all of them to false if FETCH succeeds.
e9ffda5
to
099f268
Compare
riscv/mmu.cc
Outdated
@@ -258,7 +275,7 @@ reg_t mmu_t::pmp_homogeneous(reg_t addr, reg_t len) | |||
return true; | |||
} | |||
|
|||
reg_t mmu_t::s2xlate(reg_t gva, reg_t gpa, access_type type, access_type trap_type, bool virt, bool hlvx) | |||
reg_t mmu_t::s2xlate(reg_t gva, reg_t gpa, int& type, int trap_type, bool virt, bool hlvx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use a reference for type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind, I see why. This smells bad but I can't think of a better way. :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to lead to a lot of confusion. Return an aggregate (e.g. struct { reg_t addr; int type; };
) instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I'll try this way.
@dkruckemyer-ventana tells me that the perms checking for CBOs might be relaxed so that merely checking for load permissions is a correct implementation. I'll let him confirm or deny. That would greatly simplify things: we could leave the translation functions as-is, and simply check for load perms, catch load exceptions, and rethrow as store exceptions. |
I think that change is highly likely. Let's see what things look like with Andrew's suggestions. |
099f268
to
020e955
Compare
Sorry. do you mean the spec will change as this way?
|
Yes, barring any last minute objections, load (read) permission is the only requirement for management CBOs. There's a PR with the updated text: |
020e955
to
79d5177
Compare
OK. I have reverted the modification for translate function and update the clean_inval function. |
de8b0dc
to
0923f4e
Compare
riscv/cachesim.cc
Outdated
if (likely(hit_way != NULL)) | ||
{ | ||
if (*hit_way & DIRTY) | ||
writebacks++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what "writebacks" counts, but an actual writeback is only required if a clean is required. For inval only, a proper writeback of data would not need to occur.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I'll try to fix this.
0923f4e
to
f8e8359
Compare
f8e8359
to
6ec31f6
Compare
6ec31f6
to
2a85517
Compare
prefetch.* are hints and share the encoding of ORI with rd = 0. so it can share the implementation of ORI and execute as no-ops
2a85517
to
0531acb
Compare
Thank you for your patience with all my nitpicking. |
Thanks for both your effort and your patience, @liweiwei90! |
@@ -72,6 +72,7 @@ static void help(int exit_code = 1) | |||
fprintf(stderr, " --dm-no-abstract-csr Debug module won't support abstract to authenticate\n"); | |||
fprintf(stderr, " --dm-no-halt-groups Debug module won't support halt groups\n"); | |||
fprintf(stderr, " --dm-no-impebreak Debug module won't support implicit ebreak in program buffer\n"); | |||
fprintf(stderr, " --blocksz=<size> Cache block size (B) for CMO operations(powers of 2) [default 16]\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default 16 or 64?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest a default of 64B - based on that being the most common block size in the industry today (in x86 and ARMv8 architectures).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, @ingallsj please make a PR to fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add support for CMO v1.0: