Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Traps support #29

Closed
wants to merge 47 commits into from
Closed

Traps support #29

wants to merge 47 commits into from

Conversation

fesqvw
Copy link
Collaborator

@fesqvw fesqvw commented Mar 11, 2024

No description provided.

Copy link
Owner

@CharlyCst CharlyCst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote a few comments, I think the main problem so far is that you are mixing physical and virtual CSRs, in particular you are saving physical CSRs into the virtual CSRs when trying to save the context, but you don't need to do that because we already virtualize all writes to virtual registers, so they contain the last value written. For the CSR who are updated by the hardware, you do need to write some emulation code yourself though, and you already started doing that so that part looks all good :)

src/main.rs Outdated
cause @ _ => {
// Save csr registers
ctx.csr.mcause = Arch::read_mcause_raw();
ctx.csr.mepc = Arch::read_mepc();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to save mepc, it is already saved in ctx.pc when the payload exits. What you probably want to do here is to save the PC at which the trap occured into mepc, so basically ctx.csr.mepc = ctx.pc.

Think about what would happen here if a trap occurred in Mirage itself since the trap from the payload, for instance a timer tick once we activate them. Then the physical mepc would point somewhere in Mirage, and you would configure the payload to execute some Mirage code!

)
}

unsafe fn write_mepc(mepc: usize) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You won't need to write mepc manually, it is written when entering the firmware (the assembly trampoline loads the value from ctx.pc into mepc.

src/main.rs Outdated
ctx.csr.mepc = Arch::read_mepc();
ctx.csr.mtval = Arch::read_mtval();
ctx.csr.mstatus = Arch::read_mstatus();
ctx.csr.mtinst = Arch::read_mtinst();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be careful for those three registers (mtval, mstatus, and mtinst), you are saving Mirage's registers into the payload context, that is probably not what you intended to do ^^

When you do Arch::read_xxx you are reading physical registers, but the payload operates on virtual registers. We already saved the virtual value of mtval when the payload configured it for instance, so there is nothing more to do for mtval.
It is true that you might need to update some of the virtual registers, like mstatusI suppose, but you will only need to update a few bits I assume. To perform a proper emulation you should try to figure out which bits should be updated (Neelu might be able to help :) ) and perform those updates manually. It's possible some updates will require reading stuff from the physical registers, but only some bits.

src/main.rs Outdated
@@ -95,7 +95,16 @@ fn handle_trap(ctx: &mut VirtContext, max_exit: Option<usize>) {
// Skip to next instruction
ctx.pc += 4;
}
_ => (), // Continue
cause @ _ => {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you will catch all mcause, and a lot of them we don't know how to handle right now.
For this kind of projects we need to have a philosophy of correctness over completeness, which means that should strive to have a correct implementation at all time, event if it is not complete. So we should narrow down the scope, for instance let's handle only breakpoint traps, and be sure that we do so correctly.

Ed will tell you the story at some point, but the original VMWare hypervisor was definitely not complete, there were multiple hundreds of TODOs throughout the code, event at like version 5. But, everything that was implemented was correct, and that's why it worked so well.

The real problem here is that bugs will become really hard to debug if we start allowing incorrect emulation, and we might loose track of what is implemented and what is not. Bottom line is that we should panic on everything we did not intentionally handle properly :)

In that specific case, match a single trap and we will focus on making that one work before allowing more.

src/main.rs Show resolved Hide resolved
Copy link
Owner

@CharlyCst CharlyCst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the good work!
Now that you spent some time exploring and that I had time to think a bit about it, I think we should think about doing a bit of refactoring to better handle those traps. In particular we should probably save the info about the exit just after the exit happens, and then using those we can properly emulate the trapping behavior on virtual registers.

We can discuss that a bit if you want :)

src/virt.rs Outdated
Csr::Dscratch0 => todo!(),
Csr::Dscratch1 => todo!(),
Csr::Mepc => {
if value > usize::MAX {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usize::MAX doesn't return the maximum valid address, just the maximum number we can fit in a usize. If you need to check the maximum supported address, we should add it in the Platform, because that will depend on the specific board.

src/main.rs Outdated
_ => todo!("Instruction not yet implemented: {:?}", instr),
}
}

fn payload_trap_handler(ctx: &mut VirtContext) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose that we pass the MCause as argument here, that way we won't have to read it from the hardware registers (because the hardware registers might contain an other fault at that point).

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After thinking about it, I am thinking that we should have more info that just the. MCause, we would also need the mtval and mstatus at the time of exit.
So maybe we should create a new type like TrapInfo that contains all the values we need, and return it from the enter_firmware function. That way you don't need to read hardware registers anymore.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that would be a solution to the problem. Once we have the values, emulation should be much easier.

I'll take a look on how to do that as soon as I can.

src/main.rs Outdated

// TODO : this should be done in the context switch assembly : it's ok for now because those registers are not modified by any code before
//mcause
ctx.csr.mcause = Arch::read_mcause_raw();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the fault we should inject might not be the cause in the physical register.
If we take the cause as argument we could use that here.

src/main.rs Outdated
//mcause
ctx.csr.mcause = Arch::read_mcause_raw();
//mstatus
ctx.csr.mstatus = Arch::read_mstatus();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you want to update from mstatus? The mstatus of Mirage will probably contain different values than what we expect in the payload's mstatus.

src/main.rs Outdated
//mtval
ctx.csr.mtval = Arch::read_mtval();
//mip
ctx.csr.mip = Arch::read_mip();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need to think about how we emulate interrupts. For now maybe let's assume we have no interrupts at all (so no interrupt pending) and write all 0.

src/main.rs Outdated
//mstatus
ctx.csr.mstatus = Arch::read_mstatus();
//mtval
ctx.csr.mtval = Arch::read_mtval();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mtval depends on the exception, right? We will probably also need to configure that manually by matching on the trap cause.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but like mcause and other csrs it is written by the HW (at least Neelu said so).
It should go into TrapInfo given by the enter_firmware function

payloads/breakpoint/src/main.rs Show resolved Hide resolved
@@ -332,8 +376,18 @@ _raw_trap_handler:
sd x30,(8+8*30)(x31)
csrr x30, mscratch // Restore x31 into x30 from mscratch
sd x30,(8+8*31)(x31) // Save x31 (whose value is stored in x30)
csrr x30, mepc // Read payload PC
sd x30, (8+8*32)(x31) // Save the PC
Copy link
Owner

@CharlyCst CharlyCst Mar 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You removed the assembly to save the PC, but we still restore the context from the PC when jumping back into the firmware. Isn't that a bug, or maybe I am missing something?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pc is set later on when we are sure the trap is coming from the payload and not mirage. If the trap was from mirage we would not want to put that mepc into the virtual context.

src/virt.rs Outdated

impl TrapInfo {
/// Whether the trap comes from M mode
pub fn from_mmode(self) -> bool {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you want to use &self instead of self.

  • If you use self the struct TrapInfo is passed by value to the function, that means that either the struct must be moved, or it must be copied (and you need the Copy trait indeed).
  • If you use &self the value is passed by reference to the method, so from_mmode will simply receive a pointer instead of a copy of the whole struct. This is what we want here, we simply inspect a few fields.

In general, you can use self for small sturcts, for instance when the structs are simply one or two integers, on in the special case when you want to move the struct (that can be useful in some case, such as with the builder pattern)
For all other cases, use either &self or &mut self depending if you need to modify the struct or not.

Same comment for the other method :)

@CharlyCst CharlyCst mentioned this pull request Mar 21, 2024
@CharlyCst
Copy link
Owner

Rebased and merged as part of #32.

@CharlyCst CharlyCst closed this Mar 22, 2024
@CharlyCst CharlyCst deleted the traps branch April 1, 2024 11:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants