Skip to content

OccupyMars2025/xv6-labs-2023

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

2023/11/9 - 12/2: round 1, failure

2024/2/2 - 4/18: round 2

===============================

TODO: study the test code of each lab, study the Makefile of each lab, complete the "Optional Challenge" part of each lab, re-study the "net" lab

  • (4 days)(2024/2/2 7:30 - 2/6 14:10) sep 6: Introduction and examples
  • (3 hours)(2024/2/6 14:10 - 17:05) sep 11: C in xv6, slides, and examples
  • (7 days)(2024/2/6 17:05 - 2/13 13:58)sep 13: OS design
    • (3 days)(2024/2/6 17:05 - 2/9 18:00) sep 13: OS design, chapter 2
    • (2 days)(2024/2/9 18:00 - 2/11 15:40) sep 13: OS design, lecture
    • (2024/2/11 15:40 - ) sep 13: OS design, Assignment: Lab syscall: System calls
      • (3h 23min)Using gdb (easy): 2024/2/12 13:00 - 16:23
      • (2h 10min)System call tracing (moderate): 2024/2/13 8:30 - 10:40
      • (48 min)Sysinfo (moderate): 2/13 13:10 - 13:58
  • (4 days)(2/13 13:58 - 2/17 10:10)sep 18: page tables
  • (3 days)(2/17 10:10 - 2/20 15:20) sep 20:GDB Calling conventions
    • (2 days)(2/17 10:10 - 2/19 10:30) complete reading material
    • (1 day)(2/19 10:30 - 2/20 15:20) Lab pgtbl: Page tables
      • (2h 35min)(2/20 6:25 - 9:00) Speed up system calls (easy)
      • (1h 35min)(2/20 9:00 - 10:35) Print a page table (easy)
      • (2h 40min)(2/20 12:40 - 15:20) Detect which pages have been accessed (hard)
  • (2 days 25min)(2024/2/20 15:20 - 2/22 15:45)sep 25: LEC 6: System call entry/exit
  • (2 days 2h)(2024/2/22 15:45 - 2/24 17:30)sep 27:LEC 7: Page faults
    • (9h)(2024/2/22 15:45 - 2/23 0:24)Read Section 4.6, 4.7
    • (7h 30min)(2/23 7:20 - 14:50) slides
    • (2h)(2/23 14:50 - 16:50) https://pdos.csail.mit.edu/6.1810/2023/lec/l-pgfaults.txt
    • (1 day)(2/23 18:00 - 2/24 17:30)Lab: traps
      • (3h)(2/23 19:00 - 22:00)RISC-V assembly (easy)
      • (3h 55min)(2/24 7:40 - 11:20, 11:44 - 11:59) Backtrace (moderate)
      • (4h 20min)(2/24 13:10 - 17:30) Alarm (hard)
  • (23h)(2024/2/24 17:30 - 2/25 16:30)oct 2: LEC 8: Q&A labs (slides)
    • (4h 50min)(2/24 20:00 - 22:20, 2/25 9:30 - 12:00) slides
    • (3h 10min)(2/25 13:20 - 16:30) LEC 8: Q&A labs
  • (2024/2/25 16:30 - 3/4 15:20)oct 4: LEC 9: Device drivers
    • (1 day 8h)(2/26 8:30 - 2/27 16:30)Read Chapter 5
    • (2/27 19:00 - 2/29 15:50) Lecture 9: Device drivers
    • (2/29 15:50 - 3/4 15:20) Lab: Copy-on-Write Fork for xv6
      • (2/29 22:40) bug: #9
      • (3/4 7:00 - 15:20) complete this lab successfully

========== why so slow ==================

  • (3/4 15:20 - 3/16 7:30) oct 11: LEC 10: Locking
    • (3/4 17:00 - 3/15 18:00) read chapter 6: Locking
    • (3/15 18:00 - 3/16 7:30) lecture, Locking

============================

  • (3/16 7:30 - 3/20 9:40) oct 16, LEC 11 (rtm): Scheduling 1
    • (3/16 7:30 - 3/19 18:00) Read "Chapter 7: Scheduling"
    • (3/19 18:00 - 3/20 9:40) lecture 11 (rtm): Scheduling 1
  • (3/20 9:40 - 3/21 20:40) LEC 12 (rtm): Coordination, code
    • (3/20 9:40 - 19:40) Lecture 12: Coordination (sleep&wakeup)
    • (3/20 19:40 - 3/21 20:40) Lab: Multithreading
      • (3/20 19:40 - 3/21 15:40) Uthread: switching between threads (moderate)
      • (3/21 15:40 - 19:40) Using threads (moderate)
      • (3/21 19:40 - 20:40) Barrier(moderate)

============================

Don't worry about the details at first; just get a feel for how the manual is structured so you can find things later. The E1000 has many advanced features, most of which you can ignore. Only a small set of basic features is needed to complete this lab.

Never read a manual like a textbook. This is not how the manual is used

  • (3/21 20:40 - 3/24 17:20) oct 25, Assignment: Lab net: Network driver
    • (3/21 20:40 - 3/23 18:20) read Section 1, 2, 3.2 of the manual like a textbook
    • (3/23 18:20 - 3/24 17:20) read the source code, complete the lab

Reading the source code is much more enjoyable than reading the manual

============================

  • (3/24 17:20 - 3/30 7:30) nov 1, LEC 13: File systems
    • (3/24 17:20 - 3/29 18:00) read Chapter 8: File system
    • (3/29 18:00 - 3/30 7:30) lecture
  • (3/30 7:30 - 4/2 18:50) nov 6, LEC 14: Crash recovery
  • (4/2 18:50 - 4/12 14:20) nov 8, LEC 15: File system performance and fast crash recovery
    • (4/2 18:50 - 4/4 9:10) read the paper
    • (4/4 9:10 - 16:20) lecture
    • (4/4 16:20 - 4/12 14:20) Lab lock: Parallelism/locking
      • (4/4 16:20 - 4/5 23:50) Memory allocator (moderate)
      • (4/6 7:30 - 4/12 14:20) Buffer cache (hard), spend most of the time getting familiar with the source code of the file system, spend one day writtening the source code, spend half day debugging

(4/6 7:30 - 4/8 11:10) by debugging step by step, bcachetest --> test0() --> sys_mkdir() --> create(), after the 3rd round of debugging, finally I get familiar with the file system, build a reasonably accurate model of it in my brain. It's like I suddenly had a flash of inspiration after many times of repetition. Whoo ! I can't believe how much work I've done to just understand the source code of the file system. (3/24 - 4/8)

(4/8 11:10 - 4/10 18:00) By debugging bcachetest:test0() step by step, I can now read the source code related to the file system smoothly without any confusion. I'd call it a feat ! When I started studying the source code related to the file system, I was so frustrated that I didn't think I could make it. Genius is 1% inspiration and 99% perspiration. Without the 99% perspiration you won't get the 1% inspiration because that's just the mechanism of human brain.

  • (4/12 14:20 - 4/17 11:30) nov 13, LEC 16: Virtual memory for applications
  • (9/3 21:50 - ) nov 15, LEC 17 (rtm): OS Organization
    • (9/3 21:50 - ) Preparation: Read The Performance of micro-Kernel-Based Systems (1997)

===========================

The operating system must make sure to provide adequate support for what the hardware is capable of.

Q: How to read the source code ?

A: Go to each branch, read the commits by OccupyMars2025

=====================

It will be easier to look at traps with gdb if you tell qemu to use only one CPU, which you can do by running make CPUS=1 qemu-gdb, then (gdb) set $sstatus=a new value to disable all interrupts, especially the timer interrupt (refer to intr_off())

step 1: make clean

step 2: make CPUS=1 qemu-gdb

step 3: disable all interrupts

(gdb) p/x $sstatus
(gdb) set $sstatus=a new value

step 4: (gdb) info threads, only one thread

You will need a RISC-V "newlib" tool chain from
https://github.com/riscv/riscv-gnu-toolchain, and qemu compiled for
riscv64-softmmu.  Once they are installed, and in your shell
search path, you can run "make qemu".
This issue may be helpful:
https://github.com/OccupyMars2025/xv6-labs-2023/issues/10

===================

In each lab, don't edit the file "README", it is needed, for example in "trace 32 grep hello README", you can add another file "README.md"

BUILDING AND RUNNING XV6

You will need a RISC-V "newlib" tool chain from https://github.com/riscv/riscv-gnu-toolchain, and qemu compiled for riscv64-softmmu. Once they are installed, and in your shell search path, you can run "make qemu".

Ctrl-a c : "Rotate between the frontends", so just use "Ctrl-a c" twice to switch back to the original frontend

useful commands:

xv6 has no ps command, but, if you type Ctrl-p, the kernel will print information about each process. If you try it now, you'll see two lines: one for init, and one for sh.

To quit qemu type: Ctrl-a x (press Ctrl and a at the same time, followed by x).

riscv64-linux-gnu-objdump -S user/_cat
  same as user/cat.asm
0x0: cat
  what if we run two cat programs at the same time?
  see pgtbl lecture
0x8e: _main
  user.ld:
    entry: _main
what is _main?
  defined in ulib.c, which calls main() and exit(0)
where is data memory? (e.g., buf)
  in data/bss segment
  must be setup by kernel
but we know address where buf should be
  riscv64-linux-gnu-nm -n user/_cat

Use GDB to help you debug! I know that using GDB is really annoying in the beginning but it is super super helpful in the later labs and we want you all to know the basic commands to make debugging less painful in the future.

it is worth your time to revisit the following tutorials when tracking down kernel bugs.

=======================

https://pdos.csail.mit.edu/6.1810/2023/labs/traps.html

It will be easier to look at traps with gdb if you tell qemu to use only one CPU, which you can do by running
    make CPUS=1 qemu-gdb

=======================

add the following functions to help debug:

gdb has "backtrace" command in itself, you don't need to add it

Once your backtrace is working, call it from panic in kernel/printf.c so that you see the kernel's backtrace when it panics.

    backtrace:
    0x0000000080002cda
    0x0000000080002bb6
    0x0000000080002898
  
After bttest exit qemu. In a terminal window: run addr2line -e kernel/kernel (or riscv64-unknown-elf-addr2line -e kernel/kernel) and cut-and-paste the addresses from your backtrace, like this:
    $ addr2line -e kernel/kernel
    0x0000000080002de2
    0x0000000080002f4a
    0x0000000080002bfc
    Ctrl-D
  
You should see something like this:
    kernel/sysproc.c:74
    kernel/syscall.c:224
    kernel/trap.c:85

=======================

when you enter "make qemu-gdb", you can see the following infomation about file system:

nmeta 46 (boot, super, log blocks 30 inode blocks 13, bitmap blocks 1) blocks 1954 total 2000
balloc: first 915 blocks have been allocated
balloc: write bitmap block at sector 45

Caution: if the interrupts (particularly the timer interrupt) are NOT disabled, then when you debug, the "next" command may take you to some strange location

very useful: use "(gdb) watch cons" and "(gdb) watch uart_tx_buf" to see the changes in "cons" and "uart_tx_buf", you can see how "you keystroke", "console" and "uart" interact with each other. "user/sh.c" -> main() -> getcmd() -> gets() -> read() -> sys_read() -> fileread() -> consoleread() to read the characters that I typed in the console.

https://pdos.csail.mit.edu/6.1810/2023/lec/l-internal.txt
use gdb to check system call entry/exit

$ make clean
$ make qemu-gdb
(you can see "-gdb tcp::26000" in the last line of the output)

(now open a new terminal)
$ riscv64-unknown-elf-gdb
(gdb) target remote localhost:26000
(gdb) file user/_sh
(gdb) break getcmd
(gdb) c
(gdb) info breakpoints
(gdb) layout split 
(gdb) b write
(gdb) x/3i 0xe10
(gdb) p $a0
(gdb) p (char*)$a1
$11 = 0x1310 "$ "
(gdb) x/2c $a1
0x1310:	36 '$'	32 ' '
(gdb) p $a2

(si "ecall", you can see that you get to "uservec")

(gdb) file kernel/kernel
(gdb) break usertrap
(gdb) c

can we tell that we're in supervisor mode?
  I don't know a way to find the mode directly.
but once you execute "ecall" and get to "uservec", you can see the following,
but actually at this time "satp" still points to the user page table
(gdb) p (char*)$a1
$11 = 0x1310 <error: Cannot access memory at address 0x1310>
(gdb) x/2c $a1
0x1310:	Cannot access memory at address 0x1310
(gdb) 


building xv6
  % make 
  gcc on each kernel/*.c, .o files, linker, kernel/kernel
  % ls -l kernel/kernel
  % more kernel/kernel.asm
  and produces a disk image containing file system
  % ls -l fs.img

qemu
  % make qemu
  qemu, loads kernel binary into "memory", simulates a disk with fs.img
  jumps to kernel's first instruction
  qemu maintains mock hardware registers and RAM, interprets instructions

I'll walk through xv6 booting up, to first process making first system call

% make CPUS=1 qemu-gdb
% riscv64-unknown-elf-gdb
(gdb) b *0x80000000
(gdb) c
kernel is loaded at 0x80000000 b/c that's where RAM starts
  lower addresses are device hardware
% vi kernel/entry.S
"m mode"
set up stack for C function calls
jump to start, which is C code

% vi start.c
  sets up hardware for interrupts &c
  changes to supervisor mode
  jumps to main

(gdb) b main
(gdb) c
(gdb) tui enable

main()
  core 0 sets up a lot of software / hardware
  other cores wait
  "next" through first kernel printfs

let's glance at an example of initialization -- kernel memory allocator
(gdb) step -- into kinit()
(gdb) step -- into freerange()
(gdb) step -- into free()
% vi kernel/kalloc.c
kinit/freerange find all pages of physical RAM
  make a list from them
  threaded through the first 64 bytes of each page
  [diagram]
  struct run
  the cast in kfree()
  and the list insert
  a simple allocator, only 4096-byte units, for e.g. user memory

how to get processes going?
  our goal is to get the first C user-level program running
    called init (see user/init.c)
    init starts up everything else (just console sh on xv6)
  need:
    struct proc
    user memory
    instruction bytes in user memory
    user registers, at least sp and epc
  main() does this by calling userinit()

(gdb) b userinit
(gdb) continue

% vi kernel/proc.c
allocproc()
  struct proc
  p->pagetable

back to userinit()

% vi user/initcode.S
exec("/init", ...)
ecall
a7, SYS_exec
% vi kernel/syscall.h
note SYS_exec is number 7

back to userinit()

epc -- where process will start in *user* space
and sp
p->state = RUNNABLE

(gdb) b *0x0
(gdb) c
(gdb) tui disable
(gdb) x/10i 0

what's the effect of ecall?
(gdb) b syscall
(gdb) c
back in the kernel
(gdb) tui enable
(gdb) n
(gdb) n
(gdb) n
(gdb) print num
      from saved user register a7
(gdb) print syscalls[7]
(gdb) b exec
(gdb) c

% vi kernel/exec.c
  a complex system call
  read file from disk
  "ELF" format
  text, data
  defensive, lots of checks
  don't be tricked into overwriting kernel memory!
  allocate stack
  write arguments onto stack
  epc = 
  sp = 

(gdb) c

% vi user/init.c
  top-level process
  console file descriptors, 0 and 1
  sh

If you want to see what assembly code the compiler generates for the xv6 kernel or find out what instruction is at a particular kernel address, see the file kernel/kernel.asm, which the Makefile produces when it compiles the kernel. (The Makefile also produces .asm for all user programs.)

The user-space "stubs" that route system calls into the kernel are in user/usys.S, which is generated by user/usys.pl when you run make. 
Declarations are in user/user.h
The kernel-space code that routes a system call to the kernel function that implements it is in kernel/syscall.c and kernel/syscall.h.
Process-related code is kernel/proc.h and kernel/proc.c.

trap: system call , exception, interrupt

A trap may occur while executing in user space if the user program makes a system call (ecall 
instruction), or does something illegal, or if a device interrupts. The high-level path of a trap from
user space is uservec (kernel/trampoline.S:21), then usertrap (kernel/trap.c:37); and when re-
turning, usertrapret (kernel/trap.c:90) and then userret (kernel/trampoline.S:101).

=======================

TODO:

study the source code of Makefile, grade-lab-* and other auxiliary files in each lab

all "Optional challenge exercises" of the labs are NOT done

all exercises on book-xv6-riscv-rev3.pdf are NOT done

Key idea: Combination of page faults and updating page table is powerful!

Can you hack xv6-riscv ?

Instead of gcc and gdb, can you use LLVM tools to compile and debug xv6-riscv ?

The RISC-V Reader: An Open Architecture Atlas

In "oct 2:LEC 8: Q&A labs (slides)", there is some discussion about how Linux implements the features that you implement in xv6.

=======================

hint: