6.1810 Lab - Traps

Posted on 2025-06-18 In 操作系统 Views:

RISC-V assembly

Which registers contain arguments to functions? For example, which register holds 13 in main's call to printf?

根据 RVG Calling Convention，riscv 会尽可能地把参数先放进寄存器，可以用来传参的寄存器有 a0 - a7（整形），fa0 - fa7（浮点型），剩下的参数压栈。

printf("%d %d\n", f(8) + 1, 13) 中的 13 存在 a2 寄存器。

RVG Calling Convention 可以在 Compiler Explorer 中验证。

Where is the call to function f in the assembly code for main? Where is the call to g? (Hint: the compiler may inline functions.)

可以看到 main 函数当中对 f 的调用被编译器内联了：f(8) + 1 -> g(8) + 1 -> 8 + 3 + 1 = 12，于是编译器直接把 12 这个立即数加载到 a1 寄存器中。

可以看出对 g 的调用也被内联了：没有到 g，而是直接把返回值 x + 3 写在 a0 寄存器。

At what address is the function printf located?

printf 的位置在 0x6e2。

What value is in the register ra just after the jalr to printf in main?

我的编译器编译出来的不是 jalr 指令，但是同理。

0x6aa000ef 这条指令的 opcode 是 0x6f 表示 jal，jal 的格式是 jal rd, offset。这里的 rd 是 00001 x1，也就是 ra 寄存器，于是返回地址应该被送到 ra 寄存器，那么执行完 jal 之后 ra 寄存器的值应该是 0x38 + 4 = 0x3c，即下一条指令。

下面可以用 GDB 验证一下这个猜想，可以看到跳转到 printf 函数之后 ra 寄存器的值是 0x3c。

Run the following code.
1
2
unsigned int i = 0x00646c72;
printf("H%x Wo%s", 57616, (char *) &i);
What is the output?

If the RISC-V were instead big-endian what would you set i to in order to yield the same output? Would you need to change 57616 to a different value?

输出是 HE110 World：%x 打印出 57616 的 16 进制即 0xE110；%s 打印出 rld 即 "rld\0" 即 0x72 0x6c 0x64 0x00，可以看出低位排在低地址，即小端存储。如果是大端存储的话，需要把 i 设置成 0x726c6400，但是 57616 不需要改变，因为字节序不影响 %x 的结果。

In the following code, what is going to be printed after 'y='? (note: the answer is not a specific value.) Why does this happen?

printf("x=%d y=%d", 3);

这条在我的电脑上直接编译期报错了，应该是编译器比较新，编译期间检查了参数数量不匹配的不定长参数。如果可以过编译，我猜这个值由 va_arg 的实现决定，因为 printf 在 kernel 中的实现直接调用了 va_arg(ap, int) 的结果做 printint，而 va_arg 本身没有内建的失败机制（即不能处理超出参数数量的读取操作），这里会是一个未定义行为。

Backtrace

这个任务是打印出 sleep 系统调用的内核调用栈。实验对此作了简化，只要输出调用栈每一层的返回值地址即可。

每个 frame 的前两个 8 字节 $fp - 8 和 $fp - 16 分别保存了当前 frame 的返回地址和前一个 frame 的 frame pointer（即前一个 frame 开始的地方），那么 frame pointer 就把每个 frame 连接形成了一个链表。所以我们只需要遍历每一个 frame，然后打印出每个 frame 的返回地址。

这里需要在最后一个 frame 结束后终止遍历：因为 xv6 的 kernel stack 中的 frame 一定在一个 page 上，于是可以确定出 frame 的起始地址的有效范围。这个有效范围可以作为遍历的终止条件。

Stack
  (higher addresses)
                   .
                   .
      +->          .
      |   +-----------------+   |
      |   | return address  |   |
      |   |   previous fp ------+
      |   | saved registers |
      |   | local variables |
      |   |       ...       | <-+
      |   +-----------------+   |
      |   | return address  |   |
      +------ previous fp   |   |
          | saved registers |   |
          | local variables |   |
      +-> |       ...       |   |
      |   +-----------------+   |
      |   | return address  |   |
      |   |   previous fp ------+
      |   | saved registers |
      |   | local variables |
      |   |       ...       | <-+
      |   +-----------------+   |
      |   | return address  |   |
      +------ previous fp   |   |
          | saved registers |   |
          | local variables |   |
  $fp --> |       ...       |   |
          +-----------------+   |
          | return address  |   |
          |   previous fp ------+
          | saved registers |
  $sp --> | local variables |
          +-----------------+
  (lower addresses)

在 kernel/printf.c 加入下面的代码：

kernel/printf.c

static inline uint64
r_fp()
{
  uint64 x;
  asm volatile("mv %0, s0" : "=r" (x) );
  return x;
}

void
backtrace(void)
{
  uint64 fp = r_fp();
  uint64 stack_lower = PGROUNDDOWN(fp);
  uint64 stack_upper = stack_lower + PGSIZE;
  do {
    printf("%p\n", (uint64*)(*(uint64*)(fp - 8)));
    fp = *(uint64*)(fp - 16);
  } while (fp > stack_lower && fp < stack_upper);
}

我们可以用 bttest 来调用 sleep 以调用刚刚实现的 backtrace 函数，然后使用 riscv64-unknown-elf-addr2line -e kernel/kernel 查看这些返回地址所在的行：

Alarm

这个任务要求实现两个系统调用：

sigalarm(int ticks, void (*handler)()) 用来注册一个定时执行的函数
sigreturn() 用来结束 handler 函数

这个 handler 会每隔 ticks 时间被执行。具体地：

当时钟中断来临时，内核记录每个进程存活的时间，如果满足 handler 的触发条件，则执行 handler 函数。
由于 handler 函数是定义在用户程序里的，所以执行 handler 函数需要从内核态切换到用户态。这里需要保存当前的内核态上下文，并恢复中断响应前的用户态上下文，才能执行 handler 函数。
由于执行完 handler 之后仍需要回到中断前的用户态执行内容，所以这里需要有 sigreturn 而不是普通的 return，是一个可以切换到内核态并恢复中断前的上下文的操作。

Ticks

我看到这个描述的时候想到的第一个问题是 ticks 怎么计算，是算当前进程使用的 ticks 还是从当前进程启动开始经过的 ticks。前者和后者的区别是：后者的 ticks 有一部分不是当前的进程使用的，在当前进程 yield 之后。题目中说「This might be useful for compute-bound processes that want to limit how much CPU time they chew up, or for processes that want to compute but also want to take some periodic action. 」那么可以确定是前者。

中断处理

根据 6.1810 Lab - System Calls，中断来临之后 OS 会跳转到 trampoline，然后陷入内核。这里跳转到 trampoline 的行为是通过设置 $stvec 寄存器来实现的，这一步在 usertrapret 里：fork 返回的时候会调用 usertrapret 来初始化中断响应寄存器 $stvec。

trampoline 在保存完上下文、切换到内核状态之后会跳转到 usertrap，usertrap 处理系统调用或时钟中断。在 if (which_dev == 2) 满足时，这里就是一个时钟中断，原本的设计是：当前正在执行的进程放弃 CPU，给其他用户程序被调度的机会，即 yield。我们这里需要在 yield 之前来处理 ticks。

这里有两个思路：

思路一：yield 之前，切到 user 执行定时任务，然后 sigreturn 切回 kernel，继续 yield
思路二：yield 之前，不立即执行 user 的定时任务，而是修改 trapframe 中的 PC 指针之后先 yield，等待下次程序被调度的时候会从预设好的 PC 开始执行（即定时任务的函数），然后 sigreturn 切回 kernel，再恢复中断前的 user 上下文

如果采用思路一，需要实现一个类似于 usertrapret 的东西，支持修改 PC 寄存器，并且 sigreturn 陷入 kernel 的时候要支持回到 yield 之前，实现起来相对复杂一些；所以思路二比较讨巧，因为思路二可以复用现有的 user / kernel 切换的代码。我们可以加一个 handle_time_passes() 函数来处理 ticks。

if (which_dev == 2) {
  handle_time_passes();
  yield();
}

那么我们来实现思路二，在 sigalarm 注册过 handler 的情况下：时钟中断来临 - 处理 ticks - 设置 PC，备份 trapframe - 挂起当前进程，等待再次调度 - 再次调度，执行 handler - sigreturn 返回内核 - 更新 PCB，恢复中断前的 user 上下文 - usertrapret 恢复中断前的 user 状态。

上下文恢复

在陷入内核之后，trapframe 中保存着用来恢复 user 状态的上下文；改变 trapframe 的 PC 会导致原来 user 上下文的 PC 的丢失，以及其他 user 上下文在执行后也会丢失，所以我们需要备份时钟中断前的 user 上下文，即改变 PC 之前备份 trapframe。

PCB

至此，PCB 中需要一些新的字段来用来备份 trapframe 和实现这里的定时任务。

// Per-process state
struct proc {
  //...

  //
  // for sigalarm and sigreturn
  //

  // Indicates that the timing function is being executed to prevent reentry.
  int invoking;

  // Ticks passed since last execution.
  int ticks_passed;

  // Interval.
  int alarm_interval;

  // Periodic function.
  void (*alarm_handler)();

  // For backing trampframe up.
  struct trapframe *trapframe_bk;
};

在新进程创建时，即 allocproc 时，我们要初始化这些字段。trapframe_bk 的初始化方式和 trapframe 类似，分配一个页面。在进程结束 freeproc 的时候也需要回收这个页面。

Ticks 处理

下面是处理 ticks 的逻辑。

void
handle_time_passes(void)
{
  struct proc *p = myproc();

  // If an application calls sigalarm(0, 0),
  // the kernel should stop generating periodic alarm calls.
  if (p->alarm_interval == 0 && p->alarm_handler == 0) {
    return;
  }
  
  // Prevent re-entrant calls to the handler.
  if (p->invoking) {
    return;
  }

  // Increment the ticks tracked by PCB.
  p->ticks_passed++;

  // Only invoke the alarm function if the process has a timer outstanding.
  if (p->ticks_passed < p->alarm_interval) {
    return;
  }

  // Back the trapframe up.
  memmove(p->trapframe_bk, p->trapframe, PGSIZE);

  // Set the invoking flag, 
  // to prevent re-entrant calls to the handler.
  p->invoking = 1;

  // Set PC to jump to the handler function.
  p->trapframe->epc = (uint64)p->alarm_handler;

  return;
}

系统调用

下面是 sigalarm 和 sigreturn 的实现。这里 sigreturn 的时候要恢复中断前的 a0 寄存器，那就要返回 trapframe 的 a0，因为系统调用的返回值会写入 a0。这一步在常规的系统调用中不需要实现，因为常规的系统调用中 a0 作为参数，不需要被恢复。这里是时钟中断，所以不应该改变 a0，所以需要恢复这个寄存器。

uint64
sys_sigalarm(void)
{
  struct proc *p = myproc();
  argint(0, &(p->alarm_interval));
  argaddr(1, (uint64*)&(p->alarm_handler));
  return 0;
}

uint64
sys_sigreturn(void)
{ 
  struct proc *p = myproc();
  p->invoking = 0;
  p->ticks_passed = 0;
  memmove(p->trapframe, p->trapframe_bk, PGSIZE);  
  return p->trapframe->a0; 
}

至此，我们实现了这个定时的 alarm。