![Page 1: Building Computer Systems From Scratch Around AbstractMachineics.nju.edu.cn/~jyy/teach/am-talk.pdf · systest klib (student’s own OS based on Nanos labs) Abstract Machine 11. AbstractMachine](https://reader033.vdocuments.us/reader033/viewer/2022042803/5f47cc76e7b22767672f5be7/html5/thumbnails/1.jpg)
Yanyan Jiang
Building Computer Systems From Scratch Around AbstractMachine
The Third China Systems Education Workshop (CSEW’18)
SPAR Group
Inst. of Computer Software
Nanjing University
Zihao Yu
Chinese Academy of Sciences
Inst. of Computing Technology
July 6, 2018
![Page 2: Building Computer Systems From Scratch Around AbstractMachineics.nju.edu.cn/~jyy/teach/am-talk.pdf · systest klib (student’s own OS based on Nanos labs) Abstract Machine 11. AbstractMachine](https://reader033.vdocuments.us/reader033/viewer/2022042803/5f47cc76e7b22767672f5be7/html5/thumbnails/2.jpg)
Background and Motivation
![Page 3: Building Computer Systems From Scratch Around AbstractMachineics.nju.edu.cn/~jyy/teach/am-talk.pdf · systest klib (student’s own OS based on Nanos labs) Abstract Machine 11. AbstractMachine](https://reader033.vdocuments.us/reader033/viewer/2022042803/5f47cc76e7b22767672f5be7/html5/thumbnails/3.jpg)
Students Build Systems at NJU
• The Project-N
Nanos(NJU OS)
operating system
kernel
x86 qemu
NPC(NJU Processor)
MIPS32 SoC
FPGA
NEMU(NJU Emulator)
x86 full system
emulator
Linux native
NCC(NJU CCompiler)MIPS32 Assembly
SPIM emulator
3rd semester 4th semester 5th semester 6th semester
![Page 4: Building Computer Systems From Scratch Around AbstractMachineics.nju.edu.cn/~jyy/teach/am-talk.pdf · systest klib (student’s own OS based on Nanos labs) Abstract Machine 11. AbstractMachine](https://reader033.vdocuments.us/reader033/viewer/2022042803/5f47cc76e7b22767672f5be7/html5/thumbnails/4.jpg)
But... Students Have Troubles Building (x86-Based) Systems!
4
interrupt
task switching
(HW)
interrupt
handling (HW)
context save
context restoration
context
switch
C context preparation
interrupt return
GDT, IDT, TSS,
virtual memory, ...
interrupt
behavior
fragile & tricky code
ABI The most effective
approach:
RTFM !RTFSC !!
![Page 5: Building Computer Systems From Scratch Around AbstractMachineics.nju.edu.cn/~jyy/teach/am-talk.pdf · systest klib (student’s own OS based on Nanos labs) Abstract Machine 11. AbstractMachine](https://reader033.vdocuments.us/reader033/viewer/2022042803/5f47cc76e7b22767672f5be7/html5/thumbnails/5.jpg)
Students' Trouble: There is Gap Between Low-Level and High-Level Mechanisms
5
Processor Implementation (x86, MIPS32, ...)
Operating System Kernel
(Low-Level) Mechanisms: Computer Hardware
(High-Level) Mechanisms to Implement OS Kernels
OS course
C, context switch,
virtual memory
ARCH course
instructions,
interrupts, MMU
Designed for flexibility,
efficiency, compactness, …
gap
![Page 6: Building Computer Systems From Scratch Around AbstractMachineics.nju.edu.cn/~jyy/teach/am-talk.pdf · systest klib (student’s own OS based on Nanos labs) Abstract Machine 11. AbstractMachine](https://reader033.vdocuments.us/reader033/viewer/2022042803/5f47cc76e7b22767672f5be7/html5/thumbnails/6.jpg)
Our Approach
6
x86
Operating System Kernelis just a C program
Low-Level Mechanism
Mechanisms to Implement Teaching OS Kernels
Low-Level Mechanism
MIPS32
Build This
Layer! !
![Page 7: Building Computer Systems From Scratch Around AbstractMachineics.nju.edu.cn/~jyy/teach/am-talk.pdf · systest klib (student’s own OS based on Nanos labs) Abstract Machine 11. AbstractMachine](https://reader033.vdocuments.us/reader033/viewer/2022042803/5f47cc76e7b22767672f5be7/html5/thumbnails/7.jpg)
Introducing the AbstractMachine
![Page 8: Building Computer Systems From Scratch Around AbstractMachineics.nju.edu.cn/~jyy/teach/am-talk.pdf · systest klib (student’s own OS based on Nanos labs) Abstract Machine 11. AbstractMachine](https://reader033.vdocuments.us/reader033/viewer/2022042803/5f47cc76e7b22767672f5be7/html5/thumbnails/8.jpg)
Design Goals
• An “abstract” architecture to
• provide sufficient support for modern (teaching) system software
• can be implemented on (perhaps overly) simplified processors
• maybe at the cost of losing efficiency and/or flexibility
• Facilitate portable and less painful bare-metal software development
8
x86
Operating System Kernel
ISA
Abstraction Layer
ISA
MIPS32
![Page 9: Building Computer Systems From Scratch Around AbstractMachineics.nju.edu.cn/~jyy/teach/am-talk.pdf · systest klib (student’s own OS based on Nanos labs) Abstract Machine 11. AbstractMachine](https://reader033.vdocuments.us/reader033/viewer/2022042803/5f47cc76e7b22767672f5be7/html5/thumbnails/9.jpg)
The AbstractMachine (Nexus-AM Project)
• To highlight the high-level mechanisms provided by computer hardware for implementing system software
• a bare-metal C runtime (no libraries, statically linked)
• bootstrap stack
• a flat heap (usable physical memory)
• a putc function for debugging
• a series of optional C APIs
• input and output
• interrupt management and handling
• virtual memory
• multiprocessing
9
![Page 10: Building Computer Systems From Scratch Around AbstractMachineics.nju.edu.cn/~jyy/teach/am-talk.pdf · systest klib (student’s own OS based on Nanos labs) Abstract Machine 11. AbstractMachine](https://reader033.vdocuments.us/reader033/viewer/2022042803/5f47cc76e7b22767672f5be7/html5/thumbnails/10.jpg)
By Adding the Abstraction Layer...
10
AbstractMachine
Lab Artifact:NEMU (3rd semester)
Linux Native QEMU
Lab Artifact:
Nanos (4th semester)Benchmarks
Testing
Workloads...
...
AbstractMachine
Lab Artifact:
NPC (5th semester)
![Page 11: Building Computer Systems From Scratch Around AbstractMachineics.nju.edu.cn/~jyy/teach/am-talk.pdf · systest klib (student’s own OS based on Nanos labs) Abstract Machine 11. AbstractMachine](https://reader033.vdocuments.us/reader033/viewer/2022042803/5f47cc76e7b22767672f5be7/html5/thumbnails/11.jpg)
The AM Ecosystem: Overview
Input/OutputAPIs
Interrupt/Trap APIs
Multiprocessing APIs
Memory Protection APIs
Turing Machine (a minimal bare-metal C runtime environment)
SoC
mips32-npc
VMs/Emulators
x86-qemu
x86-nemu
mips32-qemu
Virtual Enviroments
Linux native
P P P O O
P P P P P
P P P P O
P P P P O
P P O O Omips32-minimal
P P O O O
Teaching Operating Systems System Software
microbench
systest klib
(student’s own OS based on Nanos labs)
Abstr
act
Machin
e
11
![Page 12: Building Computer Systems From Scratch Around AbstractMachineics.nju.edu.cn/~jyy/teach/am-talk.pdf · systest klib (student’s own OS based on Nanos labs) Abstract Machine 11. AbstractMachine](https://reader033.vdocuments.us/reader033/viewer/2022042803/5f47cc76e7b22767672f5be7/html5/thumbnails/12.jpg)
AbstractMachine Details
![Page 13: Building Computer Systems From Scratch Around AbstractMachineics.nju.edu.cn/~jyy/teach/am-talk.pdf · systest klib (student’s own OS based on Nanos labs) Abstract Machine 11. AbstractMachine](https://reader033.vdocuments.us/reader033/viewer/2022042803/5f47cc76e7b22767672f5be7/html5/thumbnails/13.jpg)
The Turing Machine (TRM)
• A minimal C bare-metal runtime environment
bootstrap
stack
an area of physical memoryArea heap;
system memory
(code, data, bss, ...)
A Turing Machine
(image source: Wikipedia)
transfer control to main()
two APIs:int halt(int code);
void putc(char ch);
![Page 14: Building Computer Systems From Scratch Around AbstractMachineics.nju.edu.cn/~jyy/teach/am-talk.pdf · systest klib (student’s own OS based on Nanos labs) Abstract Machine 11. AbstractMachine](https://reader033.vdocuments.us/reader033/viewer/2022042803/5f47cc76e7b22767672f5be7/html5/thumbnails/14.jpg)
TRM Implementations
• TRM can be implemented on a minimal/incomplete system
• qemu-system-i386 (full system emulation)
• student’s single-cycle processor (block RAM, no bus)
• student’s incomplete system emulator (basic instructions + a debug instruction)
14
bare-metal C
program(-std=c99)
building scripts (Makefile)
loading and initialization
bootstrap
stack
an area of physical memoryArea heap;
system memory
(code, data, bss, ...)
transfer control to main()
AM provides
klib (a minimal
runtime library)
![Page 15: Building Computer Systems From Scratch Around AbstractMachineics.nju.edu.cn/~jyy/teach/am-talk.pdf · systest klib (student’s own OS based on Nanos labs) Abstract Machine 11. AbstractMachine](https://reader033.vdocuments.us/reader033/viewer/2022042803/5f47cc76e7b22767672f5be7/html5/thumbnails/15.jpg)
With TRM...
• We can run almost any algorithm for testing/benchmarking
• coremark, dhyrstone
• MicroBench (small footprint non-trivial programs)
15
#define BENCHMARK_LIST(def) \
def(qsort, "qsort", QSORT_SM, QSORT_LG, "Quick sort") \
def(queen, "queen", QUEEN_SM, QUEEN_LG, "Queen placement") \
def( bf, "bf", BF_SM, BF_LG, "Brainf**k interpreter") \
def( fib, "fib", FIB_SM, FIB_LG, "Fibonacci number") \
def(sieve, "sieve", SIEVE_SM, SIEVE_LG, "Eratosthenes sieve") \
def( 15pz, "15pz", PZ15_SM, PZ15_LG, "A* 15-puzzle search") \
def(dinic, "dinic", DINIC_SM, DINIC_LG, "Dinic's maxflow algorithm") \
def( lzip, "lzip", LZIP_SM, LZIP_LG, "Lzip compression") \
def(ssort, "ssort", SSORT_SM, SSORT_LG, "Suffix sort") \
def( md5, "md5", MD5_SM, MD5_LG, "MD5 digest")
makes labs more
interesting (challenging)
![Page 16: Building Computer Systems From Scratch Around AbstractMachineics.nju.edu.cn/~jyy/teach/am-talk.pdf · systest klib (student’s own OS based on Nanos labs) Abstract Machine 11. AbstractMachine](https://reader033.vdocuments.us/reader033/viewer/2022042803/5f47cc76e7b22767672f5be7/html5/thumbnails/16.jpg)
Example: Student’s x86 Emulator
16
AbstractMachine
Lab Artifact: NEMU(x86 à RTL à
optimization à interpret)
MicroBench
AbstractMachine
MIPS32
processor
run x86 programs
on a student's
MIPS32 processor
Linux
native
![Page 17: Building Computer Systems From Scratch Around AbstractMachineics.nju.edu.cn/~jyy/teach/am-talk.pdf · systest klib (student’s own OS based on Nanos labs) Abstract Machine 11. AbstractMachine](https://reader033.vdocuments.us/reader033/viewer/2022042803/5f47cc76e7b22767672f5be7/html5/thumbnails/17.jpg)
I/O Extension (IOE) APIs
• An I/O device = read registers + write registers
Image source: "Operating Systems: Three Easy Pieces"
AM I/O Devicereg
reg
typedef struct Device {
uint32_t id;
const char *name;
size_t (*read)(uintptr_t reg,void *buf, size_t size);
size_t (*write)(uintptr_t reg,
void *buf, size_t size);
} Device;
![Page 18: Building Computer Systems From Scratch Around AbstractMachineics.nju.edu.cn/~jyy/teach/am-talk.pdf · systest klib (student’s own OS based on Nanos labs) Abstract Machine 11. AbstractMachine](https://reader033.vdocuments.us/reader033/viewer/2022042803/5f47cc76e7b22767672f5be7/html5/thumbnails/18.jpg)
I/O Device Examples
• ⌨ AM keyboard
• non-blocking read of key code
• " AM frame buffer
• draw a texture (an array of pixels) to a W*H rectangle
• can be accelerated by DMA and/or a student's graphics card!
• PCI configuration space
• memory-mapped I/O, x86 only
18
![Page 19: Building Computer Systems From Scratch Around AbstractMachineics.nju.edu.cn/~jyy/teach/am-talk.pdf · systest klib (student’s own OS based on Nanos labs) Abstract Machine 11. AbstractMachine](https://reader033.vdocuments.us/reader033/viewer/2022042803/5f47cc76e7b22767672f5be7/html5/thumbnails/19.jpg)
With IOE…
• We can run almost any single-threaded kernel, e.g., games
19
SoC uncore (provided)
Lab Artifact:
single-cycle
processor core(w. bus connection)
!⌨
a lot more fun!
Student's OSLab0 (Game) on
Another student’s MIPS32 SoC
![Page 20: Building Computer Systems From Scratch Around AbstractMachineics.nju.edu.cn/~jyy/teach/am-talk.pdf · systest klib (student’s own OS based on Nanos labs) Abstract Machine 11. AbstractMachine](https://reader033.vdocuments.us/reader033/viewer/2022042803/5f47cc76e7b22767672f5be7/html5/thumbnails/20.jpg)
Asynchronous Extension (ASYE) APIs
• Interrupt and processor context management
• interrupt handling callback
• interrupt enable/disable
• self-trapping
• context creation
int asye_init(RegSet *(*handler)(Event ev, RegSet *regs));
int intr_read();
void intr_write(int enable);
void yield();
RegSet *make(Area kstack, void (*entry)(void *), void *arg);
Our experience: students have
no problem understanding the
semantics of these APIs
20
![Page 21: Building Computer Systems From Scratch Around AbstractMachineics.nju.edu.cn/~jyy/teach/am-talk.pdf · systest klib (student’s own OS based on Nanos labs) Abstract Machine 11. AbstractMachine](https://reader033.vdocuments.us/reader033/viewer/2022042803/5f47cc76e7b22767672f5be7/html5/thumbnails/21.jpg)
Protection Extension (PTE) APIs
• (Very) high-level mechanisms for implementing virtual memory: VM is a dict-like data structure
• creation/teardown of a protected address space
• mapping/unmapping a page
a protected portion of
the address space
a page in the kernel heap
segmented protection:
page size = protected AS size
21
![Page 22: Building Computer Systems From Scratch Around AbstractMachineics.nju.edu.cn/~jyy/teach/am-talk.pdf · systest klib (student’s own OS based on Nanos labs) Abstract Machine 11. AbstractMachine](https://reader033.vdocuments.us/reader033/viewer/2022042803/5f47cc76e7b22767672f5be7/html5/thumbnails/22.jpg)
With ASYE and PTE…
• We can run almost anything!
• applications (and libc) on operating system
22
Playable PAL (仙剑奇侠传), newlibc,Student’s Nanos and MIPS32 SoC
(with AXI bus and DDR)
2nd place in 2017 Loongson Contest
opens /dev/fb for graphics
and /dev/inputs for
timer/keyboard events
![Page 23: Building Computer Systems From Scratch Around AbstractMachineics.nju.edu.cn/~jyy/teach/am-talk.pdf · systest klib (student’s own OS based on Nanos labs) Abstract Machine 11. AbstractMachine](https://reader033.vdocuments.us/reader033/viewer/2022042803/5f47cc76e7b22767672f5be7/html5/thumbnails/23.jpg)
Multiprocessing Extension (MPE) APIs
• Simple management of parallel bare-metal (shared-memory) C runtimes
• boot multiprocessors
• get multicore information
• atomic operation (with cache coherence)
23
int mpe_init(void (*entry)());
int ncpu();
int cpu();
intptr_t atomic_xchg(volatile intptr_t *addr,intptr_t newval);
TRM TRM TRM
shared memory
![Page 24: Building Computer Systems From Scratch Around AbstractMachineics.nju.edu.cn/~jyy/teach/am-talk.pdf · systest klib (student’s own OS based on Nanos labs) Abstract Machine 11. AbstractMachine](https://reader033.vdocuments.us/reader033/viewer/2022042803/5f47cc76e7b22767672f5be7/html5/thumbnails/24.jpg)
Case Study: AM in the Research Project
• In the development of Labeled RISC-V
• lead by Zihao Yu at ICT-CAS
• Booting a full runtime environment is too costly in simulation
• alternatively, implementing AM APIs on riscv64-rocket is very
easy
• we can run kernels (tests) without booting an OS
24
MemPerf on Build Time Run Time
FPGA ~30 minutes (synthesis) ~1 minute
Simulator (Linux) ~1 minute ~1 hour
Simulator (AM) ~1 minute ~3 minutes
![Page 25: Building Computer Systems From Scratch Around AbstractMachineics.nju.edu.cn/~jyy/teach/am-talk.pdf · systest klib (student’s own OS based on Nanos labs) Abstract Machine 11. AbstractMachine](https://reader033.vdocuments.us/reader033/viewer/2022042803/5f47cc76e7b22767672f5be7/html5/thumbnails/25.jpg)
AM on Labeled RISC-V: Case #2
• Testing low-level memory virtualization and cache coherence with stress multi-core parallel memory I/O workloads
• only requires TRM, ASYE (for exceptions), and MPE (for multicore) ß another configuration of AM
• 100X parallel (random) regression testing
• 60,000 test runs in 10 hours, 1 minute/per test
• with record & replay debugging
• Found and fixed an L1 DCache concurrency bug
• the similar bug may exist in the upstream (confirming)
25
![Page 26: Building Computer Systems From Scratch Around AbstractMachineics.nju.edu.cn/~jyy/teach/am-talk.pdf · systest klib (student’s own OS based on Nanos labs) Abstract Machine 11. AbstractMachine](https://reader033.vdocuments.us/reader033/viewer/2022042803/5f47cc76e7b22767672f5be7/html5/thumbnails/26.jpg)
Discussions
![Page 27: Building Computer Systems From Scratch Around AbstractMachineics.nju.edu.cn/~jyy/teach/am-talk.pdf · systest klib (student’s own OS based on Nanos labs) Abstract Machine 11. AbstractMachine](https://reader033.vdocuments.us/reader033/viewer/2022042803/5f47cc76e7b22767672f5be7/html5/thumbnails/27.jpg)
AM: Some Obvious Benefits
• Less f**king manuals to read (and thus easier to teach)
• many concepts no longer exist (GDT, LDT, TSS, ...)
• a much more simplified interrupt/exception model
• much easier to debug
• Motivates a student to maintain his/her own system
• strive to make the entire system stack to work (compiler àapplication à operating system kernel à SoC on FPGA)
• get hands dirty in debugging and full system
27
![Page 28: Building Computer Systems From Scratch Around AbstractMachineics.nju.edu.cn/~jyy/teach/am-talk.pdf · systest klib (student’s own OS based on Nanos labs) Abstract Machine 11. AbstractMachine](https://reader033.vdocuments.us/reader033/viewer/2022042803/5f47cc76e7b22767672f5be7/html5/thumbnails/28.jpg)
AM: Some Less Obvious Benefits
• AM threads the computer system labs
• ARCH labs: TRM (a minimal processor) à IOE (buses and memory, SoC) à ASYE (interrupts and exceptions) à PTE (MMU)
• OS labs: TRM + MPE (a base system to play with) à ASYE (kernel multithreading) à IOE (file system) à PTE (processes and system calls)
• Providing a layer of abstraction for systematic testing and/or verification
• forced separation of machine-dependent (AM APIs) and machine-independent code
28
![Page 29: Building Computer Systems From Scratch Around AbstractMachineics.nju.edu.cn/~jyy/teach/am-talk.pdf · systest klib (student’s own OS based on Nanos labs) Abstract Machine 11. AbstractMachine](https://reader033.vdocuments.us/reader033/viewer/2022042803/5f47cc76e7b22767672f5be7/html5/thumbnails/29.jpg)
AM: Limitations
• Less fun hacking the systems
• OS ninja students are really addicted to this !
• this is the trade-off we take (let them implement AM APIs)
• Less low-level controls
• cannot take full advantage of hardware supports (e.g., page directory COW, access/dirty bits, ...)
• solution: make it work on AM, make it better on a particular architecture (e.g., x86-qemu)
29
![Page 30: Building Computer Systems From Scratch Around AbstractMachineics.nju.edu.cn/~jyy/teach/am-talk.pdf · systest klib (student’s own OS based on Nanos labs) Abstract Machine 11. AbstractMachine](https://reader033.vdocuments.us/reader033/viewer/2022042803/5f47cc76e7b22767672f5be7/html5/thumbnails/30.jpg)
Project-N Brewed Projects (Ongoing)
• Navy application framework
• newlib (libc), libbmp, NWM (window manager), NTerm(terminal emulator), ... (with Zihao Yu and students)
• Needle plagiarism detector1
• measures !(#$, #&) between programs (with Prof. Chang Xu)
• Nuts random program generator
• random kernels for fuzzing a student's processor with control-and data-flow diversity (with Xianfei Ou), now a research project to fuzz compilers (found previously unknown bugs in GCC)
30
1 Yanyan Jiang and Chang Xu, “Needle: Detecting code plagiarism on student submissions”,
in Proceedings of the ACM Turing Award China Conference (SIGCSE China), 2018.
![Page 31: Building Computer Systems From Scratch Around AbstractMachineics.nju.edu.cn/~jyy/teach/am-talk.pdf · systest klib (student’s own OS based on Nanos labs) Abstract Machine 11. AbstractMachine](https://reader033.vdocuments.us/reader033/viewer/2022042803/5f47cc76e7b22767672f5be7/html5/thumbnails/31.jpg)
... And a Lot of Future Work!
• Enhancing existing infrastructures
• testing/debugging/grading/... tools
• libraries, applications (busybox), ...
• Porting more interesting stuffs into the AM ecosystem
• xv6 (then it runs on MIPS!), ...
31
![Page 32: Building Computer Systems From Scratch Around AbstractMachineics.nju.edu.cn/~jyy/teach/am-talk.pdf · systest klib (student’s own OS based on Nanos labs) Abstract Machine 11. AbstractMachine](https://reader033.vdocuments.us/reader033/viewer/2022042803/5f47cc76e7b22767672f5be7/html5/thumbnails/32.jpg)
The AbstractMachine: Conclusion
• An abstraction layer of high-level instruction-set architecture mechanisms
• A bare-metal C runtime (and APIs)
• A trade-off for teaching computer systems
• The glue layer in Project-N system labs
• currently adopted in ICS, OS, and ARCH labs at NJU
32
computer
hardware
bare-metal
software
ISA
AbstractMachine