‘Dynamic’ kernel patching
How you could add your own system-calls to Linux without
editing and recompiling the kernel
System calls
• System Calls are the basic OS mechanism for providing privileged kernel services to application programs (e.g., fork(), clone(), execve(), read(), write(), signal(), getpid(), waitpid(), gettimeofday(), setitimer(), etc.)
• Linux implements almost 300 system calls
• To understand how system calls work, we can try creating one of our own design
‘Open Source’ philosophy
• Linux source-code is publicly available
• In principle, anyone could edit the sources to add their own new functions into Linux
• In practice, it is inconvenient to do this
• The steps needed involve reconfiguring, recompiling, and reinstalling your kernel
• For the novice these steps are arduous!
• Any error risks data-loss and down-time
Alternative to edit/recompile
• Linux modules offer an alternative method for modifying the OS kernel’s functionality
• It’s safer -- and vastly more convenient – since error-recovery only needs a reboot, and minimal system knowledge suffices
• The main hurdle to be overcome concerns the issue of ‘linking’ module code to some non-exported Linux kernel data-structures
Invoking kernel services
applicationprogram
user-mode (restricted privileges)
kernel-mode (unrestricted privileges)
standardruntimelibraries
call
ret
Linux kernelint 0x80
iret
installable module
callret
The system-call jump-table
• There are approximately 300 system-calls• Any specific system-call is selected by its
ID-number (it’s placed into register %eax) • It would be inefficient to use if-else tests or
even a switch-statement to transfer to the service-routine’s entry-point
• Instead an array of function-pointers is directly accessed (using the ID-number)
• This array is named ‘sys_call_table[]’
Assembly language (.data)
.section .datasys_call_table:
.long sys_restart_syscall
.long sys_exit
.long sys_fork
.long sys_read
.long sys_write// …etc (from
‘arch/i386/kernel/entry.S’)
The ‘jump-table’ idea
sys_restart_syscall
sys_exit
sys_fork
sys_read
sys_write
sys_open
sys_close
…etc…
sys_call_table
.section .text0
1
2
3
4
5
6
7
8
Assembly language (.text)
.section .textsystem_call:
// copy parameters from registers onto stack…call sys_call_table(,%eax,4)jmp ret_from_sys_call
ret_from_sys_call:// perform rescheduling and signal-handling…
iret // return to caller (in user-mode)
Changing the jump-table
• To install our own system-call function, we just need to change an entry in the Linux ‘sys_call_table[]’ array, so it points to our own module function, but save the former entry somewhere (so we can restore it if we remove our module from the kernel)
• But we first need to find ‘sys_call_table[]’ -- and there are two easy ways to do that
Which entry can we change?
• We would not want to risk disrupting the normal Linux behavior through unintended alterations of some vital system-service
• But a few entries in ‘sys_call_table[]’ are no longer being used by the newer kernels
• If documented as being ‘obsolete’ it would be reasonably safe for us to ‘reuse’ an array-entry for our own purposes
• For example: system-call 17 is ‘obsolete’
Finding the jump-table
• Older versions of Linux (prior to 2.4.18) used to ‘export’ the ‘sys_call_table[]’ as a global symbol, but current versions keep this table’s address private (for security)
• But often during kernel-installation there is a ‘System.map’ file that gets put into the ‘/boot’ directory and – assuming it matches your compiled kernel – it holds the kernel address for the ‘sys_call_table[]’ array
Using ‘uname’ and ‘grep’
• You can use the ‘uname’ command to find out which kernel-version is running:
$ uname -r
• Then you can use the ‘grep’ command to find ‘sys_call_table’ in your System.map file, like this:$ grep sys_call_table /boot/System.map-2.4.26
The ‘vmlinux’ file
• Your compiled kernel (uncompressed) is left in the ‘/usr/src/linux’ directory
• It is an ELF-format (executable) file
• It contains .text and .data sections
• You can examine your ‘vmlinux’ kernel with the ‘objdump’ system-utility
• You can pipe the output through the ‘grep’ utility to locate the ‘sys_call_table’ symbol
Section-Header Table(optional)
Executable versus Linkable
ELF Header
Section 2 Data
Section 3 Data
…Section n Data
Segment 1 Data
Segment 2 Data
Segment 3 Data
…Segment n Data
Linkable File Executable File
Section-Header Table
Program-Header Table(optional)
Program-Header Table
ELF Header
Section 1 Data
Where is ‘sys_call_table[ ]’?
• This is how you use ‘objdump’ and ‘grep’ to find the ‘sys_call_table[]’ address:
$ cd /usr/src/linux
$ objdump –t vmlinux | grep sys_call_table
Exporting ‘sys_call_table’
• Once you know the address of your kernel’s ‘sys_call_table[]’, you can write a module to export that address to other modules, e.g.:
// declare global variableunsigned long *sys_call_table;
int init_module( void){
sys_call_table = (unsigned long *)0xC027A540;return 0;
}
Avoid hard-coded constant
• You probably don’t want to ‘hard code’ the sys_call_table’s value in your module – if you ever recompile your kernel, or use a differently configured kernel, you’d have to remember to edit your module and then recompile it – or risk a corrupted system!
• There’s a way to suply the required value as a module-parameter during ‘insmod’
Module paramerers
char *svctable; // declare global variable
MODULE_PARM( svctable, “s” );
// Then you install your module like this:$ /sbin/insmod myexport.o svctable=c027a540
// Linux will assign the address of your input string “c027a450” to the ‘svctable’ pointer
simple_strtoul()
• There is a kernel function you can use, in your ‘init_module()’ function, that will convert a string of hexadecimal digits into an ‘unsigned long’’:int init_module( void ){
unsigned long myval;myval = simple_strtoul( svctable, NULL, 16
);sys_call_table = (unsigned long *)myval;return 0;
}
Shell scripts
• It’s inconvenient – and risks typing errors – if you must manually search ‘vmlinux’ and then type in the sys_call_table[]’s address every time you want to install your module
• Fortunately this sequence of steps can be readily automated – by using a shell-script
• We have created an example: ‘myscript’
shell-script format
• First line: #!/bin/sh
• Some assignment-statements:
version=$(uname –r)
mapfile=/boot/System.map-$version
• Some commands (useful while debugging)
echo $version
echo $mapfile
The ‘cut’ command
• You can use the ‘cut’ operation on a line of text to remove the parts you don’t want
• An output-line from the ‘grep’ program can be piped in as a input-line to ‘cut’
• You supply a command-line argument to the ‘cut’ program, to tell it which parts of the character-array you wish to retain:– For example: cut –c0-8– Only characters 0 through 8 will be retained
Finishing up
• Our ‘myscript’ concludes by executing the command which installs our ‘myexport.o’ module into the kernel, and automatically supplies the required module-parameter
• If your ‘/boot’ directory doesn’t happen to have the ‘System.map’ file in it, you can extract the ‘sys_call_table[]’ address from the uncompressed ‘vmlinux’ kernel-binary
The ‘objdump’ program
• The ‘vmlinux’ file contains a Symbol-Table section that includes ‘sys_call_table’
• You can display that Symbol-Table using the ‘objdump’ command with the –t flag: $ objdump –t /usr/src/linux/vmlinux
• You can pipe the output into ‘grep’ to find the ‘sys_call_table’ symbol-value
• You can use ‘cut’ to isolate the address
‘newcall.c’
• We created this module to demonstrate the ‘dynamic kernel patching’ technique
• It installs a function for system-call 17
• This function increments the value stored in a variable of type ‘int’ whose address is supplied as a function-argument
• We wrote the ‘try17.cpp’ demo to test it!
In-class exercise #1
• Write a kernel module (named ‘unused.c’) which will create a pseudo-file that reports how many ‘unimplemented’ system-calls are still available. The total number of locations in the ‘sys_call_table[]’ array is given by a defined constant: NR_syscalls so you can just search the array to count how many entries match ‘sys_ni_syscall’ (it’s the value found initially in location 17)
In-class exercise #2
• Another important “hidden” kernel object is the Interrupt Descriptor Table
• It’s an array of ‘struct desc_struct’ entries and it’s named ‘idt_table[]’
• Try modifying our ‘myexport.c’ demo so it also exports the address for ‘idt_table[]’