Skip to content

Latest commit

 

History

History
482 lines (357 loc) · 16.1 KB

11.Linux kernelspace.md

File metadata and controls

482 lines (357 loc) · 16.1 KB

Linux Kernelspace Development

Effective kernel development necessitates a solid grasp of hardware functionalities and efficient, direct interaction with device registers. Most of the work to be done in an OS is writing device drivers:

  • A microprocessor requires external memories and I/O to operate
  • A microcontroller is an integrated circuit combining CPU, memories and peripherals. Targeted at embedded and real-time systems.

When dealing with device drivers, it's all about interrupt handling:

  1. Compute correct registers memory addresses.
  2. Accessing Peripheral Registers:
    • Reserve necessary I/O port regions and register the interrupt.
  3. Handling interrupts:
    • Write the interrupt handler to perform the desired operations, such as counting received bytes.
  4. Initialize and Cleanup:
    • Request the IRQ during module initialization and free it during cleanup.

Functions are often declared as static in kernel development to limit their scope and visibility, which is useful for encapsulation and avoiding name collisions in larger projects like the Linux kernel.

Accessing Peripheral Registers

Aspect Memory-Mapped I/O Port I/O
Usage Access memory-mapped device registers Access hardware ports
Functions ioread8, iowrite8, ioremap inb, outb, inl, outl
Requesting Region request_mem_region request_region
Releasing Region release_mem_region release_region
Memory-Mapped I/O
// Read/write 8-bit value
u8 ioread8(void __iomem *addr);
void iowrite8(u8 value, void __iomem *addr);

// Read/write 16-bit value
u16 ioread16(void __iomem *addr);
void iowrite16(u16 value, void __iomem *addr);

// Read/write 32-bit value
u32 ioread32(void __iomem *addr);
void iowrite32(u32 value, void __iomem *addr);

// Map/unmap memory
void __iomem *ioremap(phys_addr_t phys_addr, unsigned long size);
void iounmap(void __iomem *addr);

// Request/release memory region
// Typically used during module init/cleanup.
struct resource *request_mem_region(phys_addr_t start, unsigned long n, const char *name);
void release_mem_region(phys_addr_t start, unsigned long n);
Port I/O
//read/write 8-bit value
unsigned char inb(unsigned int addr);
void outb(unsigned char val, unsigned int addr);

// read/write 32-bit value
unsigned int inl(unsigned int addr);
void outl(unsigned int val, unsigned int addr);


unsigned short inw(unsigned int addr);
void outw(unsigned short val, unsigned int addr);

//request/release *port* region
//Typically used during module init/cleanup.
struct resource *request_region(unsigned base_addr, unsigned size, const char *name);
void release_region(unsigned base_addr, unsigned size);
  • request_region(start, length, name);: Allocates a specific range of I/O ports for exclusive use by a device driver, ensuring no port collisions occur when new devices are added to the system.
    • start: The beginning of the IO port range you want to control.
    • length: The number of IO ports in the range.
    • name: A name for the IO region, usually the name of your driver, which helps in debugging and tracking resource usage.

This function is used to request control of a range of IO ports for your device driver, ensuring that no other driver can access the same IO space while you control it. This helps prevent conflicts between drivers trying to access the same hardware resources.

Bit Manipulation for peripheral registers

To interact with hardware peripherals, we need to access peripheral registers. Peripheral registers are memory locations mapped to specific addresses in the processor address space. Their addresses are always physical, also in microprocessors. Most common way used by hardware peripherals to expose their functionality to the software:

  1. Data-sheets provide information for each register.
  2. Name, address in memory, meaning of all the bits and access permissions.

Most often, we will need to interact with individual bits inside registers. Remember, in C/C++, a single bit cannot be directly represented as a type. Instead, use an unsigned int to represent the whole register, and use bit manipulation to access individual bits.

Setting a Bit to 1:

To set a specific bit to 1, use the bitwise OR operator (|) with a mask where the target bit is 1 and all others are 0.

variable |= 1 << n;  // Sets the nth bit of variable to 1
  • 1 << n shifts 1 to the left by n positions, creating a mask with a 1 at the nth bit.
  • |= modifies variable directly, setting its nth bit.

Clearing a Bit to 0:

To clear a specific bit to 0, use the bitwise AND operator (&) with a mask where the target bit is 0 and all others are 1.

variable &= ~(1 << n);  // Clears the nth bit of variable to 0
  • 1 << n creates a mask with a 1 at the nth bit.
  • ~ inverts the mask, turning the nth bit to 0 and all others to 1.
  • &= modifies variable directly, clearing its nth bit.

Toggling a Bit:

To toggle a bit (change 1 to 0 or 0 to 1), use the bitwise XOR operator (^) with a mask where the target bit is 1 and all others are 0.

variable ^= 1 << n;  // Toggles the nth bit of variable
  • 1 << n creates a mask with a 1 at the nth bit.
  • ^= modifies variable directly, toggling its nth bit.

Checking a Bit's Value:

To check if a specific bit is set (1) or not, use the bitwise AND operator (&) with a mask where the target bit is 1. If the result is nonzero, the bit is set.

if (variable & (1 << n)) {
    // The nth bit of variable is set
} else {
    // The nth bit of variable is not set
}
  • 1 << n creates a mask with a 1 at the nth bit.
  • & checks the nth bit of variable; the result is non-zero if the bit is 1.

Remember that when you perform operations like ((1 << 15) | 5), the numbers involved are already in binary, and the operations are carried out on their binary representations:

  • 5: The decimal number 5 is represented in binary as 101.
  • 1 << 15: shifts the binary representation of 1 and it left by 15 positions.
1000 0000 0000 0000
OR 0000 0000 0000 0101
   -------------------
   1000 0000 0000 0101

Handling Interrupts

Request and Free IRQs

int request_irq(unsigned int irq, irq_handler_t handler, unsigned long flags, const char *name, void *dev);
void free_irq(unsigned int irq, void *dev_id);
  • Purpose: Register and unregister interrupt handlers.
  • Usage: Request IRQ during module initialization and free it during cleanup.

Example Usage

static irqreturn_t uart_interrupt_handler(int irq, void *dev_id) {
    // Interrupt handler code
    return IRQ_HANDLED;
}

static int __init my_module_init(void) {
    if (request_irq(UART_IRQ, uart_interrupt_handler, 0, "my_uart_driver", NULL)) {
        printk(KERN_ERR "Failed to request IRQ\n");
        return -EBUSY;
    }
    // Other initialization code
    return 0;
}

static void __exit my_module_exit(void) {
    free_irq(UART_IRQ, NULL);
    // Other cleanup code
}

Handling Concurrency in the Linux Kernel: kthreads and mutexes

The Linux kernel employs its own API for concurrency, as POSIX standards are not supported within the kernel space. Creating and managing kernel threads:

task_struct *kthread_run(int (*threadfn)(void *), void *arg, const char *name);
void kthread_stop(task_struct *t);
int kthread_should_stop();

Ensure mutual exclusion and protection of shared data in kernel threads:

struct mutex my_mutex;
void mutex_init(struct mutex *m);
void mutex_lock(struct mutex *m);
void mutex_unlock(struct mutex *m);

Example:

static struct mutex my_mutex;

static int __init my_module_init(void) {
    mutex_init(&my_mutex);
    // Other initialization code
    return 0;
}

static void my_thread_function(void) {
    mutex_lock(&my_mutex);
    // Critical section
    mutex_unlock(&my_mutex);
}

static void __exit my_module_exit(void) {
    // Cleanup code
}

Interrupt Concurrency: Spinlocks

Spinlocks are still used in the Linux kernel due to their simplicity: their minimal overhead in short-duration scenarios (as they avoid the context switch overhead associated with sleeping locks) is especially useful in interrupt contexts within the kernel.

spinlock_t my_spinlock;
void spin_lock_init(spinlock_t *spinlock);
void spin_lock(spinlock_t *spinlock);
void spin_unlock(spinlock_t *spinlock);
void spin_lock_irq(spinlock_t *spinlock);
void spin_unlock_irq(spinlock_t *spinlock);
  • Purpose: Ensure mutual exclusion in interrupt handlers and other critical sections.
  • Usage: Used in both interrupt handlers and normal kernel code.

Example Usage

static spinlock_t my_spinlock;

static int __init my_module_init(void) {
    spin_lock_init(&my_spinlock);
    // Other initialization code
    return 0;
}

static void my_interrupt_handler(void) {
    spin_lock(&my_spinlock);
    // Critical section
    spin_unlock(&my_spinlock);
}

static void __exit my_module_exit(void) {
    // Cleanup code
}

Waiting for Interrupts: Wait Queues

To synchronize normal code execution with interrupt handling, the Linux kernel uses wait queues:

wait_queue_head_t my_wait_queue;
void init_waitqueue_head(wait_queue_head_t *waiting);
void wait_event_interruptible(wait_queue_head_t *waiting, condition);
void wake_up(wait_queue_head_t *waiting);
  • Purpose: Block a process until a condition becomes true.
  • Usage: Used in scenarios where a process needs to wait for an interrupt or other events.

Example Usage

static DECLARE_WAIT_QUEUE_HEAD(my_wait_queue);

static int condition = 0;

static void my_interrupt_handler(void) {
    condition = 1;
    wake_up(&my_wait_queue);
}

static int my_function(void) {
    wait_event_interruptible(my_wait_queue, condition != 0);
    // Condition is met, proceed
    return 0;
}

Filesystem Operations

Device drivers in Linux are typically exposed in the /dev directory, with file operations managed through a set of function pointers in struct file_operations:

static ssize_t my_driver_write(struct file *f, const char __user *buf, size_t size, loff_t *o) {
    // Write data to the device
}

static ssize_t my_driver_read(struct file *f, char __user *buf, size_t size, loff_t *o) {
    // Read data from the device
}

struct file_operations my_driver_fops = {
  .owner = THIS_MODULE,
  .write = my_driver_write,
  .read = my_driver_read,
};
int major = register_chrdev(0, "name", &my_driver_fops);
unregister_chrdev(major, "name");

Explicit data transfer between user space and kernel space is required, utilizing the following functions:

unsigned copy_from_user(void *to, const void __user *from, unsigned n);
unsigned copy_to_user(void __user *to, const void *from, unsigned n);

copy_to_user and copy_from_user ensure that the pointers provided by user space are valid and that the copying process doesn't lead to buffer overflows or access violations.

Registering and Unregistering Character Devices

struct file_operations my_driver_fops = {
    .owner = THIS_MODULE,
    .write = my_driver_write,
    .read = my_driver_read,
};

int major = register_chrdev(0, "my_driver", &my_driver_fops);
unregister_chrdev(major, "my_driver");
  • Purpose: Register and unregister character devices.
  • Usage: Set up and tear down device file operations for your driver.

Example Usage:

static ssize_t my_driver_write(struct file *f, const char __user *buf, size_t size, loff_t *o) {
    char kernel_buf[1024];
    if (size > sizeof(kernel_buf))
        size = sizeof(kernel_buf);

    if (copy_from_user(kernel_buf, buf, size))
        return -EFAULT;

    // Use kernel_buf

    return size;
}

static ssize_t my_driver_read(struct file *f, char __user *buf, size_t size, loff_t *o) {
    char kernel_buf[1024];
    int data_size = 1024; // Assume we have 1024 bytes to copy

    if (size > data_size)
        size = data_size;

    if (copy_to_user(buf, kernel_buf, size))
        return -EFAULT;

    return size;
}

Miosix kernelspace

Miosix is a real-time operating system optimized for microcontrollers:

  • Supports C and C++ standard libraries and POSIX.
  • Used as a platform for distributed real-time systems research.
  • Userspace is optional, allowing development directly in kernelspace.
  • Developed at Politecnico di Milano.

Linker Script

The linker script provides instructions to the linker about how to place various sections of code and data in memory.

  • start address of the stack
  • maps the program sections into memory regions

In a typical microcontroller setup, memory is divided into several sections, including:

  • flash memory for code
  • RAM for data and stack, and peripheral registers.

The memory map varies from one microcontroller to another. \Example of possible addresses:

0x00000000: System memory
0x08000000: Flash memory (code)
0x080FFFFF: Flash memory end
0x20000000: RAM (data, stack)
0x2001FFFF: RAM end
0x40000000: Peripheral registers
0xFFFFFFFF: Peripheral registers end

The initial part of the kernel needs to be written in assembler to satisfy C/C++ language requirements.

Common sections and symbols:

  • .isr_vector: the Interrupt Service Routine (ISR) vectors are essentially pointers to the functions that should be executed in response to various system interrupts. It is crucial for handling of hardware interrupts.
  • .text: Executable instructions.
  • .data: Initialized global and static variables.
  • .bss: Uninitialized global and static variables, zeroed at startup.
  • .rodata: Read-only data (e.g., constants).
  • _start, ENTRY: Entry point of the program.
  • _stack: Stack top address (for stack initialization).

Common linker script commands:

  • ENTRY(symbol): Sets the entry point.
  • MEMORY: Defines named memory regions with attributes (e.g., rx for read-execute).
  • ORIGIN, LENGTH: Define the start address and size of a memory region.
  • SECTIONS: Begins section definitions.
  • >: Assigns a section to a memory region.
  • AT>: Places the section at a specific address in memory but locates it in a different region in the output file.
  • KEEP: Prevents garbage collection of sections.
  • ALIGN: Ensures that data structures are aligned to memory boundaries.
  • . (dot): Represents the current address in the memory region being filled by the linker as it progresses through the sections. When you assign . to a symbol (e.g., _data = .;), you're capturing the current address at that point in the linking process, which can be used as a reference by the program at runtime.

Example of a linker script:

ENTRY(Reset_Handler)

MEMORY
{
  flash(rx) : ORIGIN = 0x08000000, LENGTH = 1M  // Flash memory for code and constants
  ram(wx)   : ORIGIN = 0x20000000, LENGTH = 128K  // RAM for data and stack
}

_stack_top = 0x2001FFFF;  // Adjusted to point to the end of RAM

SECTIONS
{
  . = 0;
  .text :  {
    KEEP(*(.isr_vector))  // Ensure ISR vectors remain included by the linker
    *(.text)
    . = ALIGN(4);
    *(.rodata)
  } > flash  // Place code and constants in flash

  .data :  {
    _data = .;
    *(.data)
    . = ALIGN(8);
    _edata = .;
  } > ram AT > flash  // Initialize data in flash, copy to RAM at startup

  _etext = LOADADDR(.data);
  _bss_start = .;
  .bss :  {
    *(.bss)
    . = ALIGN(8);
  } > ram  // Allocate uninitialized data in RAM

  _bss_end = .;
  _end = .;  // Mark end of used memory for heap allocation
}

Combining the .text and .rodata sections simplifies the memory layout by placing all read-only sections together. Since both .text (code) and .rodata (read-only data) are immutable, they can be grouped in a contiguous region of flash memory.

You can move the _etext initialization before the .data section and write _etext = .; to indicate the end of the .text section directly and not use the LOADADDR instruction.