Category Archives: Linux Kernel Internals

Pradeep’s Dive into the internals of Linux Kernel

Linux Kernel Module

<< Previous Article

In Linux kernel, drivers can be put in two ways. One is, you make it as a part of kernel and will be part of vmlinux image. Another thing is to build the drivers separately and dynamically plug it into the Kernel. So, the driver which is loaded dynamically into the kernel is known as kernel module. Modules are very handy during the development phase.

Writing a Simple Kernel Module

Before writing a module, you need to understand the kernel C. So, you might be wondering, do I need learn one more language for coding in Kernel? Don’t worry, Kernel C is normal pure C with GNU extensions. Now, what is pure C? It means C without access to any user space libraries such as glibc. Kernel includes all the code as a part of itself. This is the code which kernel developers have developed as a part of kernel and is placed at <kernel_source>/kernel/lib.

One of the beautiful thing about the kernel is that, though its written in C, but it follows the object oriented concepts. This is evident from the very first module which we will try. Below is the simple kernel module:

Simple Kernel Module

Figure 1: Simple Kernel Module

As can be seen, every module has a constructor and destruction function. skm_init is the constructor and skm_exit is the destructor. Now, as with object oriented programming, constructor is invoked when the the object is instantiated, similarly, over here, constructor is invoked when the module is dynamically loaded into the kernel. So, when will destructor be invoked? Of course, when the module is plugged out of the kernel. Macros module_init() and module_exit() are used to specify the constructor and destructor for a module.

Equivalent of printf() in kernel is printk().

Header file ‘kernel.h’ is kernel space header file which includes the prototype for printk and other commonly used functions. module.h includes the module related data structures and APIs. Macros module_init() and module_exit() are defined here. File version.h contains the kernel version. This is included for the module version to be compatible with kernel into which the module will be loaded.

Apart from this, we have a macros beginning with MODULE_. These specify the module related information and form the module’s signature.

Building a Kernel Module

In order to build a kernel module, you need to have the kernel source code which is usually found at /usr/src/linux. If not kernel source, at least you need the kernel headers. Building a kernel module is different from a building any application. Normally, applications are compiled using the gcc command and by default, gcc picks up the libraries in /usr/lib. But, as discussed earlier, kernel code is a self-contained and doesn’t uses the libraries from the user space. So, we need to give the command line options to gcc to not to take the standard libraries. Not only that, since the module is going to be the hot plugged into the kernel, it has to be compiled with the same flags as the kernel was compiled with. In order to take care of these things, we invoke the kernel makefile to compile our module.

Below is the makefile for compiling the module:

Figure 2: Makefile For Kernel Module

Here, it is assumed that the kernel source is placed at /usr/src/linux/. If it is placed at any other location, update the location in KERNEL_SOURCE variable in this Makefile.

To build the module, invoke make as below:

$ make

The output of the make would be skm.ko.

Dynamically Loading/Unloading a Kernel Module

We use insmod command to load the kernel module as below. We need to execute this command with root privileges

$ insmod skm.ko

In order to list the modules, execute lsmod as below. This will show you the skm loaded.

$ lsmod

And for removing/unloading the module, execute rmmod command as below:

$ rmmod skm

Note that while unloading, we use module name (skm), not the file name (skm.ko).

Conclusion

So, now we are comfortable with writing & building a kernel module. This is the basic building block of the Linux kernel development. In the following articles, we will dive into the Linux Kernel Programming. So, stay tuned!

Next Article >>

   Send article as PDF   

Introduction to Linux Kernel

What is Kernel?

Looking at the dictionary, the meaning of the kernel is core. But, as we know kernel refers to the operating system. So, what is kernel core of? In fact, kernel is the core of the overall system. Kernel is the system manager. It manages the system resources. So, what all are resources, we have in the system? This is explained as below:

CPU: This is one of the important resource, we have in system. It is the brain of the system. So, it is very important to optimally use this resource. Now, the question comes, how to manage the CPU. One of the mechanism to optimally use the CPU is, what we call as multitasking. Literal meaning of multitasking is doing more than one thing at a time. But, actually speaking, since in the uniprocessor system, we have a single CPU, so at a time, we execute only program, but to get the feeling of multi-tasking, kernel switches among the processes and switching happens to be so fast that, it seems like processor is executing all the processes at once. The subsystem in kernel which enables to achieve this multi-tasking is called scheduler. The task of managing the CPU is what we call as Process Management or Process Scheduling.

Memory: Another important resource, which we have in the system is the memory. When I say memory, it refers to the RAM. Since all the processes require the memory to execute, memory has to be shared among the processes in such a way that one process should not interfere with another process’ memory. Also, there is concept of virtual addressing, which gives the illusion of having more memory than the available RAM. So, subsystem of the kernel which does this task is memory manager. This task of managing the memory is known as Memory Management.

Input/Output (I/O): As you understand, system has various IO devices such as keyboard, mice, and speakers and so on. The task of managing the IO’s is what we call as IO Management.

Storage: This resource is used for storing the data in the non-volatile memory such as hard-disk. We usually store the data in form of files. So the subsystem in the kernel which deals with creation/deletion and management of files is Filesystem. This task of the kernel is what we call as Storage Management.

Network: In order to communication across the systems, we require the network. So, what needs to be managed in the network? There is networking protocol stack and the network interface card (NIC). The task of managing the network is what we call as Network Management.

So, to summarize, kernel performs the five main tasks – Process Management, Memory Management, I/O Management, Storage Management and Network Management.

Linux Kernel Source Organization

As you might be aware, Linux was developed by Linus Torvalds as an academic project. So, he started to create the directory for each functionality. Figure 1 below shows the kernel source code organization at a higher level.

 

kernel_source

 Figure 1: Linux Kernel Source Organization

As seen, there are directories corresponding to all the five functionality. Process management includes two things, one is the processor related code and the other is scheduling. So, all the CPU related code falls in the arch directory. It contains the directories for various processors. And, for scheduling, there are files starting with sched in directory kernel. For memory management, there is a directory called ‘mm’. So, all the code for managing the memory among the processes, shared memory, managing the mallocs, lies over here. Similarly, for most of the Input/Output management, there is folder called drivers. For storage management, there is directory called fs. It contains the code for various filesystem logic such as Ext2, Ext3, Fat32 and so on. For network, there is a directory called net which contains the code for protocols stack and in the drivers, there is a directory called net for interfacing with network interface card.

Apart from this, there are kernel and lib directories, which contain the architecture independent code in kernel. Similarly, linux directory, under include, contains the headers for architecture independent code in kernel. init contains the code which is used during the kernel boot-up.

Next Article >>

   Send article as PDF