Author Archives: Pradeep D Tewani

About Pradeep D Tewani

The author used to work at Intel, Bangalore. The author is a Linux enthusiast and is interested in Linux porting, Linux Kernel Internal & Linux device drivers. He shares his learnings on Linux & embedded systems through his workshops & trainings. Learn more about his experiments at http://sysplay.in.

Waiting / Blocking in Linux Driver Part – 4

<< Previous Article

In the last article, we discussed the usage of wait queues in Linux kernel. We saw the variants of wait_event(). Just for completeness, we will discuss how the wait queues are implemented internally. First step is the creation and initialization of wait queue entry. This is done as below:

DEFINE_WAIT(wait_entry);

Next step is to add the wait queue entry to the queue and set the process state. Both of these things are being done by a single function declared below:

prepare_to_wait(wait_queue_head_t *wq_head, wait_queue_t *wq, int state);

After this, we schedule out the process by invoking the schedule() API. Once we are done with waiting, next step is to clean up with the API below:

finish_wait(wait_queue_head_t *wq_head, wait_queue_t *wq);

All the above are available from:

#include <linux/wait.h>

One more point worth noting is that the condition has to be tested manually as was done in our earlier article.

   Send article as PDF   

Waiting / Blocking in Linux Driver Part – 3

<< Previous Article

The last article in this series focused on implementing the basic wait mechanism. It was a manual waiting where everything, starting from putting the process to the sleep to checking for the wake up event, was done by driver writer. But, such kind of manual waiting is error prone and may at times result in synchronization bugs. So, does kernel provide some robust wait mechanism? No points for guessing the right answer, yes it does. So, read on to explore more on wait mechanism in kernel.

Wait Queues

Wait queue is a mechanism provided in kernel to implement the wait. As the name itself suggests, wait queue is the list of processes waiting for an event. Below are the data structures for wait queues:

#include <linux/wait.h>
// Data structure: wait_queue_head_t
// Created statically 
DECLARE_WAIT_QUEUE_HEAD(wait_queue_name);
// Created dynamically
wait_queue_head_t my_queue;
init_waitqueue_head(&my_queue);

As seen above, wait queues can be defined and initialized statically as well as dynamically. Once the wait queue is initialized, next step is to add our process to wait queue. Below are variants for this:

// APIs for Waiting
wait_event(queue, condition);
wait_event_interruptible(queue, condition);
wait_event_timeout(queue, condition, timeout);
wait_event_interruptible_timeout(queue, condition, timeout);

As seen, there are two variants – wait_event() and wait_event_timeout(). The former is used for waiting for an event as usual, but the latter can be used to wait for an event with timeout. Say, if the requirement is to wait for an event till 5 milliseconds, after which we need to timeout.

So, this was about the waiting, other part of the article is to wake up. For this, we have wake_up() family of APIs as shown below:

// Wakes up all the processes waiting on the queue
wake_up(wake_queue_head_t *);
// Wakes up only the processes performing the interruptible sleep
wake_up_interruptible(wait_queue_head_t *);

Below is modified code from the last article where we use wait queues:

#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/fs.h>
#include <linux/cdev.h>
#include <linux/device.h>
#include <linux/errno.h>
#include <asm/uaccess.h>
#include <linux/wait.h>
#include <linux/sched.h>
#include <linux/delay.h>

#define FIRST_MINOR 0
#define MINOR_CNT 1

static char flag = 'n';
static dev_t dev;
static struct cdev c_dev;
static struct class *cl;
static DECLARE_WAIT_QUEUE_HEAD(wq);

int open(struct inode *inode, struct file *filp)
{
	printk(KERN_INFO "Inside open\n");
	return 0;
}

int release(struct inode *inode, struct file *filp) 
{
	printk (KERN_INFO "Inside close\n");
	return 0;
}

ssize_t read(struct file *filp, char *buff, size_t count, loff_t *offp) 
{
	printk(KERN_INFO "Inside read\n");
	printk(KERN_INFO "Scheduling Out\n");
	wait_event_interruptible(wq, flag == 'y');
	flag = 'n';
	printk(KERN_INFO "Woken Up\n");
	return 0;
}

ssize_t write(struct file *filp, const char *buff, size_t count, loff_t *offp) 
{   
	printk(KERN_INFO "Inside write\n");
	if (copy_from_user(&flag, buff, 1))
	{
		return -EFAULT;
	}
	printk(KERN_INFO "%c", flag);
	wake_up_interruptible(&wq);
	return count;
}

struct file_operations pra_fops = {
	read:        read,
	write:       write,
	open:        open,
	release:     release
};

int wq_init (void)
{
	int ret;
	struct device *dev_ret;

	if ((ret = alloc_chrdev_region(&dev, FIRST_MINOR, MINOR_CNT, "SCD")) < 0)
	{
		return ret;
	}
	printk("Major Nr: %d\n", MAJOR(dev));

	cdev_init(&c_dev, &pra_fops);

	if ((ret = cdev_add(&c_dev, dev, MINOR_CNT)) < 0)
	{
		unregister_chrdev_region(dev, MINOR_CNT);
		return ret;
	}

	if (IS_ERR(cl = class_create(THIS_MODULE, "chardrv")))
	{
		cdev_del(&c_dev);
		unregister_chrdev_region(dev, MINOR_CNT);
		return PTR_ERR(cl);
	}
	if (IS_ERR(dev_ret = device_create(cl, NULL, dev, NULL, "mychar%d", 0)))
	{
		class_destroy(cl);
		cdev_del(&c_dev);
		unregister_chrdev_region(dev, MINOR_CNT);
		return PTR_ERR(dev_ret);
	}
	return 0;
}

void wq_cleanup(void)
{
	printk(KERN_INFO "Inside cleanup_module\n");
	device_destroy(cl, dev);
	class_destroy(cl);
	cdev_del(&c_dev);
	unregister_chrdev_region(dev, MINOR_CNT);
}

module_init(wq_init);
module_exit(wq_cleanup);

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Pradeep");
MODULE_DESCRIPTION("Waiting Process Demo");

As seen, the earlier manual waiting has been replaced by single statement wait_event_interruptible() which is more robust.

Below is the sample run of the above program, assuming that the module is compiled as wait.ko:

$ insmod wait.ko
Major Nr: 250
$ cat /dev/mychar0
Inside open
Inside read
Scheduling out

This gets our process blocked. Open another shell to wake up the process:

$ echo 'y' > /dev/mychar0
Inside open
Inside write
y
Inside close
Woken up
Inside close

As seen above, this will wake up the process, since the condition of flag being ‘y’ is satisfied.

Next Article >>

   Send article as PDF   

Waiting / Blocking in Linux Driver Part – 2

<< Previous Article

In the last article, we managed to get our process blocked. As stated then, the code had couple of problems. One of them being unblocking the process. There was no one to wake our process up. Sleeping process is of no use. Another flaw was that our process was sleeping unconditionally. However, in real life scenarios, process never goes to sleep unconditionally. Read on to get the further understanding of wait mechanisms in the kernel.

Waking up the Process

We have the wake_up_process() API as shown below for waking up the process.

void wake_up_process(task_struct *ts);
ts - pointer to the task_struct of the waiting process

As our process would be blocked, we need some other process to invoke this API. Below is code snippet which demonstrates the usage of this API.

#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/fs.h> 
#include <linux/cdev.h>
#include <linux/device.h>
#include <asm/uaccess.h>
#include <linux/wait.h>
#include <linux/sched.h>
#include <linux/delay.h>

#define FIRST_MINOR 0
#define MINOR_CNT 1

static dev_t dev;
static struct cdev c_dev;
static struct class *cl;
static struct task_struct *sleeping_task;

int open(struct inode *inode, struct file *filp)
{
	printk(KERN_INFO "Inside open\n");
	return 0;
}

int release(struct inode *inode, struct file *filp)
{
	printk(KERN_INFO "Inside close\n");
	return 0;
}

ssize_t read(struct file *filp, char *buff, size_t count, loff_t *offp)
{
	printk(KERN_INFO "Inside read\n");
	printk(KERN_INFO "Scheduling out\n");
	sleeping_task = current;
	set_current_state(TASK_INTERRUPTIBLE);
	schedule();
	printk(KERN_INFO "Woken up\n");
	return 0;
}

ssize_t write(struct file *filp, const char *buff, size_t count, loff_t *offp)
{
	printk(KERN_INFO "Inside Write\n");
	wake_up_process(sleeping_task);
	return count;
}

struct file_operations fops =
{
	.read = read,
	.write = write,
	.open = open,
	.release = release
};

int schd_init (void) 
{
	int ret;
	struct device *dev_ret;

	if ((ret = alloc_chrdev_region(&dev, FIRST_MINOR, MINOR_CNT, "wqd")) < 0)
	{
		return ret;
	}
	printk(KERN_INFO "Major Nr: %d\n", MAJOR(dev));

	cdev_init(&c_dev, &fops);

	if ((ret = cdev_add(&c_dev, dev, MINOR_CNT)) < 0)
	{
		unregister_chrdev_region(dev, MINOR_CNT);
		return ret;
	}

	if (IS_ERR(cl = class_create(THIS_MODULE, "chardrv")))
	{
		cdev_del(&c_dev);
		unregister_chrdev_region(dev, MINOR_CNT);
		return PTR_ERR(cl);
	}
	if (IS_ERR(dev_ret = device_create(cl, NULL, dev, NULL, "mychar%d", 0)))
	{
		class_destroy(cl);
		cdev_del(&c_dev);
		unregister_chrdev_region(dev, MINOR_CNT);
		return PTR_ERR(dev_ret);
	}
	return 0;
}

void schd_cleanup(void) 
{
	printk(KERN_INFO "Inside cleanup_module\n");
	device_destroy(cl, dev);
	class_destroy(cl);
	cdev_del(&c_dev);
	unregister_chrdev_region(dev, MINOR_CNT);
}

module_init(schd_init);
module_exit(schd_cleanup);

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Pradeep Tewani");
MODULE_DESCRIPTION("Waiting Process Demo");

In the above example, we are using the global variable sleeping_task to hold the task_struct of the sleeping process. This variable is updated in read() function. In write() function, we use the sleeping_task as a parameter to the wake_up_process() API.

Below is the sample run for the above example. Assuming that the above module is compiled as sched.ko:

$ insmod wait.ko
Major Nr: 250
$ cat /dev/mychar0
Inside open
Inside read
Scheduling out

The above output is same as that of example from the last article. Now, comes the interesting part of waking up the process. For this, open another shell and execute the command as below:

$ echo 1 > /dev/mychar0
Inside open
Inside write
Woken up
Inside close
Inside close

When we execute the echo command, write operation gets invoked, which invokes the wake_up_process() to wake up the blocked process.

Waiting on an event

What we saw in the above example was the basic mechanism to block and unblock the process. However, as discussed earlier, the process always waits on some event. The event can be some specified amount of time, waiting for some resource or it can well be waiting for some data to arrive. Below is the modified version of above program to wait for an event.

ssize_t read(struct file *filp, char *buff, size_t count, loff_t *offp) 
{
	printk(KERN_INFO "Inside read\n");
	printk(KERN_INFO "Scheduling Out\n");
	sleeping_task = current;
slp:
	if (flag != 'y') 
	{
		set_current_state(TASK_INTERRUPTIBLE);
		schedule();
	}
	if (flag == 'y')
		printk(KERN_INFO "Woken Up\n");
	else 
	{
		printk(KERN_INFO "Interrupted by signal\n");
		goto slp;
	}
	flag = 'n';
	printk(KERN_INFO "Woken Up\n");
	return 0;
}

ssize_t write(struct file *filp, const char *buff, size_t count, loff_t *offp) 
{ 
	printk(KERN_INFO, "Inside write\n");
	ret = __get_user(flag, buffer);
	printk(KERN_INFO "%c", flag);
	wake_up_process(sleeping_task);
	return count;
}

Here, we use the global variable flag to signal the condition and the event for waking up is the flag being set to ‘y’. This flag is updated in write() function as per the data from the user space. Below is the sample run of the above program, assuming that the module is compiled as sched.ko:

$ insmod wait.ko
Major Nr: 250
$ cat /dev/mychar0
Inside open
Inside read
Scheduling out

This gets our process blocked. Open another shell to wake up the process:

$ echo 1 > /dev/mychar0
Inside open
Inside write
Interrupted by signal
Inside close

Unlike earlier program, this doesn’t unblock the process. The process wakes up and again goes to sleep, since the condition for waking up is not satisfied. The process will wake up only if the flag is set to ‘y’. Let’s execute the echo as below:

$ echo 'y' > /dev/mychar0
Inside open
Inside write
Woken up
Inside close
Inside close

As seen above, this will wake up the process, since the condition of flag being ‘y’ is satisfied.

Conclusion

In this article, we implemented the basic wait mechanism in the driver. This was more like a manual waiting where everything needs to be taken care by driver writer and as such is prone to some synchronization issues. So, this kind of manual waiting is rarely used. However, kernel does provide some robust mechanism to implement the waiting. So, stay tuned to my next article to learn more about the waiting in Linux driver.

Next Article >>

   Send article as PDF   

Waiting / Blocking in Linux Driver

<< Previous Article

Continuing our journey with Linux kernel internals, the next few articles in this series will focus on wait mechanisms in kernel. Now, you might be wondering, why do we need to wait in Linux driver? Well, there can be quite a lot of reasons to wait in driver. Let’s say you are interfacing with hardware such as LCD, which requires you to wait for 5ms before sending a subsequent command. Another example is say you want to read the data from disk, and since disk is a slower device, it may require you to wait until valid data is available. In these scenarios, we have no option, but to wait. One of the simplest way to implement the wait is a busy loop, but it might not be efficient way of waiting. So, does kernel provide any efficient mechanisms to wait? Yes, of course, kernel does provide a variety of mechanisms for waiting. Read on to get the crux of waiting in Linux kernel.

Process States in Linux

Before moving on to the wait mechanisms, it would be worthwhile to understand the process states in Linux. At any point of time, a process can be in any of the below mentioned states:

  • TASK_RUNNING :- Process is in run queue or is running
  • TASK_STOPPED :- Process stopped by debugger
  • TASK_INTERRUPTIBLE :- Process is waiting for some event, but can be woken up by signal
  • TASK_UNINTERRUPTIBLE :- Similar to TASK_INTERRUPTIBLE, but can’t be woken up by signal
  • TASK_ZOMBIE :- Process is terminated, but not cleaned up yet

For a process to be scheduled, it needs to be in TASK_RUNNING state, while TASK_INTERRUPTIBLE and TASK_INTERRUPTIBLE states correspond to a waiting process.

Wait Mechanism in Linux Kernel

API schedule() provides the basic wait mechanism in the linux kernel. Invoking this API yields the processor and invokes the scheduler to schedule any other process in run queue. Below is the programming example for the same:

#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/fs.h>
#include <linux/cdev.h>
#include <linux/device.h>
#include <asm/uaccess.h>
#include <linux/wait.h>
#include <linux/sched.h>
#include <linux/delay.h>

#define FIRST_MINOR 0
#define MINOR_CNT 1

static dev_t dev;
static struct cdev c_dev;
static struct class *cl;

int open(struct inode *inode, struct file *filp)
{
	printk(KERN_INFO "Inside open\n");
	return 0;
}

int release(struct inode *inode, struct file *filp)
{
	printk(KERN_INFO "Inside close\n");
	return 0;
}

ssize_t read(struct file *filp, char *buff, size_t count, loff_t *offp)
{
	printk(KERN_INFO "Inside read\n");
	printk(KERN_INFO "Scheduling out\n");
	schedule();
	printk(KERN_INFO "Woken up\n");
	return 0;
}

ssize_t write(struct file *filp, const char *buff, size_t count, loff_t *offp)
{
	printk(KERN_INFO "Inside Write\n");
	return 0;
}

struct file_operations fops =
{
	.read = read,
	.write = write,
	.open = open,
	.release = release
};

int schd_init (void)
{
	int ret;
	struct device *dev_ret;

	if ((ret = alloc_chrdev_region(&dev, FIRST_MINOR, MINOR_CNT, "wqd")) < 0)
	{
		return ret;
	}
	printk("Major Nr: %d\n", MAJOR(dev));

	cdev_init(&c_dev, &fops);

	if ((ret = cdev_add(&c_dev, dev, MINOR_CNT)) < 0)
	{
		unregister_chrdev_region(dev, MINOR_CNT);
		return ret;
	}

	if (IS_ERR(cl = class_create(THIS_MODULE, "chardrv")))
	{
		cdev_del(&c_dev);
		unregister_chrdev_region(dev, MINOR_CNT);
		return PTR_ERR(cl);
	}
	if (IS_ERR(dev_ret = device_create(cl, NULL, dev, NULL, "mychar%d", 0)))
	{
		class_destroy(cl);
		cdev_del(&c_dev);
		unregister_chrdev_region(dev, MINOR_CNT);
		return PTR_ERR(dev_ret);
	}
	return 0;
}

void schd_cleanup(void)
{
	printk(KERN_INFO " Inside cleanup_module\n");
	device_destroy(cl, dev);
	class_destroy(cl);
	cdev_del(&c_dev);
	unregister_chrdev_region(dev, MINOR_CNT);
}

module_init(schd_init);
module_exit(schd_cleanup);

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Pradeep Tewani");
MODULE_DESCRIPTION("Waiting Process Demo");

Example above is a simple character driver demonstrating the use of schedule() API. In read() function, we invoke schedule() to yield the processor. Below is the sample run, assuming that the above program is compiled as sched.ko:

$ insmod sched.ko
Major Nr: 244
$ cat /dev/mychar0
Inside open
Inside read
Scheduling out
Woken up

So, what do we get? Does the usage of schedule() API serves the purpose of waiting? Not really. Why is it so?  Well, if you recall the definition of schedule(), it states that the process invoking this API voluntarily yields the processor, but only yielding the processor is not enough. Process is still in run queue and as long as process is in run queue, it would be scheduled again to run. This is exactly what happens in the above example, which makes our process to come out of wait quite immediately. So, the pre-condition for performing the wait with schedule() is to first move the process out of the run queue. How do we achieve this? For this, we have a API called set_current_state(). Below is the modified code snippet for the read() function:

ssize_t read(struct file *filp, char *buff, size_t count, loff_t *offp)
{
	printk(KERN_INFO "Inside read\n");
	printk(KERN_INFO "Scheduling out\n");
	set_current_state(TASK_INTERRUPTIBLE);
	schedule();
	printk(KERN_INFO "Woken up\n");
	return 0;
}

In the above example, before invoking the schedule() API, we are setting the state of the process to TASK_INTERRUPTIBLE. This will move the process out of run queue and hence it won’t be scheduled again to run. Below is the sample run for the modified example:

$ insmod sched.ko
Major Nr: 250
$ cat /dev/mychar0
Inside open
Inside read
Scheduling out

Conclusion

So, finally we are able to get the process blocked. But do you see the problem with this code? This process is indefinitely blocked. When and who will wake this process up? Another thing worth noting is that in real life scenarios, process always waits on some event, but our process is put to an unconditional sleep. How do we make a process wait on an event? To find out the answer to these questions, stay tuned to my next article. Till then, Happy Waiting!

Next Article >>

   Send article as PDF   

Synchronization without Locking

<< Previous Article

We have covered the various synchronization mechanisms in the previous articles. One of the thing common among them was that they put the process to sleep, if the lock is not available. Also, all those are prone to deadlock, if not implemented carefully. Sometimes however, we require to protect a simple variable like integer. It can be as simple as setting a flag. Using semaphore or spinlock to protect such a variable may be overhead. So, does kernel provide any synchronization mechanism without locking? Read on to explore more on this.

Atomic Operations

Atomic operations are indivisible and uninterruptible. Each of these compile into a single machine instruction as far as possible and are guaranteed to be atomic. Kernel provides a atomic integer type atomic_t for atomic operations. Below are the operations:

#include <asm/atomic.h>

void atomic_set(atomic_t *a, int i); // Set the atomic variable a to integer value i
int atomic_read(atomic *a); // Return the value of atomic variable a
void atomic_add(int i, atomic_t *a); // Add i to atomic variable a
void atomic_sub(i, atomic_t *a); // Subtract i from atomic variable a
void atomic_inc(atomic_t *a); // Increment operation
void atomic_dec(atomic_t *a); // Decrement operation

Atomic Bit Operations

Many a times, the requirement is to flag some condition. For this, a single bit may serve the purpose well. However, atomic_t type variable doesn’t work well for manipulating the bits. For this, the kernel provides a set of operations as listed below:

#include <asm/bitops.h>

void set_bit(int nr, void *a); // Set the bit number nr in value pointed by a
void clear_bit(int nr, void *a); // Clear the bit number nr in value pointed by a
void change_bit(int nr, void *a); // Toggle the bit at position nr

Conclusion

So, these are the simple, yet powerful mechanisms to provide the synchronization without locking. These can be quite useful while dealing with integer and bit operations, respectively, and involve far less overhead as compared to the usual synchronization mechanisms such as semaphore and mutex. However, these might not be useful in achieving the critical sections.

With this, we are now familiar with most of the synchronization mechanisms provided by the kernel. As you understand, synchronization mechanisms come with its own pros and cons and we need to be very careful while selecting the right one.

Next Article >>

   Send article as PDF   

Concurrency Management Part – 3

<< Previous Article

In the last two articles, we have discussed some of the commonly used synchronization mechanisms in kernel. It was observed that these synchronization mechanisms restrict the access to the resource, irrespective of the operation which thread/process wants to perform on the resource. This in turn, mean that even though one thread has acquired the resource for read access, another thread can’t access the  same resource for reading. In most of the cases, it is quite desirable to have two or more threads having the read access to the resource as far as they are not modifying the resource data structure.  This will result into the improved system performance. Read on to find out the mechanism provided by kernel to achieve this.

Reader / Writer Semaphore

This is the type of semaphore, which provides the access depending on operation which thread/process wants to perform on the data structure. With this, multiple readers can have the access to the resource at the same time, while only one writer gets access at a time. So, will reader be allowed if write operation is in progress? Definitely not. At a time, there can be either read or write operation in progress as usual, but there can be multiple read operations. So, let’s look at the data structures associated with the reader / writer semaphores:

#include <linux/rwsem.h>

// Data Structure
structure rw_semaphore rw_sem;

// Initialization
void init_rwsem(&rw_sem);

// Operations for reader
void down_read(&rw_sem);
void up_read(&rw_sem);

// Operations for writer
void down_write(&rw_sem);
void up_write(&rw_sem);

As seen above, initialization operation is similar to what we do with the regular semaphore, but key difference lies in the fact that we have separate operations for readers and writers.

Below is an example usage of reader / writer semaphore:

#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/fs.h> 
#include <linux/cdev.h>
#include <linux/device.h>
#include <asm/uaccess.h>
#include <linux/semaphore.h>
#include <linux/sched.h>
#include <linux/delay.h>

#define FIRST_MINOR 0
#define MINOR_CNT 1

static dev_t dev;
static struct cdev c_dev;
static struct class *cl;
static struct task_struct *task;
static struct rw_semaphore rwsem;

int open(struct inode *inode, struct file *filp)
{
    printk(KERN_INFO "Inside open\n");
    task = current;
    return 0;
}

int release(struct inode *inode, struct file *filp)
{
    printk(KERN_INFO "Inside close\n");
    return 0;
}

ssize_t read(struct file *filp, char *buff, size_t count, loff_t *offp)
{
    printk("Inside read\n");
    down_read(&rwsem);
    printk(KERN_INFO "Got the Semaphore in Read\n");
    printk("Going to Sleep\n");
    ssleep(30);
    up_read(&rwsem);
    return 0;
}

ssize_t write(struct file *filp, const char *buff, size_t count, loff_t *offp)
{
    printk(KERN_INFO "Inside write. Waiting for Semaphore...\n");
    down_write(&rwsem);
    printk(KERN_INFO "Got the Semaphore in Write\n");
    up_write(&rwsem);
    return count;
}

struct file_operations fops =
{
    read:    read,
    write:   write,
    open:    open,
    release: release
};

int rw_sem_init(void)
{
    int ret;
    struct device *dev_ret;

    if ((ret = alloc_chrdev_region(&dev, FIRST_MINOR, MINOR_CNT, "rws")) < 0)
    {
        return ret;
    }
    printk("Major Nr: %d\n", MAJOR(dev));

    cdev_init(&c_dev, &fops);

    if ((ret = cdev_add(&c_dev, dev, MINOR_CNT)) < 0)
    {
        unregister_chrdev_region(dev, MINOR_CNT);
        return ret;
    }

    if (IS_ERR(cl = class_create(THIS_MODULE, "chardrv")))
    {
        cdev_del(&c_dev);
        unregister_chrdev_region(dev, MINOR_CNT);
        return PTR_ERR(cl);
    }
    if (IS_ERR(dev_ret = device_create(cl, NULL, dev, NULL, "mychar%d", 0)))
    {
        class_destroy(cl);
        cdev_del(&c_dev);
        unregister_chrdev_region(dev, MINOR_CNT);
        return PTR_ERR(dev_ret);
    }

    init_rwsem(&rwsem);

    return 0;
}

void rw_sem_cleanup(void)
{
    printk(KERN_INFO "Inside cleanup_module\n");
    device_destroy(cl, dev);
    class_destroy(cl);
    cdev_del(&c_dev);
    unregister_chrdev_region(dev, MINOR_CNT);
}

module_init(rw_sem_init);
module_exit(rw_sem_cleanup);

MODULE_LICENSE("GPL");
MODULE_AUTHOR("SysPlay Workshops <workshop@sysplay.in>");
MODULE_DESCRIPTION("Reader Writer Semaphore Demo");

Below is the sample run:

cat /dev/mychar0
Inside Open
Inside Read
Got the Semaphore in Read
Going to sleep

cat /dev/mychar0 (In different shell)
Inside Open
Inside Read
Got the Semaphore in Read
Going to sleep

echo 1 > /dev/mychar0 (In different shell)
Inside Write. Waiting for semaphore...

As seen above, multiple reader processes are able to access the resource simultaneously. However, writer process gets blocked, while the readers are accessing the resource.

Conclusion

With this, we have covered most of the commonly used synchronization mechanisms in the kernel. Apart from these, kernel provides some atomic operations, which provides instructions that execute atomically without interruption. Atomic operators are indivisible instructions. These are useful when we need to do some operations on integers and bits.

Next Article >>

   Send article as PDF   

Concurrency Management Part – 2

<< Previous Article

In the previous article, we discussed about the basic synchronization mechanisms such as mutex and semaphores. As a part of that, there came up a couple of questions. If binary semaphore can achieve the synchronization as provided by mutex, then why do we need mutex at all? Another question was, can we use semaphore / mutex in interrupt handlers? To find the answer to these questions, read on.

Mutex and Binary Semaphore

Below is the simple example using the binary semaphore:

#include <linux/module.h>
#include <linux/fs.h>
#include <linux/cdev.h>
#include <linux/device.h>
#include <linux/errno.h>
#include <asm/uaccess.h>
#include <linux/semaphore.h>

#define FIRST_MINOR 0
#define MINOR_CNT 1

static dev_t dev;
static struct cdev c_dev;
static struct class *cl;

static int my_open(struct inode *i, struct file *f)
{
    return 0;
}
static int my_close(struct inode *i, struct file *f)
{
    return 0;
}

static char c = 'A';
static struct semaphore my_sem;

static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off)
{
    // Acquire the Semaphore
    if (down_interruptible(&my_sem))
    {
        printk("Unable to acquire Semaphore\n");
        return -1;
    }
    return 0;
}
static ssize_t my_write(struct file *f, const char __user *buf, size_t len,
        loff_t *off)
{
    // Release the semaphore
    up(&my_sem);
    if (copy_from_user(&c, buf + len - 1, 1))
    {
        return -EFAULT;
    }
    return len;
}

static struct file_operations driver_fops =
{
 .owner = THIS_MODULE,
 .open = my_open,
 .release = my_close,
 .read = my_read,
 .write = my_write
};

static int __init sem_init(void)
{
    int ret;
    struct device *dev_ret;

    if ((ret = alloc_chrdev_region(&dev, FIRST_MINOR, MINOR_CNT, "my_sem")) < 0)
    {
        return ret;
    }

    cdev_init(&c_dev, &driver_fops);

    if ((ret = cdev_add(&c_dev, dev, MINOR_CNT)) < 0)
    {
        unregister_chrdev_region(dev, MINOR_CNT);
        return ret;
    }

    if (IS_ERR(cl = class_create(THIS_MODULE, "char")))
    {
        cdev_del(&c_dev);
        unregister_chrdev_region(dev, MINOR_CNT);
        return PTR_ERR(cl);
    }

    if (IS_ERR(dev_ret = device_create(cl, NULL, dev, NULL, "mysem%d", FIRST_MINOR)))
    {
        class_destroy(cl);
        cdev_del(&c_dev);
        unregister_chrdev_region(dev, MINOR_CNT);
        return PTR_ERR(dev_ret);
    }

    sema_init(&my_sem, 0);
    return 0;
}

static void __exit sem_exit(void)
{
    device_destroy(cl, dev);
    class_destroy(cl);
    cdev_del(&c_dev);
    unregister_chrdev_region(dev, MINOR_CNT);
}

module_init(sem_init);
module_exit(sem_exit);

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Pradeep");
MODULE_DESCRIPTION("Binary Semaphore Demonstration");

In the above example, we initialize the semaphore with the value of 0 with sem_init(). In my_read(), we decrement the semaphore and in my_write(), we increment the semaphore. Below is the sample run:

insmod sem.ko
cat /dev/mysem0 - This will block
echo 1 > /dev/mysem0 - Will unblock the cat process

Now, let’s try achieving the same with mutex. Below is the example for the same.

#include <linux/module.h>
#include <linux/fs.h>
#include <linux/cdev.h>
#include <linux/device.h>
#include <linux/errno.h>
#include <asm/uaccess.h>
#include <linux/mutex.h>

#define FIRST_MINOR 0
#define MINOR_CNT 1

DEFINE_MUTEX(my_mutex);

static dev_t dev;
static struct cdev c_dev;
static struct class *cl;

static int my_open(struct inode *i, struct file *f)
{
    return 0;
}
static int my_close(struct inode *i, struct file *f)
{
    return 0;
}

static char c = 'A';

static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off)
{
    if (mutex_lock_interruptible(&my_mutex))
    {
        printk("Unable to acquire Semaphore\n");
        return -1;
    }
    return 0;
}
static ssize_t my_write(struct file *f, const char __user *buf, size_t len,
        loff_t *off)
{
    mutex_unlock(&my_mutex);
    if (copy_from_user(&c, buf + len - 1, 1))
    {
        return -EFAULT;
    }
    return len;
}

static struct file_operations driver_fops =
{
    .owner = THIS_MODULE,
    .open = my_open,
    .release = my_close,
    .read = my_read,
    .write = my_write
};

static int __init init_mutex(void)
{
    int ret;
    struct device *dev_ret;

    if ((ret = alloc_chrdev_region(&dev, FIRST_MINOR, MINOR_CNT, "my_mutex")) < 0)
    {
        return ret;
    }

    cdev_init(&c_dev, &driver_fops);

    if ((ret = cdev_add(&c_dev, dev, MINOR_CNT)) < 0)
    {
        unregister_chrdev_region(dev, MINOR_CNT);
        return ret;
    }

    if (IS_ERR(cl = class_create(THIS_MODULE, "char")))
    {
        cdev_del(&c_dev);
        unregister_chrdev_region(dev, MINOR_CNT);
        return PTR_ERR(cl);
    }

    if (IS_ERR(dev_ret = device_create(cl, NULL, dev, NULL, "mymutex%d",
        FIRST_MINOR)))
    {
        class_destroy(cl);
        cdev_del(&c_dev);
        unregister_chrdev_region(dev, MINOR_CNT);
        return PTR_ERR(dev_ret);
    }

    return 0;
}

static void __exit exit_mutex(void)
{
    device_destroy(cl, dev);
    class_destroy(cl);
    cdev_del(&c_dev);
    unregister_chrdev_region(dev, MINOR_CNT);
}

module_init(init_mutex);
module_exit(exit_mutex);

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Pradeep");
MODULE_DESCRIPTION("Mutex Demonstration");

In the above example, I have replaced the semaphore with mutex. Below is the sample run:

cat /dev/mymutex0 - This will acquire the mutex
cat /dev/mymutex0 - This will block
echo 1 > /dev/mymutex0

So, what do you get after executing the echo command? I get the warning as below:

DEBUG_LOCKS_WARN_ON(lock->owner != current)

So, what does this warning mean? It warns that the process that is trying to unlock the mutex is not the owner of the same. But same thing worked without any warning with semaphore. What does this mean? This brings us to the important difference between the mutex and semaphore. Mutex have ownership associated with it. The process that acquires the lock is the one that should unlock the mutex. While such ownership didn’t exist with the semaphore. While using the semaphores for synchronization, its completely upto the user to ensure that the down & up are always called in pairs. But, mutex is designed in a way that lock and unlock must always be called in pairs.

Spinlock

Now, let’s come to the second question – can we use the semaphore / mutex in interrupt handlers. The answer is yes and no. I mean you can use the up and unlock, but can’t use down and lock, as these are blocking calls which put the process to sleep and we are not supposed to sleep in interrupt handlers. So, what if I want to achieve the synchronization in interrupt handlers? For this, there is a mechanism called spinlock. Spinlock is a lock which never yields. Similar to mutex, it has two operations – lock and unlock. If the lock is available, process will acquire it and will continue in the critical section and unlock it, once its done. This is pretty much similar to mutex. But, what if lock is not available? Here, comes the interesting difference. With mutex, the process will sleep, until the lock is available. But, in case of spinlock, it goes into the tight loop, where it continuously checks for a lock, until it becomes available. This is the spinning part of the spin lock. This was designed for multiprocessor systems. But, with the preemptible kernel, even a uniprocessor system behaves like an SMP. Below are the data structures associated with the spinlock:

#include <linux/spinlock.h>

// Data structure
struct spinlock_t my_slock

// Initialization
spinlock_init(&my_slock)

// Operations
spin_lock(&my_slock)
spin_unlock(&my_slock)

Now, let’s try to understand the complications associated with the spinlock. Let’s say, thread T1 acquires the spinlock and enters the critical section. Meanwhile, some high priority thread T2 becomes runnable and preempts the thread T1. Now, thread T2 also tries to acquire the spinlock and since the lock is not available, T2 will spin. Now, since T2 has a higher priority, T1 won’t run ever and this in turn will result in deadlock. So, how do we avoid such scenarios? Spinlock code is designed in such a way that any time kernel code holds a spinlock, the preemption is disabled on the local processor. Therefore, its very important to hold a spinlock for minimum possible time. What if the spinlock is shared between the thread T1 and interrupt handler? For this, there is a variant of spinlock, which disables the interrupts on local processor.

Conclusion

One common thing which we observed with mutex and semaphore is that they block the process, irrespective of the operation it wants to perform on the data structure. As you understand, there are two different operations a process can perform on the data structure – read and write. In most of the cases, it is innocuous to allow multiple readers at a time as far as they don’t modify the data structure. Such a parallelism would improve the performance. So, how do we achieve this? To find the answer to this, stay tuned to my next article. Till then, good bye!

Next Article >>

   Send article as PDF   

Concurrency Management in Linux Kernel

<< Previous Article

In the previous article, we discussed about the kernel threads, wherein we discussed various aspects of threads such as creation, stopping, signalling and so on. Threads provide one of the ways to achieve multitasking in the kernel. While multitasking brings the definite improvement in the system performance, but it comes with its own side effects. So, what are the side effects of multitasking? How can we overcome these? Read on to get the answer to all these questions.

Concurrency Management

In order to achieve the optimized system performance, kernel provides the multitasking, where multiple threads can execute in parallel and thereby utilizing the CPU in optimum way. Though useful, but multitasking, if not implemented cautiously can lead to concurrency issues, which can be very difficult to handle. So, let’s take an example to understand the concurrency issues. Let’s say there are two threads – T1 and T2. Among them is the shared resource called A. Both the threads execute the code as below:

int function()
{
    A++;
    printf("Value of i is %d\n", i);
}

Just imagine, when the thread T1 was in the middle of modifying the variable, it was pre-empted and thread T2 started to execute and it tries to modify the variable A. So, what will be result? Inconsistent value of variable A. These kind of scenarios where multiple threads are contending for the same resources is called race condition. These bugs are easy to create, but difficult to debug.

So, what’s the best way to avoid the concurrency issues? One thing is to avoid the global variables. But, its not always possible to do so. As you know, hardware resources are in nature globally shared. So, in order to deal with such scenarios, kernel provides us the various synchronization mechanisms such as mutex, semaphores and so on.

Mutex

Mutex stands for MUTual EXclusion. Its a compartment with a single key. Whoever, enters inside the compartment, locks it and takes the key with him. By the time, if someone else tries to acquire the compartment, he will have to wait. Its only when he comes outside, gives the key, would the other person be able to enter inside. Similar is the thing with the mutex. If one thread of execution acquires the mutex lock, other threads trying to acquire the same lock would be blocked. Its only when the first thread releases the mutex lock, would the other thread be able to acquire it. Below are the data structures for mutex:

#include <linux/mutex.h>

struct mutex /* Mutex data structure */

// Mutex Initialization
// Statically
DEFINE_MUTEX(my_mutex);
// Dynamically
struct mutex my_mutex;
mutex_init(&my_mutex);

// Operations
void mutex_lock(&my_mutex);
void mutex_unlock(&my_mutex);
int mutex_lock_interruptible(&my_mutex);
int mutex_trylock(&my_mutex);

Here, there are two versions for lock – interruptible and uninterruptible. mutex_lock_interruptible()  puts the current process in TASK_INTERRUPTIBLE state. So, the current process sleeps until the state is changed to TASK_RUNNING. For the process in TASK_INTERRUPTIBLE, there are two possible events which may change the process state to TASK_RUNNING. First event is obviously, when the mutex is available and another thing is, if any signal is delivered to process. But, if the process is put into the TASK_UNINTERRUPTIBLE state, which is the case when we invoke mutex_lock(), the only event which can wake up the process is the availability of resource. In almost all the scenarios we use mutex_lock_interruptible().

Below is the simple example to demonstrate the usage of mutex.

static int thread_fn(void *unused)
{
    while (!kthread_should_stop())
    {
        counter++;
        printk(KERN_INFO "Job %d started\n", counter);
        get_random_bytes(&i, sizeof(i));
        ssleep(i % 5);
        printk(KERN_INFO "Job %d finished\n", counter);
    }
    printk(KERN_INFO "Thread Stopping\n");
    do_exit(0);
}

Here, we have global variable counter, which is being shared between two threads. Each thread increments the counter, prints the value and then sleeps for random number of seconds. This is obvious entity for race condition and can result into the corruption of variable counter. So, in order to protect the variable, we use the mutex as below:

#include <linux/mutex.h>

DEFINE_MUTEX(my_mutex);
static int counter = 0;

static int thread_fn(void *unused)
{
    while (!kthread_should_stop())
    {
        mutex_lock(&my_mutex);
        counter++;
        printk(KERN_INFO "Job %d started\n", counter);
        get_random_bytes(&i, sizeof(i));
        ssleep(i % 5);
        printk(KERN_INFO "Job %d finished\n", counter);
        mutex_unlock(&my_mutex);
    }
    printk(KERN_INFO "Thread Stopping\n");
    do_exit(0);
}

As seen in the above code, we declare a variable my_mutex of type struct mutex which protects the global variable counter. Thread will be able to access the variable only if it is not in use by another thread. In this way, mutex synchronizes the access to the global variable counter.

Semaphore

Semaphore is a counter. It is mostly used, when we need to maintain the count of the resources. Let’s say we want to implement the memory manager, where we need to maintain the count of available memory pages. To start with, say we have 10 pages, so the initial value of the semaphore will be 10.  So, if the thread 1 comes and asks for the 5 pages, we will decrement the value of semaphore to 5 (10 – 5). Likewise, say thread 2 asks for 5 pages, this will further decrement the semaphore value to 0 (5 – 5). At this point, if there is an another thread say thread 3 and asks for 3 pages, it will have to wait, since we don’t any pages left. Meanwhile, if the thread 1 is done with the pages, it will release 5 pages, which in turn will increment the semaphore value to 5 (0 + 5). This, in turn will unblock the thread-3, which will decrement the semaphore value to 2 (5 – 3). So, as you understand, there are two possible operations with the semaphore – increment and decrement. Accordingly, we have two APIs – up and down. Below are the data structures and APIs for semaphore.

#include <linux/semaphore.h>

struct semaphore /* Semaphore data structure */

// Initialization
// Statically
DEFINE_SEMAPHORE(my_sem);
// Dynamically
struct semaphore my_sem;
sema_init(&my_sem, val);

// Operations
void down((&sem);
int down_interruptible(&sem);
int down_trylock(&sem);
void up(&sem);

As with the mutex, we have two versions of down – interruptible and uninterruptible. Initialization function sema_init() takes two arguments – pointer to the struct semaphore and initial value of semaphore. If semaphore value is greater than 1, we call it as counting semaphore and if the value of semaphore is restricted to 1, it operates in way similar to mutex. The semaphore for which the maximum count value is 1, is called as binary semaphore. Below is example, where the mutex is replaced with a semaphore:

#include <linux/semaphore.h>

static struct my_sem;
static int counter = 0;

static int thread_fn(void *unused)
{
    while (!kthread_should_stop())
    {
        if (down_interruptible(&my_sem))
            break;
        counter++;
        printk(KERN_INFO "Job %d started\n", counter);
        get_random_bytes(&i, sizeof(i));
        ssleep(i % 5);
        printk(KERN_INFO "Job %d finished\n", counter);
        up(&my_sem);
    }
    printk(KERN_INFO "Thread Stopping\n");
    do_exit(0);
}

static int init_module(void)
{
    // Binary semaphore
    sema_init(&my_sem, 1);
}

The code is same as with mutex. It provides the synchronized access to the global variable counter.

Conclusion

In this article, we discussed two most commonly used synchronization mechanisms. As seen from the above examples, the synchronization achieved  with mutex, can be achieved with the binary semaphore as well. Apart from this, semaphore can also operate in counting mode. So, why do we need mutex at all? Also, can we use the semaphore in interrupt handler? To find the answers to these questions, stay tuned to my next article on concurrency management. Till then, good bye.

Next Article >>

   Send article as PDF   

Kernel Threads Continued

<< Previous Article

In the previous article, we learned the basics of kernel threads such as creating the thread, running the thread and so on. In this article, we will dive a bit more into the kernel threads, where we will see the things such as stopping the thread, signalling the thread and so. So, let’s begin…

Continuing with the previous article, we were observing a crash while removing the kernel module with rmmod, So, are you able to find the reason for the crash? If yes, that’s very well done. The reason for the crash wasss … Let us first cover this article and hopefully, as a part of that, you by yourself would be able to discover the reason.

Stopping the Kernel Thread

If you are familiar with the pthreads in user space, you might have come across the call pthread_cancel(). With this call, one thread can send the cancellation request to the other. Pretty similar to this, there exists a call called kthread_stop() in kernel space. Below is the prototype for the same:

#include <linux/kthread.h>
int kthread_stop(struct task_struct *k);

Parameters:
k – pointer to the task structure of the thread to be stopped

Returns:  The result of the function executed by the thread, -EINTR, if wake_up_process() was never called.

Below is the code snippet which uses kthread_stop():

#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/kthread.h>
#include <linux/delay.h>

static struct task_struct *thread_st;
// Function executed by kernel thread
static int thread_fn(void *unused)
{
    while (1)
    {
        printk(KERN_INFO "Thread Running\n");
        ssleep(5);
    }
    printk(KERN_INFO "Thread Stopping\n");
    do_exit(0);
    return 0;
}
// Module Initialization
static int __init init_thread(void)
{
    printk(KERN_INFO "Creating Thread\n");
    //Create the kernel thread with name 'mythread'
    thread_st = kthread_run(thread_fn, NULL, "mythread");
    if (thread_st)
        printk(KERN_INFO "Thread Created successfully\n");
    else
        printk(KERN_ERR "Thread creation failed\n");
    return 0;
}
// Module Exit
static void __exit cleanup_thread(void)
{
   printk(KERN_INFO "Cleaning Up\n");
   if (thread_st)
   {
       kthread_stop(thread_st);
       printk(KERN_INFO "Thread stopped");
   }
}
MODULE_LICENSE("GPL");
module_init(init_thread);
module_exit(cleanup_thread);

Compile the code and insert the module with insmod. Now, try removing the module with rmmod. What do you see? Dude … where is my command prompt? rmmod seems to have got stuck..”. Relax guys!  I forgot to mention that kthread_stop(), is indeed a blocking call. It waits for the thread to exit and since our thread is in while(1), so hopefully, it will never exit and unfortunately, our rmmod will never come out. So, what does this mean? What we can infer from this, is that the kthread_stop() is just the signal, not the command. Calling kthread_stop() doesn’t gives you a license to kill/stop the thread, instead it just sets the flag in the task_struct() of the thread and waits for the thread to exit. It’s totally upto the thread to decide, when it would like to exit.  So, why is such a thing? Well, just think of the scenario where kernel thread has allocated a memory and would free it up once it exits. Had it been allowed to be killed in middle, thread would never be able to free up the memory. This, in turn would result in memory leak. This was the one of the simplest scenarios, which I could think of. Coming back to our problem, how do we get back the command prompt? Let’s try one more thing. In the user space, you might have used the kill  command to send the signal to the process. And one of the most powerful signal which process can’t mask is SIGKILL. So, lets use the same on the kernel thread as well. Find the id of the running kernel thread with ps command and then, use the following command:

kill -9 <thread_id>

So, what’s the result? Dude … this thread is invincible!. True, by default, kernel thread ignores all the signals. The reason behind this is same as explained above. Kernel thread has a full control over when can it be killed. So, the only way to get out of this problem is to kill the problem, that means, reboot the system. This program has a bug, so read on to fix this bug.

So, now the question is, how to let the kernel thread know that, somebody is willing to stop it. For this, there is a call called kthread_should_stop(). This function returns non-zero value, if there is any outstanding ‘stop’ request. Thread should invoke this call periodically and if it returns true, it should do the required clean up and exit. Below is the code snippet using this mechanism:

static struct task_struct *thread_st;
// Function executed by kernel thread
static int thread_fn(void *unused)
{
    while (!kthread_should_stop())
    {
        printk(KERN_INFO "Thread Running\n");
        ssleep(5);
    }
    printk(KERN_INFO "Thread Stopping\n");
    do_exit(0);
    return 0;
}

Here, the thread periodically invokes kthread_should_stop() and exits, if this function returns a non-zero value. In exit_module() function, we call kthread_stop() function to notify the thread, as earlier.

Signalling the Kernel Thread

As we have already seen, by default, kernel thread ignores all the signals. So, how do we send the signal to the kernel thread, if at all it’s required in some scenarios? Again, we have some set of calls to support this. First call is allow_signal(). Below is the prototype for the same:

void allow_signal(int sig_num)

Parameters:
sig_num – signal number

Unlike user space, there are no asynchronous signal handlers in kernel threads. So, thread should periodically invoke signal_pending() call to check if there is any pending signal and should act accordingly. Below is the prototype for the same:

int signal_pending(task_struct *p)

Parameters:
p – pointer to the task structure of the current thread

Returns:  Non-zero value, if signal is pending

Below is the code snippet for handling the signals:

static struct task_struct *thread_st;
// Function executed by kernel thread
static int thread_fn(void *unused)
{
    // Allow the SIGKILL signal
    allow_signal(SIGKILL);
    while (!kthread_should_stop())
    {
        printk(KERN_INFO "Thread Running\n");
        ssleep(5);
        // Check if the signal is pending
        if (signal_pending(thread_st))
            break;
    }
    printk(KERN_INFO "Thread Stopping\n");
    do_exit(0);
    return 0;
}

Compile the code and insert the module with insmod. Now, find the thread id using ps and execute the below command:

kill -9 <thread_id>

With this, you will see that thread exits, once it detects the SIGKILL signal. Now, just try removing the module with rmmod. What do you get? rmmod comes out gracefully without blocking.

Conclusion

So, with this, I am done with kernel threads. Aah! I missed out one thing from the last article. Why was that crash in the code from the last article? As you might have observed, when I call kthread_stop() in the exit module, the thread terminates after kthread_should_stop() returns true, and we don’t see a crash. So, does it mean that kthread_stop() prevents crash? In a way yes, but we need to understand the fundamental reason behind the crash. As you know, like any other process, thread also requires a memory to execute. So, where does this memory come from? No points for guessing the right answer, its from the module memory. So, when you unload the module, that memory is freed up and its no longer valid.  So, our poor chap tries to access that and its destined to crash.

So, that’s about the kernel threads. In the next article, we will touch upon the concurrency management in the kernel. So, stay tuned …

Next Article >>

   Send article as PDF   

Kernel Threads

<< Previous Article

In the previous article, we learned to write a simple kernel module. This was needed to kick start our journey into the Kernel Internals, as all of the code we are going to discuss, has to be pushed into the kernel as module. In this article, we will discuss about the Kernel Threads. So, lets begin…

Understanding Threads

Threads, also known as light weight processes are the basic unit of CPU initialization. So, why do we call them as light weight processes? One of the reason is that the context switch between the threads takes much lesser time as compared to processes, which results from the fact that all the threads within the process share the same address space, so you don’t need to switch the address space. In the user space, threads are created using the POSIX APIs and are known as pthreads. Some of the advantages of the thread, is that since all the threads within the processes share the same address space, the communication between the threads is far easier and less time consuming as compared to processes. And usually is done through the global variables. This approach has one disadvantage though. It leads to several concurrency issues and require the synchronization mechanisms to handle the same.

Having said that, why do we require threads? Need for the multiple threads arises, when we need to achieve the parallelism within the process. To give you the simple example, while working on the word processor, if we enable the spell and grammar check, as we key in the words, you will see red/green lines appear, if we type something syntactically/grammatically incorrect. This can most probably be implemented as threads.

Kernel Threads

Now, what are kernel threads? They are same as user space threads in many aspects, but one of the biggest difference is that they exist in the kernel space and execute in a privileged mode and have full access to the kernel data structures. These are basically used to implement background tasks inside the kernel. The task can be handling of asynchronous events or waiting for an event to occur. Device drivers utilize the services of kernel threads to handle such tasks. For example, the ksoftirqd/0 thread is used to implement the Soft IRQs in kernel. The khubd kernel thread monitors the usb hubs and helps in configuring  usb devices during hot-plugging.

APIs for creating the Kernel thread

Below is the API for creating the thread:

#include <kthread.h>
kthread_create(int (*function)(void *data), void *data, const char name[], ...)

Parameters:
function – The function that the thread has to execute
data – The ‘data’ to be passed to the function
name – The name by which the process will be recognized in the kernel

Retuns: Pointer to a structure of type task_struct

Below is an example code which creates a kernel thread:

#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/delay.h>

static struct task_struct *thread_st;
// Function executed by kernel thread
static int thread_fn(void *unused)
{
    while (1)
    {
        printk(KERN_INFO "Thread Running\n");
        ssleep(5);
    }
    printk(KERN_INFO "Thread Stopping\n");
    do_exit(0);
    return 0;
}
// Module Initialization
static int __init init_thread(void)
{
    printk(KERN_INFO "Creating Thread\n");
    //Create the kernel thread with name 'mythread'
    thread_st = kthread_create(thread_fn, NULL, "mythread");
    if (thread_st)
        printk("Thread Created successfully\n");
    else
        printk(KERN_INFO "Thread creation failed\n");
    return 0;
}
// Module Exit
static void __exit cleanup_thread(void)
{
    printk("Cleaning Up\n");
}

In the above code, thread is created in the init_thread(). Created thread executes the function thread_fn(). Compile the above code and insert the module with insmod. Below is the output, you get:

Thread Created successfully

That’s all we get as an output. Now, you might be wondering why is thread_fn() not executing? The reason for this is, when we create the thread with kthread_create(), it creates the thread in sleep state and thus nothing is executed. So, how do we wake up the thread. We have a API wake_up_process() for this. Below is modified code which uses this API.

// Module Initialization
static struct task_struct *thread_st;
{
    printk(KERN_INFO "Creating Thread\n");
    //Create the kernel thread with name 'mythread'
    thread_st = kthread_create(thread_fn, NULL, "mythread");
    if (thread_st)
    {
        printk("Thread Created successfully\n");
        wake_up_process(thread_st);
    }
    else
        printk(KERN_INFO "Thread creation failed\n");
    return 0;
}

As you might notice, wake_up_process() takes pointer to task_struct as an argument, which in turn is returned from kthread_create(). Below is the output:

Thread Created successfully
Thread Running
Thread Running
...

As seen, running a thread is a two step process – First create a thread and wake it up using wake_up_process(). However, kernel provides an API, which performs both these steps in one go as shown below:

#include <kthread.h>
kthread_run(int (*function)(void *data), void *data, const char name[], ...)

Parameters:
function – The function that the thread has to execute
data – The ‘data’ to be passed to the function
name – The name by which the process will be recognized in the kernel

Returns: Pointer to a structure of type task_struct

So, just replace the kthread_create() and wake_up_process() calls in above code with kthread_run and you will notice that thread starts running immediately.

Conclusion

So, now we are comfortable with creating the threads, let us remove the module with rmmod.  What do you get? Oops…isn’t it? To understand the reason for the crash, stay tuned to my next article on kernel threads. Till then, good bye.

Next Article >>

   Send article as PDF   
Google Circle
Join my Circle on Google+

Plugin by Social Author Bio