Tag Archives: Character Device File Operations

I/O Control in Linux

This ninth article, which is part of the series on Linux device drivers, talks about the typical ioctl() implementation and usage in Linux.

<< Eighth Article

“Get me a laptop and tell me about the experiments on the x86-specific hardware interfacing conducted in yesterday’s Linux device drivers’ laboratory session, and also about what’s planned for the next session”, cried Shweta, exasperated at being confined to bed and not being able to attend the classes. “Calm down!!! Don’t worry about that. We’ll help you make up for your classes. But first tell us what happened to you, so suddenly”, asked one of her friends, who had come to visit her in the hospital. “It’s all the fault of those chaats, I had in Rohan’s birthday party. I had such a painful food poisoning that led me here”, blamed Shweta. “How are you feeling now?”, asked Rohan sheepishly. “I’ll be all fine – just tell me all about the fun with hardware, you guys had. I had been waiting to attend that session and all this had to happen, right then”.

Rohan sat down besides Shweta and summarized the session to her, hoping to soothe her. That excited her more and she starting forcing them to tell her about the upcoming sessions, as well. They knew that those would be to do something with hardware, but were unaware of the details. Meanwhile, the doctor comes in and requests everybody to wait outside. That was an opportunity to plan and prepare. And they decided to talk about the most common hardware controlling operation: the ioctl(). Here is how it went.

Introducing an ioctl()

Input-output control (ioctl, in short) is a common operation or system call available with most of the driver categories. It is a “one bill fits all” kind of system call. If there is no other system call, which meets the requirement, then definitely ioctl() is the one to use. Practical examples include volume control for an audio device, display configuration for a video device, reading device registers, … – basically anything to do with any device input / output, or for that matter any device specific operations. In fact, it is even more versatile – need not be tied to any device specific things but any kind of operation. An example includes debugging a driver, say by querying of driver data structures.

Question is – how could all these variety be achieved by a single function prototype. The trick is using its two key parameters: the command and the command’s argument. The command is just some number, representing some operation, defined as per the requirement. The argument is the corresponding parameter for the operation. And then the ioctl() function implementation does a “switch … case” over the command implementing the corresponding functionalities. The following had been its prototype in Linux kernel, for quite some time:

int ioctl(struct inode *i, struct file *f, unsigned int cmd, unsigned long arg);

Though, recently from kernel 2.6.35, it has changed to the following:

long ioctl(struct file *f, unsigned int cmd, unsigned long arg);

If there is a need for more arguments, all of them are put in a structure and a pointer to the structure becomes the ‘one’ command argument. Whether integer or pointer, the argument is taken up as a long integer in kernel space and accordingly type cast and processed.

ioctl() is typically implemented as part of the corresponding driver and then an appropriate function pointer initialized with it, exactly as with other system calls open(), read(), … For example, in character drivers, it is the ioctl or unlocked_ioctl (since kernel 2.6.35) function pointer field in the struct file_operations, which is to be initialized.

Again like other system calls, it can be equivalently invoked from the user space using the ioctl() system call, prototyped in <sys/ioctl.h> as:

int ioctl(int fd, int cmd, ...);

Here, cmd is the same as implemented in the driver’s ioctl() and the variable argument construct (…) is a hack to be able to pass any type of argument (though only one) to the driver’s ioctl(). Other parameters will be ignored.

Note that both the command and command argument type definitions need to be shared across the driver (in kernel space) and the application (in user space). So, these definitions are commonly put into header files for each space.

Querying the driver internal variables

To better understand the boring theory explained above, here’s the code set for the “debugging a driver” example mentioned above. This driver has 3 static global variables status, dignity, ego, which need to be queried and possibly operated from an application. query_ioctl.h defines the corresponding commands and command argument type. Listing follows:

#ifndef QUERY_IOCTL_H

#define QUERY_IOCTL_H

#include <linux/ioctl.h>

typedef struct
{
	int status, dignity, ego;
} query_arg_t;

#define QUERY_GET_VARIABLES _IOR('q', 1, query_arg_t *)
#define QUERY_CLR_VARIABLES _IO('q', 2)

#endif

Using these, the driver’s ioctl() implementation in query_ioctl.c would be:

static int status = 1, dignity = 3, ego = 5;

#if (LINUX_VERSION_CODE < KERNEL_VERSION(2,6,35))
static int my_ioctl(struct inode *i, struct file *f, unsigned int cmd,
	unsigned long arg)
#else
static long my_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
#endif
{
	query_arg_t q;

	switch (cmd)
	{
		case QUERY_GET_VARIABLES:
			q.status = status;
			q.dignity = dignity;
			q.ego = ego;
			if (copy_to_user((query_arg_t *)arg, &q,
				sizeof(query_arg_t)))
			{
				return -EACCES;
			}
			break;
		case QUERY_CLR_VARIABLES:
			status = 0;
			dignity = 0;
			ego = 0;
			break;
		default:
			return -EINVAL;
	}

	return 0;
}

And finally the corresponding invocation functions from the application query_app.c would be as follows:

#include <stdio.h>
#include <sys/ioctl.h>

#include "query_ioctl.h"

void get_vars(int fd)
{
	query_arg_t q;

	if (ioctl(fd, QUERY_GET_VARIABLES, &q) == -1)
	{
		perror("query_apps ioctl get");
	}
	else
	{
		printf("Status : %d\n", q.status);
		printf("Dignity: %d\n", q.dignity);
		printf("Ego	: %d\n", q.ego);
	}
}
void clr_vars(int fd)
{
	if (ioctl(fd, QUERY_CLR_VARIABLES) == -1)
	{
		perror("query_apps ioctl clr");
	}
}

Complete code of the above mentioned three files is included in the folder QueryIoctl, where the required Makefile is also present. You may download its tar-bzipped file as query_ioctl_code.tar.bz2, untar it and then, do the following to try out:

  • Build the ‘query_ioctl’ driver (query_ioctl.ko file) and the application (query_app file) by running make using the provided Makefile.
  • Load the driver using insmod query_ioctl.ko.
  • With appropriate privileges and command-line arguments, run the application query_app:
    • ./query_app # To display the driver variables
    • ./query_app -c # To clear the driver variables
    • ./query_app -g # To display the driver variables
    • ./query_app -s # To set the driver variables (Not mentioned above)
  • Unload the driver using rmmod query_ioctl.

Defining the ioctl() commands

“Visiting time is over”, came calling the security guard. And all of Shweta’s visitors packed up to leave. Stopping them, Shweta said, “Hey!! Thanks a lot for all this help. I could understand most of this code, including the need for copy_to_user(), as we have learnt earlier. But just a question, what are these _IOR, _IO, etc used in defining the commands in query_ioctl.h. You said we could just use numbers for the same. But you are using all these weird things”. Actually, they are usual numbers only. Just that, now additionally, some useful command related information is also encoded as part of these numbers using these various macros, as per the Portable Operating System Interface (POSIX) standard for ioctl. The standard talks about the 32-bit command numbers being formed of four components embedded into the [31:0] bits:

  1. Direction of command operation [bits 31:30] – read, write, both, or none – filled by the corresponding macro (_IOR, _IOW, _IOWR, _IO)
  2. Size of the command argument [bits 29:16] – computed using sizeof() with the command argument’s type – the third argument to these macros
  3. 8-bit magic number [bits 15:8] – to render the commands unique enough – typically an ASCII character (the first argument to these macros)
  4. Original command number [bits 7:0] – the actual command number (1, 2, 3, …), defined as per our requirement – the second argument to these macros

“Check out the header <asm-generic/ioctl.h> for implementation details”, concluded Rohan while hurrying out of the room with a sigh of relief.

Tenth Article >>

Notes:

  1. The intention behind the POSIX standard of encoding the command is to be able to verify the parameters, direction, etc related to the command, even before it is passed to the driver, say by VFS. It is just that Linux has not yet implemented the verification part.
   Send article as PDF   

Decoding the character device file operations

This sixth article, which is part of the series on Linux device drivers, is continuation of the various concepts of character drivers and their implementation, dealt with in the previous two articles.

<< Fifth Article

So, what was your guess on how would Shweta crack the nut? Obviously, using the nut cracker named Pugs. Wasn’t it obvious? <Smile> In our previous article, we saw how Shweta was puzzled with reading no data, even after writing into the /dev/mynull character device file. Suddenly, a bell rang – not inside her head, a real one at the door. And for sure, there was the avatar of Pugs.

“How come you’re here?”, exclaimed Shweta. “After reading your tweet, what else? Cool that you cracked your first character driver all on your own. That’s amazing. So, what are you up to now?”, said Pugs. “I’ll tell you on the condition that you do not become a spoil sport”, replied Shweta. “Okay yaar, I’ll only give you pointers”. “And that also, only if I ask for”. “Okie”. “I am trying to decode the working of character device file operations”. “I have an idea. Why don’t you decode and explain me your understanding?”. “Not a bad idea”. With that, Shweta tailed the dmesg log to observe the printk‘s output from her driver. Alongside, she opened her null driver code on her console, specifically observing the device file operations my_open, my_close, my_read, and my_write.

static int my_open(struct inode *i, struct file *f)
{
	printk(KERN_INFO "Driver: open()\n");
	return 0;
}
static int my_close(struct inode *i, struct file *f)
{
	printk(KERN_INFO "Driver: close()\n");
	return 0;
}
static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off)
{
	printk(KERN_INFO "Driver: read()\n");
	return 0;
}
static ssize_t my_write(
		struct file *f, const char __user *buf, size_t len, loff_t *off)
{
	printk(KERN_INFO "Driver: write()\n");
	return len;
}

Based on the earlier understanding of return value of the functions in kernel, my_open() and my_close() are trivial. Their return types being int and both of them returning zero, meaning success. However, the return types of both my_read() and my_write() are not int, but ssize_t. On further digging through kernel headers, that turns out to be signed word. So, returning a negative number would be a usual error. But a non-negative return value would have an additional meaning. For read it would be number of bytes read, and for write it would be number of bytes written.

Reading the device file

For understanding this in detail, the complete flow has to be re-looked at. Let’s take read first. So, when the user does a read onto the device file /dev/mynull, that system call comes to the virtual file system (VFS) layer in the kernel. VFS decodes the <major, minor> tuple & figures out that it need to redirect it to the driver’s function my_read(), registered with it. So from that angle, my_read() is invoked as a request to read, from us – the device driver writers. And hence, its return value would indicate to the requester – the user, as to how many bytes is he getting from the read request. In our null driver example, we returned zero – meaning no bytes available or in other words end of file. And hence, when the device file is being read, the result is always nothing, independent of what is written into it.

“Hmmm!!! So, if I change it to 1, would it start giving me some data?”, Pugs asked in his verifying style. Shweta paused for a while – looked at the parameters of the function my_read() and confirmed with a but – data would be sent but it would be some junk data, as the my_read() function is not really populating the data into the buf (second parameter of my_read()), provided by the user. In fact, my_read() should write data into buf, according to len (third parameter of my_read()), the count in bytes requested by the user.

To be more specific, write less than or equal to len bytes of data into buf, and the same number be used as the return value. It is not a typo – in read, we ‘write’ into buf – that’s correct. We read the data from (possibly) an underlying device and then write that data into the user buffer, so that the user gets it, i.e. reads it. “That’s really smart of you”, expressed Pugs with sarcasm.

Writing into the device file

Similarly, the write is just the reverse procedure. User provides len (third parameter of my_write()) bytes of data to be written, into buf (second parameter of my_write()). my_write() would read that data and possibly write into an underlying device, and accordingly return the number of bytes, it has been able to write successfully. “Aha!! That’s why all my writes into /dev/mynull have been successful, without being actually doing any read or write”, exclaimed Shweta filled with happiness of understanding the complete flow of device file operations.

Preserving the last character

That was enough – Shweta not giving any chance to Pugs to add, correct or even speak. So, Pugs came up with a challenge. “Okay. Seems like you are thoroughly clear with the read/write funda. Then, here’s a question for you. Can you modify these my_read() and my_write() functions such that whenever I read /dev/mynull, I get the last character written into /dev/mynull?”

Confident enough, Shweta took the challenge and modified the my_read() and my_write() functions as follows, along with an addition of a static global character:

static char c;

static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off)
{
	printk(KERN_INFO "Driver: read()\n");
	buf[0] = c;
	return 1;
}
static ssize_t my_write(
		struct file *f, const char __user *buf, size_t len, loff_t *off)
{
	printk(KERN_INFO "Driver: write()\n");
	c = buf[len – 1];
	return len;
}

“Almost there, but what if the user has provided an invalid buffer, or what if the user buffer is swapped out. Wouldn’t this direct access of user space buf just crash and oops the kernel”, pounced Pugs. Shweta not giving up the challenge, dives into her collated material and figures out that there are two APIs just to ensure that the user space buffers are safe to access and then update them, as well. With the complete understanding of the APIs, she re-wrote the above code snippet along with including the corresponding header <asm/uaccess.h>, as follows, leaving no chance for Pugs to comment:

#include <asm/uaccess.h>

static char c;

static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off)
{
	printk(KERN_INFO "Driver: read()\n");
	if (copy_to_user(buf, &c, 1) != 0)
		return -EFAULT;
	else
		return 1;
}
static ssize_t my_write(
		struct file *f, const char __user *buf, size_t len, loff_t *off)
{
	printk(KERN_INFO "Driver: write()\n");
	if (copy_from_user(&c, buf + len – 1, 1) != 0)
		return -EFAULT;
	else
		return len;
}

Then, Shweta repeated the usual build and test steps as follows:

  • Build the modified null driver (.ko file) by running make.
  • Load the driver using insmod.
  • Write into /dev/mynull, say using echo -n “Pugs” > /dev/mynull
  • Read from /dev/mynull using cat /dev/mynull (Stop using Ctrl+C)
  • Unload the driver using rmmod.

Summing up

On cat‘ing /dev/mynull, the output was a non-stop infinite sequence of ‘s’, as my_read() gives the last one character forever. So, Pugs intervenes and presses Ctrl+C to stop the infinite read, and tries to explain, “If this is to be changed to ‘the last character only once’, my_read() needs to return 1 the first time and zero from second time onwards. This can be achieved using the off (fourth parameter of my_read())”. Shweta nods her head to support Pugs’ ego.

Seventh Article >>

Add-on

And here’s the modified read using the off:

static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off)
{
	printk(KERN_INFO "Driver: read()\n");
	if (*off == 0)
	{
		if (copy_to_user(buf, &c, 1) != 0)
			return -EFAULT;
		else
		{
			(*off)++;
			return 1;
		}
	}
	else
		return 0;
}
   Send article as PDF