Tag Archives: Linux

I/O Control in Linux

This ninth article, which is part of the series on Linux device drivers, talks about the typical ioctl() implementation and usage in Linux.

<< Eighth Article

“Get me a laptop and tell me about the experiments on the x86-specific hardware interfacing conducted in yesterday’s Linux device drivers’ laboratory session, and also about what’s planned for the next session”, cried Shweta, exasperated at being confined to bed and not being able to attend the classes. “Calm down!!! Don’t worry about that. We’ll help you make up for your classes. But first tell us what happened to you, so suddenly”, asked one of her friends, who had come to visit her in the hospital. “It’s all the fault of those chaats, I had in Rohan’s birthday party. I had such a painful food poisoning that led me here”, blamed Shweta. “How are you feeling now?”, asked Rohan sheepishly. “I’ll be all fine – just tell me all about the fun with hardware, you guys had. I had been waiting to attend that session and all this had to happen, right then”.

Rohan sat down besides Shweta and summarized the session to her, hoping to soothe her. That excited her more and she starting forcing them to tell her about the upcoming sessions, as well. They knew that those would be to do something with hardware, but were unaware of the details. Meanwhile, the doctor comes in and requests everybody to wait outside. That was an opportunity to plan and prepare. And they decided to talk about the most common hardware controlling operation: the ioctl(). Here is how it went.

Introducing an ioctl()

Input-output control (ioctl, in short) is a common operation or system call available with most of the driver categories. It is a “one bill fits all” kind of system call. If there is no other system call, which meets the requirement, then definitely ioctl() is the one to use. Practical examples include volume control for an audio device, display configuration for a video device, reading device registers, … – basically anything to do with any device input / output, or for that matter any device specific operations. In fact, it is even more versatile – need not be tied to any device specific things but any kind of operation. An example includes debugging a driver, say by querying of driver data structures.

Question is – how could all these variety be achieved by a single function prototype. The trick is using its two key parameters: the command and the command’s argument. The command is just some number, representing some operation, defined as per the requirement. The argument is the corresponding parameter for the operation. And then the ioctl() function implementation does a “switch … case” over the command implementing the corresponding functionalities. The following had been its prototype in Linux kernel, for quite some time:

int ioctl(struct inode *i, struct file *f, unsigned int cmd, unsigned long arg);

Though, recently from kernel 2.6.35, it has changed to the following:

long ioctl(struct file *f, unsigned int cmd, unsigned long arg);

If there is a need for more arguments, all of them are put in a structure and a pointer to the structure becomes the ‘one’ command argument. Whether integer or pointer, the argument is taken up as a long integer in kernel space and accordingly type cast and processed.

ioctl() is typically implemented as part of the corresponding driver and then an appropriate function pointer initialized with it, exactly as with other system calls open(), read(), … For example, in character drivers, it is the ioctl or unlocked_ioctl (since kernel 2.6.35) function pointer field in the struct file_operations, which is to be initialized.

Again like other system calls, it can be equivalently invoked from the user space using the ioctl() system call, prototyped in <sys/ioctl.h> as:

int ioctl(int fd, int cmd, ...);

Here, cmd is the same as implemented in the driver’s ioctl() and the variable argument construct (…) is a hack to be able to pass any type of argument (though only one) to the driver’s ioctl(). Other parameters will be ignored.

Note that both the command and command argument type definitions need to be shared across the driver (in kernel space) and the application (in user space). So, these definitions are commonly put into header files for each space.

Querying the driver internal variables

To better understand the boring theory explained above, here’s the code set for the “debugging a driver” example mentioned above. This driver has 3 static global variables status, dignity, ego, which need to be queried and possibly operated from an application. query_ioctl.h defines the corresponding commands and command argument type. Listing follows:

#ifndef QUERY_IOCTL_H

#define QUERY_IOCTL_H

#include <linux/ioctl.h>

typedef struct
{
	int status, dignity, ego;
} query_arg_t;

#define QUERY_GET_VARIABLES _IOR('q', 1, query_arg_t *)
#define QUERY_CLR_VARIABLES _IO('q', 2)

#endif

Using these, the driver’s ioctl() implementation in query_ioctl.c would be:

static int status = 1, dignity = 3, ego = 5;

#if (LINUX_VERSION_CODE < KERNEL_VERSION(2,6,35))
static int my_ioctl(struct inode *i, struct file *f, unsigned int cmd,
	unsigned long arg)
#else
static long my_ioctl(struct file *f, unsigned int cmd, unsigned long arg)
#endif
{
	query_arg_t q;

	switch (cmd)
	{
		case QUERY_GET_VARIABLES:
			q.status = status;
			q.dignity = dignity;
			q.ego = ego;
			if (copy_to_user((query_arg_t *)arg, &q,
				sizeof(query_arg_t)))
			{
				return -EACCES;
			}
			break;
		case QUERY_CLR_VARIABLES:
			status = 0;
			dignity = 0;
			ego = 0;
			break;
		default:
			return -EINVAL;
	}

	return 0;
}

And finally the corresponding invocation functions from the application query_app.c would be as follows:

#include <stdio.h>
#include <sys/ioctl.h>

#include "query_ioctl.h"

void get_vars(int fd)
{
	query_arg_t q;

	if (ioctl(fd, QUERY_GET_VARIABLES, &q) == -1)
	{
		perror("query_apps ioctl get");
	}
	else
	{
		printf("Status : %d\n", q.status);
		printf("Dignity: %d\n", q.dignity);
		printf("Ego	: %d\n", q.ego);
	}
}
void clr_vars(int fd)
{
	if (ioctl(fd, QUERY_CLR_VARIABLES) == -1)
	{
		perror("query_apps ioctl clr");
	}
}

Complete code of the above mentioned three files is included in the folder QueryIoctl, where the required Makefile is also present. You may download its tar-bzipped file as query_ioctl_code.tar.bz2, untar it and then, do the following to try out:

  • Build the ‘query_ioctl’ driver (query_ioctl.ko file) and the application (query_app file) by running make using the provided Makefile.
  • Load the driver using insmod query_ioctl.ko.
  • With appropriate privileges and command-line arguments, run the application query_app:
    • ./query_app # To display the driver variables
    • ./query_app -c # To clear the driver variables
    • ./query_app -g # To display the driver variables
    • ./query_app -s # To set the driver variables (Not mentioned above)
  • Unload the driver using rmmod query_ioctl.

Defining the ioctl() commands

“Visiting time is over”, came calling the security guard. And all of Shweta’s visitors packed up to leave. Stopping them, Shweta said, “Hey!! Thanks a lot for all this help. I could understand most of this code, including the need for copy_to_user(), as we have learnt earlier. But just a question, what are these _IOR, _IO, etc used in defining the commands in query_ioctl.h. You said we could just use numbers for the same. But you are using all these weird things”. Actually, they are usual numbers only. Just that, now additionally, some useful command related information is also encoded as part of these numbers using these various macros, as per the Portable Operating System Interface (POSIX) standard for ioctl. The standard talks about the 32-bit command numbers being formed of four components embedded into the [31:0] bits:

  1. Direction of command operation [bits 31:30] – read, write, both, or none – filled by the corresponding macro (_IOR, _IOW, _IOWR, _IO)
  2. Size of the command argument [bits 29:16] – computed using sizeof() with the command argument’s type – the third argument to these macros
  3. 8-bit magic number [bits 15:8] – to render the commands unique enough – typically an ASCII character (the first argument to these macros)
  4. Original command number [bits 7:0] – the actual command number (1, 2, 3, …), defined as per our requirement – the second argument to these macros

“Check out the header <asm-generic/ioctl.h> for implementation details”, concluded Rohan while hurrying out of the room with a sigh of relief.

Tenth Article >>

Notes:

  1. The intention behind the POSIX standard of encoding the command is to be able to verify the parameters, direction, etc related to the command, even before it is passed to the driver, say by VFS. It is just that Linux has not yet implemented the verification part.
   Send article as PDF   

Get Set with Polynomials in Octave

This ninth article of the mathematical journey through open source, deals with polynomial mathematics in octave.

<< Eighth Article

Let’s first solve the earlier puzzles. And then we shall discuss the polynomial power of octave.

Number Puzzle

Find three numbers, product of which is 60; sum of their squares is 50; and their sum is 12. Let the X vector elements X(1), X(2), X(3) be the three numbers. Then, here goes the solution:

$ octave -qf
octave:1> function Y = F(X)
> Y(1) = X(1) * X(2) * X(3) - 60; 
> Y(2) = X(1)^2 + X(2)^2 + X(3)^2 - 50; 
> Y(3) = X(1) + X(2) + X(3) - 12; 
> endfunction
octave:2> [Y, Fval, info] = fsolve(@F, [3; 3; 3]) 
warning: matrix singular to machine precision, rcond = 4.32582e-35
warning: attempting to find minimum norm solution
warning: dgelsd: rank deficient 3x3 matrix, rank = 1 
Y =

   5.0000
   3.0000
   4.0000

Fval =

  -3.2345e-07   1.0351e-07   0.0000e+00

info = 1 
octave:3>

So, the 3 numbers are 5, 3, 4.

Flower Puzzle

A sage came to a temple with some flowers and dipped them into the first pond of the temple to get them squared. Then, he offered some flowers in the temple and dipped the remaining flowers into the second pond to get them doubled. Then, he again offered same number of flowers, as earlier, and dipped the remaining flowers into the third pond to get them tripled and take back with him as prasadam, which was the same number as in each one of his offerings. Now, if he took back thrice the number of flowers he brought. How many did he bring in with him?
Let the x vector elements x(1) and x(2) be respectively, the number of flowers the sage came with and the number of flowers the sage offered each time. So, here goes the solution:

octave:1> function y = f(x)
> y(1) = ((x(1) * x(1) - x(2)) * 2 - x(2)) * 3 - x(2);
> y(2) = x(2) - 3 * x(1);
> endfunction 
octave:2> [x fval info] = fsolve(@f, [10; 10])
x =

    5.0000
   15.0000

fval =

  -2.8791e-06  -1.7764e-15

info =  1
octave:3>

So, the number of flowers the sage came with is 5 and his each offering is of 15 flowers.

Note that in all these solutions the trick is to choose the initial solution close to the original solution, through some approximation work. At times that might be tricky. So, in case we just have polynomial equations and that also in one variable, it can be solved in an easier way, using the polynomial features of octave. In contrast to the earlier method, here we also get all of the multiple solutions for the polynomial.

Playing with Polynomials

Let’s consider the polynomial equation 2x3 + 3x2 + 2x + 1 = 0. Then its octave representation and computation of its solutions aka roots would be as follows:

octave:1> P = [2; 3; 2; 1];
octave:2> roots(P)
ans =

  -1.00000 + 0.00000i
  -0.25000 + 0.66144i
  -0.25000 - 0.66144i

octave:3>

So, it being a cubic equation, it has three roots as expected. First one is the real number -1, and the other two are complex conjugates (-1 + sqrt(-7))/4 & (-1 – sqrt(-7))/4. And you may verify the solutions using the function polyval() as follows:

octave:1> P = [2; 3; 2; 1];
octave:2> sols = [-1; (-1 + sqrt(-7)) / 4; (-1 + sqrt(-7)) / 4]
sols =

  -1.00000 + 0.00000i
  -0.25000 + 0.66144i
  -0.25000 + 0.66144i

octave:3> polyval(P, sols)
ans =

   0
   0
   0

octave:4>

This shows that the value of the polynomial P evaluated at each of the 3 solutions is 0. Hence, confirming that they indeed are the solutions.

All set with polynomial basics in octave, let’s solve some puzzles.

Geometry Solving

Last time we found an intersection point of a straight line and a circle. Yes, we just calculated one point – though typically there would be two. It would be one only in case of the straight line being tangent or just touching the circle. And yes it would be zero, if the straight line is not even intersecting it. So now, let’s try these different cases, with the one variable polynomial power.

Let us have the following circle C with radius 5 and centered at origin (0, 0), defined in the Cartesian coordinate system, i.e. the x-y system: x2 + y2 = 25

And, let us consider the following 3 lines for intersection with the above circle, one by one:

  • L1: 4x + 3y = 24
  • L2: x + y = 5√2
  • L3: 6x + y = 36

To be able to solve for the intersection points of each of these 3 lines with the circle C using roots, the first step is to get polynomials in one variable. For that, we can substitute the value of y in the equation of the circle, in terms of x from each of the line equations, as follows:
For L1
x2 + y2 = 25 ⇒ 9x2 + 9y2 = 9*25 ⇒ 9x2 + (24 – 4x)2 = 225 ⇒ 25x2 – 192x + 351 = 0
For L2
x2 + y2 = 25 ⇒ x2 + (5√2 – x)2 = 25 ⇒ 2x2 – 10√2x + 25 = 0
For L3
x2 + y2 = 25 ⇒ x2 + (36 – 6x)2 = 25 ⇒ 37x2 – 432x + 1271 = 0

Now, we get the roots of each to get the x co-ordinate of the intersection point.

octave:1> C1 = [25; -192; 351];
octave:2> C2 = [2; -10*sqrt(2); 25];
octave:3> C3 = [37; -432; 1271];
octave:4> roots(C1)
ans =

   4.6800
   3.0000

octave:5> roots(C2)
ans =

   3.5355
   3.5355

octave:6> roots(C3)
ans =

   5.8378 + 0.5206i
   5.8378 - 0.5206i

octave:7>

And the corresponding y co-ordinate could be obtained by substituting the value of x into the corresponding line equations.

For L1, there are 2 different roots 4.68 and 3, implying two intersecting points (4.68, 1.76) and (3, 4).
For L2, there are 2 identical roots of 3.5355 i.e 5/√2, implying just one intersecting point (5/√2, 5/√2).
For L3, the roots are complex, implying that there is no intersecting point in the real world.

Solve it

And finally, here’s one for your brain. Find out the two square roots and the three cube roots of the imaginary number i.

If you think, you have got the octave code for solving the above, post your solution in the comments below. And as we move on, we would have more fun with the polynomials.

Tenth Article >>

   Send article as PDF   

Accessing x86-specific I/O mapped hardware in Linux

This eighth article, which is part of the series on Linux device drivers, continues on talking about accessing hardware in Linux.

<< Seventh Article

Second day in the Linux device drivers laboratory was expected to be quite different from the typical software oriented classes. Apart from accessing & programming the architecture-specific I/O mapped hardware in x86, it had lot to offer for first timers in reading hardware device manuals (commonly referred as data-sheets) and to understand them for writing device drivers.

Contrast this with the previous laboratory session, which taught about the generic architecture-transparent hardware interfacing. It was all about mapping and accessing memory-mapped devices in Linux, without any device specific detail.

x86-specific hardware interfacing

Unlike most other architectures, x86 has an additional hardware accessing mechanism through a direct I/O mapping. It is a direct 16-bit addressing scheme and doesn’t need a mapping to virtual address for its accessing. These addresses are referred to as port addresses, or in short – ports. As x86 has this as an additional accessing mechanism, it calls for additional set of x86 (assembly/machine code) instructions. And yes, there are the input instructions inb, inw, inl for reading an 8-bit byte, a 16-bit word, and a 32-bit long word respectively, from the I/O mapped devices through the ports. And the corresponding output instructions are outb, outw, outl, respectively. And the equivalent C functions/macros are as follows (available through the header <asm/io.h>):

u8 inb(unsigned long port);
u16 inw(unsigned long port);
u32 inl(unsigned long port);
void outb(u8 value, unsigned long port);
void outw(u16 value, unsigned long port);
void outl(u32 value, unsigned long port);

The basic question may arise, as to which all devices are I/O mapped and what are the port addresses of these devices. The answer is pretty simple. As per x86-specific, all these devices & their mappings are x86 standard and hence pre-defined. Figure 13 shows a snippet of these mappings through the kernel window /proc/ioports. The listing includes pre-defined DMA, timer, RTC, serial, parallel, PCI bus interfaces to name a few.

Figure 13: x86-specific I/O ports

Figure 13: x86-specific I/O ports

Simplest the serial on x86 platform

For example, the first serial port is always I/O mapped from 0x3F8 to 0x3FF. But what does this mapping mean? What do we do with this? How does it help us to use the serial port?
That is where a data-sheet of the device controlling the corresponding port needs to be looked up. Serial port is controlled by the serial controller device, commonly known as an UART (Universal Asynchronous Receiver/Transmitter) or at times a USART (Universal Synchronous/Asynchronous Receiver/Transmitter). On PCs, the typical UART used is PC16550D. The data-sheet (uart_pc16550d.pdf) for the same has also been included in the self-extracting LDDK-Package.sh, used for the Linux device driver kit. Figure 14 shows the relevant portion of it.

In general, from where & how do we get these device data-sheets? Typically, an on-line search with the corresponding device number should yield their data-sheet links. And how does one get the device number? Simple, by having a look at the device. If it is inside a desktop, open it up and check it out. Yes, this is the least you may have to do to get going with the hardware for writing device drivers. Assuming all this hacking has been done, it is time to peep into the data-sheet of UART PC16550D.

For a device driver writer, the usual sections of interest in a data-sheet are the ones related to registers of the device. Why? As, it is these registers, which a device driver writer need to read from and/or write in to finally use the device. Page 14 of the data-sheet (also shown in Figure 14) shows the complete table of all the twelve 8-bit registers present in the UART PC16550D. Each of the 8 rows corresponds to the respective bit of the registers. Also, note that the register addresses start from 0 and goes till 7. The interesting thing to note about this is that a data-sheet always gives the register offsets, which then need to be added to the base address of the device, to get the actual register addresses. Who decides the base address and where is it obtained from? Base addresses are typically board/platform specific, unless they are dynamically configurable like in the case of PCI devices. In the case here, i.e. serial device on x86, it is dictated by the x86 architecture – and that is what precisely was the starting serial port address mentioned above – 0x3F8. And the eight register offsets 0 to 7 are the ones exactly mapping to the eight port addresses 0x3F8 to 0x3FF. So, these are the actual addresses to be read or written for reading or writing the corresponding serial registers, to achieve the desired serial operations, as per the register descriptions.

Figure 14: Registers of UART PC16550D

Figure 14: Registers of UART PC16550D

All the serial register offsets and the register bit masks are defined in the header <linux/serial_reg.h>. So, rather than hard coding these values from the data-sheet, the corresponding macros could be used instead. All the following code uses these macros along with the following:

#define SERIAL_PORT_BASE 0x3F8

Operating on the device registers

To summarize all these decoding of UART PC16550D data-sheet, here are a few examples of how to do read and write operations of the serial registers and their bits.

Reading and writing the “Line Control Register (LCR)”:

u8 val;

val = inb(SERIAL_PORT_BASE + UART_LCR /* 3 */);
outb(val, SERIAL_PORT_BASE + UART_LCR /* 3 */);

Setting and clearing the “Divisor Latch Access Bit (DLAB)” in LCR:

u8 val;

val = inb(SERIAL_PORT_BASE + UART_LCR /* 3 */);

/* Setting DLAB */
val |= UART_LCR_DLAB /* 0x80 */;
outb(val, SERIAL_PORT_BASE + UART_LCR /* 3 */);

/* Clearing DLAB */
val &= ~UART_LCR_DLAB /* 0x80 */;
outb(val, SERIAL_PORT_BASE + UART_LCR /* 3 */);

Reading and writing the “Divisor Latch”:

u8 dlab;
u16 val;

dlab = inb(SERIAL_PORT_BASE + UART_LCR);
dlab |= UART_LCR_DLAB; // Setting DLAB to access Divisor Latch
outb(dlab, SERIAL_PORT_BASE + UART_LCR);

val = inw(SERIAL_PORT_BASE + UART_DLL /* 0 */);
outw(val, SERIAL_PORT_BASE + UART_DLL /* 0 */);

Blinking an LED

To get a real experience of the low-level hardware access and Linux device drivers, the best way would be to play with the Linux device driver kit (LDDK). However, just for the feel of low-level hardware access, a blinking light emitting diode (LED) may be tried as follows:

  • Connect a light emitting diode (LED) with a 330 ohm resistor in series across the pin 3 (Tx) & pin 5 (Gnd) of the DB9 connector of your PC.
  • Pull up & down the transmit (Tx) line with a 500 ms delay, by loading the blink_led driver using insmod blink_led.ko, and then unloading the driver using rmmod blink_led, before reloading.

Below is the blink_led.c, to be compiled into the blink_led.ko driver, by running make using the usual driver Makefile:

#include <linux/module.h>
#include <linux/version.h>
#include <linux/types.h>
#include <linux/delay.h>
#include <asm/io.h>

#include <linux/serial_reg.h>

#define SERIAL_PORT_BASE 0x3F8

int __init init_module()
{
	int i;
	u8 data;

	data = inb(SERIAL_PORT_BASE + UART_LCR);
	for (i = 0; i < 5; i++)
	{
		/* Pulling the Tx line low */
		data |= UART_LCR_SBC;
		outb(data, SERIAL_PORT_BASE + UART_LCR);
		msleep(500);
		/* Defaulting the Tx line high */
		data &= ~UART_LCR_SBC;
		outb(data, SERIAL_PORT_BASE + UART_LCR);
		msleep(500);
	}
	return 0;
}

void __exit cleanup_module()
{
}

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Anil Kumar Pugalia <email@sarika-pugs.com>");
MODULE_DESCRIPTION("Blinking LED Hack");

Summing up

Are you wondering as where has Shweta gone today? She has bunked all the classes. Watch out for the next article to find out why.

Ninth Article >>

Notes

  1. The above example is to demonstrate how bare bone easy the low level access could get. However, to make it more perfect, one should use the APIs request_region() and release_region(), respectively before and after the accesses of the I/O port addresses, respectively to acquire and release the range of I/O port addresses to access.
  2. Also, you might have observed that there is no module_init() & module_exit() in the above driver. But nonetheless, insmod & rmmod do work. How is that? That is because init_module() & cleanup_module() are the predefined names for the constructor & the destructor, respectively. Hence, you do not need module_init() & module_exit() to translate your other function names to these predefined ones. Caution: Since kernel 2.6 onwards, if you are building the driver into the kernel, you should define your own function names & use module_init() & module_exit().
   Send article as PDF   

Generic Hardware Access in Linux

This seventh article, which is part of the series on Linux device drivers, talks about accessing hardware in Linux.

<< Sixth Article

Shweta was all jubilant about her character driver achievements, as she entered the Linux device drivers laboratory on the second floor of her college. Why not? Many of her classmates had already read her blog & commented on her expertise. And today was a chance for show-off at an another level. Till now, it was all software. Today’s lab was on accessing hardware in Linux. Students are expected to “learn by experimentation” to access various kinds of hardware in Linux on various architectures over multiple lab sessions here.

As usual, the lab staff are a bit skeptical to let the students directly get onto the hardware, without any background. So to build their background, they have prepared some slide presentations, which can be accessed from SysPlay’s website.

Generic hardware interfacing

As every one settled in the laboratory, lab expert Priti started with the introduction to hardware interfacing in Linux. Skipping the theoretical details, the first interesting slide was about the generic architecture-transparent hardware interfacing. See Figure 11.

Figure 11: Hardware mapping

Figure 11: Hardware mapping

The basic assumption being that the architecture is 32-bit. For others, the memory map would change accordingly. For 32-bit address bus, the address/memory map ranges from 0 (0x00000000) to ‘232 – 1′ (0xFFFFFFFF). And an architecture independent layout of this memory map would be as shown in the Figure 11 – memory (RAM) and device regions (registers & memories of devices) mapped in an interleaved fashion. The architecture dependent thing would be what these addresses are actually there. For example, in an x86 architecture, the initial 3GB (0x00000000 to 0xBFFFFFFF) is typically for RAM and the later 1GB (0xC0000000 to 0xFFFFFFFF) for device maps. However, if the RAM is less, say 2GB, device maps could start from 2GB (0x80000000).

Type in cat /proc/iomem to list the memory map on your system. cat /proc/meminfo would give you an approximate RAM size on your system. Refer to Figure 12 for a snapshot.

Figure 12: Physical & bus addresses on an x86 system

Figure 12: Physical & bus addresses on an x86 system

Irrespective of the actual values, the addresses referring to RAM are termed as physical addresses. And the addresses referring to device maps are termed as bus addresses, as these devices are always mapped through some architecture-specific bus. For example, PCI bus in x86 architecture, AMBA bus in ARM architectures, SuperHyway bus in SuperH (or SH) architectures, GX bus on PowerPC (or PPC), etc.

All the architecture dependent values of these physical and bus addresses are either dynamically configurable or are to be obtained from the datasheets (i.e. hardware manuals) of the corresponding architecture processors/controllers. But the interesting part is that, in Linux none of these are directly accessible but are to be mapped to virtual addresses and then accessed through that. Thus, making the RAM and device accesses generic enough, except just mapping them to virtual addresses. And the corresponding APIs for mapping & unmapping the device bus addresses to virtual addresses are:

#include <asm/io.h>

void *ioremap(unsigned long device_bus_address, unsigned long device_region_size);
void iounmap(void *virt_addr);

These are prototyped in <asm/io.h>. Once mapped to virtual addresses, it boils down to the device datasheet, as to which set of device registers and/or device memory to read from or write into, by adding their offsets to the virtual address returned by ioremap(). For that, the following are the APIs (prototyped in the same header file <asm/io.h>):

#include <asm/io.h>

unsigned int ioread8(void *virt_addr);
unsigned int ioread16(void *virt_addr);
unsigned int ioread32(void *virt_addr);
unsigned int iowrite8(u8 value, void *virt_addr);
unsigned int iowrite16(u16 value, void *virt_addr);
unsigned int iowrite32(u32 value, void *virt_addr);

Accessing the video RAM of “DOS” days

After this first set of information, students were directed for the live experiments. They were suggested to do an initial experiment with the video RAM of “DOS” days to understand the usage of the above APIs. Shweta got onto the system – displayed the /proc/iomem window – one very similar to as shown in Figure 12. From there, she got the video RAM address ranging from 0x000A0000 to 0x000BFFFF. And with that she added the above APIs with appropriate parameters into the constructor and destructor of her already written null driver to convert it into a vram driver. Then, she added the user access to the video RAM through read & write calls of the vram driver. Here’s what she coded in the new file video_ram.c:

#include <linux/module.h>
#include <linux/version.h>
#include <linux/kernel.h>
#include <linux/types.h>
#include <linux/kdev_t.h>
#include <linux/fs.h>
#include <linux/device.h>
#include <linux/cdev.h>
#include <linux/uaccess.h>
#include <asm/io.h>

#define VRAM_BASE 0x000A0000
#define VRAM_SIZE 0x00020000

static void __iomem *vram;
static dev_t first;
static struct cdev c_dev;
static struct class *cl;

static int my_open(struct inode *i, struct file *f)
{
	return 0;
}
static int my_close(struct inode *i, struct file *f)
{
	return 0;
}
static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off)
{
	int i;
	u8 byte;

	if (*off >= VRAM_SIZE)
	{
		return 0;
	}
	if (*off + len > VRAM_SIZE)
	{
		len = VRAM_SIZE - *off;
	}
	for (i = 0; i < len; i++)
	{
		byte = ioread8((u8 *)vram + *off + i);
		if (copy_to_user(buf + i, &byte, 1))
		{
			return -EFAULT;
		}
	}
	*off += len;

	return len;
}
static ssize_t my_write(
		struct file *f, const char __user *buf, size_t len, loff_t *off)
{
	int i;
	u8 byte;

	if (*off >= VRAM_SIZE)
	{
		return 0;
	}
	if (*off + len > VRAM_SIZE)
	{
		len = VRAM_SIZE - *off;
	}
	for (i = 0; i < len; i++)
	{
		if (copy_from_user(&byte, buf + i, 1))
		{
			return -EFAULT;
		}
		iowrite8(byte, (u8 *)vram + *off + i);
	}
	*off += len;

	return len;
}

static struct file_operations vram_fops =
{
	.owner = THIS_MODULE,
	.open = my_open,
	.release = my_close,
	.read = my_read,
	.write = my_write
};

static int __init vram_init(void) /* Constructor */
{
	int ret;
	struct device *dev_ret;

	if ((vram = ioremap(VRAM_BASE, VRAM_SIZE)) == NULL)
	{
		printk(KERN_ERR "Mapping video RAM failed\n");
		return -ENOMEM;
	}
	if ((ret = alloc_chrdev_region(&first, 0, 1, "vram")) < 0)
	{
		return ret;
	}
	if (IS_ERR(cl = class_create(THIS_MODULE, "chardrv")))
	{
		unregister_chrdev_region(first, 1);
		return PTR_ERR(cl);
	}
	if (IS_ERR(dev_ret = device_create(cl, NULL, first, NULL, "vram")))
	{
		class_destroy(cl);
		unregister_chrdev_region(first, 1);
		return PTR_ERR(dev_ret);
	}

	cdev_init(&c_dev, &vram_fops);
	if ((ret = cdev_add(&c_dev, first, 1)) < 0)
	{
		device_destroy(cl, first);
		class_destroy(cl);
		unregister_chrdev_region(first, 1);
		return ret;
	}
	return 0;
}

static void __exit vram_exit(void) /* Destructor */
{
	cdev_del(&c_dev);
	device_destroy(cl, first);
	class_destroy(cl);
	unregister_chrdev_region(first, 1);
	iounmap(vram);
}

module_init(vram_init);
module_exit(vram_exit);

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Anil Kumar Pugalia <email@sarika-pugs.com>");
MODULE_DESCRIPTION("Video RAM Driver");

Summing up

Then, Shweta repeated the following steps:

  • Build the vram driver (video_ram.ko file) by running make with the same Makefile changed to build this driver.
  • Usual load of the driver using insmod video_ram.ko.
  • Usual write into /dev/vram, say using echo -n “0123456789” > /dev/vram.
  • Read the /dev/vram contents using xxd /dev/vram | less. The usual cat /dev/vram also can be used but that would give all binary content. xxd shows them up as hexadecimal in centre with the corresponding ASCII along the right side.
  • Usual unload the driver using rmmod video_ram.

Note 1: Today’s systems typically use separate video cards having their own video RAM. So, the video RAM used in the “DOS days”, i.e. the one mentioned in this article is unused and many a times not even present. Hence, playing around with it, is safe, without any effect on the system, or the display.

Note 2: Moreover, if the video RAM is absent, the read/write may not be actually reading/writing, but just sending/receiving signals in the air. In such a case, writes would not do any change, and reads would keep on reading the same value – thus ‘xxd’ showing the same values.

It was yet half an hour left for the practical class to be over and a lunch break. So, Shweta decided to walk around and possibly help somebody in their experiments.

Eighth Article >>

Notes:

  1. When a pointer is tagged with __iomem, it enables that pointer for compiler checks &/or optimizations, relevant for I/O mapped memory.

Other References:

  1. Translating addresses in Kernel Space on different architectures
  2. Addressing Concepts in Linux
  3. Linux Memory Management Overview
   Send article as PDF   

Decoding the character device file operations

This sixth article, which is part of the series on Linux device drivers, is continuation of the various concepts of character drivers and their implementation, dealt with in the previous two articles.

<< Fifth Article

So, what was your guess on how would Shweta crack the nut? Obviously, using the nut cracker named Pugs. Wasn’t it obvious? <Smile> In our previous article, we saw how Shweta was puzzled with reading no data, even after writing into the /dev/mynull character device file. Suddenly, a bell rang – not inside her head, a real one at the door. And for sure, there was the avatar of Pugs.

“How come you’re here?”, exclaimed Shweta. “After reading your tweet, what else? Cool that you cracked your first character driver all on your own. That’s amazing. So, what are you up to now?”, said Pugs. “I’ll tell you on the condition that you do not become a spoil sport”, replied Shweta. “Okay yaar, I’ll only give you pointers”. “And that also, only if I ask for”. “Okie”. “I am trying to decode the working of character device file operations”. “I have an idea. Why don’t you decode and explain me your understanding?”. “Not a bad idea”. With that, Shweta tailed the dmesg log to observe the printk‘s output from her driver. Alongside, she opened her null driver code on her console, specifically observing the device file operations my_open, my_close, my_read, and my_write.

static int my_open(struct inode *i, struct file *f)
{
	printk(KERN_INFO "Driver: open()\n");
	return 0;
}
static int my_close(struct inode *i, struct file *f)
{
	printk(KERN_INFO "Driver: close()\n");
	return 0;
}
static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off)
{
	printk(KERN_INFO "Driver: read()\n");
	return 0;
}
static ssize_t my_write(
		struct file *f, const char __user *buf, size_t len, loff_t *off)
{
	printk(KERN_INFO "Driver: write()\n");
	return len;
}

Based on the earlier understanding of return value of the functions in kernel, my_open() and my_close() are trivial. Their return types being int and both of them returning zero, meaning success. However, the return types of both my_read() and my_write() are not int, but ssize_t. On further digging through kernel headers, that turns out to be signed word. So, returning a negative number would be a usual error. But a non-negative return value would have an additional meaning. For read it would be number of bytes read, and for write it would be number of bytes written.

Reading the device file

For understanding this in detail, the complete flow has to be re-looked at. Let’s take read first. So, when the user does a read onto the device file /dev/mynull, that system call comes to the virtual file system (VFS) layer in the kernel. VFS decodes the <major, minor> tuple & figures out that it need to redirect it to the driver’s function my_read(), registered with it. So from that angle, my_read() is invoked as a request to read, from us – the device driver writers. And hence, its return value would indicate to the requester – the user, as to how many bytes is he getting from the read request. In our null driver example, we returned zero – meaning no bytes available or in other words end of file. And hence, when the device file is being read, the result is always nothing, independent of what is written into it.

“Hmmm!!! So, if I change it to 1, would it start giving me some data?”, Pugs asked in his verifying style. Shweta paused for a while – looked at the parameters of the function my_read() and confirmed with a but – data would be sent but it would be some junk data, as the my_read() function is not really populating the data into the buf (second parameter of my_read()), provided by the user. In fact, my_read() should write data into buf, according to len (third parameter of my_read()), the count in bytes requested by the user.

To be more specific, write less than or equal to len bytes of data into buf, and the same number be used as the return value. It is not a typo – in read, we ‘write’ into buf – that’s correct. We read the data from (possibly) an underlying device and then write that data into the user buffer, so that the user gets it, i.e. reads it. “That’s really smart of you”, expressed Pugs with sarcasm.

Writing into the device file

Similarly, the write is just the reverse procedure. User provides len (third parameter of my_write()) bytes of data to be written, into buf (second parameter of my_write()). my_write() would read that data and possibly write into an underlying device, and accordingly return the number of bytes, it has been able to write successfully. “Aha!! That’s why all my writes into /dev/mynull have been successful, without being actually doing any read or write”, exclaimed Shweta filled with happiness of understanding the complete flow of device file operations.

Preserving the last character

That was enough – Shweta not giving any chance to Pugs to add, correct or even speak. So, Pugs came up with a challenge. “Okay. Seems like you are thoroughly clear with the read/write funda. Then, here’s a question for you. Can you modify these my_read() and my_write() functions such that whenever I read /dev/mynull, I get the last character written into /dev/mynull?”

Confident enough, Shweta took the challenge and modified the my_read() and my_write() functions as follows, along with an addition of a static global character:

static char c;

static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off)
{
	printk(KERN_INFO "Driver: read()\n");
	buf[0] = c;
	return 1;
}
static ssize_t my_write(
		struct file *f, const char __user *buf, size_t len, loff_t *off)
{
	printk(KERN_INFO "Driver: write()\n");
	c = buf[len – 1];
	return len;
}

“Almost there, but what if the user has provided an invalid buffer, or what if the user buffer is swapped out. Wouldn’t this direct access of user space buf just crash and oops the kernel”, pounced Pugs. Shweta not giving up the challenge, dives into her collated material and figures out that there are two APIs just to ensure that the user space buffers are safe to access and then update them, as well. With the complete understanding of the APIs, she re-wrote the above code snippet along with including the corresponding header <asm/uaccess.h>, as follows, leaving no chance for Pugs to comment:

#include <asm/uaccess.h>

static char c;

static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off)
{
	printk(KERN_INFO "Driver: read()\n");
	if (copy_to_user(buf, &c, 1) != 0)
		return -EFAULT;
	else
		return 1;
}
static ssize_t my_write(
		struct file *f, const char __user *buf, size_t len, loff_t *off)
{
	printk(KERN_INFO "Driver: write()\n");
	if (copy_from_user(&c, buf + len – 1, 1) != 0)
		return -EFAULT;
	else
		return len;
}

Then, Shweta repeated the usual build and test steps as follows:

  • Build the modified null driver (.ko file) by running make.
  • Load the driver using insmod.
  • Write into /dev/mynull, say using echo -n “Pugs” > /dev/mynull
  • Read from /dev/mynull using cat /dev/mynull (Stop using Ctrl+C)
  • Unload the driver using rmmod.

Summing up

On cat‘ing /dev/mynull, the output was a non-stop infinite sequence of ‘s’, as my_read() gives the last one character forever. So, Pugs intervenes and presses Ctrl+C to stop the infinite read, and tries to explain, “If this is to be changed to ‘the last character only once’, my_read() needs to return 1 the first time and zero from second time onwards. This can be achieved using the off (fourth parameter of my_read())”. Shweta nods her head to support Pugs’ ego.

Seventh Article >>

Add-on

And here’s the modified read using the off:

static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off)
{
	printk(KERN_INFO "Driver: read()\n");
	if (*off == 0)
	{
		if (copy_to_user(buf, &c, 1) != 0)
			return -EFAULT;
		else
		{
			(*off)++;
			return 1;
		}
	}
	else
		return 0;
}
   Send article as PDF   

Character device files: Creation & Operations

This fifth article, which is part of the series on Linux device drivers, is continuation of the various concepts of character drivers and their implementation, dealt with in the previous article.

<< Fourth Article

In our previous article, we noted that even with the registration for <major, minor> device range, the device files were not created under the /dev, rather Shweta had to create them by hand using mknod. However, on further study, Shweta figured out a way for the automatic creation of the device files using the udev daemon. She also learnt the second step for connecting the device file with the device driver – “Linking the device file operations to the device driver functions”. Here are her learnings.

Automatic creation of device files

Earlier in kernel 2.4, automatic creation of device files was done by the kernel itself, by calling the appropriate APIs of devfs. However, as kernel evolved, kernel developers realized that device files are more of a user space thing and hence as a policy only the users should deal with it, not the kernel. With this idea, now kernel only populates the appropriate device class & device info into the /sys window for the device under consideration. And then, the user space need to interpret it and take an appropriate action. In most Linux desktop systems, the udev daemon picks up that information and accordingly creates the device files.

udev can be further configured using its configuration files to tune the device file names, their permissions, their types, etc. So, as far as driver is concerned, the appropriate /sys entries need to be populated using the Linux device model APIs declared in <linux/device.h> and the rest would be handled by udev. Device class is created as follows:

struct class *cl = class_create(THIS_MODULE, "<device class name>");

and then the device info (<major, minor>) under this class is populated by:

device_create(cl, NULL, first, NULL, "<device name format>", …);

where first is the dev_t with the corresponding <major, minor>.

The corresponding complementary or the inverse calls, which should be called in chronologically reverse order, are as follows:

device_destroy(cl, first);
class_destroy(cl);

Refer to Figure 9, for the /sys entries created using “chardrv” as the <device class name> and “mynull” as the <device name format>. That also shows the device file, created by udev, based on the <major>:<minor> entry in the dev file.

Figure 9: Automatic device file creation

Figure 9: Automatic device file creation

In case of multiple minors, device_create() and device_destroy() APIs may be put in for-loop, and the <device name format> string could be useful. For example, the device_create() call in a for-loop indexed by ‘i‘ could be as follows:

device_create(cl, NULL, MKDEV(MAJOR(first), MINOR(first) + i), NULL, "mynull%d", i);

File operations

Whatever system calls or more commonly file operations we talk of over a regular file, are applicable to the device files as well. That’s what we say a file is a file, and in Linux almost everything is a file from user space perspective. The difference lies in the kernel space, where virtual file system (VFS) decodes the file type and transfers the file operations to the appropriate channel, like file system module in case of a regular file or directory, corresponding device driver in case of a device file. Our discussion of interest is the second case.

Now, for VFS to pass the device file operations onto the driver, it should have been told about that. And yes, that is what is called registering the file operations by the driver with the VFS. This involves two steps. (The parenthesised text below refers to the ‘null driver’ code following it.) First, is to fill in a file operations structure (struct file_operations pugs_fops) with the desired file operations (my_open, my_close, my_read, my_write, …) and to initialize the character device structure (struct cdev c_dev) with that, using cdev_init(). The second step is to hand this structure to the VFS using the call cdev_add(). Both cdev_init() and cdev_add() are declared in <linux/cdev.h>. Obviously, the actual file operations (my_open, my_close, my_read, my_write) also had to be coded by Shweta. So, to start with, Shweta kept them as simple as possible, so as to say, as easy as the “null driver”.

The null driver

Following these steps, Shweta put all the pieces together to attempt her first character device driver. Let’s see what was the outcome. Here’s the complete code:

#include <linux/module.h>
#include <linux/version.h>
#include <linux/kernel.h>
#include <linux/types.h>
#include <linux/kdev_t.h>
#include <linux/fs.h>
#include <linux/device.h>
#include <linux/cdev.h>

static dev_t first; // Global variable for the first device number
static struct cdev c_dev; // Global variable for the character device structure
static struct class *cl; // Global variable for the device class

static int my_open(struct inode *i, struct file *f)
{
	printk(KERN_INFO "Driver: open()\n");
	return 0;
}
static int my_close(struct inode *i, struct file *f)
{
	printk(KERN_INFO "Driver: close()\n");
	return 0;
}
static ssize_t my_read(struct file *f, char __user *buf, size_t len, loff_t *off)
{
	printk(KERN_INFO "Driver: read()\n");
	return 0;
}
static ssize_t my_write(struct file *f, const char __user *buf, size_t len,
	loff_t *off)
{
	printk(KERN_INFO "Driver: write()\n");
	return len;
}

static struct file_operations pugs_fops =
{
	.owner = THIS_MODULE,
	.open = my_open,
	.release = my_close,
	.read = my_read,
	.write = my_write
};

static int __init ofcd_init(void) /* Constructor */
{
	int ret;
	struct device *dev_ret;

	printk(KERN_INFO "Namaskar: ofcd registered");
	if ((ret = alloc_chrdev_region(&first, 0, 1, "Shweta")) < 0)
	{
		return ret;
	}
	if (IS_ERR(cl = class_create(THIS_MODULE, "chardrv")))
	{
		unregister_chrdev_region(first, 1);
		return PTR_ERR(cl);
	}
	if (IS_ERR(dev_ret = device_create(cl, NULL, first, NULL, "mynull")))
	{
		class_destroy(cl);
		unregister_chrdev_region(first, 1);
		return PTR_ERR(dev_ret);
	}

	cdev_init(&c_dev, &pugs_fops);
	if ((ret = cdev_add(&c_dev, first, 1)) < 0)
	{
		device_destroy(cl, first);
		class_destroy(cl);
		unregister_chrdev_region(first, 1);
		return ret;
	}
	return 0;
}

static void __exit ofcd_exit(void) /* Destructor */
{
	cdev_del(&c_dev);
	device_destroy(cl, first);
	class_destroy(cl);
	unregister_chrdev_region(first, 1);
	printk(KERN_INFO "Alvida: ofcd unregistered");
}

module_init(ofcd_init);
module_exit(ofcd_exit);

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Anil Kumar Pugalia <email@sarika-pugs.com>");
MODULE_DESCRIPTION("Our First Character Driver");

Then, Shweta repeated the usual build with new test steps as follows:

  • Build the driver (.ko file) by running make.
  • Load the driver using insmod.
  • List the loaded modules using lsmod.
  • List the major number allocated using cat /proc/devices.
  • “null driver” specific experiments (Refer to Figure 10 for details).
  • Unload the driver using rmmod.
Figure 10: “null driver” experiments

Figure 10: “null driver” experiments

Summing up

Shweta was surely happy as all on her own she got a character driver written, which works same as the driver for the standard device file /dev/null. To understand what it means, check for yourself the <major, minor> tuple for /dev/null, and similarly also try out the echo and cat commands with it.

But one thing started bothering Shweta. She had got her own calls (my_open, my_close, my_read, my_write) in her driver, but how are they working so unusually unlike any regular file system calls. What’s so unusual? Whatever I write, I get nothing when read – isn’t that unusual, at least from regular file operations’ perspective. Any guesses on how would she crack this nut? Watch out for the next article.

Sixth Article >>

Notes:

  1. For using a fixed major number, you may use register_chrdev_region() instead of alloc_chrdev_region().
  2. Use kernel version >= 2.6.3x for the class_create() and the device_create() APIs to compile properly work as explained. As, before that version they have been rapidly evolving and changing.
  3. Kernel APIs (like class_create(), device_create()) which returns pointers, should be checked using IS_ERR macro instead of comparing with NULL, as NULL is zero (i.e. success and not an error). These APIs return negative pointers on error – error code from which could be extracted using PTR_ERR. See the usage in the above example.

Other References:

  1. Working of udev daemon
   Send article as PDF   

Linux Character Drivers

This fourth article, which is part of the series on Linux device drivers, deals with the various concepts of character drivers and their implementation.

<< Third Article

Shweta at her hostel room in front of her PC, all set to explore the characters of Linux character drivers, before it is being taught in the class. She recalled the following lines from professor Gopi’s class: “… today’s first driver would be the template to any driver you write in Linux. Writing any specialized advanced driver is just a matter of what gets filled into its constructor & destructor. …”. With that, she took out the first driver code, and popped out various reference books to start writing a character driver on her own. She also downloaded the on-line “Linux Device Drivers” book by Jonathan Corbet, Alessandro Rubini, Greg Kroah-Hartman from http://lwn.net/Kernel/LDD3/. Here follows the summary from her various collations.

W’s of character drivers

We already know what are drivers and why we need them. Then, what is so special about character drivers? If we write drivers for byte-oriented operations or in the C-lingo the character-oriented operations, we refer to them as character drivers. And as the majority of devices are byte-oriented, the majority of device drivers are character device drivers. Take for example, serial drivers, audio drivers, video drivers, camera drivers, basic I/O drivers, …. In fact, all device drivers which are neither storage nor network device drivers are one form or the other form of character drivers. Let’s look into the commonalities of these character drivers and how Shweta wrote one of them.

Figure 7: Character driver overview

Figure 7: Character driver overview

The complete connection

As shown in Figure 7, for any application (user space) to operate on a byte-oriented device (hardware space), it should use the corresponding character device driver (kernel space). And the character driver usage is done through the corresponding character device file(s), linked to it through the virtual file system (VFS). What it means is that an application does the usual file operations on the character device file – those operations are translated to the corresponding functions into the linked character device driver by the VFS – those functions then does the final low level access to the actual devices to achieve the desired results. Note that though the application does the usual file operations, their outcome may not be the usual ones. Rather, they would be as driven by the corresponding functions in the device driver. For example, a read followed by a write may not fetch what has been written into, unlike in the case of regular files. Note that this is the usual expected behaviour for device files. Let’s take an audio device file as an example. What we write into it is the audio data we want to playback, say through a speaker. However, the read would get us the audio data we are recording, say through a microphone. And the recorded data need not be the played back data.

In this complete connection from application to the device, there are four major entities involved:

  1. Application
  2. Character device file
  3. Character device driver
  4. Character device

And the interesting thing is that, all of these can exist independently on a system, without the other being there. So, mere existence of these on a system doesn’t mean they are linked to form the complete connection. Rather, they need to be explicitly connected. Application gets connected to a device file by invoking open system call on the device file. Device file(s) are linked to the device driver by specific registrations by the driver. And the device driver is linked to a device by its device-specific low-level operations. Thus, forming the complete connection. With this, note that the character device file is not the actual device but just a placeholder for the actual device.

Major & minor number

Connection between the application and the device file is based on the name of the device file. However, the connection between the device file and the device driver is based on the number of the device file, not the name. This allows a user-space application to have any name for the device file, and enables the kernel-space to have trivial index-based linkage between the device file & the device driver. This device file number is more commonly referred as the <major, minor> pair, or the major & minor numbers of the device file. Earlier (till kernel 2.4), one major number was for one driver, and the minor number used to represent the sub-functionalities of the driver. With kernel 2.6, this distinction is no longer mandatory – there could be multiple drivers under same major number but obviously with different minor number ranges. However, this is more common with the non-reserved major numbers and standard major numbers are typically preserved for single drivers. For example, 4 for serial interfaces, 13 for mice, 14 for audio devices, …. The following command would list the various character device files on your system:

$ ls -l /dev/ | grep “^c”

<major, minor> related support in kernel 2.6

Type: (defined in kernel header <linux/types.h>)

dev_t // contains both major & minor numbers

Macros: (defined in kernel header <linux/kdev_t.h>)

MAJOR(dev_t dev) // extracts the major number from dev
MINOR(dev_t dev) // extracts the minor number from dev
MKDEV(int major, int minor) // creates the dev from major & minor

Connecting the device file with the device driver involves two steps:

  1. Registering for the <major, minor> range of device files
  2. Linking the device file operations to the device driver functions

First step is achieved using either of the following two APIs: (defined in kernel header <linux/fs.h>)

int register_chrdev_region(dev_t first, unsigned int cnt, char *name);
int alloc_chrdev_region(
	dev_t *first, unsigned int firstminor, unsigned int cnt, char *name);

First API registers the cnt number of device file numbers starting from first, with the name. Second API dynamically figures out a free major number and registers the cnt number of device file numbers starting from <the free major, firstminor>, with the name. In either case, the /proc/devices kernel window lists the name with the registered major number. With this information, Shweta added the following into the first driver code.

#include <linux/types.h>
#include <linux/kdev_t.h>
#include <linux/fs.h>

static dev_t first; // Global variable for the first device number

In the constructor, she added:

int ret;

if ((ret = alloc_chrdev_region(&first, 0, 3, "Shweta")) < 0)
{
	return ret;
}
printk(KERN_INFO "<Major, Minor>: <%d, %d>\n", MAJOR(first), MINOR(first));

In the destructor, she added:

unregister_chrdev_region(first, 3);

Putting it all together, it becomes:

#include <linux/module.h>
#include <linux/version.h>
#include <linux/kernel.h>
#include <linux/types.h>
#include <linux/kdev_t.h>
#include <linux/fs.h>

static dev_t first; // Global variable for the first device number

static int __init ofcd_init(void) /* Constructor */
{
	int ret;

	printk(KERN_INFO "Namaskar: ofcd registered");
	if ((ret = alloc_chrdev_region(&first, 0, 3, "Shweta")) < 0)
	{
		return ret;
	}
	printk(KERN_INFO "<Major, Minor>: <%d, %d>\n", MAJOR(first), MINOR(first));
	return 0;
}

static void __exit ofcd_exit(void) /* Destructor */
{
	unregister_chrdev_region(first, 3);
	printk(KERN_INFO "Alvida: ofcd unregistered");
}

module_init(ofcd_init);
module_exit(ofcd_exit);

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Anil Kumar Pugalia <email@sarika-pugs.com>");
MODULE_DESCRIPTION("Our First Character Driver");

Then, Shweta repeated the usual steps, she learnt for the first driver

  • Build the driver (.ko file) by typing make
  • Load the driver using insmod
  • List the loaded modules using lsmod
  • Unload the driver using rmmod

Summing up

Additionally, before unloading the driver, she peeped into the kernel window /proc/devices to look for the registered major number with the name “Shweta” using cat /proc/devices. It was right there. But she couldn’t find any device file created under /dev with the same major number. So, she created them by hand using mknod, and then tried reading & writing those. Figure 8 shows all these. Please note that the major number “250” may vary from system to system based on the availability. Figure 8 also shows the results, Shweta got from reading & writing one of the device files. That reminded her that the second step for connecting the device file with the device driver – “Linking the device file operations to the device driver functions” is not yet done. She realized that she needs to dig further information to complete this step and also to figure out the reason for the missing device files under /dev. We shall continue further in our next article, to figure out what more is Shweta learning and how is she going ahead with her first character driver.

Figure 8: Character device file experiments

Figure 8: Character device file experiments

Fifth Article >>

   Send article as PDF   

Kernel C Extras in a Linux Driver

This third article, in the series on Linux device drivers deals with the kernel’s message logging,
and kernel-specific GCC extensions.

<< Second Article

Enthused by how Pugs impressed professor Gopi, in the last class, Shweta decided to do something similar. And there was already an opportunity – finding out where has the output of printk gone. So, as soon as she entered the lab, she got hold of the best located system, logged into it, and took charge. Knowing her professor pretty well, she knew that there would be a hint for the finding, from the class itself. So, she flashed back what all the professor taught, and suddenly remembered the error output demonstration from “insmod vfat.ko” – dmesg | tail. She immediately tried that and for sure found out the printk output, there. But how did it come here? A tap on her shoulder brought her out of the thought. “Shall we go for a coffee?”, proposed Pugs. “But I need to …”. “I know what you are thinking about.”, interrupted Pugs. “Let’s go, yaar. I’ll explain you all about dmesg”.

Kernel’s message logging

On the coffee table, Pugs began:

As far as parameters are concerned, printf & printk are same, except that when programming for the kernel we don’t bother about the float formats of %f, %lf & their likes. However unlike printf, printk is not destined to dump its output on some console. In fact, it cannot do so, as it is something which is in the background, and executes like a library, only when triggered either from the hardware space or the user space. So, then where does printk print? All the printk calls, just put their contents into the (log) ring buffer of the kernel. Then, the syslog daemon running in the user space picks them for final processing & redirection to various devices, as configured in its configuration file /etc/syslog.conf.

You must have observed the out of place macro KERN_INFO, in the printk calls, in the previous article. That actually is a constant string, which gets concatenated with the format string after it, making it a single string. Note that there is no comma (,) between them – they are no two separate arguments. There are eight such macros defined in <linux/kernel.h> under the kernel source, namely:

#define KERN_EMERG	"<0>" /* system is unusable			*/
#define KERN_ALERT	"<1>" /* action must be taken immediately	*/
#define KERN_CRIT	"<2>" /* critical conditions			*/
#define KERN_ERR	"<3>" /* error conditions			*/
#define KERN_WARNING	"<4>" /* warning conditions			*/
#define KERN_NOTICE	"<5>" /* normal but significant condition	*/
#define KERN_INFO	"<6>" /* informational				*/
#define KERN_DEBUG	"<7>" /* debug-level messages			*/

Depending on these log levels (i.e. the first 3 characters in the format string), the syslog daemon in the user space redirects the corresponding messages to their configured locations – a typical one being the log file /var/log/messages for all the log levels. Hence, all the printk outputs are by default in that file. Though, they can be configured differently to say serial port (/dev/ttyS0) or say all consoles, like what happens typically for KERN_EMERG. Now, /var/log/messages is buffered & contain messages not only from the kernel but also from various daemons running in the user space. Moreover, the /var/log/messages most often is not readable by a normal user, and hence a user-space utility ‘dmesg‘ is provided to directly parse the kernel ring buffer and dump it on the standard output. Figure 6 shows the snippets from the two.

Figure 6: Kernel's message logging

Figure 6: Kernel’s message logging

Kernel-specific GCC extensions

With all these Shweta got frustrated, as she wanted to find all these by her own, and then do a impression in the next class – but all flop. Pissed off, she said, “So as you have explained all about printing in kernel, why don’t you tell about the weird C in the driver as well – the special keywords __init, __exit, etc.”

These are not any special keywords. Kernel C is not any weird C but just the standard C with some additional extensions from the C compiler gcc. Macros __init and __exit are just two of these extensions. However, these do not have any relevance in case we are using them for dynamically loadable driver, but only when the same code gets built into the kernel. All the functions marked with __init get placed inside the init section of the kernel image and all functions marked with __exit are placed inside the exit section of the kernel image, automatically by gcc, during kernel compilation. What is the benefit? All functions with __init are supposed to be executed only once during boot-up, till the next boot-up. So, once they are executed during boot-up, kernel frees up RAM by removing them by freeing up the init section. Similarly, all functions in exit section are supposed to be called during system shutdown. Now, if system is shutting down anyway, why do you need to do any cleanups. Hence, the exit section is not even built into the kernel – another cool optimization.

This is a beautiful example of how kernel & gcc goes hand-in-hand to achieve lot of optimizations and many other tricks – we could see others, as we go along. And that is why Linux kernel can be compiled only using gcc-based compilers – a close knit bond.

Kernel function’s return guidelines

While returning from coffee, Pugs started all praises for the OSS & its community. Do you know why different individuals are able to come together and contribute excellently without any conflicts – moreover in a project as huge as Linux? There are many reasons. But definitely, one of the strong reasons is, following & abiding by the inherent coding guidelines. Take for example the guideline for returning values from a function in kernel programming.

Any kernel function needing error handling, typically returns an integer-like type and the return value again follows a guideline. For an error, we return a negative number – a minus sign appended with a macro included through the kernel header <linux/errno.h>, that includes the various error number headers under the kernel sources, namely <asm/errno.h>, <asm-generic/errno.h>, <asm-generic/errno-base.h>. For success, zero is the most common return value, unless there is some additional information to be provided. In that case, a positive value is returned, the value indicating the information like number of bytes transferred.

Kernel C = Pure C

Once back into the lab, Shweta remembered their professor mentioning that no /usr/include headers can be used for kernel programming. But Pugs said that kernel C is just standard C with some gcc extensions. Why this conflict? Actually this is not a conflict. Standard C is just pure C – just the language. The headers are not part of it. Those are part of the standard libraries built in C for C programmers, based on the concept of re-using code. Does that mean, all standard libraries and hence all ANSI standard functions are not part of ‘pure’ C? Yes. Then, hadn’t it been really tough coding the kernel. Not for this reason. In reality, kernel developers have developed their own needed set of functions, and they are all part of the kernel code. printk is just one of them. Similarly, many string functions, memory functions, … are all part of the kernel source under various directories like kernel, ipc, lib, … and the corresponding headers under include/linux directory.

“O ya! That is why we need to have kernel source for building a driver”, affirmed Shweta. “If not the complete source, at least the headers are a must. And that is why we have separate packages to install complete kernel source or just the kernel headers”, added Pugs. “In the lab, all the sources are setup. But if I want to try out drivers on my Linux system at my hostel room, how do I go about it?” asked Shweta. “Our lab have Fedora, where the kernel sources typically get installed under /usr/src/kernels/<kernel_version> unlike the standard place /usr/src/linux. Lab administrators must have installed it using command line ‘yum install kernel-devel‘. I use Mandriva and installed the kernel sources using ‘urpmi kernel-source‘, replied Pugs. “But, I have Ubuntu”. “Okay!! For that just use apt-get install – possibly, ‘apt-get install linux-source‘”.

Summing up

Lab timings were just getting over. Suddenly, Shweta put out her curiosity – “Hey Pugs! What is the next topic we are going to learn in our Linux device drivers class?”. “Hmmm!! Most probably character drivers”. With this information, Shweta hurriedly packed up her bag & headed towards her room to setup the kernel sources and try out the next driver on her own. “In case you are stuck up, just give me a call. I’ll be there”, called up Pugs from the behind with a smile.

Fourth Article >>

Notes:

  1. The default syslog file /var/log/messages may vary from distro to distro. For example, in the latest Ubuntu distros, it is /var/log/syslog.

Other References:

  1. Another possible pointer to the missing /var/log/messages in Ubuntu
   Send article as PDF   

Writing your First Linux driver in the Classroom

This second article, which is part of the series on Linux device drivers, deals with the concept of dynamically loading drivers, first writing a Linux driver, before building and then loading it.

<< First Article

As Shweta and Pugs reached their classroom, they were already late. Their professor was already in there. They looked at each other and Shweta sheepishly asked, “May we come in, sir”. “C’mon!!! you guys are late again”, called out professor Gopi. “And what is your excuse, today?”. “Sir, we were discussing your topic only. I was explaining her about device drivers in Linux”, was a hurried reply from Pugs. “Good one!! So, then explain me about dynamic loading in Linux. You get it right and you two are excused”, professor emphasized. Pugs was more than happy. And he very well knew, how to make his professor happy – criticize Windows. So, this is what he said.

As we know, a typical driver installation on Windows needs a reboot for it to get activated. That is really not acceptable, if we need to do it, say on a server. That’s where Linux wins the race. In Linux, we can load (/ install) or unload (/ uninstall) a driver on the fly. And it is active for use instantly after load. Also, it is disabled with unload, instantly. This is referred as dynamic loading & unloading of drivers in Linux.

As expected he impressed the professor. “Okay! take your seats. But make sure you are not late again”. With this, the professor continued to the class, “So, as you now already know, what is dynamic loading & unloading of drivers into & out of (Linux) kernel. I shall teach you how to do it. And then, we would get into writing our first Linux driver today”.

Dynamically loading drivers

These dynamically loadable drivers are more commonly referred as modules and built into individual files with .ko (kernel object) extension. Every Linux system has a standard place under the root file system (/) for all the pre-built modules. They are organized similar to the kernel source tree structure under /lib/modules/<kernel_version>/kernel, where <kernel_version> would be the output of the command “uname -r” on the system. Professor demonstrates to the class as shown in Figure 4.

Figure 4: Linux pre-built modules

Figure 4: Linux pre-built modules

Now, let us take one of the pre-built modules and understand the various operations with it.

Here’s a list of the various (shell) commands relevant to the dynamic operations:

  • lsmod – List the currently loaded modules
  • insmod <module_file> – Insert/Load the module specified by <module_file>
  • modprobe <module> – Insert/Load the <module> along with its dependencies
  • rmmod <module> – Remove/Unload the <module>

These reside under the /sbin directory and are to be executed with root privileges. Let us take the FAT file system related drivers for our experimentation. The various module files would be fat.ko, vfat.ko, etc. under directories fat (& vfat for older kernels) under /lib/modules/`uname -r`/kernel/fs. In case, they are in compressed .gz format, they need to be uncompressed using gunzip, for using with insmod. vfat module depends on fat module. So, fat.ko needs to be loaded before vfat.ko. To do all these steps (decompression & dependency loading) automatically, modprobe can be used instead. Observe that there is no .ko for the module name to modprobe. rmmod is used to unload the modules. Figure 5 demonstrates this complete experimentation.

Figure 5: Linux module operations

Figure 5: Linux module operations

Our first Linux driver

With that understood, now let’s write our first driver. Yes, just before that, some concepts to be set right. A driver never runs by itself. It is similar to a library that gets loaded for its functions to be invoked by the “running” applications. And hence, though written in C, it lacks the main() function. Moreover, it would get loaded / linked with the kernel. Hence, it needs to be compiled in similar ways as the kernel. Even the header files to be used can be picked only from the kernel sources, not from the standard /usr/include.

One interesting fact about the kernel is that it is an object oriented implementation in C. And it is so profound that we would observe the same even with our first driver. Any Linux driver consists of a constructor and a destructor. The constructor of a module gets called whenever insmod succeeds in loading the module into the kernel. And the destructor of the module gets called whenever rmmod succeeds in unloading the module out of the kernel. These two are like normal functions in the driver, except that they are specified as the init & exit functions, respectively by the macros module_init() & module_exit() included through the kernel header module.h

/* ofd.c – Our First Driver code */

#include <linux/module.h>
#include <linux/version.h>
#include <linux/kernel.h>

static int __init ofd_init(void) /* Constructor */
{
	printk(KERN_INFO "Namaskar: ofd registered");

	return 0;
}

static void __exit ofd_exit(void) /* Destructor */
{
	printk(KERN_INFO "Alvida: ofd unregistered");
}

module_init(ofd_init);
module_exit(ofd_exit);

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Anil Kumar Pugalia <email@sarika-pugs.com>");
MODULE_DESCRIPTION("Our First Driver");

Above is the complete code for our first driver, say ofd.c. Note that there is no stdio.h (a user space header), instead an analogous kernel.h (a kernel space header). printk() being the printf() analogous. Additionally, version.h is included for version compatibility of the module with the kernel into which it is going to get loaded. Also, the MODULE_* macros populate the module related information, which acts like the module’s signature.

Building our first Linux driver

Once we have the C code, it is time to compile it and create the module file ofd.ko. And for that we need to build it in the similar way, as the kernel. So, we shall use the kernel build system to do the same. Here follows our first driver’s Makefile, which would invoke the kernel’s build system from the kernel source. The kernel’s Makefile would in turn invoke our first driver’s Makefile to build our first driver. The kernel source is assumed to be installed at /usr/src/linux. In case of it to be at any other location, the KERNEL_SOURCE variable has to be appropriately updated.

# Makefile – makefile of our first driver

# if KERNELRELEASE is not defined, we've been called directly from the command line.
# Invoke the kernel build system.
ifeq (${KERNELRELEASE},)
	KERNEL_SOURCE := /usr/src/linux
	PWD := $(shell pwd)
default:
	${MAKE} -C ${KERNEL_SOURCE} SUBDIRS=${PWD} modules

clean:
	${MAKE} -C ${KERNEL_SOURCE} SUBDIRS=${PWD} clean

# Otherwise KERNELRELEASE is defined; we've been invoked from the
# kernel build system and can use its language.
else
	obj-m := ofd.o
endif

Note 1: Makefiles are very space-sensitive. The lines not starting at the first column have a tab and not spaces.

Note 2: For building a Linux driver, you need to have the kernel source (or at the least the kernel headers) installed on your system.

With the C code (ofd.c) and Makefile ready, all we need to do is put them in a (new) directory of its own, and then invoke make in that directory to build our first driver (ofd.ko).

$ make
make -C /usr/src/linux SUBDIRS=... modules
make[1]: Entering directory `/usr/src/linux'
CC [M] .../ofd.o
Building modules, stage 2.
MODPOST 1 modules
CC .../ofd.mod.o
LD [M] .../ofd.ko
make[1]: Leaving directory `/usr/src/linux'

Summing up

Once we have the ofd.ko file, do the usual steps as root, or with sudo.

# su
# insmod ofd.ko
# lsmod | head -10

lsmod should show you the ofd driver loaded.

While the students were trying their first module, the bell rang, marking the end for this session of the class. And professor Gopi concluded, saying “Currently, you may not be able to observe anything, other than “lsmod” listing showing our first driver loaded. Where’s the printk output gone? Find that out for yourself in the lab session and update me with your findings. Moreover, today’s first driver would be the template to any driver you write in Linux. Writing any specialized advanced driver is just a matter of what gets filled into its constructor & destructor. So, here onwards, our learnings shall be in enhancing this driver to achieve our specific driver functionalities.”

Notes

  1. In most of today’s distros, one may safely have KERNEL_SOURCE set to /lib/modules/$(shell uname -r)/build, instead of /usr/src/linux i.e. KERNEL_SOURCE := /lib/modules/$(shell uname -r)/build in the Makefile.

Third Article >>

   Send article as PDF   

Linux Device Drivers for your Girl Friend

This is the first article of the series on Linux device drivers, which aims to present the usually technical topic in a way that is more interesting to a wider cross-section of readers.

“After a week of hard work, we finally got our driver working”, was the first line as Pugs met his girl friend Shweta.

“Why? What was your driver upto? Was he sick? And what hard work did you do?”, came a series of question from Shweta with a naughty smile.

Pugs was confused as what was Shweta talking about. “Which driver are you talking about?”, he exclaimed.

“Why are you asking me? You should tell me, which of your drivers, you are talking about?”, replied Shweta.

Pugs clicked, “Ah C’mon! Not my car drivers. I am talking about my device driver written on my computer.”

“I know a car driver, a bus driver, a pilot, a screw driver. But what is this device driver?”, queried Shweta.

And that was all needed to trigger Pugs’ passion to explain the concept of device drivers for a newbie. In particular, the Linux device drivers, which he had been working on since many years.

Of drivers and buses

A driver is one who drives – manages, controls, directs, monitors – the entity under his command. So a bus driver does that with a bus. Similarly, a device driver does that with a device. A device could be any peripheral connected to a computer, for example mouse, keyboard, screen / monitor, hard disk, camera, clock, … – you name it.

A pilot could be a person or automatic systems, possibly monitored by a person. Similarly, device driver could be a piece of software or another peripheral / device, possibly driven by a software. However, if it is an another peripheral / device, it is referred as device controller in the common parlance. And by driver, we only mean the software driver. A device controller is a device itself and hence many a times it also needs a driver, commonly referred as a bus driver.

General examples of device controllers include hard disk controllers, display controllers, audio controller for the corresponding devices. More technical examples would be the controllers for the hardware protocols, such as an IDE controller, PCI controller, USB controller, SPI controller, I2C controller, etc. Pictorially, this whole concept can be depicted as in figure 1.

Figure 1: Device & driver interaction

Figure 1: Device & driver interaction

Device controllers are typically connected to the CPU through their respectively named buses (collection of physical lines), for example pci bus, ide bus, etc. In today’s embedded world, we more often come across microcontrollers than CPUs, which are nothing but CPU + various device controllers built onto a single chip. This effective embedding of device controllers primarily reduces cost & space, making it suitable for embedded systems. In such cases, the buses are integrated into the chip itself. Does this change anything on the drivers or more generically software front?

Not much except that the bus drivers corresponding to the embedded device controllers, are now developed under the architecture-specific umbrella.

Drivers have two parts

Bus drivers provides hardware-specific interface for the corresponding hardware protocols, and are the bottom-most horizontal software layers of an operating system (OS). Over these sit the actual device’ drivers. These operate on the underlying devices using the horizontal layer interfaces, and hence are device-specific. However, the whole idea of writing these drivers is to provide an abstraction to the user. And so on the other end, these do provide interface to the user. This interface varies from OS to OS. In short, a device driver has two parts: i) Device-specific, and ii) OS-specific. Refer to figure 2.

Figure 2: Linux device driver partition

Figure 2: Linux device driver partition

The device-specific portion of a device driver remains same across all operating systems, and is more of understanding and decoding of the device data sheets, than of software programming. A data sheet for a device is a document with technical details of the device, including its operation, performance, programming, etc. Later, I shall show some examples of decoding data sheets as well. However, the OS-specific portion is the one which is tightly coupled with the OS mechanisms of user interfaces. This is the one which differentiates a Linux device driver from a Windows device driver from a MAC device driver.

Verticals

In Linux, a device driver provides a system call interface to the user. And, this is the boundary line between the so-called kernel space and user space of Linux, as shown in figure 2. Figure 3 elaborates on further classification.

Based on the OS-specific interface of a driver, in Linux a driver is broadly classified into 3 verticals:

  • Packet-oriented or Network vertical
  • Block-oriented or Storage vertical
  • Byte-oriented or Character vertical
Figure 3: Linux kernel overview

Figure 3: Linux kernel overview

The other two verticals, loosely the CPU vertical and memory vertical put together with the other three verticals give the complete overview of the Linux kernel, like any text book definition of an OS: “An OS does 5 managements namely: CPU/process, memory, network, storage, device/io”. Though these 2 could be classified as device drivers, where CPU & memory are the respective devices, these two are treated differently for many reasons.

These are the core functionalities of any OS, be it micro or monolithic kernel. More often than not, adding code in these areas is mainly a Linux porting effort, typically for a new CPU or architecture. Moreover, the code in these two verticals cannot be loaded or unloaded on the fly, unlike the other three verticals. And henceforth to talk about Linux device drivers, we would mean to talk only on the later three verticals in figure 3.

Let’s get a little deeper into these three verticals. Network consists of 2 parts: i) Network protocol stack, and ii) Network interface card (NIC) or simply network device drivers, which could be for ethernet, wifi, or any other network horizontals. Storage again consists of 2 parts: i) File system drivers for decoding the various formats on various partitions, and ii) Block device drivers for various storage (hardware) protocols, that is the horizontals like IDE, SCSI, MTD, etc.

With this you may wonder, is that the only set of devices for which you need drivers, or Linux has drivers for. Just hold on. You definitely need drivers for the whole lot of devices interfacing with a system, and Linux do have drivers for them. However, their byte-oriented accessibility puts all of them under the character vertical – yes I mean it – it is the majority bucket. In fact, because of this vastness, character drivers have got further sub-classified. So, you have tty drivers, input drivers, console drivers, framebuffer drivers, sound drivers, etc. And the typical horizontals here would be RS232, PS/2, VGA, I2C, I2S, SPI, etc.

Multiple-vertical interfacing drivers

On a final note on the complete picture of placement of all the drivers in the Linux driver ecosystem, the horizontals like USB, PCI, etc span below multiple verticals. Why? As we have a USB wifi dongle, a USB pen drive, as well as a USB to serial converter – all USB but three different verticals.

In Linux, bus drivers or the horizontals, are often split into two parts, or even two drivers: i) Device controller specific, and ii) An abstraction layer over that for the verticals to interface, commonly called cores. A classical example would be the usb controller drivers ohci, ehci, etc and the USB abstraction usbcore.

Summing up

So, to conclude a device driver is a piece of software which drives a device, though there are so many classifications. And in case it drives only another piece of software, we call it just a driver. Examples are file system drivers, usbcore, etc. Hence, all device drivers are drivers but all drivers are not device drivers.

“Hey Pugs! Just hold on. We are getting late for our class. And you know what kind of trouble, we can get into. Let’s continue from here, tomorrow.”, exclaimed Shweta.

With that Pugs wrapped up saying, “Okay. This is majorly what the device driver theory is. If you are interested, later I shall show you the code & what all have we been doing for all the various kinds of drivers”. And they hurried towards their classroom.

Second Article >>

   Send article as PDF