Toolchain over a Cup of Coffee

This is the second article in the series dedicated to the world of Embedded Systems.

“Wow! You are already there”, pulled Shweta on Pugs.

“Has to be. As an obedient student, how can I keep my guru waiting for me”, was a naughty reply from Pugs.

Okay. Ok. So, what were we discussing about?

Two categories of Embedded Systems.

O yes! Baremetal Software (BS) based, and Operating System (OS) based. In BS based embedded systems, the firmware is the only software component. It has its own pros & cons. As being the minimal software, it typically yields better control of timing & performance. But the downside is that it is a development from scratch. Everything and anything needed has to be designed and developed – no software logic can be assumed to be there.

Wow! That would be amazing, right? One would get a feel of the days, when the first software was written.

Similar to that. But not exactly. As when the first software was written, it was written directly using the CPU instructions aka machine codes.

Yes. Yes. Nowadays, we write in C, or other higher level languages.

Yes. But even then, it ultimately needs to get converted into machine codes. So, for that we would use a compiler.

Yes that I know. But what is a cross compiler?

See in general, you run a compiler on your x86 desktop to generate the machine code for your x86 system only, i.e. you compile the code for your native system itself. However, embedded systems typically have non-x86 architectures like ARM, PPC, MIPS, etc. But we still wish to compile the code for them on our desktops, instead of compiling it on the embedded system.

Why is that so?

Because of the usual embedded system restrictions, and ease of development on our familiar desktops. In such scenario, we need to have a compiler which runs on a x86 system (typically referred as a host system), but generates the machine code for a non-x86 system (typically referred as a target system). And such compilers are called cross compilers.

So, a compiler where the host and target are not same is called a cross compiler.

Exactly.

So, say I am somehow able to do compilation for my ARM based target system, on the target system itself. Then, wouldn’t it be a cross compiler?

Yes, it wouldn’t be – it would be just an another native compiler, though ARM based. As then, your compiler would run on an ARM system and generate code for the same system. And why somehow? When you have a complete Linux distribution running on an Embedded System, you can easily have the ARM based native compiler in there.

Got it. Then, what is this so called (cross) toolchain all about?

When we talk about a (cross) compiler, we are referring to a single software (tool) that converts say C into machine code. But typically to do that there are many other related tools, which are used, like an (cross) assembler, (cross) linker, etc. Moreover, there are many other related utilities, which are useful in development, like (cross) objdump, (cross) nm, etc, grouped together as binary (lowest level output) manipulating utilities aka (cross) binutils. All these set or chain of tools put together is called a (cross) toolchain.

O! That’s all – just for the name of a toolchain.

It’s not, that’s all. A toolchain is a topic in itself. Developing a specific toolchain involves many things, and multiple iterations. Just to give you an idea. Toolchain is after all a software. So, what do you think it is written in?

C, I guess.

Nowadays, yes. So, we need to have a toolchain, to compile a toolchain.

That’s okay. We may use other already built toolchain, right?

Not that straight forward. There may incompatibilities with that. As a toolchain consists of a set of libraries, OS dependent headers, etc, which may be different for different versions.

So, what do we do then?

We first build a minimal version independent toolchain using an existing toolchain, and then use that to build the complete toolchain.

Interesting.

And involved.

But then, how would have been the first toolchain built?

Possibly using a C compiler not written in C, but assembly.

And that would have been assembled using an assembler. And then continuing down further, the first assembler would have been written in machine code itself.

Excellent.

That’s why. Rather than building toolchain ourselves, we typically, get these ready-made from some websites, right?

Yes but they are not just websites, rather full-fledged companies built around toolchains. For example, Linaro.

Makes sense.

However, there are full-fledged build systems to build complete Linux distributions, like buildroot, yocto, etc, which in turn build their own toolchains in an automated fashion, taking care of all the complications involved.

Wonderful. Based on our discussion, what I understand is that irrespective of the embedded system type (BS or OS based), we would need a toolchain for its software development. And it is upto us, whether we download the pre-built ones, or build ourselves – manually or preferably using build systems.

Yes. But depending on the type, the toolchain may vary. And note that the build systems may be able to generate the toolchain only for an OS based embedded system.

“Yes, I can see there are broadly two types of toolchains on the Linaro website”, quipped Pugs browsing the above link. “But why do they need to be different. Finally, they are going to be generating the machine code for the same architecture – isn’t it?”

I need to go now. Some other time on that.

When?

I’ll message you.

Next Article >>

Anil Kumar Pugalia (123 Posts)

The author is a hobbyist in open source hardware and software, with a passion for mathematics, and philosopher in thoughts. A gold medallist from the Indian Institute of Science, Linux, mathematics and knowledge sharing are few of his passions. He experiments with Linux and embedded systems to share his learnings through his weekend workshops. Learn more about him and his experiments at https://sysplay.in.

Playing with Systems

SysPlay's Blogs

2 thoughts on “Toolchain over a Cup of Coffee”

Leave a Reply Cancel reply