Fun with gcc plugins

| categories: fedora

One of piece of infrastructure that's come in as part of the Kernel Self Protection Project (KSPP) is support for gcc plugins. I touched on this briefly in my DevConf talk but I wanted to discuss a few more of the 'practicalities' of dealing with compiler plugins.

At an incredibly abstract level, a compiler transforms a program from some form A to another form A'. Your A might be C, C++ and you expect A' to be a binary file you can run. Modern compilers like gcc produce the final result by transforming your program several passes, so you end up with A to A' to A'' to A''' etc. The gcc plugin architecture allows you to hook in at various points to make changes to the intermediate state of the program. gcc has a number of internal representations so depending on where you are hooking you may need to use a different representation.

Kernel development gets a (not undeserved) reputation for being poorly documented and difficult to get into. To write even a self-contained kernel module requires some knowledge about the rest of the code base. If you have some familiarity with the code base it makes things much easier. I've found compiler plugins to be similarly difficult. I'm not working with the gcc code base on a regular basis so figuring out how to do something practical with the internal structures feels like an uphill battle. I played around with writing a toy plugin to look at the representation and it took me forever to figure out how to get the root of the tree so I could do something as simple as call walk_tree. Once I figured that out, I spent more time figuring out how to actually do a switch on the node to see what type it was. Basically, I'm a beginner in an unfamiliar code base so it takes me a while to do anything.

Continuing the parallels between kernels and compilers, the internal ABI of gcc may change between versions, similar how the kernel provides no stable internal ABI. If you want to support multiple compiler versions in your plugin, this results in an explosion of #ifdef VERSION >= BLAH all throughout the code. Arguably, external kernel modules have the same problem but I'd argue the problem is slightly worse for compiler plugins. Kernel modules can be built and shipped for particular kernel versions but it's harder to require specific compiler versions.

With all this talk about how hard it is to use compiler plugins, there might be some questions about if it's really worth it to support them at all. My useless answer is "it depends" and "isn't that the ultimate question of any feature". If you have a plugin that can eliminate bug classes, is it worth the maintenance burden? I say yes. One long term option is to get features merged into the main gcc trunk so they don't have to be carried as plugins. Some of the tweaks are kernel specific though, so we're probably stuck carrying the more useful plugins. There is interest in new compiler flags and features so we'll have to see what happens in the future.