Kernel community management

| categories: fedora

I was at Open Source Summit last week (full trip report forthcoming) and like always one of the keynotes was Linus being interviewed by Dirk Hohndel. The topic of the kernel community and community management came up and whether Linus thought the kernel needed to do anything more to grow. Paraphrasing, his response was the success of the kernel community shows that it's generally doing fine. I disagree with some aspects of this and have actually thought a lot about what community management would mean for the kernel.

Community manager is a job that many modern source projects above a certain size seem to have. If you google for "open source community manager" you'll find lots of different descriptions of what the job entails. Lots of people who actually have experience with this (i.e. not me) have written and spoken about this work. The big thing for me is that community management is a deliberate choice to shape the community. You have to make the choice to build the community you want to see. Even if you don't have a community manager, developers are still doing community management every time they interact, because ultimately that's the community.

A better question than "does the kernel need a community manager" is "does the kernel need community management" to which I give an emphatic yes. The kernel has certainly been a successful project but people have pointed out some issues. Again, community management is about making choices to actively build a community. You can't have a stream of maintainers unless you actively work to make sure people are coming in. The kernel community is great at attracting other people who want to work on the kernel but that may not be enough. The kernel is way behind in terms of continuous integration and other tools most people expect from open source projects these days. One area we need to grow is people who work on tools to support the kernel. That pool may need to come from outside the traditional kernel development community.

The role of the TAB in community management is an interesting one. If you look at the description on that page, "The Technical Advisory Board provides the Linux kernel community a direct voice into The Linux Foundation’s activities and fosters bi-directional interaction with application developers, end users, and Linux companies." I know there are some unfavorable opinions (and conspiracy theories) out there about the Linux Foundation. What the Linux Foundation does well is help guide corporations in doing open source which is very different from grassroots free software. There's a large number of companies who have become very active members of the kernel community thanks to guidance and support from developers like those who are on the TAB. Enabling companies to contribute successfully is a form of community building as a practicality; companies have different needs and requirements than individuals. I do believe the members of the TAB deeply care about the kernel community, including those who aren't part of any corporate entity. Figuring out how to set that direction may be less obvious though.

Anyone who says they have the magic solution to community management is lying and I certainly don't have one. I do believe you have to shape your community with intentionality and just focusing on the code will not achieve that.

Flock 2018

| categories: fedora

Last week was Flock 2018. Highlights:

  • I gave a talk on the relationship between the Fedora and RHEL kernels. The short summary is that that the two kernels are not closely related, despite the fact that they are supposed to be. I've been working with some people inside Red Hat to figure out ways to improve this situation. The goal is to have more Red Hat kernel developers participating in Fedora to make the Fedora kernel more beneficial for future RHEL work. I talked about some of the upcoming work such as syncing up core kernel configuration and packaging. This all seemed fairly well received.

  • RHEL + Fedora was a theme throughout many presentations. Josh Boyer and Brendan Conoboy gave a talk about aligning Fedora and RHEL across the entire OS. Some of this was about what you would expect (more testing etc.) but one of the more controversial points was suggesting redefining what makes up the system vs. applications. RPMs are nominally the smallest unit of a distribution but this doesn't quite mesh with the modern concepts of self-contained applications. You want to be able to update applications independently of the underlying system and vice versa. The talk was fairly high level about what to actually do about this problem but it generated some discussion.

  • Kevin Fenzi gave a talk about rawhide. As a relative newcomer to the project, I enjoyed hearing the history of how rawhide came about and what's being done to keep it moving forward. I'll echo the sentiment that rawhide is typically fairly usable, so give it a shot!

  • Dusty Mabe and Benjamin Gilbert gave a talk about Fedora CoreOS. I've always thought the CoreOS concept was a great idea and I'm pleased to see it continue on. Some of the talk was a bit of a retrospective about what worked and didn't work for CoreOS. Certain parts are going to be re-written. I enjoyed hearing the upcoming plans as well as getting to meet the CoreOS team.

  • Peter Robinson ran an IoT BOF. IoT is now an official Fedora objective and has a regular release. Part of the goal of the BoF was to talk about what it currently supports and what people want to do. Several people had great plans for utilizing some older hardware and I look forward to seeing more projects.

  • Peter Robinson and Spot gave a talk on the Raspberry Pi. Support for this device has come a long way and there's always new things happening. If you have a Raspberry Pi give it a shot!

  • There was a session on Fedora in Google Summer of Code and Outreachy. Fedora was extremely successful with its interns this past summer and it was great to hear from everyone and the mentors. There is another round of Outreachy happening soon as well.

Once again, a great time. Thanks to the organizers to putting on a fantastic conference.

The cabbage patch for linker scripts

| categories: fedora

Quick quiz: what package provides ld? If you said binutils and not gcc, you are a winner! That's not actually the story, I just tend to forget which package to look at when digging into problems. This is actually a story about binutils, linker scripts, and toolchains.

Usually by -rc4, the kernel is fairly stable so I was a bit surprised when the kernel was failing on arm64:

ld: cannot open linker script file ldscripts/aarch64elf.xr: No such file or directory

There weren't many changes to arm64 so it was pretty easy to narrow down the problem to a seemingly harmless change. If you are running a toolchain on a standard system such as Fedora, you will probably expect it to "just work". And it should if everything goes to plan! binutils is a very powerful library though and can be configured to allow for emulating a bunch of less standard linkers, if you run ld -V you can see what's available:

$ ld -V
GNU ld version 2.29.1-23.fc28
  Supported emulations:

This is what's on my Fedora system. Depending on how your toolchain is compiled, the output may be different. A common variant toolchain setup is the 'bare metal' toolchain. This is (generally) a toolchain that's designed to compile binaries to run right on the hardware without an OS. The kernel technically meets this definition and provides all its own linker scripts so in theory you should be able to compile the kernel with a properly configured bare metal toolchain. What the harmless looking change did was switch the emulation mode from linux to one that works with bare metal toolchains.

So why wasn't it working? Looking across the system, I found no trace of the file aarch64elf.xr, yet clearly it was expecting it. Because this seemed to be something internal to the toolchain, I decided to try another one. Linaro helpfully provides toolchains for compiling arm targets. Turns out the Linaro toolchain worked. strace helpfully showed where it was picking up the file1:

lstat("/opt/gcc-linaro-7.1.1-2017.08-x86_64_aarch64-linux-gnu/aarch64-linux-gnu/lib/ldscripts/aarch64elf.xr", {st_mode=S_IFREG|0644, st_size=5299, ...}) = 0

So clearly the file was supposed to be included. Looking at the build log for Fedora's binutils, I could definitely see the scripts being installed. Further down the build log, there was also a nice rm -rf removing the directory where these scripts were installed to. This very deliberately exists in the spec file for building binutils with a comment about gcc. The history doesn't make it completely clear, but I suspect this was either intended to avoid conflicts with something gcc generated or it was 'borrowed' from gcc to remove files Fedora didn't care about. Linaro, on the other hand, chose to package the files with their toolchain. Given Linaro has a strong embedded background, it would make sense for them to care about emulation modes that might be used on more traditional embedded hardware.

For one last piece of the puzzle, if all the linker scripts are rm -rf'd why does the linker work at all, shouldn't it complain? The binutils source has the answer. If you trace through the source tree, you can find a folder with all the emulation options, along with the template they use for generating the structure representation. There's a nice check for $COMPILE_IN to actually build a linker script into the binary. The file is actually responsible for generating all the linker scripts and will compile in the default script. This makes sense, since you want the default case to be fast and not hit the file system.

I ended up submitting a revert of the patch since this was a regression, but it turns out Debian suffers from a similar problem. The real take away here is toolchains are tricky. Choose yours carefully.

  1. You also know a file is a bit archaic when it has a comment about the Solaris linker 

What's a kernel devel package anyway

| categories: fedora

One of the first concepts you learn when building open source software is the existance of -devel packages. You have package foo to provide some functionality and foo-devel for building other programs with the foo functionality. The kernel follows this pattern in its own special kernel way for building external modules.

First a little bit about how a module is built. A module is really just a fancy ELF file compiled and linked with the right options. It has .text, .data, and other kernel specific sections. Some parts of the build environment also get embedded in modules. Modules are also just a socially acceptable way to run arbitrary code in kernel mode. Modules are loaded via a system call (either by fd or an mmaped address). The individual sections (.text., .data etc.) get placed based on the ELF header. The kernel does some basic checks on the ELF header to make sure it's not complete crap (loading for an incorrect arch etc.) but can also do some more complicated verification. Each module gets a version magic embedded in the ELF file. This needs to match the running kernel but can be overridden with a force option. There's also CONFIG_MODVERSIONS which will generate a crc over functions and exported symbols to make sure they match the kernel that was built. If the CRC in the module and kernel don't match, the module loading will fail.

Now consider an out of tree module. The upstream Linux kernel doesn't provide an ABI guarantee. In order to build an external module, you need to use the same tree that was used to build the kernel. You might be able to get away with using a different base but it's not guaranteed to work. These requirements are well documented. Actually packaging the entire build tree would be large and unecessary. Fedora ends up packaging a subset of the build tree:

  • Kconfigs and Makefiles
  • header files, both generic and architecture specific
  • Some userspace binaries built at make modules_prepare time
  • The kernel symbol map
  • Module.symvers
  • A few linker files for some arches

Annoyingly, because each distribution does something different, all of this has to be done manually. This also means we find bugs when there are new dependencies that need to be packaged. I really wish we could just get away with building the module dependencies at runtime but doesn't work with the requirements.

More kbuild for reproducible builds

| categories: fedora

I'm still working on patches to deal with build ids for the kernel. One issue I spent way too long figuring out was that if you just do a basic make for the kernel, some local environment information will be picked up on each build. This means that the build id will not be the same between builds of the same source tree because the sha1 sum is going to be different. This has the funny effect of meaning that the problem of unique build ids is actually solved for the vmlinux itself but still not modules or the vDSO.

Among the list of common commands you learn for Linux is uname. If you run uname -a you'll see something like

Linux localhost.localdomain 4.17.0-0.rc3.git4.1.fc29.x86_64 #1 SMP
Fri May 4 19:41:58 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

What's most interesting for this discussion is a subset with uname -v

#1 SMP Fri May 4 19:41:58 UTC 2018

This is some version information about when this kernel was built. All this can technically be namespaced but by default these values come from generated defines at compile time, specifically UTS_VERSION. You can see how this gets generated from scripts/mkcompile_h

The timestamp is fairly obvious and the Kbuild infrastructure provides an easy override to set it to a fixed value (KBUILD_BUILD_TIMESTAMP= some string that can be passed to date -d). A bit more obtuse (at least for me) was the #1. This is a value stored in a file called .version. This gets updated every time scripts/ is run. It is, in fact, designed to be a release number to differentiate between builds. After too many hours of debugging it also ends up feeling like some sort of achievement for a video game ("You have managed to compile the kernel .version times while working on this particular issue.") This can also be set with KBUILD_BUILD_VERSION.

The short and sweet summary is that if I actually want to verify things with build ids I can set KBUILD_BUILD_TIMESTAMP and KBUILD_BUILD_VERSION to fixed values to get a consistent build id across compiles. It's worth noting that modules can end up with a consistent build id without setting anything extra because they (typically) don't use UTS_VERSION anywhere. Now all I need to do is finish cleaning up some patches.

« Previous Page -- Next Page »