When tools break the kernel

| categories: fedora

The kernel is really self-contained. This makes it great for trying experiments and breaking things. It also means that most bugs are also going to be self-contained. I say most because the kernel still has dependencies on other core system packages and when those change, the kernel can break as well.

All the low level packages on your system are usually so well maintained you don't even realize they are present1. binutils provides tools for working with binary files. The assembler will get updates for features such as instruction set updates. Changes like these can break the kernel unexpectedly though. glibc is another popular package for updates which break the kernel. The word 'break' here does not mean the changes from glibc/binutils were incorrect. The kernel makes a lot of assumptions about what's provided by external packages and things are bound to get out of sync occasionally. This is a big part of the purpose of rawhide: to find dependency problems and get them fixed as soon as possible.

Updates to the compiler can be more ambiguous about whether or not a change is a regression. Compiler optimizations are designed to improve code but may also change the behavior in unexpected ways. A good example of this is some recent optimizations related to constants. For those who haven't studied compilers, constant folding involves identifying expressions that can be evaluated to a constant at compile time. gcc provides a builtin function __builtin_constant_p to let code behave differently depending on if an expression can be evaluated to a constant at compile time. This sounds fairly simple for cases such as __builtin_constant_p(0x1234) but it turns out to be more complex for actual code when combined with more complex compiler analysis. The end result is that a new compiler optimization broke some assumptions about how the kernel was using __builtin_constant_p. One of the risks of using compiler builtin functions is that the behavior is defined but only to some degree. Developers may argue that a compiler is doing something incorrect but it turns out to be easier just to fix the kernel.

Sometimes the compiler is just plain wrong. New optimizations may eliminate critical portions of code. Identifying such bugs is a special level of debugging. Typically, you end up staring at the code wondering how it could end up in such a situation. Then you get an idea that staring at assembly will somehow be less painful at which point you notice that a critical code block is missing. This may be followed by yelling. For kernel builds, comparing what gets pulled into the buildroot of working and non-working builds can be a nice hint that something outside the kernel has gone awry.

As a kernel developer, I am appreciative to the fantastic maintainers of the packages the kernel depends on. All the times I've reported issues in Fedora the maintainers have been patient and helpful in helping me figure out how to get the right debugging information to determine whether an issue is in gcc/binutils/glibc or the kernel. The kernel may be self-contained but it still needs other packages to work.


  1. Until your remove them with the --force option, then you really miss them. 


Boring rpm tricks

| categories: fedora

Several of my tasks over the past month or so have involved working with the monstrosity that is the kernel.spec file. The kernel.spec file is about 2000 lines of functions and macros to produce everything kernel related. There have been proposals to split the kernel.spec up into multiple spec files to make it easier to manage. This is difficult to accomplish since everything is generated from the same source packages so for now we are stuck with the status quo which is roughly macros all the way down. The wiki has a good overview of what all goes into the kernel.spec file. I'm still learning about how RPM and spec files work all the time but I've gotten better at figuring out how to debug problems. These are some miscelaneous tips that are not actually novel but were new to me.

Most .spec files override a set of default macros. The default macros are defined at @RPMCONFIGDIR@/macros which typically gets expanded to /usr/lib/rpm/macros. More usefully, you can put %dump anywhere in your spec file and it will dump out the current set of macros that are defined. While we're talking about macros, be very careful about whether to check if a macro is undefined vs. set to 0. This is a common mistake in general but I seem to get bit by it more in spec files than anywhere else.

Sometimes you just want to see what the spec file looks like when it's expanded. rpmspec -P <spec file> is a fantastic way to do this. You can use the -D option to override various macros. This is a cheap way to see what a spec file might look like on other archictectures (Is it the best way to see what a spec file looks like for another arch? I'll update this with a note if someone else me another way).

One of my projects has been looking at debuginfo generation for the kernel. The kernel invokes many of the scripts directly for historical reasons. Putting bash -x before a script to make it print out the commands makes it much easier to see what's going on.

Like I said, none of these are particularly new to experienced packagers but my day gets better when I have some idea of how to debug a problem.


Single images and page sizes

| categories: fedora

"The year of Linux on the desktop" is an old running joke. This has resulted in many "The year of X on the Y" spin off jokes. One of these that's close to my heart is "The year of the arm64 server". ARM has long dominated the embedded space and the next market they intend to capture is the server space. As some people will be more than happy to tell you, moving from the embedded space to the enterprise class server space has involved some growing pains (and the occasional meme). Most of the bickering^Wdiscussion comes from the fact that the embedded world has different requirements than the server world. Trying to support all requirements in a single tree often means making a choice for one versus the other.

The goal with a distribution like Fedora is to support many devices with as few images as possible. Producing a separate image means more code to maintain, more QA, and generally more work. These days we take it for granted that multiple ARM devices can be booted on the same kernel image. This was not always the case. Prior to 2012 or so, the platform support that lived under arch/arm/ was not designed to work in a unified fashion. Each vendor had a mach-foo directory which contained code that (usually) assumed only mach-foo devices would exist in the image. A good example of this is header files. Many devices would have header files under arch/arm/mach-foo/include/mach/blah.h. The way the include path was structured, you could not also compile a device with arch/arm/mach-bar/include/mach/blah.h since there would be two headers with the same name. Many of the important parts of the platform definition (e.g. PHYS_OFFSET) were #defines which meant that platforms with different needs could not be compiled together. Driven by a combination of a move towards devicetree and the realization that none of this was sustainable, the ARM community decided to work towards a single kernel image. Fast forward to today, and single image booting is standard thanks to a bunch of hard work.

arm64 learned from the lessons of arm32 and has always mandated a single image. You can see this reflected in the existence of a single defconfig file under arch/arm64/configs/defconfig. This is designed to be a set of options that are reasonable for most platforms. It is not designed to be a production ready fully optimized configuration file. This gets brought up occasionally on the mailing list when people try and submit changes to the defconfig file for optimization purposes.

Fedora is a production system and it does need to be optimized. There's been fantastic work recently to support more single board computers like the Raspberry Pi in Fedora. Thanks to single image efforts, the same kernel can boot on both a Raspberry Pi and an enterprise class ARM server. Booting doesn't mean work well though. Single Board Computers can come with as little as 512MB of RAM. Enterprise servers have significantly more.

Consider the choice of PAGE_SIZE for Fedora. A page size represents the smallest amount of physical memory that can be mapped into a page table. aarch64 has several options here, 4K being the most common and 64K giving better TLB performance1. A larger page size also means more wasted space. Many allocations need to be aligned to PAGE_SIZE for one reason or another even if they aren't using close to that amount of space. This can quickly add up to megabytes of wasted memory. A server with several gigabytes of memory probably won't show an impact but a system with 512MB will start to perform poorly due to lack of RAM. Choosing one page size over the other is going to be detrimental to one type of machine.

For a more degenerate case of PAGE_SIZE problems, we have to look at CMA (Contiguous Memory Allocator). CMA allows the kernel to get relatively large (think 8MB or more) physically contiguous allocations. Systems that use CMA will set up one or more designated CMA regions. The memory in a CMA region can be used by the system as normal with a few restrictions. When a driver wants to allocate contiguous memory from a CMA region, the kernel will use underlying page migration/compaction2 to allocate the block of memory. To help ensure the migration can succeed, CMA regions have a minimum size. When PAGE_SIZE is larger, the minimum size goes up as well. The particular combination of options Fedora uses makes the minimum size go up to 512MB when a larger PAGE_SIZE is used on arm64. Given other requirements for CMA, this essentially means CMA can't be used on smaller memory systems if a larger page size is used since the alignment requirements are too strict. And thus we get people making choices about what gets supported.

One way to avoid the need to make multi-platform trade offs is to make more options runtime selectable. This is popular with many debug features that can be builtin with the appropriate CONFIG_FOO option but are only actually run when an argument is passed on the kernel command line. This doesn't work for anything that needs to be determined at compile time though. PAGE_SIZE almost certainly falls into this category as do many other constants in the kernel. The end result is that you will never be able to find one true build configuration that's optimal for all situations. The best you can hope to do is foist the problem off on someone else and let them make the trade offs so you don't have to. Or evaluate what your requirements actually are and go from there. Either works.


  1. If you don't believe page size will actually make a difference, try talking to a hardware engineer and asking for some graphs. Better yet, ask them for some hardware optimized for a larger page size and then use a smaller page size. 

  2. LWN has some older articles about the technologies for the interested. I should also write more about CMA some time. 


Complaining about the kingdom of kernel

| categories: complaining, fedora

Jonathan Corbet of LWN gave a keynote at Linaro Connect about The kernel's limits to growth. The general summary was that the kernel had scaling problems in the late 90's (A single "B"DFL does not scale) but the developers figured out a method that was more sustainable. There's a growing concern that we're about to hit another scaling problems with insufficient maintainers. Solving this has gotten some attention of late. I have a lot of thoughts about maintainership and growing in the kernel (many of which can be summarized as "well nobody has told me to stop yet") but this is not that blog post. The talk mentioned that kernel development can be described as "A bunch of little feifdoms". This is a superb metaphor for so many things in Linux kernel land.

The terrible secret of Linux kernel development is that there really isn't a single kernel project. Sending stuff to just LKML is unlikely to go anywhere. Using get_maintainer.pl will tell you who the maintainers are and what mailing lists to use1 but it won't tell you how the maintainer actually maintains or their preferences. There are some common documented guidelines for getting things in but there always seems to be an exception. The networking stack has a long list of the ways it is different. Some subsystems use patchwork as a method for tracking and acking patches. The ARM32 maintainer has his own separate system for tracking patches. DRM is embracing a group maintainer model.

The end result is that sending patches to different subsystems means figuring out a different set of procedures. This problem is certainly not unique to the kernel. The hardest part of open source is always going to be the social aspect and dealing with how others want to handle a project. No one tool is ever going to solve this problem. The kernel seems to be particularly in love with the idea of letting everyone do their own thing so long as it doesn't make anyone else too mad. I'm sure this worked great when all the kernel developers could fit in one room but these days having one set of procedures for the entire kernel would make things run much smoother.

If the kernel community is made up of feifdoms, then the kernel community itself is a strange archaic kingdom2. Many of Ye Olde kernel developers love to talk about why e-mail is the only acceptable method for kernel development. I'm going to pick on this talk for a bit. I can't deny that many of the other options aren't great. I refuse to believe that github having pull requests separate from the mailing list is actually worse than each subsystem having a completely separate mailing list though. Good luck if someone forgets to Cc LKML or if your mailing list3 doesn't have patchwork. Having everything go to mailing list also doesn't guarantee anyone will actually review it or learn from it. The way to learn from an open source community is to make deliberate time to read and review what's being submitted. People can learn whatever tool is available to make this happen if they want to be engaged with the community. Maybe this is e-mail, maybe this is github. Whatever. The harder part is making sure people want to use the preferred communication method to review what's going on in the community.

Once again, I seem to have come around to the point of community building, something which the Linux kernel community still seems to struggle at. The kernel community problems are well documented at this point and I don't feel like enumerating them again. The scaling problems of the kernel are only going to get worse if nobody actually wants to stick around long enough to become a maintainer.


  1. Among my list of petty grievances is that mailing lists can be hosted on a variety of servers so there isn't always a unified place to look at archives. RIP GMANE. 

  2. Insert Monty Python and the Holy Grail joke here 

  3. I love you linux-mm but either your patchwork is incredibly well hidden from me or it doesn't exist, both of which make me sad. 


Maintainerati

| categories: fedora

I spent last Wednesday hanging out in San Francisco for the first annual maintainerati event. The idea was that there are a lot of open source maintainers out there but events are usually separated by technology areas. Javascript framework maintainers may never meet programming language maintainers even if their problems are similar. The idea with this event was to give open source maintainers a chance to vent and problem solve with others.

The event was structured as an 'unconference'. I describe it as a slightly more structured hallway track. We started the morning doing 'dot voting' on topics people wanted to talk about and then broke into groups to discuss the topics that got the highest vote. I chose to go for the discussion about recruiting newcomers and maintainers. We started with some discussion about what is a contribution and pros and cons of structuring the contribution process and eventually getting committer rights. There's no hard and fast rule about when people can/should get commit rights and it mostly comes down to relationships; you need to build relationships with existing maintainers and existing maintainers need to build relationships and mentor new committers. This let to quite a bit of discussion about free vs. paid and company vs. non-company contributors. It's a lot easier to build relationships if you can set up a meeting with a maintainer in your company but that doesn't work for outside contributors. There's also the question of trying to recruit volunteers for your sponsored project. Doing work 'for exposure' is problematic and exploitative yet open source has this idea of doing work for the inherent joy of open source and learning. Promoting unpaid contributions needs to be done very carefully if it is done at all. We ended up running out of time and I think the discussion could have certainly gone longer.

There was a second session in the afternoon about problematic communities. This one is unfortunately near and dear to my heart. We started out defining what makes a community toxic. A big point was that bad behavior prevents the community from making progress. Many of the discussion points were not just open source but other communities that tend to have overlap. Code of conducts are a necessity to make dealing with toxic behavior possible. There was some discussion about how specific these guidelines should be, and interestingly it was pointed out that having slightly less specific guidelines (but not too much) may help to avoid people trying to purposely hang out at the edge of acceptable. If your larger community is problematic, it can be helpful to work on making a smaller subset welcoming and let that influence the larger group. I appreciated everyone who took the time to contribute in the discussion.

Outside structured conversations, I spent time talking about empathy. Several attendees either were or had been in first line customer support positions. To succeed in this type of work, you need to have (or quickly build) empathy skills to keep customers satisfied. Developers are not well known for having large amounts of empathy skills. I'm guilty of this myself; empathy without emotionally draining myself is something I'm constantly working on. Figuring out how to teach empathy skills to others is a challenge. One of the ideas that came up was the need to be outside your comfort bubble. Travel and moving were a common way people cited to force yourself to have new experiences. Traditional developer mind set also tends to be very black and white (hi guilty here too). Most important was the desire to keep improving this skill and not write it off as unnecessary.

There were plenty of other conversations I'm sure I've forgotten about. Notes are available on the github and will be added as people get around to it. I really hope to see this conference happen again. It's filling a space to have important conversations about non-technical topics that tend to get sidelined elsewhere. I met so many cool people and left with a lot to think about. My biggest thanks to the organizers.


Next Page ยป