There will always be hardware bugs

| categories: hottakes, fedora

By now everyone has seen the latest exploit, meltdown and spectre, complete with logos and full academic paper. The gist of this is that side channel attacks on CPUs are now actually plausible instead of mostly theoretical. LWN (subscribe!) has a good collection of posts about actual technical details and mitigations. Because this involves hardware and not just software, fixes get more complicated.

In my previous job, I worked on kernels for mobile phones. This involved working with new hardware. I love working with hardware but one thing you learn pretty quickly is that hardware will have bugs. Sometimes the hardware team has already found them and will give a workaround. Other times you spend weeks chasing weird crashes and going back and forth with the hardware team. One of the challenging issues when working across teams in any area is communicating your domain expertise and listening to others expertise. There can be a lot of "well how about we just..." and talking across each other. Once upon a time, some hardware was not working the way we expected and we were talking to the hardware team. They were having trouble reproducing the behavior seen on our complex Android stack so we were running a series of experiments on our setups. Much of the actual work was figuring out how to take the requests from the hardware team and translate them into something reasonable for the kernel (e.g. where does "after each TLB flush" apply). Sometimes the experiments weren't actually feasible due to how the kernel was written.

If you are lucky (or unlucky depending on your view), you may find a hardware bug. The question then comes what to do. There may, again, be back and forth about what's actually an acceptable workaround. "Just run this sequence of code sometimes" may sound simple to the hardware team but might be impractical to actually implement in the kernel. The performance penalties can be high if part of the microarchitecture needs to be turned off. Sometimes the answer turns out to be "pretty please don't run this sequence of code which should never get generated by a reasonable compiler". Obviously, if an issue has security implications you may need to just take a performance hit but not implementing a workaround can be a valid decision.

Part of the discussion around all this has been a call for more open source hardware. This is absolutely a worthwhile goal. Most processors support adjusting various microarchitecture features. This is mostly for verification purposes but it's also useful if there's a need to disable a feature such as a prefetcher or branch predictor. The microarchitecture is usually considered proprietary and as such it's next to impossible to figure out how to make changes without consultation from the hardware team. So an open source hardware design would allow for better insight into the microarchitecture. What most people miss about open hardware is that you still have all the problems of hardware. Unless you're running on an FPGA, you can't just drop in a new hardware revision immediately. You're still going to have to implement software workarounds. The value of open hardware comes from freedom of licensing but not freedom from bugs.

Calling all this an "Intelocolypse" is deeply unfair as basically all modern processors from multiple vendors were affected here. It's a fundamental flaw in most implementations. It's certainly possible for each vendor/architecture to give a workaround but because of the severity here, there are proposals to fix this in generic kernel code. As has been mentioned though, many of the fixes are still under review so we'll have to see what happens. A big shout out to all the hardware and software developers who spent time coming up with proposals.


Build ids and the Fedora kernel

| categories: fedora

One of the overlooked aspects of packaging is how much stuff can be handled automatically for relatively simple packages. Debugging symbols are a good example. For many packages which support debugging information (compile with -g), the rpm packaging process can automatically separate the debugging symbols from the binaries with no extra work. The rpm team has put it a lot of work over the years to make this happen.

The kernel is unfortunately not a simple package. It's had a bunch of custom macros and functions to handle its debuginfo generation, even as rpm itself has improved. I did some work on cleaning up some of this earlier this year with review and feedback from Mark Wielaard. One of the changes for Fedora 27 was parallel debuginfo. This feature lets you have multiple versions of debugging symbols installed at once. Given you can have multiple kernel versions installed at once, this is something that would be valuable to the kernel.

One of the links between a binary and its debugging information is a Build ID. To borrow from the link, "But I'd like to specify it explicitly as being a unique identifier good only for matching, not any kind of checksum that can be verified against the contents". By default, passing --build-id to the linker will produce a sha1 sum of parts of the binary that gets put in an ELF note. You can see this with readelf -n:

Displaying notes found in: .note.gnu.build-id
  Owner                 Data size   Description
  GNU                  0x00000014   NT_GNU_BUILD_ID (unique build ID bitstring)
    Build ID: bbe4ba9f6ebc37ba8764904290077ec7e78ec8a9

Part of the trick with the sha1 sum is that it makes the build id reproducible, building with the same environment will produce the same binaries and therefore the same build id. Consider the case of a minor version bump to a package with no change in source code or buildroot. Depending on the package, this may very well produce the same binary which will have the same sha1 build id. If the build id is used as part of the file structure of the debuginfo, this may lead to package conflicts. Part of the work for the parallel debuginfo was making the build-id unique. As described at the link, part of fixing this involved making changes to debugedit to take the N-V-R as a hash seed. This gets run via find-debuginfo.sh1 to fixup the build id and other debug paths.

Now enter the kernel. The kernel has the vDSO which gets loaded automatically with each program. The vDSO is encoded in the kernel as a shared object. As a shared object, it also has its own build id. When I was doing the work earlier in the year, Mark Wielaard gave a quick way to show this:

$ eu-unstrip -n -p $$ | grep vdso | cut -d ' ' -f 2
$ eu-readelf -n /usr/lib/debug/lib/modules/`uname -r`/vdso/vdso64.so.debug | grep "Build ID"

debugedit doesn't know about the vDSO encoded in the kernel but it will happily update the build id of the vdso.so binary. This ends up breaking the debug link for the vdso if we make it unique since the debuginfo build id is different from the in-kernel vDSO build id.

So the end result of this story is that the kernel can't completely handle parallel debuginfo yet. The build id of the vdso in the kernel must be updated to be unique and there isn't a good solution for this. The rpm developers are aware of this problem but all of them are of course busy with other tasks (they're always very helpful with questions though!). I have some ideas about how to approach this so ideally if I get some time, I can propose something for review.


  1. The default invocation of find-debuginfo.sh can be found in macros.in


Fun with Le Potato

| categories: fedora

At Linux Plumbers, I ended up with a Le Potato SBC. I hadn't really had time to actually boot it up until now. They support a couple of distributions which seem to work fine if you flash them on. I mostly like SBCs for having actual hardware to test on so my interest tends to be how easily can I get my own kernel running.

Most of the support is not upstream right now but it's headed there. The good folks at BayLibre have been working on getting the kernel support upstream and have a tree available for use until then.

The bootloader situation is less than ideal currently. All the images run with the vendor provided u-Boot which is a few years out of date and runs with a bunch of out of tree patches. This is unfortunately common for many boards. There wasn't much information about u-Boot so I asked on the forums. I got a very prompt and helpful response that u-Boot upstreaming is also in progress. The first series looks like it's been reviewed and also comes with a very detailed README on how to actually build and install. This is important because you have to do some work to actually pick up the vendor firmware ('libre').

So here's roughly what I did to get my own code running. I'll note that this is just for something to output on serial. Make sure you have an SD card handy:

  • Download mainline u-Boot
  • Apply the base series
  • Follow the instructions in the README for compiling the base u-Boot ("u-boot compilation" section). I should note that I didn't feel like grabbing a bare metal toolchain so I just used the package Fedora provides for cross compilation. (CROSS_COMPILE=aarch64-linux-gnu-) YMMV.

The "Image creation" steps have a few gotchas, which I'll summarize:

  • wget the 4.8 toolchains. Before I asked on the forums about u-boot, I experimented with compiling the u-boot from the BSP with a newer toolchain. This was a bit of a nightmare so I just went ahead and used their suggested toolchain.
  • The toolchains are 32-bit binaries so you need to install 32-bit libs (dnf install glibc.i686 libstd++.i686 zlib.i686)
  • The vendor u-boot expects the toolchains to be in your path so set them accordingly.
  • Clone the vendor u-boot
  • Compile the 'vendor u-boot' (make gxl_p212_v1_defconfig && make)
  • Go back to your mainline u-Boot.
  • Run all the commands up to the dd commands (I put them in a shell script). Note that the line with acs_tool.pyc needs to be prefixed with python.
  • Run the dd command, setting the dev as appropriate.

You now have an SD card with u-boot and firmware on it. Of course you still need a kernel.

  • Clone the tree
  • make ARCH=arm64 defconfig
  • For a rootfs, I set CONFIG_INITRAMFS_SOURCE to point to my buildroot environment I use with QEMU.
  • make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu
  • Based on comments on the forums, I coverted the kernel to a uImage which u-boot understands:

    path/to/u-boot/tools/mkimage -A arm64 -O linux -C none -T kernel -a 0x01080000 -e 0x01080000 -d path/to/kernel/arch/arm64/boot/Image uImage

Fedora does provide mkimage in the uboot-tools package but given we're compiling u-boot, I went ahead and used the binary from that.

  • Insert the sdcard with u-boot to your computer and mount it (first partition should be FAT16)
  • Copy the uImage and arch/arm64/boot/dts/amlogic/meson-gxl-s905x-libretech-cc.dtb to the SD card

Your SD card should now have the kernel and devicetree on it. If all has gone well, you should be able to insert it and get to the u-boot prompt. Based on the comments on the forums, I did

  • loadaddr=1080000
  • fdt_high=0x20000000
  • fatload mmc 1:1 ${loadaddr} uImage
  • fatload mmc 1:1 ${fdt_high} meson-gxl-s905x-libretech-cc.dtb
  • bootm ${loadaddr} - ${fdt_high};

And it worked. Obviously this is a pretty simple setup but it shows that you can get something custom going if you want). I might try and throw in a version of Fedora on there to experiment with the multimedia hardware. I doubt this board will get official support unless the u-boot firmware situation improves.


Some reading

| categories: fedora

Like all good developers, I do not know everything and will happily admit this. I've spent some time recently reading a couple of books to help fill in some gaps in my knowledge.

I've complained previously about disliking benchmarking. More generally, I'm not really a fan of performance analysis. I always feel like I get stuck at coming up with an approach to "it's going slower, why" beyond the basics. I watched a video of Brendan Gregg's talk from kernel recipes, and ended up going down the black hole1 of reading his well written blog. He does a fantastic job of explaining performance analysis concepts as well as the practical tools to do the analysis. He wrote a book several years ago and I happily ordered it. The book explains how to apply the USE method to performance problems across the system. This was helpful to me because it provides a way to generate a list of things to check and how to check them. It addresses the "stuck" feeling I get when dealing with performance problems. The book also provides a good high level overview of operating systems concepts. I'm always looking for references for people who are interested in kernels but don't know where to start and I think this book could fill a certain niche. Even if this book has been out for several years now, I was very excited to discover it.

I consider networking the biggest black hole of mystery in the kernel. I've never been a network or sysadmin for anything except my own Linux machines. Most of my networking debugging involves just googling for the correct command to type. I ended up buying a copy of Volume I of TCP/IP Illustrated. This is the canonical text and it's quite dense. For my style though, it's been helpful for grasping concepts. I have a better idea of exactly how packets flow and what exactly various networking functions (e.g. VPN) actually do. It's not very useful for practical experience though so I want to find some tasks to apply some of the skills I've learned. Maybe I'll write more if I find something interesting.


  1. I suffer from https://xkcd.com/214/ syndrome for all internet content. 


OSS/Ksummit 2017

| categories: fedora

Last week was kernel summit in Prague. Based on feedback from Linus and other people, kernel summit was a 2 day open technical forum along with a half-day "maintainer summit". Open Source Summit Europe was also happening at the same time and I attended somethings there as well.

Darren Hart gave a talk about x86 platform drivers. Darren is the current maintainer of the x86 platform drivers. He gave a nice overview of what a platform driver actually is (a bunch of glue) and some history about how big or small drivers can be. One of the sticking points about drivers in general is that most hardware vendors only really focus on Windows and the driver philosophy there is different from Linux. This results in Linux needing to play catch-up and work around firmware that was only tested on Windows (see also the example of vendor "To be filled by O.E.M."). Hardware vendors can make this easier by using standard interfaces and also open sourcing firmware, something the Fedora community cares deeply about.

Laurent Pinchart ran a session called "Bash the Kernel Maintainers". This was designed to be a feedback session for attendees to express opinions about the kernel process. Most of the feedback was things I've heard before (and even expressed myself). Submitting patches as a new user is still intimidating. There was some discussion about making it easier for users to access the zeroday bot without having to submit something publicly. The topic of bots reminded me of some of the themes I heard when I was at maintainerati. The complete lack of consistency among maintainers was a big theme. There is no one rule about where to send a patch or when to ping a maintainer for a review or even how to get status about a patch. This is still one of my pet peeves as a full time kernel developer. Laurent took great notes and gave a readout at the kernel maintainer's summit.

For the first day of kernel summit, Steven Rostedt and Mathieu Desnoyers talked about the tracing ABI. The kernel has a (reasonably) consistent rule that userspace is an ABI and you do not break userspace. This makes sense for things like traditionally compiled userspace programs and syscalls. The tracing infrastructure in tree has grown over the years which has made debugging much easier. If that tracing infrastructure gets exposed to userspace though, it might end up looking like an ABI, which means that if tracepoints get changed or removed tools that depend on it might break. The presenters argued that even if it is an ABI, tools developers were perfectly willing to recompile on each kernel version to match. Linus disagreed. LWN did a much more complete writeup of the topic.

Peter Robinson and Jim Perrin ran a BOF on the state of ARM on Fedora. This was mostly a brief status update (Fedora rocks) with time to ask questions. People were mostly interested in device support so there was some discussion and explanation about what is required to support these things in Fedora (reasonable mainline graphics support). Great session.

The kernel Outreachy interns each gave presentations on what they worked on. There were six different projects across the kernel, from documentation to IIO. I always enjoy hearing Outreachy interns talk about what they accomplished. For many of them, this was their first contribution to open source or even kernel programming. Outreachy is a fantastic program and it shows what can be accomplished with a supportive mentor.

Thorsten Leemhuis gave a presentation on kernel regression tracking. There had not been anyone keeping track of regressions in the kernel for years until Thorsten picked up the task earlier this year. It's a very valuable but thankless job. Thorsten talked about some of the difficulties including getting people to actually send him regressions and how to keep track of what he did find. There was talk of creating a regressions@ mailing list which should hopefully make reporting easier. The topic of bugzilla came up once again and it sounded like there was agreement to improve the landing page to make bug reporting more easier. As a Fedora maintainer, regressions are near and dear to my heart so I plan on keeping an eye on this.

Konstantin Ryabitsev, a sysadmin for kernel.org, gave a presentation on security hygiene. The intended audience was kernel developers but everything applies to developers in general. PGP is still the most widely used mechanism out there and the kernel community relies on it for trust. Linus signs all releases as does Greg KH for the stable releases. There was some discussion about the trust in git pull requests and how much signing should actually be happening. Konstantin is a big promoter of hardware tokens for storing your subkeys. I have a yubikey but haven't made much use of it (and apparently need to update my keys thanks to the latest flaw). I really enjoyed this talk mostly because the security suggestions were very practical, even if they did acknowledge that some problems like video conferencing were still insecure. I'll be reimaging my laptop soon so I'll hopefully be able to implement some of the suggestions.

The maintainer's summit was a half-day even closer to what the kernel summit used to be. This was the first time the new format was used and overall it went well. There were only about 30 of us which made discussions much easier, especially on topics like regression tracking and maintainer bashing. All the discussion felt productive and I think we made useful progress. I expect LWN to have a more complete writeup in the near future.

Once again, a great conference in a great city. I love Prague.


Next Page ยป