Running Linux on a Zynq without Vivado madness

Intro

The nice folks at Trenz Electronic recently gave me a Zynqberry to play with, so I finally picked up the TODO to take a closer look at that chip.

The Xilinx Zynq is a dual-core Cortex-A9 ARM processor. I recommend you take a look at the Technical Reference Manual (all references in here are to v1.10 of this document), it’s very detailed and clearly and well written ­– better than the documentation for most of the Cortex-A SoCs I’ve seen. While the peripherals are a little on the light site (the fanciest two are the USB host/device/OTG and 1 Gig Ethernet MAC), its main selling point is that it integrates a 7 Series FPGA (Artix on the smaller chips, Kintex on the larger ones), called the PL (Programmable Logic). This FPGA is connected to the SoC (called the Processing System, or PS) via the SoC interconnect, providing two ordinary AXI master and slave ports which allow you to create devices which integrate seamlessly into the SoC. Additionally, it contains one cache-coherent master port (ACP) and four high performance AXI masters which can only access the on-chip memory and the DDR3 controller which is shared with the rest of the system (see the Interconnect Block Diagram in Figure 5-1 of the TRM).

This is a very cool thing. While there are open soft cores available, they take up precious logic space, even more so if they’re supposed to run an actual operating system. The Zynq offers a solution to this problem.

When I tried to get started with the Zynqberry, it was still very fresh and undocumented, so I borrowed the Zybo of my partner in crime and started fiddling around with that instead. However, a schematic of the Zynqberry has recently popped up on the Trenz website, so I’ll look into that again soon.

As the Zynq is more a SoC with an FPGA attached rather than the other way around, the configuration of the PL is driven from the PS (you can externally configure the PL via JTAG, but you probably wouldn’t do that in practice), so booting something on the SoC is a good first step.

Alas, like with all things FPGA, when looking at the official sources, you’re presented with tools and wizards which magically create some blob which you can put on the device and which then does something. Or not. Who knows.

The rest of this post will not contain any deep insights, but will rather serve to point out that is indeed possible to boot a Zynq without any blobs or vendorized code. Hopefully, this will be helpful for some poor soul who might find their way here via Google.

The Zynq boot process

The Zynq contains a boot ROM which serves to load some initial code from one of the various boot sources (see Chapter 6 of the TRM); I use an SD card. The boot ROM loads this code to the on-chip memory and jumps to it (the boot ROM also has extensive secure-boot capabilities, which–contrary to most other vendors–are publicly documented in the TRM, but I won’t go into details here).

In Xilinx’ world order, this piece of code is the FSBL (first-stage bootloader), a blob generated by Vivado, which can do a myriad of things, but its main purpose is to initialize the memory controller, load the next-stage payload and continue execution there. The FSBL basically consists of two parts: An FSBL template and the ps7_init.c file generated by Vivado, which contains some byte-code which is interpreted by the FSBL to (mainly) configure the clock-tree and the memory controller.

If you’re familiar with the U-Boot bootloader, which is commonly used on embedded systems, you’ll probably have noticed that this sounds awfully like what the U-Boot secondary program loader (SPL) does. And you’re right: U-Boot SPL can completely replace the Xilinx-provided FSBL template nowadays (I use the current master, bbca7108db79076d3a9a9c112792d7c4608a665c, because uEnv-support in the Zynq default config hasn’t landed in a release yet, but support for SPL booting has been present since v2014.04).

While generating the ps7_init.c file with Vivado is still a necessary evil1, this only has to be done once for each hardware design and if you’re using a premade board, it’s probably contained in one of the vendor’s examples. Furthermore, this file does not contain all too much magic and is close to human-readable. Note that U-Boot uses the ps7_init_gpl.c file, which is equivalent to the other, but has a different license header.

U-Boot already comes with a configuration for the Zybo and also contains an appropriate ps7_init file, so just running make zynq_zybo_defconfig all should have you covered (however, I use buildroot, which makes cross-compiling whole Linux systems much less painful. See below.). Right now, the U-Boot build system doesn’t yet create a BOOT.BIN from the SPL, but buildroot includes zynq-boot-bin.py in its build process, which does just that.

As I didn’t want to patch U-Boot for my boot process customizations, I used the following uEnv.txt file:

ipaddr=10.0.0.123
serverip=10.0.0.4
uenvcmd=tftpboot 0x2000000 uEnv.txt;env import -t 0x2000000 ${filesize}

To prevent me from having to always swap SD cards around, this file delegates the whole boot process to another file loaded via TFTP:

fdt_addr=0x30000000
rd_addr=0x25000000
loadfdt=tftpboot ${fdt_addr} zynq-zybo.dtb; fdt addr ${fdt_addr}
load_linux=tftpboot ${load_addr} uImage
load_rd=tftpboot ${rd_addr} rootfs.cpio.uboot
load_bitstream=tftpboot ${load_addr} bit/top.bin; fpga load 0 ${load_addr} ${filesize}
bootcmd=run loadfdt; run load_linux; run load_rd; setenv bootargs console=tty0 console=ttyPS0,115200; bootm ${load_addr} ${rd_addr} ${fdt_addr}

There are a few things worth noting:

  • I pulled the load addresses from thin air. Make sure parts which are loaded later do not overwrite earlier parts when your images are bigger.
  • U-Boot can also load a bitstream to the PL. I only used this for some early testing and in my setup, the command isn’t active for a regular boot, but this could be useful if your system depends on the PL to boot for some reason (a SATA core, for example).

The Linux kernel

There’s upstream support for the Zynq in the vanilla Linux kernel, as well as as a kernel tree maintained by Xilinx themselves. As usual in the ARM world, the latter has some additional features which either haven’t yet found their way into the vanilla kernel or maybe never will (the devcfg interface mentioned below will probably never be merged). As the Xilinx tree seems to be quite well maintained, I’ll stick with that for now (I’ve tried the a vanilla 4.4.3 as well and it works, too). I used the xilinx_zynq_defconfig, but additionally enabled debugfs and switched from devcfg to fpga_manager (see below).

Getting a bitstream loaded from within Linux is the next step. Depending on whether you’re running the Xilinx kernel tree or not, you have two options:

devcfg

The Xilinx tree includes a char driver (drivers/char/xilinx_devcfg.c) which creates a device node /dev/xdevcfg where you can simply pipe in your bitstream and be done with it. It also exposes more information and some of the fancier features of the configuration interface at /sys/bus/platform/devices/f8007000.devcfg.

It also contains support for configuring the fclks, which are the clocks provided to the PL by the PS. Sadly, doing this is none of its business and it doesn’t play nice with the generic clock tree implementation for the Zynq (it doesn’t seem to synchronize the state between those two subsystems in any way).

fpga_manager

fpga_manager is the generic mainline Linux mechanism for programming FPGAs attached to a Linux system. Both mainline Linux and the Xilinx tree contain code for doing this on the Zynq. However, both the fpga_manager Zynq driver and the devcfg driver attach to the device tree class zynq-devcfg-1.0, so both are mutually exclusive when building a kernel from the Xilinx tree.

Alas, the fpga_manager subsystem doesn’t have a similarly convenient user space interface as devcfg. Instead, it exports two functions for doing this: fpga_mgr_buf_load, which loads a bitstream from a memory buffer, and fpga_mgr_firmware_load, which accomplishes the same but uses the firmware loading facilities for getting the bitstream.

While this is less convenient than the user space interface, it is probably cleaner, as you’re probably going to have to write a kernel module for your peripheral anyway. Although the question remains whose responsibility it would be to load the bitstream if you have multiple custom peripherals which are serviced by different kernel modules. Maybe simply loading the bitstream from within the bootloader would be the right approach here.

Tying it all together: Buildroot

Buildroot is a build system which enables you to create customized (embedded) Linux images. It takes care of building or downloading a cross compiler, U-Boot, kernel and userland and wrapping it into a set of files you can deploy on your system. Of the embedded build systems I checked out so far, they seem get things right the most. I recommend you take a look at their manual here.

For the Zybo, I built a custom set of configs based on the Zedboard (yet another Zynq board) config already present within Buildroot, you can find it here. At the time of this writing, I used Buildroot 2016.02, but check the README in that repository for more up to date information.

Building an image for the Zybo should be as easy as

$ cd buildroot-2016.02
$ make BR2_EXTERNAL=/path/to/checkout/of/zynq-buildroot zybo_defconfig
$ make all

Note that you shouldn’t use the -j option with the top-level Buildroot makefile. Buildroot will autodiscover the number of cores available and scale the package build processes accordingly (this seems to be broken when running Buildroot on my main Archlinux system, but works in a Debian jessie container).

For the above “netboot” setup, putting the BOOT.BIN and u-boot-dtb.img files from output/images/ next to the first uEnv.txt file above on an SD card with a FAT filesystem and the others in the TFTP root directory should result in a booting system.

Generating a bitstream with PS<->PL connections without having to use the Vivado GUI (using migen and their build system instead) is a topic for another post.

Footnotes

  1. There’s the ezynq project by the people at Ephel. While build process is questionable (an ad-hoc shell script downloads and patches a U-Boot git version and contains hard-coded references to Ephel boards), the repo also contains code for generating system initialization data similar to ps7_init.c, which could be used to build a drop-in generator