Cross compiling Baserock - a rough analysis

Tristan Van Berkom tristan.vanberkom at codethink.co.uk
Thu Dec 24 07:46:20 GMT 2015


Hi all,

Earlier this week I started looking into the possibility of cross
compiling systems with Baserock. Just to get an idea of what kind of
workload it would represent to get it up and running and to maintain
such a beast.

To this end, I took some time to refresh my memory with a buildroot
build and as I was curious to compare, I also installed a raspbian
system in QEMU to see if I could easily build the GNOME system easily
enough using virtualization.

What I have written below is a rough assessment of the work which needs
to be done to cross compile a system in general, and a rough draft of
the kinds of changes which would be appropriate for us to do it well in
Baserock.

This is still only a rough analysis, but I wanted to share this on the
list so others could interject, if there are other approaches to this
which make more sense, or if things are in fact simpler (or more
difficult) than I suspect, it would be great to hear your feedback in
the interest of accurately assessing what it would take to evolve
baserock into a cross-build system.

Take your time, enjoy the holiday season :)

Cheers,
    -Tristan


What has to be done to cross compile a system
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  o Build/Have a cross compiler.

    This is pretty much solved with baserock, we can easily build a
    cross compiler in the early build phase.

  o Build all of the required host tooling

    We also need every tool which can be used by a makefile, which
    starts with your regular binutils and fileutils packages and then
    your interpretors for various scripting languages you may need to
    run during the course of your compilation.

    This host vs. target game unfortunately occurs all the way up the
    stack. Whenever a module which is high up in the stack provides
    some compiled program to help compile itself, or, assist in
    compiling modules which depend on it; then that given module and
    it's dependencies must also be built for the host arch and staged
    for further cross compilation.

    Examples of programs which are required during the compilation
    phase:

      o pkg-config

      o pkg-config is a later incarnation of an earlier construct,
        where a dependency will install a binary for discovering itself
        and reporting how one should compile against that specific
        installation of itself, sometimes coupled with a convenience m4
        macro to be used by dependent package makefiles.

        While many of the previous incarnations of this have been
        phased out in favor of pkg-config, some remain and need to be
        compiled for the host (icu-config for instance is still in
        use).

      o various tools from glib need to be compiled for the host, such
        as glib-mkenums, glib-genmarshal, glib-compile-resources and
        glib-compile-schemas

      o Tools for manipulating translations, like msgfmt and gettext


  o Once all of the host tooling and it's dependencies are built, we
    can start to consider cross compiling the set of modules we want.

    At this point the question of how to approach staging and paths and
    setting up the build environment arises, to which there are various
    possible approaches.

    Taking buildroot as an example, they do not use any chroot and
    stage all of the host builds in one location before building
    everything into the target location.

    Cross compiling any given module is a delicate dance of setting up
    the environment correctly and providing the correct environment so
    that:

      o Host tooling is prioritized in $PATH so that build scripts in
        the target correctly use the tools from the host tools staging
        location
        
      o Host tools link to host libraries in the host staging location
        when they are used, this is perhaps done by setting
        $LD_LIBRARY_PATH

      o When compiling anything, the target assets are found to
        assemble any product. That is to say that even though we link
        our host build tools to host libraries while they run, the
        built assets are assembled and linked against target libraries.


  o Combat with module-local build tooling

    Some modules compile programs which are code generators required
    during the course of their regular compilation, if these are also
    installed binaries, it is good practice to avoid using the system
    installed generator as it is most probably out of date and lacks a
    feature which is only implemented in the checked out source tree.

    Programs like glib-genmarshal and glib-mkenums are examples of such
    tooling which are built in-tree and then used later on to compile
    gio from the same source tree.

    This correct practice of preferring the in-tree binary is however
    incorrect if we explicitly built one which can run on the host in
    the hope that glib would use that one for it's own code generation.

    In cases such as this; it can require custom hacks and sometimes
    patches against the upstream module to force it to select the
    already compiled host compatible tool.

    Buildroot works around this particular case for glib 2.46 by
    setting the following in the environment:

      ac_cv_path_GLIB_GENMARSHAL=$(HOST_DIR)/usr/bin/glib-genmarshal


  o Correcting and stripping rpaths

    This may or may not be an issue (it is an edge case at best), with
    current Baserock we build in a chroot and link against libraries
    found in a path which will be the correct path on the final target.
    So in the case where a build script forces an encoded -rpath to
    link somewhere under /usr/lib, then /usr/lib will be found in the
    resulting target.

    When cross-compiling however, we necessarily link against libraries
    found in /opt/target/usr/lib, so there may be some additional
    legwork in stripping binaries of their hard coded link paths and
    setting up the environment in the final target so that binaries
    still find their expected libraries.


Plausible approach to cross compiling in Baserock
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Given the above preliminary assessment, here follows a rough outline of
how we could potentially approach this in the baserock build system.


Part 1 - The cross compiler
~~~~~~~~~~~~~~~~~~~~~~~~~~~
This is essentially covered but could be improved, what we currently
have a way to build a host compiler from scratch and then use that host
compiler to build a cross compiler for armv7lhf.

What could be improved upon in this picture, is that it would be nice
if we could build the cross compiler we want for the arch we intend to
build for.

Then we could have a single cross-build-essential with the host
compiler and the cross compiler required for the $TARGET we are
currently building for.


Part 2 - Host and Target artifacts and staging
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Baserock would need to be setup in such a way as to understand that
there are 2 sets of artifacts, one for the host tooling and one for
target assets.

Of course, for a given system, we need not build the host version for
the great majority of the chunks required in the target, and even then
we can usually get away with a minimalist variant of any given
requirement (you dont need a glib with pcre just to run glib-mkenums).

When building a host artifact, the build process implemented by
Baserock and YBD would remain unchanged, we only need to stage the
artifacts in '/' and then do the regular thing.

When building a target artifact, we need to stage both the host
artifacts and the target artifacts, the staging for a target build
would look something like this:

  /               - Host artifacts staged in '/' as usual
  /opt/target     - Target artifacts we depend on staged here
  /package.build  - The checkout of the chunk we intend to build
  /package.inst   - The DESTDIR where we will pickup the result


Part 3 - Actually building the target artifacts
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Here we have a slight advantage over buildroot which is that we've
staged our host tooling into the / of a chroot, which should slightly
reduce the legwork involved in setting up the paths we need to use,
maybe.

Here we will want to setup an environment that will use the cross
compilers and appropriate pathing by default, reducing the verbosity of
chunk definitions which need to be cross compiled.

We will want something roughly like:

  # Tools...
  CC=/usr/bin/${arch}-gcc
  CXX=/usr/bin/${arch}-g++
  ... all cross compilers ...

  # Paths...
  PKG_CONFIG_PATH=/opt/target/lib/pkg-config
  CFLAGS="-I /opt/target/usr/include"
  LDFLAGS="-L /opt/target/lib -L /opt/target/usr/lib"
  ACLOCAL_FLAGS="-I /opt/target/share/aclocal"

Then, chunk by chunk we will discover problems which can ideally be
solved by tweaking the default cross environment appropriately, and
sometimes by encoding some brute force into the chunk recipe itself,
ensuring the right paths are used.


Part 4 - Assembling and deploying the system
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Here is an interesting problem.

Of course to assemble the final target image, we would need to simply
stage all of the cross-built artifacts in '/' instead of in
'/opt/target', but, at this stage what do we do for the system
integration commands ?

Perhaps we have a very special setup for this which stages the whole
thing backwards so that:

   /          - Contains the assembled cross-built target
   /opt/host  - Contains the staged host tooling

And we have a sort of floating chroot which only runs tooling from
/opt/host with a working shell in which system integration commands
could be run from the host tooling and effect the root target ?

Or, we stage it in a regular way with host tools in '/' and the target
in /opt/target, but we ensure that all tools we use for system
integration can handle the relocation, possibly by patching the
upstreams providing those tools.

And things are probably more complicated than just this when it comes
to tooling which generates some binary format, it's unclear to me
whether the host runnable gtk-update-icon-cache will generate an icon
cache which will be readable on the target system, for instance.


Summary
~~~~~~~
In the above picture I have skimmed over some details, such as build
dependency enhancements; when cross compiling it would become important
to specify which "host" chunks a given target chunk depends on.

There is also a running theory that one could simply build depend on
the cross compiler and use that to cross-compile chunks without
significantly modifying baserock functionality and definitions format.

It's a possibility but we would still have to ensure that we have our
host assets in '/' and our cross built assets in a separate location
for each and every chunk build (also we still have the same system
integration issues).

Cross compiling is a continuous uphill battle primarily because it is
difficult to justify the importance of supporting cross-builds to
upstreams when a self-hosting compilation works.

The current release of buildroot contains a total of 1469 downstream
patches in it's package directories and is still building GTK+ 3.14.x.




More information about the baserock-dev mailing list