Cross compiling Baserock - a rough analysis
Tristan Van Berkom
tristan.vanberkom at codethink.co.uk
Thu Dec 24 07:46:20 GMT 2015
Hi all,
Earlier this week I started looking into the possibility of cross
compiling systems with Baserock. Just to get an idea of what kind of
workload it would represent to get it up and running and to maintain
such a beast.
To this end, I took some time to refresh my memory with a buildroot
build and as I was curious to compare, I also installed a raspbian
system in QEMU to see if I could easily build the GNOME system easily
enough using virtualization.
What I have written below is a rough assessment of the work which needs
to be done to cross compile a system in general, and a rough draft of
the kinds of changes which would be appropriate for us to do it well in
Baserock.
This is still only a rough analysis, but I wanted to share this on the
list so others could interject, if there are other approaches to this
which make more sense, or if things are in fact simpler (or more
difficult) than I suspect, it would be great to hear your feedback in
the interest of accurately assessing what it would take to evolve
baserock into a cross-build system.
Take your time, enjoy the holiday season :)
Cheers,
-Tristan
What has to be done to cross compile a system
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
o Build/Have a cross compiler.
This is pretty much solved with baserock, we can easily build a
cross compiler in the early build phase.
o Build all of the required host tooling
We also need every tool which can be used by a makefile, which
starts with your regular binutils and fileutils packages and then
your interpretors for various scripting languages you may need to
run during the course of your compilation.
This host vs. target game unfortunately occurs all the way up the
stack. Whenever a module which is high up in the stack provides
some compiled program to help compile itself, or, assist in
compiling modules which depend on it; then that given module and
it's dependencies must also be built for the host arch and staged
for further cross compilation.
Examples of programs which are required during the compilation
phase:
o pkg-config
o pkg-config is a later incarnation of an earlier construct,
where a dependency will install a binary for discovering itself
and reporting how one should compile against that specific
installation of itself, sometimes coupled with a convenience m4
macro to be used by dependent package makefiles.
While many of the previous incarnations of this have been
phased out in favor of pkg-config, some remain and need to be
compiled for the host (icu-config for instance is still in
use).
o various tools from glib need to be compiled for the host, such
as glib-mkenums, glib-genmarshal, glib-compile-resources and
glib-compile-schemas
o Tools for manipulating translations, like msgfmt and gettext
o Once all of the host tooling and it's dependencies are built, we
can start to consider cross compiling the set of modules we want.
At this point the question of how to approach staging and paths and
setting up the build environment arises, to which there are various
possible approaches.
Taking buildroot as an example, they do not use any chroot and
stage all of the host builds in one location before building
everything into the target location.
Cross compiling any given module is a delicate dance of setting up
the environment correctly and providing the correct environment so
that:
o Host tooling is prioritized in $PATH so that build scripts in
the target correctly use the tools from the host tools staging
location
o Host tools link to host libraries in the host staging location
when they are used, this is perhaps done by setting
$LD_LIBRARY_PATH
o When compiling anything, the target assets are found to
assemble any product. That is to say that even though we link
our host build tools to host libraries while they run, the
built assets are assembled and linked against target libraries.
o Combat with module-local build tooling
Some modules compile programs which are code generators required
during the course of their regular compilation, if these are also
installed binaries, it is good practice to avoid using the system
installed generator as it is most probably out of date and lacks a
feature which is only implemented in the checked out source tree.
Programs like glib-genmarshal and glib-mkenums are examples of such
tooling which are built in-tree and then used later on to compile
gio from the same source tree.
This correct practice of preferring the in-tree binary is however
incorrect if we explicitly built one which can run on the host in
the hope that glib would use that one for it's own code generation.
In cases such as this; it can require custom hacks and sometimes
patches against the upstream module to force it to select the
already compiled host compatible tool.
Buildroot works around this particular case for glib 2.46 by
setting the following in the environment:
ac_cv_path_GLIB_GENMARSHAL=$(HOST_DIR)/usr/bin/glib-genmarshal
o Correcting and stripping rpaths
This may or may not be an issue (it is an edge case at best), with
current Baserock we build in a chroot and link against libraries
found in a path which will be the correct path on the final target.
So in the case where a build script forces an encoded -rpath to
link somewhere under /usr/lib, then /usr/lib will be found in the
resulting target.
When cross-compiling however, we necessarily link against libraries
found in /opt/target/usr/lib, so there may be some additional
legwork in stripping binaries of their hard coded link paths and
setting up the environment in the final target so that binaries
still find their expected libraries.
Plausible approach to cross compiling in Baserock
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Given the above preliminary assessment, here follows a rough outline of
how we could potentially approach this in the baserock build system.
Part 1 - The cross compiler
~~~~~~~~~~~~~~~~~~~~~~~~~~~
This is essentially covered but could be improved, what we currently
have a way to build a host compiler from scratch and then use that host
compiler to build a cross compiler for armv7lhf.
What could be improved upon in this picture, is that it would be nice
if we could build the cross compiler we want for the arch we intend to
build for.
Then we could have a single cross-build-essential with the host
compiler and the cross compiler required for the $TARGET we are
currently building for.
Part 2 - Host and Target artifacts and staging
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Baserock would need to be setup in such a way as to understand that
there are 2 sets of artifacts, one for the host tooling and one for
target assets.
Of course, for a given system, we need not build the host version for
the great majority of the chunks required in the target, and even then
we can usually get away with a minimalist variant of any given
requirement (you dont need a glib with pcre just to run glib-mkenums).
When building a host artifact, the build process implemented by
Baserock and YBD would remain unchanged, we only need to stage the
artifacts in '/' and then do the regular thing.
When building a target artifact, we need to stage both the host
artifacts and the target artifacts, the staging for a target build
would look something like this:
/ - Host artifacts staged in '/' as usual
/opt/target - Target artifacts we depend on staged here
/package.build - The checkout of the chunk we intend to build
/package.inst - The DESTDIR where we will pickup the result
Part 3 - Actually building the target artifacts
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Here we have a slight advantage over buildroot which is that we've
staged our host tooling into the / of a chroot, which should slightly
reduce the legwork involved in setting up the paths we need to use,
maybe.
Here we will want to setup an environment that will use the cross
compilers and appropriate pathing by default, reducing the verbosity of
chunk definitions which need to be cross compiled.
We will want something roughly like:
# Tools...
CC=/usr/bin/${arch}-gcc
CXX=/usr/bin/${arch}-g++
... all cross compilers ...
# Paths...
PKG_CONFIG_PATH=/opt/target/lib/pkg-config
CFLAGS="-I /opt/target/usr/include"
LDFLAGS="-L /opt/target/lib -L /opt/target/usr/lib"
ACLOCAL_FLAGS="-I /opt/target/share/aclocal"
Then, chunk by chunk we will discover problems which can ideally be
solved by tweaking the default cross environment appropriately, and
sometimes by encoding some brute force into the chunk recipe itself,
ensuring the right paths are used.
Part 4 - Assembling and deploying the system
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Here is an interesting problem.
Of course to assemble the final target image, we would need to simply
stage all of the cross-built artifacts in '/' instead of in
'/opt/target', but, at this stage what do we do for the system
integration commands ?
Perhaps we have a very special setup for this which stages the whole
thing backwards so that:
/ - Contains the assembled cross-built target
/opt/host - Contains the staged host tooling
And we have a sort of floating chroot which only runs tooling from
/opt/host with a working shell in which system integration commands
could be run from the host tooling and effect the root target ?
Or, we stage it in a regular way with host tools in '/' and the target
in /opt/target, but we ensure that all tools we use for system
integration can handle the relocation, possibly by patching the
upstreams providing those tools.
And things are probably more complicated than just this when it comes
to tooling which generates some binary format, it's unclear to me
whether the host runnable gtk-update-icon-cache will generate an icon
cache which will be readable on the target system, for instance.
Summary
~~~~~~~
In the above picture I have skimmed over some details, such as build
dependency enhancements; when cross compiling it would become important
to specify which "host" chunks a given target chunk depends on.
There is also a running theory that one could simply build depend on
the cross compiler and use that to cross-compile chunks without
significantly modifying baserock functionality and definitions format.
It's a possibility but we would still have to ensure that we have our
host assets in '/' and our cross built assets in a separate location
for each and every chunk build (also we still have the same system
integration issues).
Cross compiling is a continuous uphill battle primarily because it is
difficult to justify the importance of supporting cross-builds to
upstreams when a self-hosting compilation works.
The current release of buildroot contains a total of 1469 downstream
patches in it's package directories and is still building GTK+ 3.14.x.
More information about the baserock-dev
mailing list