Hi Sam !
On Thu, 2017-04-06 at 13:23 +0100, Sam Thursfield wrote:
Hi Tristan,
In general what I've seen of BuildStream looks really promising,
I've
not got time to be involved with Baserock at present but am in favour
of
adopting it. I think it'd be a good step forward.
Good to hear.
Here are some general thoughts and concerns.
* Is there a need for a 3-stage compiler bootstrap in the gnu-
toolchain
build? The original reason for doing that "cross-compile" to
$cpu-bootstrap-linux and then back to $cpu-baserock-linux was to
make
sure we didn't link against any libraries from the host build in
stage2.
We modelled this on the Linux From Scratch compiler bootstrap. But
BuildStream builds everything in a chroot starting from a
precompiled
toolchain so I can't see any reason to build a C compiler more than
once
during a BuildStream build. (In the case that the precompiled C
compiler
is new enough and the target system doesn't need to contain a
toolchain,
there's no need to build a C compiler at all).
We've had this conversation before :)
It is just as necessary, although I dont believe stage 3 has really
ever been necessary except as a proof that what you've built in stage 2
actually works, this needs looking into.
It's necessary for two reason:
* First off, just because we build from a deterministic precompiled
toolchain, does not mean that we do not need to completely seal the
build off from that original toolchain, this is still a bootstrap.
I'm not sure how else to say this to get the point across, this
is quite exactly the same as it was with using host tools, we
drop the original host tools after bootstrapping the compiler so
nothing from the original import is present to link to anyway
beyond that bootstrapping process.
The only difference here is that the host you are bootstrapping
from, are not the ones installed on your host.
NOTE: This is _not_ the same as compiling an alternative gcc in
one shot on your host, that you plan to use on your same
host, the tools we import we are going to throw away directly
after building the toolchain. This is in fact what makes it
a bootstrap.
* Second, is that we should be able to refine this bootstrap so that
we can use it to cross-bootstrap for any target arch we support.
For this reason I hope to nuke stage three.
All we really want in the bootstrapping of the toolchain is:
1) Build a cross compiler for ${target}, even where target is
the same arch as ${host}, we create this cross for the sake
of isolation and bootstrapping (same as Baserock currently
does)
2) Use that cross compiler to build a native compiler for
${target} (this is essentially a "canadian cross")
If we remodel build-essential (now 'gnu-toolchain') to work this
way, and never actually *use* the tooling built in stage2 during
that build, we can then take this binary output and use it to stage
in a slightly different build sandbox. A build sandbox which not
only uses bubblewrap for sandboxing, but additionally uses qemu
user mode to continue from that point on and cross compile using
virtualization.
* How will we host the precompiled bootstrap binaries? I guess that
hosting an OSTree repo on
baserock.org infrastructure makes the most
sense, that requires a bit of work to organise but should be fairly
straightforward. These binaries would effectively be the successor
to
the "devel" reference systems that we provide at
http://download.baserock.org/baserock and that were mandated for
using
Morph.
Hosting an OSTree along with the trove I think is the most optimal
solution for this yes. OSTree is interesting because it saves space
with the way it revisions filesystem trees (so when things change in
this base, you save space over other methods of storage).
However, there is nothing preventing you from using tarballs or
even git for this purpose, we've tried similar before using git
and it works, it's just not an optimal solution.
* How will we provide the bootstrap binaries? Right now we're
getting
them from the GNOME project's Flatpak SDK which is a reasonable base
apart from being enourmous. I wonder if we could continue to
"outsource"
this aspect and join forces with a minimal binary distro like Alpine
Linux ? (I've not looked at Alpine, just know that it's small). But
it
depends on how easy it is to cross-bootstrap our base to a new
architecture. Using the 'gnu-toolchain' itself as a base also has a
nicer symmetry than basing ourselves off another distro; I just
think
collaboration is good where possible.
Yes, the base is unnecessarily enourmous, in fact the
org.freedesktop.BaseSdk from that same repository should be much, much
smaller and also sufficient for bootstrapping.
With that said, for the Baserock project I'm much more interested in
hosting the gnu-toolchain binaries we build ourselves, and using those
(or alternatively a binary of the converted 'foundation.morph').
We've again had this conversation before :)
I think it will be interesting to make this circular and build only
from what we've built before, for two reasons:
1) It proves that what we build, can be built with what we've built.
2) It makes Baserock entirely independent of any other system out
in the wild.
For (2), it is particularly important to have this freedom moving
forward, otherwise we are simply trusting that either:
a) We will never require a newer compiler than say, gcc5, in order
to compile whatever version of gcc will exist in 10 years from
now.
But we know this is sort of untrue, parts of gcc are now written
in C++, maybe gcc version 12 will be written in rust, or whatever,
nobody knows.
b) That somebody else will always be around to bootstrap a base
system for *us* to bootstrap from, for any host arch we care
to build on.
I dont like either of the above (a) or (b), and there is no reason why
Baserock's build instructions themselves cannot be entirely responsible
for creating everything it needs.
The trick is only that we need some x86 binary sysroot the first time
around to start bootstrapping and cross bootstrapping from an x86 host,
once we have built it once, there's no reason to depend on anything
else at all.
* What's the plan for versioning the BuildStream format?
Initially
we
didn't have version for the Baserock definitions format, and then as
the
format evolved users would try to upgrade Morph and get random
crashes
as it expected new format definitions and their definitions were in
an
old format. This created an attitude of "never upgrade Morph" among
downstream users I knew since they weren't able to keep up with the
definitions format changes as we made them (and we didn't usually
communicate them particularly effectively). Later we added a simple
VERSION marker to the repo which at least allowed build tools to give
a
useful error. We also added migration scripts which worked well in
some
cases although added to the effort of changing the definitions
format.
YBD took the approach of trying to support all definitions, which
appears to have succeeded thus far but there have been relatively
few
changes to the definitions format in that time (despite our original
goal of wanting to encourage frequent incremental improvements to
the
definitions format).
This is a little tricky to respond to but I'll try.
Bare in mind that the truth is we dont need to revision anything until
the first time we break compatibility.
So, the tricky part with versioning in buildstream is that buildstream
is not one monolithic tool but rather a system for integration of
various elements. So instead of thinking "BuildStream format 1.0",
think "autotools element format 1.0". This is true for the most part,
except for some fundamental things in the BuildStream loader
(dependency expression, variants, variables etc).
For first class citizen elements of BuildStream (i.e. element plugins
which we maintain inside the actual BuildStream repository), the stance
is that these elements should always support every format which has
ever existed, and that any project needs to declare the format version
they intend to use, both for the overall format and for any element
specific versions.
With that said, breaking format compatibility is something we should
avoid, I would aim for a cut off date for an ultimately "stable"
release of buildstream this summer around GUADEC and from that point
attempt to never break it; If, and only if we need to intruduce a
feature that requires a format change, BuildStream must absolutely
continue to support previous versions, and projects which want to use
the newer format need to declare it in their metadata.
* Would be nice to have a 'hyperlinked' way of browsing a
buildstream
repo like we have on
git.baserock.org now for Baserock definitions:
http://git.baserock.org/cgit/baserock/baserock/definitions.git/tree/s
trata/bsp-armv7-versatile.morph
:-)
I'm also curious about the plan for non-x86 platforms.
This should be mostly outlined in the roadmap here:
https://wiki.gnome.org/Projects/BuildStream/Roadmap/CrossBuilding
Our initial idea in Baserock was to native-compile everything and,
for
platforms where building like that is slow, to distribute builds
across
multiple machines. This is why Morph has its 'distbuild' plugin and
I
think some people are still using that to native build on an ARM
server.
That approach has major flaws, including the fact that maintaining
16
build workers is a pain in the ass and the fact that distributing
builds
at a per-component level doesn't solve anything for massive
components
like WebKit that take 5 hours to build from source on ARM. Since then
I
don't know if anyone has focused on improving things, Morph is
abandoned
since years ago and I think YBD's non-x86 builds story is still "try
and
source fast enough build machines".
Is there BuildStream work planned that will help such users? Tristan
I
know you have a bunch of good ideas for solving this, I'm more
interested in how far away these are from becoming reality i.e.
whether
Codethink is already committing time to having realistic support for
non-x86 targets. (By "realistic support" I guess I mean "build times
comparable with x86" and without a high barrier to entry such
expensive
hardware or a complex distributed setup).
Regarding time and commitment, I think we're going to have a clearer
answer to that in around 2 or 3 weeks (some meetings are scheduled with
an interest towards solving the cross building problem, especially with
an eye towards never depending on there ever having been a build for
that target arch ever before, i.e. highly custom arches for which you
may need to build support for yourself into the kernel).
On my part, I have made a huge commitment to make this fly and solve
problems in GNOME, so I dont see myself having time to work on cross
building until some time after GUADEC. That said, there is a huge
chance that resources will be allocated to that much much sooner, and I
very much hope it will be YOU who will work on it :D
Cheers,
-Tristan
Bootstrapping new architectures is also a concern but I know there's
already discussion about that coming up soon, so I guess we'll find
out
about it here in due course.
Sam