On Fri, 2016-10-07 at 16:11 +0100, Daniel Firth wrote:
In the last couple of weeks I've been working on an improvement
definitions format in order to address a few key problems we've had
integrating system images. I will detail a list of what I've
where in what state, and some short stories to support the changes.
This work is partially to mostly complete - The morphology
and assemblage manipulation is quite usable but I do invite everybody
try and break it as hard as possible, provide patches or even
cannibalize all of it.
* The V10 schema change (No ontology changes yet):
* Defslib (Will attempt to build naively using `sudo ./quick-
I'm trying to run this now, although I'm not sure how to set it up to
use my existing gits directory or whether it is compatible with YBD
gits directory, so it may take some time to build :)
However I am at least building binutils while writing this mail... lets
see how it runs...
* Defslib pypi page: https://pypi.python.org/pypi/defslib
* V10 visualisation of base-system-x86-64-generic:
* Example of pre-migrated definitions:
Suppose we develop a system foo-x86_64.morph, using a particular
toolchain stratum that was delivered. We'll call this
"foo-toolchain.morph". A large amount of strata rest atop,
"foo-toolchain.morph" as a stratum build-dependency. After
the definitions for the entire system and strata stack, we then wish
swap out the foo-toolchain and test the system building against
build-essential. How much do we need to change in order to try this?
The answer to this is "everything". The entire strata stack depends
ultimately depends on foo-toolchain, necessitating a duplication of
entire strata stack, which is not something we want to have to do
to try a provisional compiler.
The answer for this came from observing why it is we don't suffer
same problem in swapping out an individual chunk in a stratum, and
is that the build-depends are ultimately the juresdiction of the
enclosing stratum to declare and modify. V9 tries to homogenise this
slightly by moving control of the build dependencies out of the
files themselves up to the system level, and mirroring the same
that strata use to manage build-dependencies of their component
Testing a different toolchain now only involves modifying the
to the stratum included in the system.
This is indeed interesting, and is one approach to solve the problem of
say, multiple strata which provide the same API surface with a
As you mention the toolchain is a good example of this, one might want
a full gcc with a glibc runtime, or one might want musl or another
At first sight, I felt a little uncomfortable to see however that there
is no explicit dependency on a given API surface, who's implementation
might be interchangeable. I.e. a stratum which requires a C library
cannot explicitly state that it does; instead the burden of how a
system must be assembled falls on the one putting together the system.
That said, such extra syntax might just be bloat, or could be explored
at a later time; since there would still be a need at the system level
to decide what flavor of C library implementation (for example) should
be chosen when assembling the system; this implementation solves that
I would however like to converge towards a future where we might regain
all lost dependency information and in doing so, be able to validate
whether a build could possibly succeed before attempting to run the
build, i.e.: a world where every chunk explicitly states their direct
dependencies on other chunks. This would require some symbolic naming
allowing one chunk to require "a C library" and allow both musl and
glibc to provide "a C library".
My question regarding the above is only; after making this move to V10
do you think the above could possibly still be in the cards ?
Suppose we have several systems all relying on the same collection
strata for functional reasons. Suppose foo-x86_32, foo-x86_64 and
foo-armv7 include as strata [graphics-stack-banana,
graphics-stack-banana-core, graphics-stack-banana-plugins]. Suppose
we want to switch to a single new graphics-stack-trampoline in all
our systems. We want to keep both collections of strata around, as
system may genuinely need the graphics-stack-banana collection. But
since all systems must explicitly list all of their strata, we must
update every system to cope with the new change. Perhaps what we
like is a mechanism by which all systems can include a certain
collection of functional components [build-essential, core,
graphics-stack-something], in which can abstract out a particular
graphics-stack provider across all systems that utilise the same
In a more extreme case we may want to replace the functionality of
several strata with a single chunk, however this will require
the chunk in its own stratum, which is bloatful. Other times we may
to do the reverse, and swap out a single chunk for a collection of
components, already defined in a stratum, this requires copying or
V10 attempts to answer this by replacing system and stratum with a
syntactic type called 'assemblage'. Assemblages are objects which
contain a contents list, which is a hetrogenous list of either
assemblage or chunk. In type theory parlance, where previously
held a field strata: List<Stratum>, and strata held a field chunks:
List<Chunk>, assemblages hold contents: List<Either Assemblage
Chunks and assemblages within a contents list can be made to depend
each other. The visualisation of base-system in the links above
indicates one such factorisation - where previously base-system
duplicated the strata in minimal-system, here it need only include
minimal-system as an element of its contents and have foundation
on it. One further factorisation would be to collate the non-bsp
of minimal into say "minimal-stack", and only putting architecture
specific bsps in the system level, dependant on minimal-stack,
reuse of "minimal-stack".
Since the type of contents: is uniform across the board, assemblages
be programatically sorted, manipulated, and edited to form new
topologies. In the defslib example, the Actuator.build_assemblage()
function works by flattening the assemblage recursively into a list
chunks, and sorting them topologically so that iterating
over the contents list is also a sound build order.
System and stratum will still form part of the ontology of baserock,
with a dependent meaning. A stratum is an assemblage that exists as
component in another assemblage. A system is an assemblage that
V10 also does not preclude continuing to define everything as has
done in multiple systems, should the user prefer.
So now we can treat chunks and strata equally as dependencies of
eachother if I understand correctly; or at least a stratum can depend
on chunks and other strata (or assemblages). This is something I had
I'm a bit unclear about this, though, you mention that:
"Chunks and assemblages within a contents list can be made to depend
on each other"
Does the above mean that a Chunk can itself indeed depend on an
"A stratum is an assemblage that exists as a component in
another assemblage. A system is an assemblage that boots."
If a chunk can also depend on assemblages as well as other chunks, is
the Chunk itself not in a sense also an assemblage ? Or at least one
could say that the chunk and assemblage shares some common properties,
probably the same common properties as a stratum shares with a system ?
Whatever the answer is about the Chunk's relation to an Assemblage
above, I tend to agree with Sam that we should simply drop the concept
of "stratum" entirely, if the only distinction is from where the
assemblage is included from.
Extra note: some important details I think were left out of this email,
as far as I understand the new data model adds some more features,
specifically the include-mode. As far as I understand this feature
allows us to define the scope of included chunks/assemblages and
whether they impact the product of the assemblage being built.
In simple terms: it allows us to include busybox while building an
assemblage that provides coreutils, in such a way that busybox will not
be forced upon the consumers of the coreutils assemblage.
This is noteworthy because our "from scratch" build topology
necessitates rebuilds of various chunks along the way, and tooling like
busybox which might only be required during a given build phase - this
necessity sets us apart from other self hosting build systems and as
such this is a very important feature that comes with V9 or V10.
That said, I'm generally very happy with the new features and changes,
and am a bit more concerned at this point with shelling out a sane and
documented API for defslib itself.
I think we should be examining the requirements of YBD and forming a
defslib API which caters to the needs of YBD while refactoring YBD to
use defslib for the heavy lifting, instead of trying to write a
separate tool along side YBD.
Analysis of the existing json schema indicated that it didn't handle
type checking files with any level of rigor. Due to the fact that
"morph:" fields allow for self-updating dictionaries, very few
can actually be guaranteed to exist in say, a file containing chunk
build instructions. This as documented in the defslib README, makes
checking individual files pointless, and also led to ybd being unable
use the schema itself to type check incoming data, instead relying
its own field validation. Defslib will currently parse V10 using the
MorphologyResolver is able to validate *fully resolved assemblages*,
that is assemblages that have had their morph: references inserted
the morph: field popped out. This allows the entire assemblage to be
type checked, giving some assurance that the resulting structure is
something that can be understood by a build process, or other logic.
Defslib will attempt to build with sandboxlib chroot using "sudo
./quick-check.sh". I have not let this run very far yet, so more to
baserock-dev mailing list