On Fri, Nov 20, 2015 at 04:15:17PM +0000, Tiago Gomes wrote:
On 19/11/15 16:32, Richard Maw wrote:
>On Tue, Nov 17, 2015 at 02:54:24PM +0000, Tiago Gomes wrote:
>>Hi,
>>
>>After pondering about this a bit more, I suggest a new layout:
>
>Sorry it took so long for me to reply.
No worries. Thanks for providing feedback.
>
>>⋮
>>chunks:
>>- name: foo
>> morph: strata/base/foo.morph
>> repo: upstream:foo
>> ref: cafef00d…
>> extra-sources:
>
>I pondered this approach when I put together the proposal.
>
>My justification for not doing it this way is that I wanted to be able to have
>no top-level source, or multiple top-level sources.
>
>Allowing no top-level sources would be handy for builds which are for combining
>the results of installed components, which would be a way to handle
>gdk-pixbuf-loaders or ld.so.cache file generation.
>
>Allowing multiple top-level sources would be handy for building incestuously
>linked projects, or projects that are difficult to bootstrap independently,
>such as gcc and its support libraries, or clang and llvm.
I am not a fan of having multiple top-levels. That would involve
putting all the chunk repo and ref fields in a list, despite most of
the chunks only having a single source.
It's one extra line and a small amount of indentation,
but you could keep the old syntax when you have just 1 source,
and decide how to parse it based on the presence of a top-level sources key.
I'd just rather avoid stuffing more stuff into the top-level format,
when it might make more sense to take a step back.
Also, if there were multiple top-level sources, from which directory
would the build commands run if there isn't a parent source
directory?
You'd still run from /foo.build, but for multiple top-level sources you'd
specify a path for all of them, and have a tree looking like you do below.
I don't know about gdk-pixbuf-loaders, but gcc supports
specifying
the location of the support libraries at configure time, so you
could clone them to the gcc dir and use them from there.
Clang build instructions also have it building inside a parent
source directory (llvm) [1].
That assumes that the build system deals with extra content being present
nicely, you could have one which decides that it's not a "clean" build so
tags
it differently, and then people get concerned that there may be local patches
which aren't tracked.
I have independently seen projects put an unclean marker in their version
string if there is additional content in the repository,
and I have seen people get concerned about the marker,
so it's not a theoretical concern.
That said, if there are really projects which require a set of
sources cloned side-by-side, a way of dealing with this would be
supporting a `path` field for the parent source as well. So that
chunks:
- name: foo
repo: delta:foo
path: foo
extra-sources:
- repo: delta:bar
path: bar
- repo: delta:qux
path: qux
I'd assert that this looks more consistent and is the same length:
chunks:
- name: foo
sources:
- repo: delta:foo
path: foo
- repo: delta:bar
path: bar
- repo: delta:qux
path: qux
would produce the following directory structure in staging area:
foo.build/
├── bar/
├── foo/
└── qux/
This looks good to me.
<snip>
>The justification for it being mandatory was to reduce the number
of operations
>that you need to have the chunks' source trees available for, so that we can
>rid ourselves of the code required to fetch the sources in.
I am not sure which code you are talking about. AFAIK, if there is
a configured morph-cache-server, you only need the sources if you
need build them.
Or if the morph-cache-server is unreachable.
If morph-cache-server API grows to support getting the submodule
commit given a certain path, there would be no need to fetch the
sources.
This still requires a round-trip to the morph-cache-server and back,
latency can be more important than the amount of data transferred,
and if you can't know in advance whether you need to check with the remote server,
then you have to check every time which makes things slow
and unable to make it work offline.
>>This avoids the hassle of requiring the user to do a `git
ls-tree`
>>by themselves when writing the definitions, and makes it possible to
>>override the submodule commit used without needing the former
>>'submodule-commit' field described by Richard.
>
>The purpose of having both the ref and the submodule-commit to specify both the
>check and the override, was so that when you update the ref of the parent
>repository and that happens to update the submodule, the tool forces you to
>verify whether you still want to override the submodule.
I understood that. With my suggestion there is no need for that
extra field, as if the user specified a ref for the submodule, that
will be the ref used independent of the parent ref given.
I can see that,
but I still think the consistency check has value,
and an extra field is required if you have the consistency check.
>>'ref' will be a mandatory field if the extra source
is not a
>>submodule of the parent repo.
>>
>>Recursive submodules will handled by specifying recursive extra_sources.
>
>Isn't this redundant when you are already providing the path?
The idea of using recursive sources is to make it explicit which one
is the parent repo for getting the submodule commit id. The
alternative would be some kind of path detection.
chunks:
- name: foo:
- repo: delta:foo
extra-sources:
- repo: bar
path: bar
- repo: qux
path: bar/apath/qux
For example, let's suppose that qux is a submodule of bar. I could
perhaps find the parent repo for qux by checking which extra-source
has a path that shares the most initial path components.
For what it's worth,
I'd prefer it to use the path rather than nesting the data structures.
>You can determine the order in which sources need to be extracted
by which
>paths contain each other.
>
>I considered a nested sources structure when I wrote my proposal,
>but decided since I needed to put some form of path in anyway,
>and the dependency could be derived from that,
>that I'd prefer to keep the structure flat.
>
>Of course, since you're encoding the sources in a list,
>you could require that sources are made available in that order.
That's how I thought it would work and why is a list. The user
would be responsible for providing a correct order.
I'm struggling to think of a case where topologically sorting based on
the path would be wrong and a manual ordering right, but that doesn't
mean there isn't one.
Potentially you could make source preparation faster if you could unpack
in topology order rather than in listed order, as there is some level
of parallelism possible.
I think it unlikely that we'd need that level of optimisation,
but I'm wary of selecting an implementation that would prevent it.
>>Thoughts?
<snip>
>It looks like you are the one who intends to implement it,
>and I'm glad of that.
>
>If you think your way is better that's fine,
>I just wanted to make sure that you were aware of why my proposal differed.
There is a WIP implementation at [2]. I am happy to abandon the
branch if my changes are considered problematic. But I would like
to know what other people think.
T.
[1]:
http://clang.llvm.org/get_started.html
[2]:
http://git.baserock.org/cgi-bin/cgit.cgi/baserock/baserock/morph.git/log/...
I'll make a note to look at it when I have time.