Unpacked chunk directory corruption: how to detect
by Lars Wirzenius
Moving discussion from a private list to public.
Someone was having problems with their cache directory containing an
unpacked chunk that was corrupted. I don't know if the problem was
that some files were missing, or that they had been modified.
Quoting:
> so the answer for this strange bug turned out to be that somehow
> something in /src/tmp/cache/alsa-lib* had been corrupted, which meant
> that the alsa-lib artifacts being created in /src/cache/artifacts were
> broken, hence alsa-utils did not build.
>
> It seems to me that we need some checksum/validation for getting from
> /src/tmp/cache/* to artifacts in this process - there's no way a normal
> user could figure this out in a reasonable amount of time afaict.
I agree it would be good to do this. An easy implementation would be
like this:
* when we unpack a chunk into FOO.d, we also write a FOO.manifest,
which lists the name and cryptographic checksum of each regular file
in FOO.d
* when hardlinking, we read in the manifest, and verify the checksums
before hardlinking into the staging area
The problem here is that the hardlinking is there to make things fast,
and checksumming everything requires reading all the files, and that's
exactly what we're trying to avoid.
So instead I'll suggest a different manifest:
* store the pathnames (within FOO.d) for each file in the manifest,
along with basic stat information (type, mtime, size); none of that
is meant to change in the unpacked chunk, ever
* before hardlinking into the staging area, go through the manifest
and verify that everything still has the same basic stat information
This should catch most corruption (especially missing files), without
costing too much, though I admit I haven't done any benchmarks, since
my laptop's doing a heavy build right now. It won't catch the case of
files being modified in a way that preserved their mtime, but that
should be rare: it'll either be quite a stupid program, or filesystem
corruption.
We could, additionally, have a Morph subcommand that verifies all the
unpacked chunks, but that may be overkill.
Opinions?
PS. The code to gather the manifest data would be something along the
lines of this:
def manifest(start_dirname):
def add(obj, pathname):
st = os.lstat(pathname)
relative = os.path.relpath(pathname, start_dirname)
obj[relative] = {
'st_mtime': st.st_mtime,
'st_size': st.st_size,
}
obj = {}
for dirname, subdirs, basenames in os.walk(start_dirname):
add(obj, dirname)
for basename in basenames:
add(obj, os.path.join(dirname, basename))
return obj
The return value should be easily be stored as JSON, then read back,
and then compared with another return value. Any difference would
indicate corruption, which can be handled either by giving an error
message and terminating, or giving a warning, deleting the corrupted
FOO.d, and unpacking it again.
--
http://www.codethink.co.uk/ http://wiki.baserock.org/ http://www.baserock.com/
9 years, 6 months
[PATCH 0/9] Morph: Fix test suite so "./check --full" works
by Lars Wirzenius
repo: git://git.baserock.org/baserock/baserock/morph
branch: liw/fix-check-2-rebase
commit: 2c4752d71d35ec31221dcdab1289d2084936ccae
land: master
card: 10663
This patch series makes Morph's test suite, "./check --full", work
again. I haven't tracked down when it actually broke, but there are
several problems being fixed by these patches, and some of them are
ancient.
* We now need to use set PYTHONPATH to point at a checkout of cliapp
with a current version of the baserock/morph branch. For this to
work, the morph getting run by the various tests must inherit it
from what the user sets when invoking ./check.
Without these patches, setting PYTHONPATH doesn't work for yarns,
because a) yarn cleans up the environment for running tests (for
reproducibility) and b) the shell library we have for Morph yarns
overrides the PYTHONPATH it gets.
These patches use the yarn --env option to pass in PYTHONPATH to
yarns, so that yarn allows it in the test environment, and change
the shell library to add to, rather than override, PYTHONPATH when
it runs morph from the source tree. This way, when the user sets
PYTHONPATH when running ./check, the morph that gets invoked by yarn
steps includes the user's PYTHONPATH (with additions).
* PYTHONPATH can't contain pathnames that contain colons: it uses
colon as a separator and there is no escape mechanism. Morph uses
colons in repo aliases, and it uses repo aliases as directory names
when setting up the workspace. As a result, the PYTHONPATH, as
augmented by the yarn shell library, won't actually work as
intended, when you run ./check in a checked out system branch in a
workspace. Worse, if you "morph edit" to get the right version of
cliapp, you can't set PYTHONPATH to point at that, either.
As a workaround, we can rename (mv) any directories in the workspace
with colons in them. Morph will work fine with that, since it
doesn't assume names for checked out git repositories, but scans the
workspace instead.
Most of the patches in this series change Morph so that it avoids
colons in repo aliases, and repo URLs, when determining paths in the
workspace. This will change the repository layout: where we used to
have master/baserock:baserock/definitions, we now have
master/baserock/baserock/definitions.
* The tests.deploy/deploy-cluster.script cmdtest test always fails,
since it produces output and the corresponding .stdout file is
missing from git. I have disabled the test, since I didn't want to
try to figure out what the actual right output is (as opposed to
what the script now outputs). It should also be converted into yarn,
so I figured I'd be lazy and disable it now, and help convert it to
yarn later.
While I was poking around in the test suite, I found additional things
to fix. These aren't strictly necessary to get ./check to work, but we
should fix them anyway.
* Our shell library no longer needs to set SRCDIR explicitly, yarn now
does it for us. This removes some code that is effectively dead.
* There's an assertTrue that should be assertEqual in a unit test. An
easy, but unfortunate mistake. This fixes the test, making it
actually do something useful.
With these patches, and a checkout of the cliapp baserock/morph
branch, I can successfully run the following, on a Baserock 13
development system:
PYTHONPATH=/home/root/cliapp ./check --full
We're currently in a development cycle where it's OK to break
backwards compatibility, so it may be necessary to point out that the
fixes above are necessary even if you run the test suite on the
current Baserock master.
I further note that when you submit a patch for merge, you should do
this before sending (note the --full argument to ./check):
git checkout master
git merge --no-ff --no-commit YOUR/BRANCH
./check --fulll # add PYTHONPATH as necessary
If that works successfully, you then do "git reset --hard" to revert
the merge-in-progress, and send the patches.
When you merge, you do the same sequence, and only commit and push the
merge if the test suite succeeds.
The full test suite takes a while to run (about 8 minutes in a VM on
my laptop). We should fix that, rather than use it an excuse to not
run it.
The last patch in this series is huge, about a thousand lines, but it
is very straightforward, I hope.
Lars Wirzenius (9):
Remove setting of SRCDIR in morph.shell-lib
Pass in user's PYTHONPATH to morph when run from yarns
Disable test so that "./check --full" passes
Fix assert in unit test
Fix system branch dirname generation to avoid colons
Fix pathname (colon to slash) in test implementations
Fix directory names for chunks to use slashes, not colons
Convert colons to slashes for chunk name
Fix paths for chunk directories in cmdtests
check | 15 +++-
morphlib/sysbranchdir.py | 10 ++-
morphlib/sysbranchdir_tests.py | 6 +-
morphlib/workspace_tests.py | 4 +-
tests.as-root/branch-from-image-works.script | 2 +-
.../build-handles-stratum-build-depends.script | 4 +-
tests.as-root/build-with-external-strata.script | 4 +-
...iple-times-doesnt-generate-new-artifacts.script | 4 +-
...system-branch-picks-up-committed-removes.script | 12 +--
...stem-branch-picks-up-uncommitted-changes.script | 12 +--
.../building-a-system-branch-works-anywhere.script | 6 +-
.../building-creates-correct-temporary-refs.script | 10 +--
...hology-contents-do-not-change-cache-keys.script | 6 +-
tests.branching/add-then-edit.script | 6 +-
tests.branching/ambiguous-refs.script | 6 +-
...reates-new-system-branch-not-from-master.script | 6 +-
...reates-new-system-branch-not-from-master.stdout | 11 +--
.../branch-creates-new-system-branch.script | 6 +-
.../branch-creates-new-system-branch.stdout | 9 ++-
tests.branching/branch-works-anywhere.script | 4 +-
tests.branching/branch-works-anywhere.stdout | 90 ++++++++++++----------
tests.branching/checkout-existing-branch.script | 4 +-
tests.branching/checkout-existing-branch.stdout | 9 ++-
tests.branching/checkout-works-anywhere.script | 4 +-
tests.branching/checkout-works-anywhere.stdout | 27 ++++---
.../edit-checkouts-existing-chunk.script | 4 +-
tests.branching/edit-handles-submodules.script | 4 +-
tests.branching/edit-updates-stratum.script | 4 +-
...repository-stored-in-cloned-repositories.script | 8 +-
tests.branching/petrify-no-double-petrify.script | 4 +-
tests.branching/petrify.script | 6 +-
tests.branching/status-in-dirty-branch.script | 4 +-
tests.branching/tag-tag-works-as-expected.script | 4 +-
.../workflow-separate-stratum-repos.script | 16 ++--
tests.branching/workflow.script | 6 +-
tests.deploy/deploy-cluster.script | 5 ++
tests.deploy/deploy-rawdisk.script | 4 +-
tests.deploy/setup-build | 6 +-
yarns/branches-workspaces.yarn | 10 +--
yarns/implementations.yarn | 50 ++++++------
yarns/morph.shell-lib | 29 ++++---
41 files changed, 246 insertions(+), 195 deletions(-)
--
1.8.4
9 years, 6 months
[PATCH 0/2] Two branch and merge fixes
by Sam Thursfield
Repository: git://git.baserock.org/baserock/baserock/morph
Ref: sam/branching-fixes
Sha1: fbce0142fb9ff9c9187cdcb67bc3a2981e78ff69
Sam Thursfield (2):
Raise correct error on `morph checkout|branch` of repo with no morphs
Fix `morph petrify` in cases where root repo URL has a trailing /
morphlib/gitdir.py | 3 ++-
morphlib/gitdir_tests.py | 8 ++++++--
morphlib/plugins/branch_and_merge_new_plugin.py | 4 ++--
3 files changed, 10 insertions(+), 5 deletions(-)
--
1.8.5.3
9 years, 6 months
Adding system users and groups
by Sam Thursfield
Hi!
This mail is a followup to
http://vlists.pepperfish.net/pipermail/baserock-dev-baserock.org/2014-Mar...
... I think it merits its own discussion.
We have a set of system users and groups in the fhs-dirs chunk:
http://git.baserock.org/cgi-bin/cgit.cgi/baserock/baserock/fhs-dirs.git/t...
http://git.baserock.org/cgi-bin/cgit.cgi/baserock/baserock/fhs-dirs.git/t...
This is a pretty simplistic approach. We want the next release of
Baserock to be upgradable, so we need to make sure we're covered in some
manner now.
As I see it, we have the following requirements.
1. The ability to add users and groups at runtime, and have them persist
across upgrades with no manual steps required.
- for trove-early-setup, which currently creates the necessary
Trove daemon users on first-boot
- for multiuser Baserock
- these must persist across upgrades without manual fixups
2. The ability to add new system users in a new version of Baserock, and
upgrade current systems to that new version with no manual steps required.
- E.G. if we add a new service to Trove that needs a new user.
Problems with the current approach:
- changing the set of system users and groups causes a rebuild of
everything
- systems end up with users they don't need. Trove has groups and
users for Pulseaudio, for example.
- it's impossible to automate merges of /etc/passwd and /etc/group
if we allow both arbitrary user changes and arbitrary system changes.
If we do nothing, we are actually OK. The current upgrades mechanism
makes it *possible* for us run arbitrary code on upgrades, and these
hooks can be conditional on, say, a certain tag of the 'definitions'
repo. So we can bodge future changes into working that way. I'd prefer
not to have to, however.
Here are a couple of ideas to make the situation better:
1. Store the system users and groups separately from the site-specific
ones. This removes the need to merge /etc/passwd and /etc/group at all.
There is existing code to do this already, which is in use in OSTree
systems. I've verified that in a gnome-ostree VM that you can add a
user, upgrade, and the user still exists.
https://github.com/aperezdc/nss-altfiles
https://people.gnome.org/~walters/ostree/doc/lib-passwd.html
As I understand it to implement this we'd need to add an 'nss-altfiles'
chunk in 'core' and alter the 'fhs-dirs' chunk to install /lib/passwd,
with /etc/passwd being empty. It would take perhaps a day of work, given
all the rebuilding and testing that would be necessary, but I think it
would save us a whole week of trying to write post-upgrade hooks later on.
2. Don't add chunk-specific system users in 'fhs-dirs'.
It would be nice if Morph could handle this for us, either by allowing
chunks / strata to specify what users and groups they need, or (as a
more generic fix) by allowing chunks / strata to append lines to
existing files in the system. That's quite a bit of work.
A quicker solution would be to add the extra system users and groups in
a configure extension, which would at least solve the problem of the
massive rebuild and of Trove having a 'pulse' user.
I don't think this needs solving now, but I think we should add a note
on the wiki or in the 'fhs-dirs' chunk linked to this thread.
Thanks!
Sam
--
Sam Thursfield, Codethink Ltd.
Office telephone: +44 161 236 5575
9 years, 6 months
Baserock edit-test-debug cycles
by Sam Thursfield
Hi all
Lars' "Fix test suite so "./check --full" works" patch highlights a
couple of things in our workflows which I find slowing me down quite a lot.
I'm bringing them up here largely so they don't get forgotten.
1. Paths in workspace directories
---------------------------------
It's very time-consuming to navigate directories trees with 'cd' when
they look like this:
/src/ws/baserock/samthursfield/fix-this-thing/baserock/baserock/definitions
For private Baserock-based projects, this becomes even worse:
/src/ws/git@git.$company.co.uk/git/$project/samthursfield/fix-this-thing/$company/$project/definitions
Is there any thinking already on how we solve this?
2. Yarn
-------
I find Yarn quite frustrating when it is part of my edit-run-debug cycle
because of the lengthy commandline I have to remember: something like
yarn -s yarns/morph.shell-lib yarns/implementations.yarn
yarns/thing-I-currently-care-about.yarn
To actually debug test failures there is even more required:
rm -f t.tmp && mkdir t.tmp && yarn -s yarns/morph.shell-lib
yarns/implementations.yarn yarns/thing-I-currently-care-about.yarn
--snapshot --tempdir=t.tmp
I also don't seem to be able to run a single test without actually
deleting all of the other tests from the .yarn file. Since the tests for
e.g. deploy can take minutes, that's important.
How should we approach fixing this? My ideal is to be able to run Yarn
something like this:
yarn --keyword="what i am currently developing"
Does that seem feasible? Will we ever get there?
Sam
--
Sam Thursfield, Codethink Ltd.
Office telephone: +44 161 236 5575
9 years, 6 months
Thoughts on recursive deployment
by Richard Maw
We've come up with a few use-cases for needing recursive deployments:
1. Including the base target system in an SDK so that we can
cross-compile binaries against it.
2. Baserock installer media, where we put one system inside another,
and the outer system is responsible for installing the inner one.
3. Deploying an NFS server with systems pre-installed, so you can
re-deploy the host and all the
4. Deploying a system with many sandboxed applications, each built as
a Baserock system.
e.g. Deploy a VM host, and VMs at the same time, deploying a system
containing docker containers, or deploying Baserock-in-a-box, which
produces a USB installer image, which creates a VM host, with a few
emergency backup Baserock VMs plus a Trove which contains nfs roots
for a distbuild cluster.
I've attached what I think future cluster morphologies could look like
to support his.
I've used yaml anchors, references and merges to keep it from getting
too wide. Talk around the office suggested that we may not want to tie
ourselves to YAML this way, so we may want another way of referencing
other deployments.
9 years, 6 months
[PATCH 0/2] Create help for write and configure extensions.
by mark.doffman@codethink.co.uk
From: Mark Doffman <mark.doffman(a)codethink.co.uk>
This patch set adds the ability to provide help for write and
configuration extensions. It takes the approach that since
write and configuration extensions are scripts help can be
farmed off to them.
When 'morph --help <extension>' is called it will in turn
call '<extension> --help' and show the result.
'morph help-extensions' will list all of the available extensions.
The advantage of this is that help for the extensions is tied
to the extension script rather than having to create a separate
text file for each one.
Questions:
Should the extensions be run in a chroot? I don't want
the extension scripts modifying the system if badly written.
Repo: ssh://git@git.baserock.org/baserock/baserock/morph.git
Branch: baserock/markdoffman/S10382/add-help-option-v3
SHA: 249592e84e875a89
Kanban Card: S10617
Mark Doffman (2):
Add utilities for listing and finding extensions.
Add write and configuration extensions to help.
morphlib/__init__.py | 1 +
morphlib/app.py | 77 +++++++++++---------
morphlib/extensions.py | 143 ++++++++++++++++++++++++++++++++++++++
morphlib/plugins/deploy_plugin.py | 49 +++----------
without-test-modules | 1 +
5 files changed, 200 insertions(+), 71 deletions(-)
create mode 100644 morphlib/extensions.py
--
1.8.4
9 years, 6 months
[PATCH 0/3] [definitions] Add write extension to create a SDK installer
by Richard Maw
Repo: git://git.baserock.org/baserock/baserock/definitions.git
Ref: baserock/richardmaw/S10618/sdk-installer
SHA1: f88ad5f8fa954b18a3e0d13e0a0aa94d89805124
Land: origin/master
This makes a binary blob that acts as an installer, like shar or makeself.
The installer unpacks the tarball, writes an environment file to set
variables like PATH, and uses sed and patchelf to make tools in the SDK
directory only use components from inside.
This adds a stratum for patchelf, as it's needed for the installer.
Patchelf is not required in the system that runs the installer, due to
clever trickery.
Richard Maw (3):
Add patchelf to toolchain systems
Add write extension for constructng sdk installer blobs
Add morphology for deploying the SDK.
...lhf-cross-toolchain-system-x86_32-generic.morph | 1 +
...lhf-cross-toolchain-system-x86_64-generic.morph | 1 +
cross-tools.morph | 10 +
sdk.morph | 12 ++
sdk.write | 208 +++++++++++++++++++++
5 files changed, 232 insertions(+)
create mode 100644 cross-tools.morph
create mode 100644 sdk.morph
create mode 100755 sdk.write
--
1.8.5.rc2
9 years, 6 months
[PATCHv5 0/7] Add script to manage parallel Baserock versions
by Pedro Alvarez
Repo: ssh://git@trove-baserock-org/baserock/baserock/tbdiff.git
Ref: baserock/pedroalvarez/trove-upgrades-rebase2
Sha1: 87b15c34fa2012dda414499246ed0819d7bfdb50
Card: S10393
This patch series also include some fixes in baserock-system-config-sync
we needed for implementing the upgrades in baserock.
Changes from v4
Suggestions applied.
The v4 of this patch series is:
[PATCHv4 0/7] Add script to manage parallel Baserock versions.
Pedro Alvarez (6):
Fix behaviour in bscs-merge when vUser and v2 don't have a file of v1
Fix error in the baserock-system-config-sync behaviour table.
Modify 'baserock-system-config-sync' to get two arguments using
'merge'
baserock-system-config-sync: Add some logging to the standard output.
baserock-system-config-sync: Force copy /etc/passwd and /etc/group
Add script to modify the bootloader and manage different parallel OS.
Sam Thursfield (1):
system-version-manager: Allow specifying custom path for
baserock-system-config-sync
Makefile.am | 3 +-
.../baserock-system-config-sync | 39 ++-
configure.ac | 3 +-
system-version-manager/Makefile.am | 20 ++
system-version-manager/system-version-manager | 310 ++++++++++++++++++++
.../upgrades.out/systems/version2/run/etc/file1 | 2 -
tests/run_tests.sh | 4 +-
7 files changed, 366 insertions(+), 15 deletions(-)
create mode 100644 system-version-manager/Makefile.am
create mode 100755 system-version-manager/system-version-manager
delete mode 100644 tests/bscs-merge.pass/upgrades.out/systems/version2/run/etc/file1
--
1.7.10.4
9 years, 6 months