TL;DR: The Ick controller can be changed in a straightforward manner
to use Muck for persistent data storage, except for log files. It
doesn't seem worthwhile to support an in-place conversion. Instead,
the few people affected can re-create their projects and pipelines on
a new, fresh demo Ick instance, which will replace the current one.
TS;RF (too short; rambling follows):
Currently, the Ick controller stores various resources directly in its
local filesystem. I wish to change the controller to use Muck instead.
The main motivation for this is to have better access control: users
of the same controller shouldn't see or be able to change projects or
pipelines for other users.
Muck is a JSON data store with access control. Part of the access
control is that every JSON object ("resource") is owned by a specific
user. Every user can only access their own objects, for now (although
this will become more flexible later).
Where the controller currently creates, say, a resource for the
project foo, by storing it as YAML in the file
"/var/lib/ick/state/projects/foo", I plan to change that so that it
creates a JSON resource in Muck like this:
... # all other fields as in the current YAML file
Muck invents a unique identifier for each object, and guarantees no
other object has that identifier. The controller will not use this,
and will instead use a search on the "_type" and "_name" fields to
find the right object. This is so that users may refer to projects and
pipelines using more humane names: "ick.liw.fi" instead of
"48053f4f-71d9-42a1-b3ca-8574cbb788aa" for example.
Muck allows arbitrary JSON objects to be stored. For the Ick
controller, the following approach seems like a reasonable first
* Each object will have a "_type" field, which specifies the type of
the object: project, pipeline, build, log, or worker. This is needed
so that one can search for a "project foo", as opposed to "pipeline
* Each object will have a user-assigned "_name" field, which the
controller makes sure is unique and prefixed with the object owner's
Muck does not have transactions that span multiple HTTP requests.
The controller will do its best to ensure a name is unique, but it
can't guarantee that. However, if it notices a name clash later, it
will treat that as an error. For example, if there are two projects
named "liw/ick.liw.fi", the controller will refuse to trigger
either. (This is a limitation in Muck and will be fixed later,
possibly by teaching Muck about user-defined names for resources,
and having it make sure they're unique. But that will have to wait
for a later version of Muck.)
* Depending on the type of the object, it may contain other fields as
well, see below.
* The controller creates objects in Muck based on API calls to the
controller, and passes on the access token it gets. Muck uses the
access token's "sub" field to assign ownership of the object; some
access tokens do not have such a field, and can thus not be
accressed by any user; this will be the case for workers that
The following object types will be supported initially:
* project - projects defined by the user
* pipeline - pipelines defined by the user
* build - a build that's been triggered, is running, or is finished
* log - a build log
* worker - a worker
The contents of these object types are as they are in the controller
now. Switching to Muck does not change that, except for logs, which
need special handling to avoid very bad performance.
Muck stores all incoming resources to a "change log". A build log may
get thousands of updates: each line of output may become an update.
Doing a new update of the log object would result in thousands of
nearly identical copies in the change log, which is quite wasteful:
each new version of the log resource would be identical to the
previous one, except with a line of new text. When a build produces a
thousand lines of output, Muck would store a thousand copies of the
log object in its change log.
To avoid this waste, the controller will be changed to store the build
logs as follows:
* The worker-manager will send each build log snippet as a separate
update, as before, but it will also add a sequence number to the
* The controller will create a separate log object for reach update in
Muck. This object will contain only the new snippet. The new log
object will have the log snippet, plus a reference to the build for
which it is part of, and the sequence number.
* The controller will reconstruct the whole build log when it's
requested for a complete log, by fetching all log objects that refer
to the specified build, and catenating the log output snippets in
order of the sequence number.
I think this will work, but I haven't written any code yet. Comments?
I plan on starting work on this next week, when I get back from gallivanting
I don't plan on converting existing projects and pipeline on the demo Ick
instance. That seems like work that would be useful, but it's also work
that's maybe not worthwhile yet, given the demo Ick instance only has a few
users. I'm lazy, sorry.
I added the above page to the Ick website. It matches the emails I
sent before. Feedback is still welcome!
I want to build worthwhile things that might last. --joeyh
(Also posted to the Ick blog:
Netsurf (http://www.netsurf-browser.org/) is a lightweight and
portable web browser. It targets several older operating systems, for
which it is often the only graphical web browser.
Netsurf is cross-built for many older operating systems which can not
support native build. It is natively built on a number of esoteric
systems for which cross-building is not possible. For some more modern
operating systems, NetSurf is built for several different toolkits and
target kinds, including some pure-test-related builds whose outputs
are coverage reports, test summaries, etc, rather than executable
NS currently uses Jenkins, but finds it problematic. A particular
problem is that current versions of Jenkins want a version of Java
that is newer than what is supported on some of the systems NS
targets. Jenkins has also proven to be inflexible to use. NS is
interested in replacing Jenkins with Ick, eventually.
The NetSurf project has sunk a significant amount of effort into
making the older Jenkins fit-for-use, so while replacement is
desirable, there will need to be an easy pathway to transition to Ick,
or to have Ick run alongside Jenkins while jobs are migrated piecemeal
Another significant issue with Jenkins for the NetSurf project is that
of configuration management. Jenkins' ability to be configured without
use of its web-UI is, while not limited as such, unpleasant to work
with. Significant security issues surround the JNLAPI and as such,
given the age of the Jenkins in use, NetSurf are unable to automate
configuration management. A feature of Ick which attracts the project
is the possibility of git-based configuration of the CI system.
However, the project does find the web-UI useful from time to time in
order to make small tweaks to configuration (for example when
debugging a CI job). Ick will need to support some configuration
changes through a web-UI (ideally making and pushing git commits
behind the scenes).
NS is divided into many components, sub-projects, each of which builds
something like a library used by other components. The various
components thus have build dependencies on each other, and if a
dependency changes, any components that build-depend on it need to be
built and tested as well.
Some components are host-based tools which are consumed by many
downstream variants of component builds. For example, the binding
generator is built for each host (e.g. `x86_64` Linux vs. `i686`
Haiku) and is then consumed by any downstream build running on that
host, regardless of its target (e.g. on `x86_64` Linux, a build of
NetSurf targetting AmigaOS).
The NS project hosts its own builders, and there is at minimum ssh
access to them. They may need to have each builder run multiple
workers, to use hardware efficiently. The builders are not all
co-located in the same physical location: being frugal with bandwidth
use is a concern.
All of the builders are linked to a virtual LAN (single broadcast
domain across multiple sites). This means that in situations where
bandwidth is a concern, non-encrypted protocols are considered just as
secure as encrypted ones. While SSH is available to all hosts, some
are perhaps less reliable than others. It would behoove Ick to not
rely on communication purely via SSH if at all possible. Sadly Python
is not necessarily available on all hosts either. A small C-based
agent may be required for maximum flexibility and minimum risk of
TTY/SSH related issues.
NetSurf has CI notifications routed to a number of targets including
the project IRC channel, a mailing list, and an RSS feed from the
Jenkins. In order for these routes to be useful, NetSurf would find
the ability to construct rules in order to filter and route
notifications very useful indeed. For example, every successful or
failed build should not be notified to IRC, instead only when a job
changes state from success to failure or back. Ensuring that
notifications carry useful semantic information and links will also be
NetSurf makes use of Jenkins' history of builds, allowing the project
to trace non-fatal issues through builds. These link back to the git
repositories at the specific commits built, and this is considered a
critical feature of the UI.
NetSurf would very much like it if the CI could be easily configured
to deploy branch-based builds in a way which will not confuse artifact
handling, and will allow developers to build a branch across many
repositories in order to simplify the testing of the development of a
new feature which crosses libraries. Such a concept is often referred
to as "system branches".
It would be best if these builds were not created by default for any
branch, but rather that there was a simple way for a developer to
activate a set of jobs for a branch. Perhaps by adding the branch name
to a configuration in Ick via either the web-UI or via the git
configuration. Or perhaps by pattern of branch name.
I want to build worthwhile things that might last. --joeyh
I'm thinking the Ick project is getting to the stage where it makes
sense to write down the stakeholders and requirements for Ick the
software. This will go on the website, and be living documents, not
fixed into an immutable state. It doesn't need to be formal or
grandiose, but it'll help us think and talk about the software in the
Here's a start on stakeholders:
* USERS: Those who develop of projects, and need a CI or CD system for
helping them do that. They will use an Ick hosted by a HOSTER (which
might be themselves, wearing another hat).
* HOSTERS: Those who host Ick for others to use.
* CONTRIBUTORS: Those who contribute to Ick itself. This includes
those who write code for Ick, or documentation, translate the
software or documentation, those who support Ick users, etc.
Am I missing some group?
Here's a start on requiements:
* Ick should be free software.
* Ick should be "hostable": those interested in doing so, should be
able to provide an Ick instance for other people to use, without
having to truse those other people. Hosters should be able to charge
money for hosting Ick.
* Users should be able to pick any Ick instance they trust, and
migrate between instances. Ick users should feel safe that their
projects, builds, and build artifacts are unaffected and secure from
prying by other users.
* Ick should provide users with a secure way to store "secrets", such
as SSH keys, PGP keys, and API access tokens, and use them securely
in their projects, without the secrets leaking to other users.
* Ick should run builds reasonbly fast, without causing much overhead.
* Ick should be able to build different projects concurrently, whether
for the same user or several.
* Ick should be able to build parts of a project concurrently, when
the parts don't depend on each other.
I'm sure there's other requirements, and the above can be made more
clear and precise.
I want to build worthwhile things that might last. --joeyh