[lowrisc-dev] lowRISC SoC structure / communication between application cores and minions

Alex Bradbury asb at asbradbury.org
Tue Dec 30 11:45:47 GMT 2014

On 23 December 2014 at 20:12, Reinoud Zandijk <reinoud at netbsd.org> wrote:
> Why do you think `trap the hypervisor' would be *that* expensive? It don't see
> why it would be that different from calling a system call. Furthermore i dont
> envision the OS passing character-by-character trough a Hypervisor call but
> rather command blocks, much like SCSI. As an analog, define a HCALL with as
> arguments (registers just hand picked here) :

Hi Reinoud, sorry for the delayed response - christmas got in the way!

I don't mean to come across as overly worried about overheads roughly
equivalent to a syscall. I'm just very aware of recent interest in
systems research in bypassing the kernel for network or disk traffic
for lower overhead and latency. See e.g. Ix and Arrakis:
This functionality is also already deployed in the market place, with
e.g. Intel's SR-IOV
There may be perfectly valid reasons we don't want to worry about
those use cases right now, or possibly other approaches can make the
overhead of trapping low enough not to worry about this. It's just
something I think needs thinking about at this stage. Though I freely
admit, although I try to follow work in this area, it's not my field
of expertise.

You might also want to use the FIFO for unprivileged code for
frequent, lightweight messages. e.g. a virtual machine passing
information on observed types to a minion core which is collecting
stats or even performing the full JIT compile off the main thread.

> This way the HAL is only called once every command, maybe resulting in a few
> cycles lost on the switch just like it hit a branch and the almost obligatory
> saving and restoring of some registers. It is also atomically writing commands
> this way when a pipe-lock is used during the actual transfer; which might also
> just be a register :)

This all sounds very sensible.

> Either the Hypervisor, or if one runs a bare OS without hypervisor, both can
> pump the command block trough the FIFO of choice and then wait for the results
> to pass back immediately or fetch its results later when interrupted.
> A continuous stream is also possible if the virtual device uses the specified
> output region as a circular buffer to DMA in.
> As for the choice between a Hypervisor and running a bare OS, i'd go for the
> Hypervisor solution. This way multiple OSs can easily run in parallel
> including nested OSs. Even with one OS it abstracts away specific
> implementation details that we might want to change in later versions.

With tagged memory and a base hypervisor layer, it's starting to sound
like an IBM mainframe-on-a-chip! Is there an existing system you would
propose to model this on? L4? Akaros?

>> > (**) DMA channel for each minion, programmable only by the Minion side
>> > since Application CPUs don't know where memory is in the Minion nor know
>> > if its space is free.
>> I like this.
> This also prevents (most) DMA based attacks since no user code will ever run
> on the Minions. Unless of course their firmware upgrade is explicitly allowed
> by the minion master. These minion firmware could be considered Trusted and
> are normally bound to the specific board and its hardware interfaces.

Yes, replacing minion's code should certainly be a privileged
operation and in some deployments you'd want it to be fixed as part of
the secure boot sequence.

> IF we go for supporting tags in the Minions, well... what to do with them? We
> can't just save them to disc for that needs knowledge not present there. We
> can't transfer them over a serial line unless we define a protocol for that. I
> think thats the curse of wanting to support something that no devices cater
> for. We're too early :)

A minion could be employed as a smart I/O device where it reads in
something over e.g. bluetooth, builds a datastructure in its
scratchpad and applies appropriate tags which then gets DMAed back.
There may be other uses for tags that can be constructed for the

> As for an intermediate solution, the DMA engine could be instructed to only
> accept tags of a given type or give an error; say only accept data marked
> encrypted and/or store data marked a certain type.

Possibly, though this is going against the idea that tagged memory is
a general purpose reconfigurable mechanism by fixing behaviour.

Thanks, as always for your thoughts!


More information about the lowrisc-dev mailing list