[lowrisc-dev] lowRISC SoC structure / communication between
application cores and minions
reinoud at NetBSD.org
Tue Dec 23 20:12:07 GMT 2014
Hi Alex, hi folks,
On Mon, Dec 22, 2014 at 04:27:23PM +0000, Alex Bradbury wrote:
> On 19 December 2014 at 14:55, Reinoud Zandijk <reinoud at netbsd.org> wrote:
> > (*) Each minion has a separate two way FIFO communication channel to the
> > Application CPUs. Its the Hypervisors task to prevent simultanious access.
> One FIFO shared between multiple applications CPUs? Having to trap the the
> hypervisor for any send on the FIFO seems unfortunate. An individual guest
> VM or even user process with appropriate permissions having the ability to
> send directly would seem ideal, even if it does make the network slightly
> more complex.
I've thought about this yes. A kind of virtual to physical device renumbering
scheme might be possible with virtual device numbers pointing to actual or
emulated devices; its going to take some hardware and might need to be
reprogrammed etc. on each OS switch. Not an ideal situation IMHO.
Why do you think `trap the hypervisor' would be *that* expensive? It don't see
why it would be that different from calling a system call. Furthermore i dont
envision the OS passing character-by-character trough a Hypervisor call but
rather command blocks, much like SCSI. As an analog, define a HCALL with as
arguments (registers just hand picked here) :
a0 : hcall number, say 0x10 here for writing to pipe :)
a1 : virtual device ID, say virtual device 0x105 -> mapped to
(say) pipe 2, sub device 3 by the Hypervisor
a2 : input block address
a3 : input block length
a4 : output block address
a5 : output block length
In the input block, things could be like :
0 : length of command block
1 : subdevice, like 3rd I2C bus
2 : some operation code, like device discovery or write blob
3 : command version, to allow for possible incompat. enhancements
4 : command options, like IRQ back, wait for result etc.etc
8 : some tag for when called back later
16 : optional small blob of data or pointer(s) to the DMA-able stuff
Likewise the output block could have a fixed header, preferably the same, but
with say read stuff appended like device type, device version, access rights,
whatever when its a device discovery or the header with a result code of the
operation like OK, queued, invalid offset, no access etc.
This way the HAL is only called once every command, maybe resulting in a few
cycles lost on the switch just like it hit a branch and the almost obligatory
saving and restoring of some registers. It is also atomically writing commands
this way when a pipe-lock is used during the actual transfer; which might also
just be a register :)
Either the Hypervisor, or if one runs a bare OS without hypervisor, both can
pump the command block trough the FIFO of choice and then wait for the results
to pass back immediately or fetch its results later when interrupted.
A continuous stream is also possible if the virtual device uses the specified
output region as a circular buffer to DMA in.
As for the choice between a Hypervisor and running a bare OS, i'd go for the
Hypervisor solution. This way multiple OSs can easily run in parallel
including nested OSs. Even with one OS it abstracts away specific
implementation details that we might want to change in later versions.
> As the FIFO is shown separately to the memory-based DMA requests my
> understanding is you are proposing app->minion communication is done via
True; using memory mapped communication creates issues like what (base)
address to take, compatibility issues on memory layout, virtual registers,
hardware support in the VM system etc.
Sure, a virtual device could be implemented in a fixed block in physical
memory but then the virtual device needs to be emulated and need to somehow be
interupted if a certain bit it toggled etc. Quite some communication and
interupting going on for a simple transaction let alone the bus-noise.
Efficiency-wise i'd say no to this. The only advantage i can think of is
emulating existing hardware devices but that road is dangerous and hairy; its
not easy to emulate it correctly, hard to extend let alone implement
In combination with a Hypervisor layer, the OS can run guest OS instances or
other OSs next to it that don't need to be ported nor have explicit support
for it and still have them run on (or allmost on) the origional speed. Very
handy if you want to, say, implement device drivers in a kernel; just run the
new instance as guest and let it run on the origional hardware you want to
cater for and all others are catered by the host OS.
> > (**) DMA channel for each minion, programmable only by the Minion side
> > since Application CPUs don't know where memory is in the Minion nor know
> > if its space is free.
> I like this.
This also prevents (most) DMA based attacks since no user code will ever run
on the Minions. Unless of course their firmware upgrade is explicitly allowed
by the minion master. These minion firmware could be considered Trusted and
are normally bound to the specific board and its hardware interfaces.
> > The `Minions' don't have tagged memory and are not tag aware and will
> > write all tags as the default and/or insecure data; this to ensure that no
> > tricks can be played with it. They also don't need to have virtual memory
> > support nor be coherent with anything.
> Minions not supporting tagged memory doesn't seem to be a necessary
> consequence of this (though given the lack of coherency, it may be easier to
> reason about to just disable it).
I agree; it doesn't come as a logical conclusion no. What i meant to say is
that they dont have to be tagged memory aware since they are only dealing with
data I/O streams. All data read to the minions trough DMA lose their tags and
all data written by the minons are tagged plain `data' until they are
It would thus be impossible to execute data that comes in trough the DMA
engine. If it is a program that is loaded, the OS binary loader and/or
Hypervisor is responsible for setting up the tags. The OS could provide a
protocol over a serial line or a disc appendage for conveying tags but those
are modyfiable and thus hackable and i wouldn't trust them.
IF we go for supporting tags in the Minions, well... what to do with them? We
can't just save them to disc for that needs knowledge not present there. We
can't transfer them over a serial line unless we define a protocol for that. I
think thats the curse of wanting to support something that no devices cater
for. We're too early :)
As for an intermediate solution, the DMA engine could be instructed to only
accept tags of a given type or give an error; say only accept data marked
encrypted and/or store data marked a certain type.
On the issue of swapping in/out pages, we have to realise its very OS
dependent. To cater for as easy a transition as possible, the Hypervisor could
be instructed to use a reserved stretch of `drum'/swap space as its storage
space. This can then be encrypted and forwarded to the device of choice; be it
a physical device or a virtual device catered by a host OS.
A 4k page has then 512 tags each each say 4 bits totalling in a 4352 bytes. If
this is then encrypted to say a multiple of the sector size the result can
then be written out directly to the drum. Even if some OS operation is then
able to mess with the drum, and the OS shouldn't allow this in the first place
(!!) the encryption will prevent tampering without detection. Reading/Writing
could be bypassed by a malignant hack and the paranoid might want to block out
the space for all reading/writing requests outside the Hypervisor.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 473 bytes
Desc: not available
Url : http://listmaster.pepperfish.net/pipermail/lowrisc-dev-lists.lowrisc.org/attachments/20141223/26414878/attachment.sig
More information about the lowrisc-dev