[lowrisc-dev] lowRISC SoC structure / communication between application cores and minions

Reinoud Zandijk reinoud at NetBSD.org
Fri Dec 19 14:55:35 GMT 2014

Dear folks,

i'd like to propose a novel way of structuring the SoC. I'll first try to put
it into an ASCII art :)

(see http://pastebin.com/Q9Rt8fm0 for a fixed width copy)


       /----------- FIFOs (*) ----------\
       |                                |
   Appl.CPU0 <=>||             ||<=> Minion CPU --   (
       |        ||             ||      |
   Appl.CPU1 <=>||             ||<=> Minion CPU -- S   soft
       |        ||             ||      |
      ...    <=>||<==> DMA <==>||<=> Minion CPU -- H 
       |        ||    (**)     ||      |
      ...    <=>||             ||<=> Minion CPU -- I   hardware
                ||             ||      |
                ||             ||<=> ....       -- M  )
                ||             ||      |
                ||             ||<=> ....       ---
                ||             ||      |
                ||             ||<=> Minion CPU - USB + Ethernet + whatever
                ||             ||      |
                ||             ||<=> Minion CPU (Power, control)
                ||             ||      |    +-- Flash bootrom
                ||             ||      |    \-- Power and freq contr.
                ||             ||      |
                ||             ||<=> FPGA interface if wanted
                ||             ||      |
             L2 CACHE          ||<=> Minion CPU (***)
                ||                     |  |
            TAG CACHE                  |  | (private FIFO)
                ||                     |  |
                ||<==> DMA2 <=========> GPU (***)
     DRAM (DDR3, DDR4 or GDDR5)

(*) Each minion has a separate two way FIFO communication channel to the
Application CPUs. Its the Hypervisors task to prevent simultanious access.

(**) DMA channel for each minion, programmable only by the Minion side since
Application CPUs don't know where memory is in the Minion nor know if its
space is free.

(***) Open for debate on where the GPU should be positioned later on but this
is the most logical place IMHO. The Minion can provide basic abstract settings
like mode parameters and can pass commands/code to the GPU.

Each minion has its own *private* SDRAM that holds its code and its data
buffers. The size of this is of course not yet determined.

One minion is the coordinator and controls the power and frequency and all
other internal coordination. It also boots from say an external (serial?)
FlashROM, initialises the other Minions as directed and does general startup.

Other than the coordinator-Minion, the Minions depicted here are not
nessisarily separate cpus but one Minion could serve multiple pieces.

The entire `Minion side' is basicly acting as a HAL to a Hypervisor (or bare
OS) running on the `Application side' CPUs. A bare OS ofcourse is troublesome
with virtualisation as a Hypervisor is better suited.

All communication needed between the Application CPUs themselves are done
using the standard RISCV IPI communication ways trough the Hypervisor as to
not complicate things like virtualisation. Hopefully the new RISCV system docs
will also propose this.

Communication with the HAL, i.e. the Minions, is done by requesting the
Hypervisor to send command blocks to the desired Minion over the designated
FIFO. These can either be waited on or be fire-and-forget and you'll receive
an interrupt when its result is retrievable.

Data transfers are initiated by the Minions using their DMA channel to
read/write data from the main memory at their convenience; the locations and
sizes are given by the caller; continuous filling a given circular buffer by
the minion is of course also possible.

The `Minions' don't have tagged memory and are not tag aware and will write
all tags as the default and/or insecure data; this to ensure that no tricks
can be played with it. They also don't need to have virtual memory support nor
be coherent with anything.

The Application CPUs OTOH have a complete implementation of tagged memory
support, have virtual memory and can be completely OoO and beefed up as much
as wanted. They only need to be coherent with eachother and with the DMA
engine that acts like just-another writing/reading CPU.

The very high speed Application CPU memory bus to the DRAM is very short and
is only connected to the CPUs, the DMA engine and the L2 cache. No need to
distribute it all over the SoC. Each Application CPU frequency can be slowed
down as much as wanted from say 2 Ghz to 2 Khz or even full-stop or powered

The HAL/Minion bus can be at a much lower speed if desired and can be scaled
independently from the Application CPU memory bus. Since there is no coherency
between individual Minion CPUs, they can be put into full sleep slumber until
a command or event comes by. They can also individually be powered down.

All booting, memory configuration, frequency and power control and other misc.
tasks are done by a designated Minion; no need to expose this all, with the
risk of frying the SoC(!) by an OS/Hypervisor.

With regards,
Reinoud Zandijk

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 473 bytes
Desc: not available
Url : http://listmaster.pepperfish.net/pipermail/lowrisc-dev-lists.lowrisc.org/attachments/20141219/3dcf8deb/attachment.sig

More information about the lowrisc-dev mailing list