[lowrisc-dev] Untethered lowRISC: run bbl using Verilator

Wei Song ws327 at cam.ac.uk
Thu Jul 20 16:06:14 BST 2017


Hello Grady,

As I have mentioned, we were trying to remove the host interface from 
the SoC in order to make it standalone. The functions of the host 
interface is very much reduced.
This means there is no more host side and software should not rely on 
the host side to fulfill I/O functions or any functions really (except 
for exit simulation).

So where does this BBL come from? We do not support the original BBL 
from UCB or SiFive. The BBL in lowRISC should have been revised.

Is this host interface call is related to multicore and you have 
configured to use multicore?
The 0.2v of lowRISC does not support multicore, I am afraid.

Thanks for the clarification,
Wei


On 20/07/2017 15:54, Grady Chen wrote:
> Hi Wei,
>
> 8cb0 address for csrrw is reasonable.
> grady at riscv:~/lowrisc-chip/riscv-tools/riscv-pk/build$riscv64-unknown-elf-objdump 
> -D bbl | less
> 0000000000008ca8 <num_harts_booted>:
> 8ca8: 00010x1
> ...
> Disassembly of section .bss:
> 0000000000008cb0 <magic_mem.2009>:
> ...
> 0000000000008cf0 <fds>:
>
> BBL used csrrw/<Host interface> to call syscall of front end server on 
> host side.
>
> 0000000000008cb0 <magic_mem.2009>:
>
> ...
>
> pk/mtrap.c:
>
> staticuintptr_tmcall_dev_req(sbi_device_message *m)
>
> {
>
> //printm("req %d %p\n", HLS()->device_request_queue_size, m);
>
> if(!supervisor_paddr_valid(m, sizeof(*m))
>
> && EXTRACT_FIELD(read_csr(mstatus), MSTATUS_PRV1) != PRV_M)
>
> return-EFAULT;
>
>
> if((m->dev > 0xFFU) | (m->cmd > 0xFFU) | (m->data > 0x0000FFFFFFFFFFFFU))
>
> return-EINVAL;
>
>
> //flush_dcache();
>
> while(swap_csr(mtohost, TOHOST_CMD(m->dev, m->cmd, m->data)) != 0)
>
> ;
>
>
> m->sbi_private_data = (uintptr_t)HLS()->device_request_queue_head;
>
> HLS()->device_request_queue_head = m;
>
> HLS()->device_request_queue_size++;
>
>
> return0;
>
> }
>
>
> On Thu, Jul 20, 2017 at 10:13 PM, Wei Song <ws327 at cam.ac.uk 
> <mailto:ws327 at cam.ac.uk>> wrote:
>
>     Hello Grady,
>
>     You are running on a really old version of lowRISC. I am afraid I
>     cannot remember all the details and tell why this happened.
>
>     The host interface is mostly removed. A program should use host
>     interface only to deliberately exit from execution. Also it can be
>     used to report the exit code if something is wrong. Here the
>     0x8cb0 is the lower 32-bit value of a5. It is not a valid exit
>     code for the minimal host interface, that is why it is called
>     unsolved.
>
>     So the real question probably is, why a5 is assigned to 0x8cb0.
>     Where is this code segment in the BBL source code?
>
>     I would think the program might have already run into some
>     exceptions far before here. You probably need to trace backwards
>     to find out.
>
>     Best regards,
>     Wei
>
>
>
>     On 20/07/2017 14:43, Grady Chen wrote:
>>     Hi Wei,
>>
>>     Thank you.
>>     By referring to dram.c, I can get a workaround dcache_flush()
>>     function which is to write through a big memory space.
>>     Let me go back to run BBL using Verilator:
>>
>>     grady at riscv:~/lowrisc-chip/riscv-tools/riscv-pk$ elf2hex 16 8192
>>     build/bbl > ../../vsim/bbl.hex
>>     grady at riscv:~/lowrisc-chip/vsim$ ./DefaultConfig-sim-debug +vcd
>>     +vcd_name=bbl.vcd +max-cycles=10000 +load=bbl.hex | spike-dasm >
>>     bbl.log
>>     Core 0 get unsolved tohost code 8cb0
>>
>>     grady at riscv:~/lowrisc-chip/vsim$ tail bbl.log
>>     C0: 3195 [0] pc=[0000002a7c] W[r 0=0000000000008cb0][0]
>>     R[r15=0000000000008cb0] R[r12=0000000000008cb0] inst=[00c7e7b3]
>>     ora5, a5, a2
>>     C0: 3196 [0] pc=[0000002a7c] W[r 0=0000000000008cb0][0]
>>     R[r15=0000000000008cb0] R[r12=0000000000008cb0] inst=[00c7e7b3]
>>     ora5, a5, a2
>>     C0: 3197 [0] pc=[0000002a7c] W[r 0=0000000000008cb0][0]
>>     R[r15=0000000000008cb0] R[r12=0000000000008cb0] inst=[00c7e7b3]
>>     ora5, a5, a2
>>     C0: 3198 [1] pc=[0000002a80] W[r13=0000000000000000][1]
>>     R[r15=0000000000008cb0] R[r 0=0000000000000000] inst=[780796f3]
>>     csrrw a3, mtohost, a5
>>     C0: 3199 [0] pc=[0000002a80] W[r 0=0000000000000000][0]
>>     R[r15=0000000000008cb0] R[r 0=0000000000000000] inst=[780796f3]
>>     csrrw a3, mtohost, a5
>>
>>     It supposedly should continue to run after csrrw instruction.
>>     But it stop after csrrw instruction. Next instruction to be run
>>     is "lui a3,0x1".
>>     How to make it run after csrrw instruction?
>>     Seems I have to modify
>>     lowrisc-chip/src/test/verilog/host_behav.sv
>>     <http://host_behav.sv>. Is it right?
>>
>>     --
>>     Thanks,
>>     Grady Chen
>>
>>
>>     On Fri, Jul 14, 2017 at 5:52 PM, Wei Song <ws327 at cam.ac.uk
>>     <mailto:ws327 at cam.ac.uk>> wrote:
>>
>>         I would believe the debug-v0.3 lowRISC would be easier to use
>>         than unthether-v0.2.
>>
>>         We provided a dram bare-metal test
>>         https://github.com/lowRISC/lowrisc-fpga/blob/debug-v0.3/bare_metal/examples/dram.c
>>         <https://github.com/lowRISC/lowrisc-fpga/blob/debug-v0.3/bare_metal/examples/dram.c>
>>         This test will write through a big memory space to force LLC
>>         to write back cache line due to capacity misses.The test is
>>         small and should be able to run in simulation very fast. If
>>         you do not have a UART yet, comment out all printf related lines.
>>
>>         -Wei
>>
>>
>>         On 14/07/2017 04:43, Grady Chen wrote:
>>>         Hi Wei,
>>>
>>>         Thank you for the information.
>>>         My use case is to run bbl&linux using Veloce2 Emulator. I
>>>         have compiled unthether-v0.2 on it.
>>>         I don't have Boot RAM, DDR RAM, UART and SD for it. So I am
>>>         using behaviour DRAM to boot linux.
>>>
>>>         Before that, I want to make sure if I can run bbl&linux
>>>         using Verilator.
>>>
>>>         --
>>>         Thanks,
>>>         Grady Chen
>>>
>>>
>>>
>>>         On Thu, Jul 13, 2017 at 6:27 PM, Wei Song <ws327 at cam.ac.uk
>>>         <mailto:ws327 at cam.ac.uk>> wrote:
>>>
>>>             Hello Grady,
>>>
>>>             Right now there is no instruction to deliberately flush
>>>             the data cache.
>>>             Some discussion from the RISC-V maillist can be found here:
>>>             https://groups.google.com/a/groups.riscv.org/forum/#!msg/isa-dev/XD_QkBH7HEk/Ag18X7IlCAAJ
>>>             <https://groups.google.com/a/groups.riscv.org/forum/#%21msg/isa-dev/XD_QkBH7HEk/Ag18X7IlCAAJ>
>>>             https://groups.google.com/a/groups.riscv.org/forum/#!msg/isa-dev/Bo0nb26fguM/dhBQOaMBBAAJ
>>>             <https://groups.google.com/a/groups.riscv.org/forum/#%21msg/isa-dev/Bo0nb26fguM/dhBQOaMBBAAJ>
>>>             https://groups.google.com/a/groups.riscv.org/forum/#!msg/isa-dev/EYAG7yQRnaQ/hc5uEOwUBQAJ
>>>             <https://groups.google.com/a/groups.riscv.org/forum/#%21msg/isa-dev/EYAG7yQRnaQ/hc5uEOwUBQAJ>
>>>
>>>             In the unthether-v0.2 of lowRISC, we did support
>>>             bypassing the whole cache hierarchy by mapping the
>>>             memory to I/O space at run-time.
>>>             However, this behaviour is no longer supported and not
>>>             recommended.
>>>             The reason for us to do so was the RISC-V GCC had
>>>             insufficient support for program relocation at that time.
>>>             As a result, the bootloader and the kernel were located
>>>             at the same physical address space.
>>>
>>>             As I said, I strongly recommend you not to go for the
>>>             cache bypass direction.
>>>
>>>             The Verilog you pointed out controls the address mapping
>>>             of the DDR memory after LLC.
>>>             It has little to do with the run-time cache bypassing as
>>>             Verilog parameters are a compile time mechanism.
>>>
>>>             If we can understand more of your use case, we might
>>>             provide some suggestions.
>>>
>>>             I guess if what you want is a processor to access a
>>>             memory without any cache, you can configure the latest
>>>             Rocket-chip from the freechipsproject with a scratch pad
>>>             replacing the L1 cache. In this case, everything is
>>>             controlled by software.
>>>             However, that is not something we support right now.
>>>
>>>             Best regards,
>>>             Wei
>>>
>>>
>>>             On 13/07/2017 05:23, Grady Chen wrote:
>>>>             Hi Wei,
>>>>
>>>>             Thank you for your explanation.
>>>>             Is there a way to flush data cache? I meant to add some
>>>>             code on BBL.
>>>>             or how to bypass data cache for behaviour dram?
>>>>             Supposedly, I will need to modify the following
>>>>             parameters in chip_top.sv <http://chip_top.sv> to
>>>>             bypass data cache, right?
>>>>                // crossbar to merge memory and IO to the behaviour dram
>>>>              nasti_crossbar
>>>>              #(
>>>>              .N_INPUT    ( 2  ),
>>>>              .N_OUTPUT   ( 1  ),
>>>>              .IB_DEPTH   ( 3  ),
>>>>              .OB_DEPTH   ( 3  ),
>>>>              .W_MAX      ( 4  ),
>>>>              .R_MAX      ( 4  ),
>>>>              .ID_WIDTH   ( `MEM_TAG_WIDTH + 1 ),
>>>>              .ADDR_WIDTH ( `PADDR_WIDTH   ),
>>>>              .DATA_WIDTH ( `MEM_DAT_WIDTH   ),
>>>>              .BASE0      ( 0  ),
>>>>              .MASK0      ( 32'hffffffff   )
>>>>              )
>>>>              mem_crossbar
>>>>              (
>>>>             .*,
>>>>             .s ( mem_io_nasti  ),
>>>>             .m ( ram_nasti   )
>>>>             );
>>>>
>>>>             --
>>>>             Thanks,
>>>>             Grady Chen
>>>>
>>>>             On Wed, Jul 12, 2017 at 5:44 PM, Wei Song
>>>>             <ws327 at cam.ac.uk <mailto:ws327 at cam.ac.uk>> wrote:
>>>>
>>>>                 Hello Grady,
>>>>
>>>>                 When there is a write-allocated data cache in the
>>>>                 system, a store operation does not cause a write
>>>>                 transaction to the memory but usually a read
>>>>                 transaction. The write is handled in the
>>>>                 write-allocated cache. If the cache line is missed
>>>>                 in cache, it is fetched from memory, that is why
>>>>                 you see the read transaction.
>>>>
>>>>                 Write transactions happens when there is a
>>>>                 replacement occurs in the last level cache, and the
>>>>                 dirty cache line is written back to memory before a
>>>>                 new one can be fetched.
>>>>
>>>>                 That is to say, you need more memory operations to
>>>>                 trigger a writeback to see any write transactions
>>>>                 to memory.
>>>>
>>>>                 BTW, the lowRISC version you are using is old. For
>>>>                 debug-v0.3 and the latest minion-v0.4, +load=
>>>>                 accepts an elf executable and the elf2hex step is
>>>>                 no longer necessary.
>>>>
>>>>                 Best regards,
>>>>                 Wei
>>>>
>>>>
>>>>                 On 12/07/2017 08:27, Grady Chen wrote:
>>>>
>>>>                     Hi All,
>>>>
>>>>                     For some reason, I am running bbl using Verilator.
>>>>                     The following are my steps:
>>>>
>>>>                     grady at riscv:~/lowrisc-chip/riscv-tools/riscv-pk$
>>>>                     elf2hex 16 8192 build/bbl
>>>>
>>>>                         ../../vsim/bbl.hex
>>>>
>>>>                     grady at riscv:~/lowrisc-chip/vsim$
>>>>                     ./DefaultConfig-sim-debug +vcd
>>>>                     +vcd_name=bbl.vcd +max-cycles=100000000
>>>>                     +load=bbl.hex | spike-dasm > bbl.log
>>>>
>>>>                     Core 0 get unsolved tohost code 8cb0 *# This is
>>>>                     what I expected.*
>>>>
>>>>                     grady at riscv:~/untether/lowrisc-chip/vsim$ grep
>>>>                     request bbl.log
>>>>
>>>>                     memory read request: 1 @ 200
>>>>
>>>>                     memory read request: 1 @ 240
>>>>
>>>>                     memory read request: 1 @ 280
>>>>
>>>>                     memory read request: 1 @ 2c80
>>>>
>>>>                     memory read request: 1 @ be80
>>>>
>>>>                     memory read request: 2 @ bfc0
>>>>
>>>>                     memory read request: 1 @ 2cc0
>>>>
>>>>                     memory read request: 1 @ 2d00
>>>>
>>>>                     ......
>>>>
>>>>
>>>>                     There is only the memory read request but no
>>>>                     memory write request. Seems
>>>>                     not right.
>>>>                     Any one know how to make SD assembly
>>>>                     instruction leads memory write
>>>>                     transaction?
>>>>
>>>>                     --
>>>>                     Thanks,
>>>>                     Grady Chen
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>



More information about the lowrisc-dev mailing list