[lowrisc-dev] Untethered lowRISC: run bbl using Verilator

Grady Chen gchen at viosoft.com
Thu Jul 20 14:43:59 BST 2017


Hi Wei,

Thank you.
By referring to dram.c, I can get a workaround dcache_flush() function
which is to write through a big memory space.
Let me go back to run BBL using Verilator:

grady at riscv:~/lowrisc-chip/riscv-tools/riscv-pk$ elf2hex 16 8192 build/bbl
> ../../vsim/bbl.hex
grady at riscv:~/lowrisc-chip/vsim$ ./DefaultConfig-sim-debug +vcd
+vcd_name=bbl.vcd +max-cycles=10000 +load=bbl.hex | spike-dasm > bbl.log
Core 0 get unsolved tohost code 8cb0

grady at riscv:~/lowrisc-chip/vsim$ tail bbl.log
C0:       3195 [0] pc=[0000002a7c] W[r 0=0000000000008cb0][0]
R[r15=0000000000008cb0] R[r12=0000000000008cb0] inst=[00c7e7b3] or      a5,
a5, a2
C0:       3196 [0] pc=[0000002a7c] W[r 0=0000000000008cb0][0]
R[r15=0000000000008cb0] R[r12=0000000000008cb0] inst=[00c7e7b3] or      a5,
a5, a2
C0:       3197 [0] pc=[0000002a7c] W[r 0=0000000000008cb0][0]
R[r15=0000000000008cb0] R[r12=0000000000008cb0] inst=[00c7e7b3] or      a5,
a5, a2
C0:       3198 [1] pc=[0000002a80] W[r13=0000000000000000][1]
R[r15=0000000000008cb0] R[r 0=0000000000000000] inst=[780796f3] csrrw   a3,
mtohost, a5
C0:       3199 [0] pc=[0000002a80] W[r 0=0000000000000000][0]
R[r15=0000000000008cb0] R[r 0=0000000000000000] inst=[780796f3] csrrw   a3,
mtohost, a5

It supposedly should continue to run after csrrw instruction.
But it stop after csrrw instruction. Next instruction to be run is "lui
a3,0x1".
How to make it run after csrrw instruction?
Seems I have to modify lowrisc-chip/src/test/verilog/host_behav.sv. Is it
right?

--
Thanks,
Grady Chen


On Fri, Jul 14, 2017 at 5:52 PM, Wei Song <ws327 at cam.ac.uk> wrote:

> I would believe the debug-v0.3 lowRISC would be easier to use than
> unthether-v0.2.
>
> We provided a dram bare-metal test
>     https://github.com/lowRISC/lowrisc-fpga/blob/debug-v0.3/
> bare_metal/examples/dram.c
> This test will write through a big memory space to force LLC to write back
> cache line due to capacity misses.The test is small and should be able to
> run in simulation very fast. If you do not have a UART yet, comment out all
> printf related lines.
>
> -Wei
>
>
> On 14/07/2017 04:43, Grady Chen wrote:
>
> Hi Wei,
>
> Thank you for the information.
> My use case is to run bbl&linux using Veloce2 Emulator. I have compiled
> unthether-v0.2 on it.
> I don't have Boot RAM, DDR RAM, UART and SD for it. So I am using
> behaviour DRAM to boot linux.
>
> Before that, I want to make sure if I can run bbl&linux using Verilator.
>
> --
> Thanks,
> Grady Chen
>
>
>
> On Thu, Jul 13, 2017 at 6:27 PM, Wei Song <ws327 at cam.ac.uk> wrote:
>
>> Hello Grady,
>>
>> Right now there is no instruction to deliberately flush the data cache.
>> Some discussion from the RISC-V maillist can be found here:
>> https://groups.google.com/a/groups.riscv.org/forum/#!msg/isa
>> -dev/XD_QkBH7HEk/Ag18X7IlCAAJ
>> https://groups.google.com/a/groups.riscv.org/forum/#!msg/isa
>> -dev/Bo0nb26fguM/dhBQOaMBBAAJ
>> https://groups.google.com/a/groups.riscv.org/forum/#!msg/isa
>> -dev/EYAG7yQRnaQ/hc5uEOwUBQAJ
>>
>> In the unthether-v0.2 of lowRISC, we did support bypassing the whole
>> cache hierarchy by mapping the memory to I/O space at run-time.
>> However, this behaviour is no longer supported and not recommended.
>> The reason for us to do so was the RISC-V GCC had insufficient support
>> for program relocation at that time.
>> As a result, the bootloader and the kernel were located at the same
>> physical address space.
>>
>> As I said, I strongly recommend you not to go for the cache bypass
>> direction.
>>
>> The Verilog you pointed out controls the address mapping of the DDR
>> memory after LLC.
>> It has little to do with the run-time cache bypassing as Verilog
>> parameters are a compile time mechanism.
>>
>> If we can understand more of your use case, we might provide some
>> suggestions.
>>
>> I guess if what you want is a processor to access a memory without any
>> cache, you can configure the latest Rocket-chip from the freechipsproject
>> with a scratch pad replacing the L1 cache. In this case, everything is
>> controlled by software.
>> However, that is not something we support right now.
>>
>> Best regards,
>> Wei
>>
>> On 13/07/2017 05:23, Grady Chen wrote:
>>
>> Hi Wei,
>>
>> Thank you for your explanation.
>> Is there a way to flush data cache? I meant to add some code on BBL.
>> or how to bypass data cache for behaviour dram?
>> Supposedly, I will need to modify the following parameters in chip_top.sv
>> to bypass data cache, right?
>>    // crossbar to merge memory and IO to the behaviour dram
>>    nasti_crossbar
>>      #(
>>        .N_INPUT    ( 2                  ),
>>        .N_OUTPUT   ( 1                  ),
>>        .IB_DEPTH   ( 3                  ),
>>        .OB_DEPTH   ( 3                  ),
>>        .W_MAX      ( 4                  ),
>>        .R_MAX      ( 4                  ),
>>        .ID_WIDTH   ( `MEM_TAG_WIDTH + 1 ),
>>        .ADDR_WIDTH ( `PADDR_WIDTH       ),
>>        .DATA_WIDTH ( `MEM_DAT_WIDTH     ),
>>        .BASE0      ( 0                  ),
>>        .MASK0      ( 32'hffffffff       )
>>        )
>>    mem_crossbar
>>      (
>>       .*,
>>       .s ( mem_io_nasti  ),
>>       .m ( ram_nasti     )
>>       );
>>
>> --
>> Thanks,
>> Grady Chen
>>
>> On Wed, Jul 12, 2017 at 5:44 PM, Wei Song <ws327 at cam.ac.uk> wrote:
>>
>>> Hello Grady,
>>>
>>> When there is a write-allocated data cache in the system, a store
>>> operation does not cause a write transaction to the memory but usually a
>>> read transaction. The write is handled in the write-allocated cache. If the
>>> cache line is missed in cache, it is fetched from memory, that is why you
>>> see the read transaction.
>>>
>>> Write transactions happens when there is a replacement occurs in the
>>> last level cache, and the dirty cache line is written back to memory before
>>> a new one can be fetched.
>>>
>>> That is to say, you need more memory operations to trigger a writeback
>>> to see any write transactions to memory.
>>>
>>> BTW, the lowRISC version you are using is old. For debug-v0.3 and the
>>> latest minion-v0.4, +load= accepts an elf executable and the elf2hex step
>>> is no longer necessary.
>>>
>>> Best regards,
>>> Wei
>>>
>>>
>>> On 12/07/2017 08:27, Grady Chen wrote:
>>>
>>>> Hi All,
>>>>
>>>> For some reason, I am running bbl using Verilator.
>>>> The following are my steps:
>>>>
>>>> grady at riscv:~/lowrisc-chip/riscv-tools/riscv-pk$ elf2hex 16 8192
>>>> build/bbl
>>>>
>>>>> ../../vsim/bbl.hex
>>>>>
>>>> grady at riscv:~/lowrisc-chip/vsim$ ./DefaultConfig-sim-debug +vcd
>>>> +vcd_name=bbl.vcd +max-cycles=100000000 +load=bbl.hex | spike-dasm >
>>>> bbl.log
>>>>
>>>> Core 0 get unsolved tohost code 8cb0 *# This is what I expected.*
>>>>
>>>> grady at riscv:~/untether/lowrisc-chip/vsim$ grep request bbl.log
>>>>
>>>> memory read request: 1 @ 200
>>>>
>>>> memory read request: 1 @ 240
>>>>
>>>> memory read request: 1 @ 280
>>>>
>>>> memory read request: 1 @ 2c80
>>>>
>>>> memory read request: 1 @ be80
>>>>
>>>> memory read request: 2 @ bfc0
>>>>
>>>> memory read request: 1 @ 2cc0
>>>>
>>>> memory read request: 1 @ 2d00
>>>>
>>>> ......
>>>>
>>>>
>>>> There is only the memory read request but no memory write request. Seems
>>>> not right.
>>>> Any one know how to make SD assembly instruction leads memory write
>>>> transaction?
>>>>
>>>> --
>>>> Thanks,
>>>> Grady Chen
>>>>
>>>
>>>
>>
>>
>
>


More information about the lowrisc-dev mailing list