[lowrisc-dev] Untethered lowRISC: run bbl using Verilator

Grady Chen gchen at viosoft.com
Thu Jul 20 15:54:37 BST 2017


Hi Wei,

8cb0 address for csrrw is reasonable.
grady at riscv:~/lowrisc-chip/riscv-tools/riscv-pk/build$riscv64-unknown-elf-objdump
-D bbl | less
0000000000008ca8 <num_harts_booted>:
    8ca8:       0001                    0x1
        ...
Disassembly of section .bss:
0000000000008cb0 <magic_mem.2009>:
        ...
0000000000008cf0 <fds>:

BBL used csrrw/<Host interface> to call syscall of front end server on host
side.

0000000000008cb0 <magic_mem.2009>:

        ...

pk/mtrap.c:

static uintptr_t mcall_dev_req(sbi_device_message *m)

{

  //printm("req %d %p\n", HLS()->device_request_queue_size, m);

  if (!supervisor_paddr_valid(m, sizeof(*m))

      && EXTRACT_FIELD(read_csr(mstatus), MSTATUS_PRV1) != PRV_M)

    return -EFAULT;


  if ((m->dev > 0xFFU) | (m->cmd > 0xFFU) | (m->data > 0x0000FFFFFFFFFFFFU))

    return -EINVAL;


  //flush_dcache();

  while (swap_csr(mtohost, TOHOST_CMD(m->dev, m->cmd, m->data)) != 0)

    ;


  m->sbi_private_data = (uintptr_t)HLS()->device_request_queue_head;

  HLS()->device_request_queue_head = m;

  HLS()->device_request_queue_size++;


  return 0;

}

On Thu, Jul 20, 2017 at 10:13 PM, Wei Song <ws327 at cam.ac.uk> wrote:

> Hello Grady,
>
> You are running on a really old version of lowRISC. I am afraid I cannot
> remember all the details and tell why this happened.
>
> The host interface is mostly removed. A program should use host interface
> only to deliberately exit from execution. Also it can be used to report the
> exit code if something is wrong. Here the 0x8cb0 is the lower 32-bit value
> of a5. It is not a valid exit code for the minimal host interface, that is
> why it is called unsolved.
>
> So the real question probably is, why a5 is assigned to 0x8cb0. Where is
> this code segment in the BBL source code?
>
> I would think the program might have already run into some exceptions far
> before here. You probably need to trace backwards to find out.
>
> Best regards,
> Wei
>
>
>
> On 20/07/2017 14:43, Grady Chen wrote:
>
> Hi Wei,
>
> Thank you.
> By referring to dram.c, I can get a workaround dcache_flush() function
> which is to write through a big memory space.
> Let me go back to run BBL using Verilator:
>
> grady at riscv:~/lowrisc-chip/riscv-tools/riscv-pk$ elf2hex 16 8192
> build/bbl > ../../vsim/bbl.hex
> grady at riscv:~/lowrisc-chip/vsim$ ./DefaultConfig-sim-debug +vcd
> +vcd_name=bbl.vcd +max-cycles=10000 +load=bbl.hex | spike-dasm > bbl.log
> Core 0 get unsolved tohost code 8cb0
>
> grady at riscv:~/lowrisc-chip/vsim$ tail bbl.log
> C0:       3195 [0] pc=[0000002a7c] W[r 0=0000000000008cb0][0]
> R[r15=0000000000008cb0] R[r12=0000000000008cb0] inst=[00c7e7b3] or      a5,
> a5, a2
> C0:       3196 [0] pc=[0000002a7c] W[r 0=0000000000008cb0][0]
> R[r15=0000000000008cb0] R[r12=0000000000008cb0] inst=[00c7e7b3] or      a5,
> a5, a2
> C0:       3197 [0] pc=[0000002a7c] W[r 0=0000000000008cb0][0]
> R[r15=0000000000008cb0] R[r12=0000000000008cb0] inst=[00c7e7b3] or      a5,
> a5, a2
> C0:       3198 [1] pc=[0000002a80] W[r13=0000000000000000][1]
> R[r15=0000000000008cb0] R[r 0=0000000000000000] inst=[780796f3] csrrw   a3,
> mtohost, a5
> C0:       3199 [0] pc=[0000002a80] W[r 0=0000000000000000][0]
> R[r15=0000000000008cb0] R[r 0=0000000000000000] inst=[780796f3] csrrw   a3,
> mtohost, a5
>
> It supposedly should continue to run after csrrw instruction.
> But it stop after csrrw instruction. Next instruction to be run is "lui
> a3,0x1".
> How to make it run after csrrw instruction?
> Seems I have to modify lowrisc-chip/src/test/verilog/host_behav.sv. Is it
> right?
>
> --
> Thanks,
> Grady Chen
>
>
> On Fri, Jul 14, 2017 at 5:52 PM, Wei Song <ws327 at cam.ac.uk> wrote:
>
>> I would believe the debug-v0.3 lowRISC would be easier to use than
>> unthether-v0.2.
>>
>> We provided a dram bare-metal test
>>     https://github.com/lowRISC/lowrisc-fpga/blob/debug-v0.3/bare
>> _metal/examples/dram.c
>> This test will write through a big memory space to force LLC to write
>> back cache line due to capacity misses.The test is small and should be able
>> to run in simulation very fast. If you do not have a UART yet, comment out
>> all printf related lines.
>>
>> -Wei
>>
>>
>> On 14/07/2017 04:43, Grady Chen wrote:
>>
>> Hi Wei,
>>
>> Thank you for the information.
>> My use case is to run bbl&linux using Veloce2 Emulator. I have compiled
>> unthether-v0.2 on it.
>> I don't have Boot RAM, DDR RAM, UART and SD for it. So I am using
>> behaviour DRAM to boot linux.
>>
>> Before that, I want to make sure if I can run bbl&linux using Verilator.
>>
>> --
>> Thanks,
>> Grady Chen
>>
>>
>>
>> On Thu, Jul 13, 2017 at 6:27 PM, Wei Song <ws327 at cam.ac.uk> wrote:
>>
>>> Hello Grady,
>>>
>>> Right now there is no instruction to deliberately flush the data cache.
>>> Some discussion from the RISC-V maillist can be found here:
>>> https://groups.google.com/a/groups.riscv.org/forum/#!msg/isa
>>> -dev/XD_QkBH7HEk/Ag18X7IlCAAJ
>>> https://groups.google.com/a/groups.riscv.org/forum/#!msg/isa
>>> -dev/Bo0nb26fguM/dhBQOaMBBAAJ
>>> https://groups.google.com/a/groups.riscv.org/forum/#!msg/isa
>>> -dev/EYAG7yQRnaQ/hc5uEOwUBQAJ
>>>
>>> In the unthether-v0.2 of lowRISC, we did support bypassing the whole
>>> cache hierarchy by mapping the memory to I/O space at run-time.
>>> However, this behaviour is no longer supported and not recommended.
>>> The reason for us to do so was the RISC-V GCC had insufficient support
>>> for program relocation at that time.
>>> As a result, the bootloader and the kernel were located at the same
>>> physical address space.
>>>
>>> As I said, I strongly recommend you not to go for the cache bypass
>>> direction.
>>>
>>> The Verilog you pointed out controls the address mapping of the DDR
>>> memory after LLC.
>>> It has little to do with the run-time cache bypassing as Verilog
>>> parameters are a compile time mechanism.
>>>
>>> If we can understand more of your use case, we might provide some
>>> suggestions.
>>>
>>> I guess if what you want is a processor to access a memory without any
>>> cache, you can configure the latest Rocket-chip from the freechipsproject
>>> with a scratch pad replacing the L1 cache. In this case, everything is
>>> controlled by software.
>>> However, that is not something we support right now.
>>>
>>> Best regards,
>>> Wei
>>>
>>> On 13/07/2017 05:23, Grady Chen wrote:
>>>
>>> Hi Wei,
>>>
>>> Thank you for your explanation.
>>> Is there a way to flush data cache? I meant to add some code on BBL.
>>> or how to bypass data cache for behaviour dram?
>>> Supposedly, I will need to modify the following parameters in
>>> chip_top.sv to bypass data cache, right?
>>>    // crossbar to merge memory and IO to the behaviour dram
>>>    nasti_crossbar
>>>      #(
>>>        .N_INPUT    ( 2                  ),
>>>        .N_OUTPUT   ( 1                  ),
>>>        .IB_DEPTH   ( 3                  ),
>>>        .OB_DEPTH   ( 3                  ),
>>>        .W_MAX      ( 4                  ),
>>>        .R_MAX      ( 4                  ),
>>>        .ID_WIDTH   ( `MEM_TAG_WIDTH + 1 ),
>>>        .ADDR_WIDTH ( `PADDR_WIDTH       ),
>>>        .DATA_WIDTH ( `MEM_DAT_WIDTH     ),
>>>        .BASE0      ( 0                  ),
>>>        .MASK0      ( 32'hffffffff       )
>>>        )
>>>    mem_crossbar
>>>      (
>>>       .*,
>>>       .s ( mem_io_nasti  ),
>>>       .m ( ram_nasti     )
>>>       );
>>>
>>> --
>>> Thanks,
>>> Grady Chen
>>>
>>> On Wed, Jul 12, 2017 at 5:44 PM, Wei Song <ws327 at cam.ac.uk> wrote:
>>>
>>>> Hello Grady,
>>>>
>>>> When there is a write-allocated data cache in the system, a store
>>>> operation does not cause a write transaction to the memory but usually a
>>>> read transaction. The write is handled in the write-allocated cache. If the
>>>> cache line is missed in cache, it is fetched from memory, that is why you
>>>> see the read transaction.
>>>>
>>>> Write transactions happens when there is a replacement occurs in the
>>>> last level cache, and the dirty cache line is written back to memory before
>>>> a new one can be fetched.
>>>>
>>>> That is to say, you need more memory operations to trigger a writeback
>>>> to see any write transactions to memory.
>>>>
>>>> BTW, the lowRISC version you are using is old. For debug-v0.3 and the
>>>> latest minion-v0.4, +load= accepts an elf executable and the elf2hex step
>>>> is no longer necessary.
>>>>
>>>> Best regards,
>>>> Wei
>>>>
>>>>
>>>> On 12/07/2017 08:27, Grady Chen wrote:
>>>>
>>>>> Hi All,
>>>>>
>>>>> For some reason, I am running bbl using Verilator.
>>>>> The following are my steps:
>>>>>
>>>>> grady at riscv:~/lowrisc-chip/riscv-tools/riscv-pk$ elf2hex 16 8192
>>>>> build/bbl
>>>>>
>>>>>> ../../vsim/bbl.hex
>>>>>>
>>>>> grady at riscv:~/lowrisc-chip/vsim$ ./DefaultConfig-sim-debug +vcd
>>>>> +vcd_name=bbl.vcd +max-cycles=100000000 +load=bbl.hex | spike-dasm >
>>>>> bbl.log
>>>>>
>>>>> Core 0 get unsolved tohost code 8cb0 *# This is what I expected.*
>>>>>
>>>>> grady at riscv:~/untether/lowrisc-chip/vsim$ grep request bbl.log
>>>>>
>>>>> memory read request: 1 @ 200
>>>>>
>>>>> memory read request: 1 @ 240
>>>>>
>>>>> memory read request: 1 @ 280
>>>>>
>>>>> memory read request: 1 @ 2c80
>>>>>
>>>>> memory read request: 1 @ be80
>>>>>
>>>>> memory read request: 2 @ bfc0
>>>>>
>>>>> memory read request: 1 @ 2cc0
>>>>>
>>>>> memory read request: 1 @ 2d00
>>>>>
>>>>> ......
>>>>>
>>>>>
>>>>> There is only the memory read request but no memory write request.
>>>>> Seems
>>>>> not right.
>>>>> Any one know how to make SD assembly instruction leads memory write
>>>>> transaction?
>>>>>
>>>>> --
>>>>> Thanks,
>>>>> Grady Chen
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>


More information about the lowrisc-dev mailing list