Dear Aleksandar,
The majority of the processor code, is written in Chisel, a functional
language which contains the hardware
as a shallow embedding. This means you don't get any hardware until you
run the chisel scripts. In principle
it is possible to generate a subset of the Verilog by choosing a
different chisel target. I would only recommend
this usage if you are a functional programming afficionado. However
having generated the Verilog for the entire
design as per normal (this will appear in the generated-src subdirectory
of vsim if you choose a simulation task),
it is quite easy to write a testbench that justs targets the FPU
section. However the generated Verilog is not very
understandable, I'm afraid this is largely outside our control.
For example considering the Top.DefaultConfig.sv file, you would have to
include something like this in your testbench.
FPU FP_UNDER_TEST(.clk(clk), .reset(reset),
.io_inst( core_io_fpu_inst ),
.io_fromint_data( core_io_fpu_fromint_data ),
.io_fcsr_rm( core_io_fpu_fcsr_rm ),
.io_fcsr_flags_valid( FPU_io_fcsr_flags_valid ),
.io_fcsr_flags_bits( FPU_io_fcsr_flags_bits ),
.io_store_data( FPU_io_store_data ),
.io_toint_data( FPU_io_toint_data ),
.io_dmem_resp_val( core_io_fpu_dmem_resp_val ),
.io_dmem_resp_type( core_io_fpu_dmem_resp_type ),
.io_dmem_resp_tag( core_io_fpu_dmem_resp_tag ),
.io_dmem_resp_data( core_io_fpu_dmem_resp_data ),
.io_valid( core_io_fpu_valid ),
.io_fcsr_rdy( FPU_io_fcsr_rdy ),
.io_nack_mem( FPU_io_nack_mem ),
.io_illegal_rm( FPU_io_illegal_rm ),
.io_killx( core_io_fpu_killx ),
.io_killm( core_io_fpu_killm ),
.io_dec_cmd( FPU_io_dec_cmd ),
.io_dec_ldst( FPU_io_dec_ldst ),
.io_dec_wen( FPU_io_dec_wen ),
.io_dec_ren1( FPU_io_dec_ren1 ),
.io_dec_ren2( FPU_io_dec_ren2 ),
.io_dec_ren3( FPU_io_dec_ren3 ),
.io_dec_swap12( FPU_io_dec_swap12 ),
.io_dec_swap23( FPU_io_dec_swap23 ),
.io_dec_single( FPU_io_dec_single ),
.io_dec_fromint( FPU_io_dec_fromint ),
.io_dec_toint( FPU_io_dec_toint ),
.io_dec_fastpipe( FPU_io_dec_fastpipe ),
.io_dec_fma( FPU_io_dec_fma ),
.io_dec_div( FPU_io_dec_div ),
.io_dec_sqrt( FPU_io_dec_sqrt ),
.io_dec_round( FPU_io_dec_round ),
.io_dec_wflags( FPU_io_dec_wflags ),
.io_sboard_set( FPU_io_sboard_set ),
.io_sboard_clr( FPU_io_sboard_clr ),
.io_sboard_clra( FPU_io_sboard_clra ),
.io_cp_req_ready( FPU_io_cp_req_ready ),
.io_cp_req_valid( 1'h0 ),
.io_cp_resp_ready( 1'h0 ),
.io_cp_resp_valid( FPU_io_cp_resp_valid ),
.io_cp_resp_bits_data( FPU_io_cp_resp_bits_data )
);
I'm afraid I don't really know what these signals do, but there is a
sample (not currently working) testbench in the hardfloat sub-directory.
It uses the berkley-testfloat library to compare with a software
implementation. However this is not a timing model so it cannot be used
for performance estimation.
On 17/11/17 08:55, Aleksandar Pajkanovic wrote:
Hello,
my name is Aleksandar Pajkanovic and I write to express interest in
the floating-point unit implementation within the LowRISC project.
I am learning on the topic, so for starters I would like to be able to
test FPU's performance.
Does the whole chip need to be generated in order to testbench the FPU?
In either case, what would be the simplest way to have it up and
running within the testbench?
While going through the repo, I first thought that the hardfloat
submodule represents the FPU, but later on I found fpu.scala within
the rocket submodule - I am a bit confused what is the relation
between these two. Seemengly, they have nothing to do which other -
and I am saying this with great fear, I truly don't know, I am only
not seeing the connection. Would you, please, provide some context on
the development of the FPU?
Finally, as I am still trying to bend my mind around the RISC-V,
Chisel, Scala, chip generators and chip... please forgive possible
redundancy. I did try searching the list and found no discussions of FPU.
Best regards,
Aleksandar
p.s. Guys, thank you for all the effort you are putting into the
development of open hardware.