ara_dispatcher
— Vector Instruction Decoder and Issuer
The ara_dispatcher
is the central instruction decoder and legality checker for Ara’s RISC-V vector unit. It receives instructions from the scalar core (CVA6) via the acc_req_i
interface and dispatches well-formed vector requests to Ara’s backend using the ara_req_o
interface.
Role in Ara
Decodes RISC-V vector instructions (RVV) from scalar core
Validates legality based on LMUL, SEW, CSR, segment loads/stores, and support fflags
Manages control and status registers (CSRs) for VL, VTYPE, VSTART, VXRM, and VXSAT
Handles load/store reshuffling to maintain consistent EEW across register groups
Issues vector requests via
ara_req_o
and coordinates responses viaara_resp_valid
Interface
Input Ports
Signal |
Width |
Description |
---|---|---|
|
1 |
Clock input |
|
1 |
Active-low reset |
|
struct |
Incoming request from scalar core |
|
1 |
Back-end ready to receive vector request |
|
1 |
Back-end has completed a request |
|
struct |
Response metadata from Ara |
|
1 |
Ara is idle, ready to accept new instructions |
|
1 |
Vector load completed |
|
1 |
Vector store completed |
Output Ports
Signal |
Width |
Description |
---|---|---|
|
struct |
Response back to scalar core |
|
1 |
Ara request is valid |
|
struct |
Decoded vector request |
|
1 |
Pending segment memory operation tracker |
FSM States
IDLE
— Waiting for valid vector instructionsWAIT_IDLE
— Waiting for Ara to become idle (CSR ops)WAIT_IDLE_FLUSH
— Flushes vector state after exceptionsRESHUFFLE
— Triggers register reshuffling before execution
Internal Concepts
CSR Registers
csr_vl_q
,csr_vtype_q
,csr_vstart_q
— Active state of vector CSRscsr_vxrm_q
,csr_vxsat_q
— Fixed-point rounding/saturation
EEW Tracking
In Ara, every vector register is encoded with a byte layout that forces consecutive vector elements into consecutive lanes (i.e., element 0 in lane 0, element 1 in lane 1, end so on). This means that a vector interpreted with a different element width will require a byte layout reshuffling to enforce consecutive vector elements in consecutive lanes.
eew_q[0..31]
stores Element Effective Width for each vreg. This is basically the byte layout encoding of every vector registerUpdated upon successful dispatch of instructions
Reshuffling
When a vector register needs to be re-interpreted with a different byte encoding, the Ara’s Dispatcher injects slide micro-operations to reshuffle the vector register’s byte layout.
Needed if same register used with different EEW
Controlled by
reshuffle_req_d[2:0]
forvs1
,vs2
,vd
Buffering via
eew_old_buffer_d
,eew_new_buffer_d
, etc.
Interface with CVA6
Vector instructions are dispatched from CVA6 to Ara when they have reached the top of CVA6’s scoreboard, i.e., when they are no more speculative and can be committed from CVA6’s perspective.
Ara’s dispatcher handshakes the request (and returns a response) if exceptions cannot happen for that instruction or if exceptions are immediately raised during decoding.
For example, arithmetic instructions can raise exceptions only during decoding. Thus, the answer to CVA6 is really fast (1 cycle).
Memory operations can raise errors on the memory bus or exceptions during virtual-to-physical translation. Therefore, memory instructions freeze the dispatcher until the VLSU has reported back an exception or the absence of it. This process requires more than 1 cycle.
Instruction Decoding
Instructions are decoded based on RVV encoding using extracted fields:
vmem_type
,varith_type
, etc.mop
,nf
,vm
,rs1
,rs2
,rd
,mew
,width
Memory Operation Handling
Load Types: VLE, VLSE, VLXE, VLVX
Store Types: VSE, VSSE, VSXE, VSVX
Unit-stride, strided, indexed, and whole-register
Segment operations detected if
nf != 0
Illegal Instruction Checks
Illegal cases include:
Illegal operand registers given the current SEW, LMUL state
EMUL × NF > 8
Access beyond register 31
Inconsistent EEW across a register group
Disallowed CSR writes or invalid opcodes
Fixed-point ops without hardware support
Floating-point ops (e.g.,
VFREC7
) without FPExt support
Reshuffling Flow
Triggered by EEW mismatch for reused vector registers
Masked out if same register appears in multiple operand slots
FSM state switches to
RESHUFFLE
, issues internal reshuffle opsOnce reshuffling is complete, instruction is re-issued
CSR Handling
All CSR access instructions (e.g., csrrw
, csrrs
, csrrc
, and immediate variants) are handled.
Only
vstart
,vxrm
,vxsat
are writablevl
,vtype
,vlenb
are read-onlyIllegal accesses cause exception
Zero VL Behavior
If vl = 0
, most instructions are treated as NOPs.
Some exceptions (whole-reg ops, special instructions)
Response is generated with
req_ready
andresp_valid
setEnsures scalar pipeline doesn’t stall