|
|
|
library ieee;
|
|
|
|
use ieee.std_logic_1164.all;
|
|
|
|
|
|
|
|
package insn_helpers is
|
|
|
|
function insn_rs (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
function insn_rt (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
function insn_ra (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
function insn_rb (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
function insn_rcreg (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
function insn_si (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
function insn_ui (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
function insn_l (insn_in : std_ulogic_vector) return std_ulogic;
|
|
|
|
function insn_sh32 (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
function insn_mb32 (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
function insn_me32 (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
function insn_li (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
function insn_lk (insn_in : std_ulogic_vector) return std_ulogic;
|
|
|
|
function insn_aa (insn_in : std_ulogic_vector) return std_ulogic;
|
|
|
|
function insn_rc (insn_in : std_ulogic_vector) return std_ulogic;
|
Add basic XER support
The carry is currently internal to execute1. We don't handle any of
the other XER fields.
This creates type called "xer_common_t" that contains the commonly
used XER bits (CA, CA32, SO, OV, OV32).
The value is stored in the CR file (though it could be a separate
module). The rest of the bits will be implemented as a separate
SPR and the two parts reconciled in mfspr/mtspr in latter commits.
We always read XER in decode2 (there is little point not to)
and send it down all pipeline branches as it will be needed in
writeback for all type of instructions when CR0:SO needs to be
updated (such forms exist for all pipeline branches even if we don't
yet implement them).
To avoid having to track XER hazards, we forward it back in EX1. This
assumes that other pipeline branches that can modify it (mult and div)
are running single issue for now.
One additional hazard to beware of is an XER:SO modifying instruction
in EX1 followed immediately by a store conditional. Due to our writeback
latency, the store will go down the LSU with the previous XER value,
thus the stcx. will set CR0:SO using an obsolete SO value.
I doubt there exist any code relying on this behaviour being correct
but we should account for it regardless, possibly by ensuring that
stcx. remain single issue initially, or later by adding some minimal
tracking or moving the LSU into the same pipeline as execute.
Missing some obscure XER affecting instructions like addex or mcrxrx.
[paulus@ozlabs.org - fix CA32 and OV32 for OP_ADD, fix order of
arguments to set_ov]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
5 years ago
|
|
|
function insn_oe (insn_in : std_ulogic_vector) return std_ulogic;
|
|
|
|
function insn_bd (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
function insn_bf (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
function insn_bfa (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
function insn_cr (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
function insn_bt (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
function insn_ba (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
function insn_bb (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
function insn_fxm (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
function insn_bo (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
function insn_bi (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
function insn_bh (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
function insn_d (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
function insn_ds (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
core: Implement quadword loads and stores
This implements the lq, stq, lqarx and stqcx. instructions.
These instructions all access two consecutive GPRs; for example the
"lq %r6,0(%r3)" instruction will load the doubleword at the address
in R3 into R7 and the doubleword at address R3 + 8 into R6. To cope
with having two GPR sources or destinations, the instruction gets
repeated at the decode2 stage, that is, for each lq/stq/lqarx/stqcx.
coming in from decode1, two instructions get sent out to execute1.
For these instructions, the RS or RT register gets modified on one
of the iterations by setting the LSB of the register number. In LE
mode, the first iteration uses RS|1 or RT|1 and the second iteration
uses RS or RT. In BE mode, this is done the other way around. In
order for decode2 to know what endianness is currently in use, we
pass the big_endian flag down from icache through decode1 to decode2.
This is always in sync with what execute1 is using because only rfid
or an interrupt can change MSR[LE], and those operations all cause
a flush and redirect.
There is now an extra column in the decode tables in decode1 to
indicate whether the instruction needs to be repeated. Decode1 also
enforces the rule that lq with RT = RT and lqarx with RA = RT or
RB = RT are illegal.
Decode2 now passes a 'repeat' flag and a 'second' flag to execute1,
and execute1 passes them on to loadstore1. The 'repeat' flag is set
for both iterations of a repeated instruction, and 'second' is set
on the second iteration. Execute1 does not take asynchronous or
trace interrupts on the second iteration of a repeated instruction.
Loadstore1 uses 'next_addr' for the second iteration of a repeated
load/store so that we access the second doubleword of the memory
operand. Thus loadstore1 accesses the doublewords in increasing
memory order. For 16-byte loads this means that the first iteration
writes GPR RT|1. It is possible that RA = RT|1 (this is a legal
but non-preferred form), meaning that if the memory operand was
misaligned, the first iteration would overwrite RA but then the
second iteration might take a page fault, leading to corrupted state.
To avoid that possibility, 16-byte loads in LE mode take an
alignment interrupt if the operand is not 16-byte aligned. (This
is the case anyway for lqarx, and we enforce it for lq as well.)
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
4 years ago
|
|
|
function insn_dq (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
function insn_dx (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
function insn_to (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
function insn_bc (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
function insn_sh (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
function insn_me (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
function insn_mb (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
function insn_frt (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
function insn_fra (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
function insn_frb (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
function insn_frc (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
function insn_u (insn_in : std_ulogic_vector) return std_ulogic_vector;
|
|
|
|
end package insn_helpers;
|
|
|
|
|
|
|
|
package body insn_helpers is
|
|
|
|
function insn_rs (insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(25 downto 21);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_rt (insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(25 downto 21);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_ra (insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(20 downto 16);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_rb (insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(15 downto 11);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_rcreg (insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(10 downto 6);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_si (insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(15 downto 0);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_ui (insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(15 downto 0);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_l (insn_in : std_ulogic_vector) return std_ulogic is
|
|
|
|
begin
|
|
|
|
return insn_in(21);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_sh32 (insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(15 downto 11);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_mb32 (insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(10 downto 6);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_me32 (insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(5 downto 1);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_li (insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(25 downto 2);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_lk (insn_in : std_ulogic_vector) return std_ulogic is
|
|
|
|
begin
|
|
|
|
return insn_in(0);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_aa (insn_in : std_ulogic_vector) return std_ulogic is
|
|
|
|
begin
|
|
|
|
return insn_in(1);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_rc (insn_in : std_ulogic_vector) return std_ulogic is
|
|
|
|
begin
|
|
|
|
return insn_in(0);
|
|
|
|
end;
|
|
|
|
|
Add basic XER support
The carry is currently internal to execute1. We don't handle any of
the other XER fields.
This creates type called "xer_common_t" that contains the commonly
used XER bits (CA, CA32, SO, OV, OV32).
The value is stored in the CR file (though it could be a separate
module). The rest of the bits will be implemented as a separate
SPR and the two parts reconciled in mfspr/mtspr in latter commits.
We always read XER in decode2 (there is little point not to)
and send it down all pipeline branches as it will be needed in
writeback for all type of instructions when CR0:SO needs to be
updated (such forms exist for all pipeline branches even if we don't
yet implement them).
To avoid having to track XER hazards, we forward it back in EX1. This
assumes that other pipeline branches that can modify it (mult and div)
are running single issue for now.
One additional hazard to beware of is an XER:SO modifying instruction
in EX1 followed immediately by a store conditional. Due to our writeback
latency, the store will go down the LSU with the previous XER value,
thus the stcx. will set CR0:SO using an obsolete SO value.
I doubt there exist any code relying on this behaviour being correct
but we should account for it regardless, possibly by ensuring that
stcx. remain single issue initially, or later by adding some minimal
tracking or moving the LSU into the same pipeline as execute.
Missing some obscure XER affecting instructions like addex or mcrxrx.
[paulus@ozlabs.org - fix CA32 and OV32 for OP_ADD, fix order of
arguments to set_ov]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
5 years ago
|
|
|
function insn_oe (insn_in : std_ulogic_vector) return std_ulogic is
|
|
|
|
begin
|
|
|
|
return insn_in(10);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_bd (insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(15 downto 2);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_bf (insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(25 downto 23);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_bfa (insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(20 downto 18);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_cr (insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(10 downto 1);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_bb (insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(15 downto 11);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_ba (insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(20 downto 16);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_bt (insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(25 downto 21);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_fxm (insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(19 downto 12);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_bo (insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(25 downto 21);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_bi (insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(20 downto 16);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_bh (insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(12 downto 11);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_d (insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(15 downto 0);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_ds (insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(15 downto 2);
|
|
|
|
end;
|
|
|
|
|
core: Implement quadword loads and stores
This implements the lq, stq, lqarx and stqcx. instructions.
These instructions all access two consecutive GPRs; for example the
"lq %r6,0(%r3)" instruction will load the doubleword at the address
in R3 into R7 and the doubleword at address R3 + 8 into R6. To cope
with having two GPR sources or destinations, the instruction gets
repeated at the decode2 stage, that is, for each lq/stq/lqarx/stqcx.
coming in from decode1, two instructions get sent out to execute1.
For these instructions, the RS or RT register gets modified on one
of the iterations by setting the LSB of the register number. In LE
mode, the first iteration uses RS|1 or RT|1 and the second iteration
uses RS or RT. In BE mode, this is done the other way around. In
order for decode2 to know what endianness is currently in use, we
pass the big_endian flag down from icache through decode1 to decode2.
This is always in sync with what execute1 is using because only rfid
or an interrupt can change MSR[LE], and those operations all cause
a flush and redirect.
There is now an extra column in the decode tables in decode1 to
indicate whether the instruction needs to be repeated. Decode1 also
enforces the rule that lq with RT = RT and lqarx with RA = RT or
RB = RT are illegal.
Decode2 now passes a 'repeat' flag and a 'second' flag to execute1,
and execute1 passes them on to loadstore1. The 'repeat' flag is set
for both iterations of a repeated instruction, and 'second' is set
on the second iteration. Execute1 does not take asynchronous or
trace interrupts on the second iteration of a repeated instruction.
Loadstore1 uses 'next_addr' for the second iteration of a repeated
load/store so that we access the second doubleword of the memory
operand. Thus loadstore1 accesses the doublewords in increasing
memory order. For 16-byte loads this means that the first iteration
writes GPR RT|1. It is possible that RA = RT|1 (this is a legal
but non-preferred form), meaning that if the memory operand was
misaligned, the first iteration would overwrite RA but then the
second iteration might take a page fault, leading to corrupted state.
To avoid that possibility, 16-byte loads in LE mode take an
alignment interrupt if the operand is not 16-byte aligned. (This
is the case anyway for lqarx, and we enforce it for lq as well.)
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
4 years ago
|
|
|
function insn_dq (insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(15 downto 4);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_dx (insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(15 downto 6) & insn_in(20 downto 16) & insn_in(0);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_to (insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(25 downto 21);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_bc (insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(10 downto 6);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_sh (insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(1) & insn_in(15 downto 11);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_me (insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(5) & insn_in(10 downto 6);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_mb (insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(5) & insn_in(10 downto 6);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_frt(insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(25 downto 21);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_fra(insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(20 downto 16);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_frb(insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(15 downto 11);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_frc(insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(10 downto 6);
|
|
|
|
end;
|
|
|
|
|
|
|
|
function insn_u(insn_in : std_ulogic_vector) return std_ulogic_vector is
|
|
|
|
begin
|
|
|
|
return insn_in(15 downto 12);
|
|
|
|
end;
|
|
|
|
end package body insn_helpers;
|