execute1: Fix forwarding of result when doing delayed LR update

Random execution testcases showed that a bdnzl which doesn't branch,
followed immediately by a bdnz, uses the wrong value for CTR for the
bdnz.  Decode2 detects the read-after-write hazard on CTR and tells
execute1 to use the bypass path.  However, the bdnzl takes two cycles
because it has to write back both CTR and LR, meaning that by the time
the bdnz starts to execute, r.e.write_data no longer contains the CTR
value, but instead contains zero.

To fix this, we make execute1 maintain the written-back value of CTR
in r.e.write_data across the cycle where LR is written back (this is
possible because the LR writeback uses the exc_write_data path).

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Paul Mackerras 3 years ago
parent 27ac74a341
commit e49192cb5b

@ -1145,6 +1145,9 @@ begin
v.e.exc_write_data := r.next_lr;
v.e.exc_write_reg := fast_spr_num(SPR_LR);
v.e.valid := '1';
-- Keep r.e.write_data unchanged next cycle in case it is needed
-- for a forwarded result (e.g. for CTR).
result := r.e.write_data;
elsif r.cntz_in_progress = '1' then
-- cnt[lt]z always takes two cycles
result := countzero_result;