This adds a TLB to dcache, providing the ability to translate
addresses for loads and stores. No protection mechanism has been
implemented yet. The MSR_DR bit controls whether addresses are
translated through the TLB.
The TLB is a fixed-pagesize, set-associative cache. Currently
the page size is 4kB and the TLB is 2-way set associative with 64
entries per set.
This implements the tlbie instruction. RB bits 10 and 11 control
whether the whole TLB is invalidated (if either bit is 1) or just
a single entry corresponding to the effective page number in bits
12-63 of RB.
As an extension until we get a hardware page table walk, a tlbie
instruction with RB bits 9-11 set to 001 will load an entry into
the TLB. The TLB entry value is in RS in the format of a radix PTE.
Currently there is no proper handling of TLB misses. The load or
store will not be performed but no interrupt is generated.
In order to make timing at 100MHz on the Arty A7-100, we compare
the real address from each way of the TLB with the tag from each way
of the cache in parallel (requiring # TLB ways * # cache ways
comparators). Then the result is selected based on which way hit in
the TLB. That avoids a timing path going through the TLB EA
comparators, the multiplexer that selects the RA, and the cache tag
comparators.
The hack where addresses of the form 0xc------- are marked as
cache-inhibited is kept for now but restricted to real-mode accesses.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
This arranges for some mfspr and mtspr to get sent to loadstore1
instead of being handled in execute1. In particular, DAR and DSISR
are handled this way. They are therefore "slow" SPRs.
While we're at it, fix the spelling of HEIR and remove mention of
DAR and DSISR from the comments in execute1.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
This adds a "gpr" command for reading 1 or more GPRs/fast SPRs,
and a "mw" command for writing an 8-byte value to memory. It also
adds an "icreset" command for resetting the instruction cache
and fixes the "creset" command to actually reset the core instead
of starting it. The MSR is now printed along with the NIA in the
status information.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
This provides commands on the debug interface to read the value of
the MSR or any of the 64 GSPR register file entries. The GSPR values
are read using the B port of the register file in a cycle when
decode2 is not using it.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Mfspr from an unimplemented SPR should be a no-op in privileged state,
so in this case we need to write back whatever was previously in the
destination register. For problem state, both mtspr and mfspr to
unimplemented SPRs should cause a program interrupt.
There are special cases in the architecture for SPRs 0, 4 5 and 6
which we still don't implement.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
This mainly required the addition of an entry to the opcode 31 decode
table and a 32-bit sign-extender in the rotator.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
An external signal can control whether the core will start
executing at the standard or the alternate reset address.
This will be used when litedram is initialized by microwatt
itself, to route the reset to the built-in init code secondary
block RAM.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This comes in two parts:
- A generator script which uses LiteX to generate litedram cores
along with their init files for various boards (currently Arty and
Nexys-video). This comes with configs for arty and nexys_video.
- A fusesoc "generator" which uses pre-generated litedram cores
The generation process is manual on purpose. This include pre-generated
cores for the two above boards.
This is done so that one doesn't have to install LiteX to build
microwatt. In addition, the generator script or wrapper vhdl tend to
break when LiteX changes significantly which happens.
This is still rather standalone and hasn't been plumbed into the SoC
or the FPGA toplevel files yet.
At this point LiteDRAM self-initializes using a built-in VexRiscv
"Minimum" core obtained from LiteX and included in this commit. There
is some plumbing to generate and cores that are initialized by Microwatt
directly but this isn't working yet and so isn't enabled yet.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The icache would still spit out an instruction which could
cause a 0x700 instead of a reset.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
LiteDRAM at the moment pretty much enforces 100Mhz, and our software
isn't quite yet adaptable, so switch out default to 100Mhz accross
the board. Recent timing improvements should make it a non-issue.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
During slow instructions such as multiply or divide, if a decrementer
(or other asynchronous) interrupt becomes pending, it disrupts the
logic that keeps stall asserted until the end of the slow
instruction, and the interrupt logic starts trying to deliver the
interrupt before the slow instruction has finished.
To fix that, make the interrupt logic wait until it sees e_in.valid
set before setting exception to 1.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
We can hit the assert for req_op = OP_STORE_HIT and reloading in the
case of dcbz, since it looks like a store. Therefore we need to
exclude that case from the assert.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
This adds a test that tries to execute various privileged instructions
with MSR[PR] = 1. This also incidentally tests some of the MSR bit
manipulations.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
This adds logic to dcache and loadstore1 to implement dcbz. For now
it zeroes a single cache line (by default 64 bytes), not 128 bytes
like IBM Power processors do.
The dcbz operation is performed much like a load miss, except that
we are writing zeroes to memory instead of reading. As each ack
comes back, we write zeroes to the BRAM instead of data from memory.
In this way we zero the line in memory and also zero the line of
cache memory, establishing the line in the cache if it wasn't already
resident. If it was already resident then we overwrite the existing
line in the cache.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
In preparation for adding a TLB to the dcache, this plumbs the
insn_type from execute1 through to loadstore1, so that we can have
other operations besides loads and stores (e.g. tlbie) going to
loadstore1 and thence to the dcache. This also plumbs the unit field
of the decode ROM from decode2 through to execute1 to simplify the
logic around which ops need to go to loadstore1.
The load and store data formatting are now not conditional on the
op being OP_LOAD or OP_STORE. This eliminates the inferred latches
clocked by each of the bits of r.op that we were getting previously.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
This adds logic to execute1 to check, when MSR[PR] = 1, whether each
instruction arriving to be executed is a privileged instruction.
If it is, a privileged-instruction type program interrupt is generated.
For the mtspr and mfspr instructions, we need to look at bit 20 of the
instruction (bit 4 of the SPR number) to determine if the SPR is
privileged.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
This makes our treatment of the MSR conform better with the ISA.
- On reset, initialize the MSR to have the SF and LE bits set and
all the others reset. For good measure initialize r properly too.
- Fix the bit numbering in msr_copy (the code was using big-endian
bit numbers, not little-endian).
- Use constants like MSR_EE to index MSR bits instead of expressions
like '63 - 48', for readability.
- Set MSR[SF, LE] and clear MSR[PR, IR, DR, RI] on interrupts.
- Copy the relevant fields for rfid instead of using msr_copy, because
the partial function fields of the MSR should be left unchanged,
not zeroed. Our implementation of rfid is like the architecture
description of hrfid, because we don't implement hypervisor mode.
- Return the whole MSR for mfmsr.
- Implement the L field for mtmsrd (L=1 copies just EE and RI).
- For mtmsrd with L=0, leave out the HV, ME and LE bits as per the arch.
- For mtmsrd and rfid, if PR ends up set, then also set EE, IR and DR
as per the arch.
- A few other minor tidyups (no semantic change).
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Checks interrupt masking and priorities.
Adds to `make test_xics` which is run in `make check` also.
Signed-off-by: Michael Neuling <mikey@neuling.org>
New unified ICP and ICS XICS compliant interrupt controller.
Configurable number of hardware sources.
Fixed hardware source number based on hardware line taken. All
hardware interrupts are a fixed priority. Level interrupts supported
only.
Hardwired to 0xc0004000 in SOC (UART is kept at 0xc0002000).
Signed-off-by: Michael Neuling <mikey@neuling.org>
This fixes a bug in the logic where we would still send a load
or store instruction to loadstore1 even though we have decided
to take an asynchronous interrupt.
Reported-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Use a symlink to share the console code in hello_world. Not ideal,
but we can improve on it later.
Signed-off-by: Anton Blanchard <anton@linux.ibm.com>
I'm hitting a build error:
error[E0050]: method `alloc` has 2 parameters but the declaration in trait `core::alloc::AllocRef::alloc` has 3
Updating the version of linked_list_allocator fixes it. I updated
heapless to while I was at it.
Signed-off-by: Anton Blanchard <anton@linux.ibm.com>