microwatt

Commit Graph

Author	SHA1	Message	Date
Paul Mackerras	e92d49375f	fetch1: Reorganize fetch1 to provide an asynchronous early next NIA to icache This adds a next_nia field to the Fetch1ToIcacheType record, which provides an indication of what will be in the nia field on the next non-stalled cycle. This is intended to be as fast as possible, being a selection from two redirect addresses (from writeback and decode1) or an internal register (r_int.next_nia). Reset addresses and predicted branch targets come through this internal register. The rearrangement here has the side effect that we can now use the BTC on the first instruction after a taken branch, whereas previously the BTC was only active starting with the second instruction after a taken branch. This provides a slight improvement in performance. This also fixes a buglet in icache where it would assert its stall output when i_in.req was false. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2 years ago
Paul Mackerras	2dceb28830	Improve timing of redirect_nia going from decode1 to fetch1 This moves the addition that computes the branch target address for statically predicted taken branches before a clock edge, so the redirect_nia signal going to fetch1 comes from a clean latch. The address generation logic is also simplified somewhat, and conditional absolute branches to negative addresses are no longer predicted taken (this should have no impact on performance as such branches are basically never used). Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2 years ago
Paul Mackerras	1c4b5def36	Improve timing of redirect_nia going from writeback to fetch1 This gets rid of the adder in writeback that computes redirect_nia. Instead, the main adder in the ALU is used to compute the branch target for relative branches. We now decode b and bc differently depending on the AA field, generating INSN_brel, INSN_babs, INSN_bcrel or INSN_bcabs as appropriate. Each one has a separate entry in the decode table in decode1; the *rel versions use CIA as the A input. The bclr/bcctr/bctar and rfid instructions now select ramspr_result for the main result mux to get the redirect address into ex1.e.write_data. For branches which are predicted taken but not actually taken, we need to redirect to the following instruction. We also need to do that for isync. We do this in the execute2 stage since whether or not to do it depends on the branch result. The next_nia computation is moved to the execute2 stage and comes in via a new leg on the secondary result multiplexer, making next_nia available ultimately in ex2.e.write_data. This also means that the next_nia leg of the primary result multiplexer is gone. Incrementing last_nia by 4 for sc (so that SRR0 points to the following instruction) is also moved to execute2. Writing CIA+4 to LR was previously done through the main result multiplexer. Now it comes in explicitly in the ramspr write logic. Overall this removes the br_offset and abs_br fields and the logic to add br_offset and next_nia, and one leg of the primary result multiplexer, at the cost of a few extra control signals between execute1 and execute2 and some multiplexing for the ramspr write side and an extra input on the secondary result multiplexer. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2 years ago
Paul Mackerras	06ff486567	icache: Restore primary opcode to instruction word The icache stores a predecoded insn_code value for each instruction, and so as to fit in 36 bits, omits the primary opcode (the most significant 6 bits) of each instruction. Previously, for valid instructions, the primary opcode field of the instruction delivered to decode1 was a part-representation of the insn_code value rather than the actual primary opcode. This adds a lookup table to compute the primary opcode from the insn_code and deliver it in the instruction words supplied to decode1. In order that each insn_code can be associated with a single primary opcode value, the various no-operation instructions with primary opcode 31 (the reserved no-ops and dss, dst and dstst) have been given a new insn_code, INSN_rnop, leaving INSN_nop for the preferred no-op (ori r0,r0,0). Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2 years ago
Paul Mackerras	f59485f157	Merge pull request #420 from paulusmack/master Various minor improvements	2 years ago
Paul Mackerras	8dc24416aa	Merge pull request #421 from paulusmack/fixes Fix instruction logging	2 years ago
Paul Mackerras	b1b1367cd5	icache: Fix instruction sent to log Log the instruction read from the icache, not the instruction (if any) being written to the icache. Fixes: `6db626d245` ("icache: Log 36 bits of instruction rather than 32") Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2 years ago
Paul Mackerras	af62b9f1eb	scripts/fmt_log: Update for recent changes This updates fmt_log.c to account for the recent changes to insn_type_t and to unit_t. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2 years ago
Paul Mackerras	f668597f67	Merge pull request #419 from paulusmack/prefix Add support for prefixed instructions	2 years ago
Paul Mackerras	4bef477e29	core_debug: Add support for detecting writes to a memory address This adds a new type of stop trigger for the log buffer which triggers when any byte(s) of a specified doubleword of memory are written. The trigger logic snoops the wishbone for writes to the address specified and stops the log 256 cycles later (same as for the instruction fetch address trigger). The trigger address is a real address and sees DMA writes from devices as well as stores done by the CPU. The mw_debug command has a new 'mtrig' subcommand to set the trigger and query its state. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2 years ago
Paul Mackerras	a2890745d5	Makefile: Remove long micropython test from check_light It takes a very long time, so remove it from the "light" check. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2 years ago
Paul Mackerras	9c3d14dd5a	dcache: Make reading of DTLB independent of d_in.valid This improves timing. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2 years ago
Paul Mackerras	8c5dabd67f	dcache: Make r1.acks_pending independent of r1.state With this, the logic that maintains r1.acks_pending operates in every state based on r1.wb and wishbone_in, rather than only operating in STORE_WAIT_ACK state. This makes things a bit clearer and improves timing slightly. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2 years ago
Paul Mackerras	84008fbf41	arty: Change shield I/O pin bus into individual signals Make the shield I/O pins be individual signals rather than a bus in order to avoid warnings on pins which don't have both a driver and a receiver. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2 years ago
Paul Mackerras	b7ccffe2a3	Merge pull request #404 from CodeConstruct:dev/gpio-interrupt Interrupts for GPIO Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2 years ago
Paul Mackerras	b50170cd1d	Implement byte reversal instructions This implements the byte-reverse halfword, word and doubleword instructions: brh, brw, and brd. These instructions were added to the ISA in version 3.1. They use a new OP_BREV insn_type value. The logic for these instructions is implemented in logical.vhdl. In order to avoid going over 64 insn_type values, OP_AND and OP_OR were combined into OP_LOGIC, which is like OP_AND except that the RS input can be inverted as well as the RB input. The various forms of OR instruction are then implemented using the identity a OR b = NOT (NOT a AND NOT b) The 'is_signed' field of the instruction decode table is used to indicate that RS should be inverted. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2 years ago
Paul Mackerras	fd8c0000c0	Implement set[n]bc[r] instructions This implements the setbc, setnbc, setbcr and setnbcr instructions. Because the insn_type_t type already has 64 elements, this uses the existing OP_SETB for the new instructions, and has execute1 compute different results depending on bits 6-9 of the instruction. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2 years ago
Paul Mackerras	7c5a2bcaf4	tests: Add a test for prefixed instructions Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	3 years ago
Paul Mackerras	c4492c843a	Implement interrupts for prefixed instructions This arranges to generate an illegal instruction type program interrupt for illegal prefixed instructions, that is, those where the suffix is not a legal value given the prefix, or the prefix has a reserved value in the subtype field. This implementation doesn't generate an interrupt for the invalid 8LS:D and MLS:D instruction forms where R = 1 and RA != 0. (In those cases it uses (RA) as the addend, i.e. it ignores the R bit.) This detects the case where the address of an instruction prefix is equal mod 64 to 60, and generates an alignment interrupt in that case. This also arranges to set bit 34 of SRR1 when an interrupt occurs due to a prefixed instruction, for those interrupts where that is required (i.e. trace, alignment, floating-point unavailable, data storage, data segment, and most cases of program interrupt). Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	3 years ago
Paul Mackerras	39ca675ce3	Decode prefixed instructions This adds logic to do basic decoding of the prefixed instructions defined in PowerISA v3.1B which are in the SFFS (Scalar Fixed plus Floating-Point Subset) compliancy subset. In PowerISA v3.1B SFFS, there are 14 prefixed load/store instructions plus the prefixed no-op instruction (pnop). The prefixed load/store instructions all use an extended version of D-form, which has an extra 18 bits of displacement in the prefix, plus an 'R' bit which enables PC-relative addressing. When decode1 sees an instruction word where the insn_code is INSN_prefix (i.e. the primary opcode was 1), it stores the prefix word and sends nothing down to decode2 in that cycle. When the next valid instruction word arrives, it is interpreted as a suffix, meaning that its insn_code gets modified before being used to look up the decode table. The insn_code values are rearranged so that the values for instructions which are the suffix of a valid prefixed instruction are all at even indexes, and the corresponding prefixed instructions follow immediately, so that an insn_code value can be converted to the corresponding prefixed value by setting the LSB of the insn_code value. There are two prefixed instructions, pld and pstd, for which the suffix is not a valid SFFS instruction by itself, so these have been given dummy insn_code values which decode as illegal (INSN_op57 and INSN_op61). For a prefixed instruction, decode1 examines the type and subtype fields of the prefix and checks that the suffix is valid for the type and subtype. This check doesn't affect which entry of the decode table is used; the result is passed down to decode2, and will in future be acted upon in execute1. The instruction address passed down to decode2 is the address of the prefix. To enable this, part of the instruction address is saved when the prefix is seen, and then the instruction address received from icache is partly overlaid by the saved prefix address. Because prefixed instructions are not permitted to cross 64-byte boundaries, we only need to save bits 5:2 of the instruction to do this. If the alignment restriction ever gets relaxed, we will then need to save more bits of the address. Decode2 has been extended to handle the R bit of the prefix (in 8LS and MLS forms) and to be able to generate the 34-bit immediate value from the prefix and suffix. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	3 years ago
Paul Mackerras	7af0e001ad	Move insn_codes for mcrfs, mtfsb0/1 and mtfsfi This moves the insn_code values for mcrfs, mtfsb0/1 and mtfsfi into the region used for floating-point instructions. This means that in no-FPU implementations, they will get turned into illegal instructions in predecode. We then don't need the code in execute1 that makes FP instructions illegal in no-FPU implementations. We also remove the NONE value for unit_t, since it was only ever used with insn_type = OP_ILLEGAL, and the check for unit = NONE was redundant with the check for insn_type = OP_ILLEGAL. Thus the check for unit = NONE is no longer needed and is removed here. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	3 years ago
Paul Mackerras	4416ebe92e	fetch1: Change the way predictions from the BTC are sent downstream Instead of sending down the predicted taken/not-taken bits with the target of the branch, we now send them down with the branch itself. Previously icache adjusted for this by sending the prediction bits to decode1 without a 1-clock delay while everything else had a 1-clock delay. Now icache keeps the prediction bits with the rest of the attributes for the request. Also fix a buglet in fetch1 where the first address sent out after reset didn't have .req set. Currently this doesn't cause a problem because icache doesn't really look at .req. Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	3 years ago
Anton Blanchard	83dcfeabf8	Merge pull request #417 from kraigher/master Add VHDL-LS language server configuration	3 years ago
Olof Kraigher	341a507486	Add vhdl_ls.toml dump to run.py Signed-off-by: Olof Kraigher <olof.kraigher@gmail.com>	3 years ago
Michael Neuling	da5d3ded3c	Merge pull request #409 from CodeConstruct/dev/soc-reset Make syscon SOC reset work	3 years ago
Matt Johnston	56f1c41e9c	arty: Add software reset from syscon Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>	3 years ago
Matt Johnston	1f5a2e8aaa	soc: Expose sw_soc_reset for syscon reset The soc itself will be reset when a syscon soc reset is triggered. Separately, top- board files can use the sw_soc_rst signal if they need to reset other peripherals Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>	3 years ago
Matt Johnston	89d8cf0788	Regenerate litedram with updated sdram init Using litedram c770dd62edc281c370f9e2c694fe4ac1525a0b4a litex e570b612b2a9d8f8d2002d79497bda0dc35b936a Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>	3 years ago
Matt Johnston	1874cad5b7	litedram: only run sdram init at first boot Subsequent boots can skip the dram configuration, it will already be in a usable state. Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>	3 years ago
Matt Johnston	4bd45af739	Move alt_reset to syscon Instead of connecting core_alt_reset to litedram init_done, it moves to a syscon register bit. This simplifies top- files and future soc_reset handling. sdram main.c can unset the alt_reset bit after sdram init. Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>	3 years ago
Matt Johnston	dfecda3a5f	bin2hex: handle any file length, not just 8 or 4 Treat the input as if it was padded with zeroes to a multiple of 8. This is needed if the .data in a binary changes size, it won't be a nice multiple of 4 or 8. At present the microwatt binaries all are multiples of 8, but making code alterations could make bin2hex fail unexpectedly. Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>	3 years ago
Michael Neuling	7d928200b8	Merge pull request #415 from ozbenh/uart16550-core Bundle the uart16550 core file	3 years ago
Michael Neuling	964b97e85c	Merge pull request #414 from ozbenh/misc Fixup plru_tb to use the new plrufn, take out the old plru and vunit test misc changes	3 years ago
Benjamin Herrenschmidt	d299ea925e	Bundle the uart16550 core file We already carry the UART verilog source, so we may as well use it instead of requiring fusesoc to import it from its library Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	3 years ago
Benjamin Herrenschmidt	6068b635ae	Fix plru_tb to use the new plrufn and take out the old plru.vhdl This reworks (and simplifies) plru_tb to use the new plrufn module instead of the old (and now unused) plru module. The latter is now removed completely. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	3 years ago
Michael Neuling	432d9f3150	Merge pull request #413 from ozbenh/fix-io-bridge-qw-store soc: Fix issues with 64-bit stores to IO bridge	3 years ago
Benjamin Herrenschmidt	4e32dcff80	Clean vunit_out on make clean Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	3 years ago
Benjamin Herrenschmidt	41328306f3	Add shebang to run.py It's useful to run the vunit tests by hand, this makes it easier Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	3 years ago
Benjamin Herrenschmidt	3f788e87dc	soc: Fix issues with 64-bit stores to IO bridge The IO bridge would latch the top half of write data and selection signals when issuing the second downstream store. Unfortunately at this point the bridge has already "accepted" the upstream store from the core (due to stall being 0 on the cycle when stb/cyc are 1), so the values on the wishbone signals aren't stable and might already reflect a subsequent wishbone command. This causes occasional data corruption of 64-bit stores through the IO bridge. While at it, take out a bunch of useless conditions on the data latch path. It doesn't matter whether we is 0 or 1, we can just always latch the data, the destination will decide whether to use the content or not, which should save a bit of hardware. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	3 years ago
Matt Johnston	52da535b16	gpio: Add interrupts and trigger registers Allows to trigger on rising/falling/both edge, as well as high/low level. Registers are compatible with Linux ftgpio010 driver. Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>	3 years ago
Paul Mackerras	413f2dc5d6	Merge pull request #411 from ozbenh/dcache-plru-update-fix Dcache PLRU update fix	3 years ago
Benjamin Herrenschmidt	76f61ef823	dcache: Update PLRU on misses as well as hits The current dcache will not update the PLRU on a cache miss which is later satisfied during the reload process. Thus subsequent misses will potentially evict the same cache line. The same issue happens with dcbz which are treated more/less as load misses. This fixes it by triggering a PLRU update when r1.choose_victim, which is set on a miss for one cycle to snapshot the PLRU output. This means we will update the PLRU on the same cycle as we capture its output, which is fine (the new value will be visible on the next cycle). That way, a "miss" will result in a PLRU update to reflect that the entry being refilled is actually used (and will be used to serve subsequent load operations from the same cache line while being refilled). Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	3 years ago
Benjamin Herrenschmidt	3edbbf5f18	Fix dcache_tb (and add dump of victim way to dcache) It bitrotted... more signals need to be initialized. This also adds a lot more accesses with different timing conditions allowing to test cases of hit during reloads, hit with reload formward, hit on idle cache etc... It also exposes a bug where the cache miss caused by the read of 0x140 uses the same victim way as previous cache miss of 0x40 (same index). This bug will need to be fixed separately, but at least this exposes it. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	3 years ago
Matt Johnston	f5d1deb204	Add more interrupt numbers to microwatt_soc.h Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>	3 years ago
Matt Johnston	fe62bc50e8	arty: Add switches and buttons as gpio 10-17 Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>	3 years ago
Matt Johnston	9d53882c48	arty: Add other RGB LEDs, attach to gpio 0-8 Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>	3 years ago
Matt Johnston	7619c3d089	arty: Add switches and buttons to xdc file Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>	3 years ago
Michael Neuling	84a0fba25d	Merge pull request #408 from paulusmack/plru-improvement PLRU improvements	3 years ago
Michael Neuling	5766dbab37	Merge pull request #406 from shingarov/spi-kintex Add support for flashing the s25fl256s onboard Genesys2	3 years ago
Michael Neuling	d9c55defdb	Merge pull request #407 from shingarov/openocd-012 Recognize version string "0.12" in recent OpenOCD master	3 years ago

1 2 3 4 5 ...

1304 Commits (e92d49375f92e930498e6915a7940d584245dcaa) All Branches Search

1304 Commits (e92d49375f92e930498e6915a7940d584245dcaa)

All Branches