profile
viewpoint
If you are wondering where the data of this site comes from, please visit https://api.github.com/users/Lingrui98/events. GitMemory does not store any data, but only uses NGINX to cache data for a period of time. The idea behind GitMemory is simply to give users a better reading experience.
Steve Gou Lingrui98 Beijing Graduate student of CS major in ICT

Lingrui98/RISC-V-book 88

A translation project of the RISC-V reader

ucassjy/SimpleOCR 6

A simple OCR project for course DD2424 in kth.

Lingrui98/scalaTage 2

A scala version of simple TAGE branch predictor

Lingrui98/UCAS_AutoDownload 2

批量更新下载UCAS课件的GUI程序

Lingrui98/algorithm_course 1

国科大 算法分析与设计 卜东波 作业答案整理

Lingrui98/Verilog-Gadget 1

🔧 Verilog plugin for Sublime Text 2/3. It helps to generate a simple testbench, instantiate a module, insert a user-header, repeat codes with formatted incremental/decremental numbers, etc.

shinezyy/gem5_data_proc 1

data preprocessing scripts for gem5 output

Lingrui98/.tmux 0

Oh My Tmux! My pretty + versatile self-contained tmux configuration (in other words the best tmux configuration)

Lingrui98/CA_P6_Go 0

Add support for AXI Bus

Lingrui98/DirtyStuff 0

An individual repo to contain all the tools that I created for arch research;

create barnchOpenXiangShan/XiangShan

branch : dcp-frontend-params

created branch time in 2 days

push eventOpenXiangShan/XiangShan

YikeZhou

commit sha e92092e77b5a9868f8c415bf01cbc59fc5b98765

MEFreeList: use tailPtr instead of tailPtrNext in free reg cnt

view details

zfw

commit sha 0a6fa50eb85d82d4dccfa45a8d0be107f61bbae4

alu, decode: fix alu instruction and change instruction name (#1012) * Alu: fix andn, orn, xnor * Decode: change instruction name

view details

Lemover

commit sha fa086d5e5585e904f586e7fb1b789648dd976f76

mmu.tlb: set itlb's and l2tlb's size (#1014) * mmu.tlb: l2tlb's l3 now 128 sets and 4 ways * mmu.tlb: set itlb default size

view details

Yinan Xu

commit sha 88825c5cc17cb5e1df50964cd03c71974cc7730d

backend: support instruction fusion cases (#1011) This commit adds some simple instruction fusion cases in decode stage. Currently we only implement instruction pairs that can be fused into RV64GCB instructions. Instruction fusions are detected in the decode stage by FusionDecoder. The decoder checks every two instructions and marks the first instruction fused if they can be fused into one instruction. The second instruction is removed by setting the valid field to false. Simple fusion cases include sh1add, sh2add, sh3add, sexth, zexth, etc. Currently, ftq in frontend needs every instruction to commit. However, the second instruction is removed from the pipeline and will not commit. To solve this issue, we temporarily add more bits to isFused to indicate the offset diff of the two fused instruction. There are four possibilities now. This feature may be removed later. This commit also adds more instruction fusion cases that need changes in both the decode stage and the funtion units. In this commit, we add some opcode to the function units and fuse the new instruction pairs into these new internal uops. The list of opcodes we add in this commit is shown below: - szewl1: `slli r1, r0, 32` + `srli r1, r0, 31` - szewl2: `slli r1, r0, 32` + `srli r1, r0, 30` - byte2: `srli r1, r0, 8` + `andi r1, r1, 255` - sh4add: `slli r1, r0, 4` + `add r1, r1, r2` - sr30add: `srli r1, r0, 30` + `add r1, r1, r2` - sr31add: `srli r1, r0, 31` + `add r1, r1, r2` - sr32add: `srli r1, r0, 32` + `add r1, r1, r2` - oddadd: `andi r1, r0, 1`` + `add r1, r1, r2` - oddaddw: `andi r1, r0, 1`` + `addw r1, r1, r2` - orh48: mask off the first 16 bits and or with another operand (`andi r1, r0, -256`` + `or r1, r1, r2`) Furthermore, this commit adds some complex instruction fusion cases to the decode stage and function units. The complex instruction fusion cases are detected after the instructions are decoded into uop and their CtrlSignals are used for instruction fusion detection. We add the following complex instruction fusion cases: - addwbyte: addw and mask it with 0xff (extract the first byte) - addwbit: addw and mask it with 0x1 (extract the first bit) - logiclsb: logic operation and mask it with 0x1 (extract the first bit) - mulw7: andi 127 and mulw instructions. Input to mul is AND with 0x7f if mulw7 bit is set to true.

view details

Lemover

commit sha cc5a5f222ecf2e0129bc8ce2a76389d953b7963b

mmu.l2tlb: partially rewrite fsm and miss queue for bug and optimization (#1007) * mmu.l2tlb: l2tlb now support multiple parallel mem accesses 8 missqueue entry and 1 page table worker mq entry only supports page leaf entry ptw supports all the three level entries * mmu.tlb: fix bug of mq.refill_vpn and out.ready * mmu.tlb: fix bug of perf counter * mmu.tlb: l2tlb's l3 now 128 sets and 4 ways * mmu.tlb: miss queue now will 'merge' same mem req addr * mmu.l2tlb: ptw doesn't access last level pte * mmu.l2tlb: add mem req mask into ptw func block_decoupled doesn't work well and has bug in signal ready * mmu.l2tlb: fix bug of sfence to fsm add a new state s_check_pte to ptw fsm now take memPte from outside, doesn't store it inside mem_resp_valid will arrive a cycle before mem_resp_data * mmu.l2tlb: rm some state in fsm * mmu.tlb: set itlb default size * mmu.l2tlb: unkonwn mq wait bug, change code style to avoid it * mmu.l2tlb: opt, mq's entry with cache_l3 would not be blocked * mmu.l2tlb: add many time out assert * mmu.l2tlb: fix bug of mq enq state change & wait_id * Revert "mmu.tlb: l2tlb's l3 now 128 sets and 4 ways" This reverts commit 216e4192e4b01e68ce5502135318bc2473434907. * Revert "mmu.tlb: set itlb default size" This reverts commit 670bf1e408384964c601c0a55defbc767eb80698. * mmu.l2tlb: set miss queue size to 9 and set filter size to 8 if they are equal, itlb may loss its req

view details

Yinan Xu

commit sha 66c2a07b8b3ab6f7f7d98549d6ec0ccb68220031

backend, rs: parallelize selection and data read (#1018) This commit changes how uop and data are read in reservation stations. It helps the issue timing. Previously, we access payload array and data array after we decide the instructions that we want to issue. This method makes issue selection and array access serialized and brings critial path. In this commit, we add one more read port to payload array and data array. This extra read port is for the oldest instruction. We decide whether to issue the oldest instruction and read uop/data simultaneously. This change reduces the critical path to each selection logic + read + Mux (previously it's selection + arbitration + read). Variable oldestOverride indicates whether we choose the oldest ready instruction instead of the normal selection. An oldestFirst option is added to RSParams to parameterize whether we need the age logic. By default, it is set to true unless the RS is for ALU. If the timing for aged ALU rs meets, we will enable it later.

view details

Jiawei Lin

commit sha a1ea7f76add43b40af78084f7f646a0010120cd7

Use HuanCun instead of block-inclusive-cache (#1016) * misc: add submodule huancun * huancun: integrate huancun to SoC as L3 * remove l2prefetcher * update huancun * Bump HuanCun * Use HuanCun instead old L2/L3 * bump huancun * bump huancun * Set L3NBanks to 4 * Update rocketchip * Bump huancun * Bump HuanCun * Optimize debug configs * Configs: fix L3 bug * Add TLLogger * TLLogger: fix release ack address * Support write prefix into database * Recoding more tilelink info * Add a database output format converter * missqueue: add difftest port for memory difftest during refill * misc: bump difftest * misc: bump difftest & huancun * missqueue: do not check refill data when get Grant * Add directory debug tool * config: increase client dir size for non-inclusive cache * Bump difftest and huancun * Update l2/l3 cache configs * Remove deprecated fpga/* * Remove cache test * Remove L2 preftecher * bump huancun * Params: turn on l2 prefetch by default * misc: remove duplicate chisel-tester2 * misc: remove sifive inclusive cache * bump difftest * bump huancun * config: use 4MB L3 cache * bump huancun * bump difftest * bump difftest Co-authored-by: wangkaifan <wangkaifan@ict.ac.cn> Co-authored-by: TangDan <tangdan@ict.ac.cn>

view details

zoujr

commit sha 7f36ad77cdafc6b31d23e7dca7d08ff8e40a33c8

BPU: Fix bug that false hit in coremark 10

view details

Yinan Xu

commit sha c9ebdf902ce82cc0cb5eb4c2c6b6704fc90f574a

rs,status: simplify logic to optimize timing (#1020) This commit simplifies status logic in reservations stations. Module StatusArray is mostly rewritten. The following optimizations are applied: * Wakeup now has higher priority than enqueue. This reduces the length of the critical path of ALU back-to-back wakeup. * Don't compare fpWen/rfWen if the reservation station does not have float/int operands. * Ignore status.valid or redirect for srcState update. For data capture, these are necessary and not changed. * Remove blocked and scheduled conditions in issue logic when the reservation station does not have loadWait bit and feedback.

view details

Lemover

commit sha 9bd9cdfa6cc3f84e6770c84439754da8ca2b7dc6

mmu.l2tlb: add TimeOutAssert & cut down mem resp data buffer (#1021) * mmu.l2tlb: add object TimeOutAssert * mmu.l2tlb: add TimeOutAssert to Repeater * mmu.l2tlb: cut down mem req buffer from 8 ptes to 1 pte each * util: move some utils from MMUBundle to utils

view details

zhanglinjuan

commit sha 59a7cc929b4251497f9d1518ac414b3cb0053a91

MissQueue: send GrantAck immediately after first beat of GrantData (#1013) * MissQueue: send GrantAck immediately after first beat of GrantData * MissQueue: add perf cnts * MissQueue: fix assertion failure in perf cnt * MissQueue: add perf cnts for proportion of load merge / load reject * MissQueue: add perf cnt * MissQueue: fix merge-conflict error

view details

zoujr

commit sha 3ad99c7ff9c928dc40cc7d4b0ec7ca0c83ec26eb

BPU: Remove the false_hit_fix branch from the list of auto-run ci

view details

Yinan Xu

commit sha a792bcf1a0e095a9bceade5a3f5b481df101a247

backend: add 3-bit shift fused instructions (#1022) This commit adds 3-bit shift fused instructions. When the program tries to add 8-byte index, these may be used. List of fused instructions added in this commit: * szewl3: `slli r1, r0, 32` + `srli r1, r0, 29` * sr29add: `srli r1, r0, 29` + `add r1, r1, r2`

view details

Yinan Xu

commit sha 64056bed33c318bdd8e9543a75503d44d9d8d8a7

backend,rs: move select logic to stage 0 (#1023) This commit moves issue select logic in reservation stations to stage 0 from stage 1. It helps timing of stage 1, which load-to-load requires. Now, reservation stations have the following stages: * S0: enqueue and wakeup, select. Selection results are RegNext-ed. * S1: data/uop read and data bypass. Bypassed results are RegNext-ed. * S2: issue instructions to function units.

view details

Steve Gou

commit sha 42ba7d8c7bff2c008075543ab6372a0d01d7443c

Merge pull request #1025 from OpenXiangShan/false_hit_fix BPU: Fix bug and significantly reduce false_hit

view details

YikeZhou

commit sha 62d2a04b2f32541ef7bfc530b15331f1e1d4af9c

backend, rename: optimize MEFreeList free logic

view details

Yinan Xu

commit sha c88c3a2ad8d5caddcf38e659cf944c9bb09bb6ad

backend: clean up exception vector usages (#1026) This commit cleans up exception vector usages in backend. Previously the exception vector will go through the pipeline with the uop. However, instructions with exceptions will enter ROB when they are dispatched. Thus, actually we don't need the exception vector when an instruction enters a function unit. * exceptionVec, flushPipe, replayInst are reset when an instruction enters function units. * For execution units that don't have exceptions, we reset their output exception vectors to avoid ROB to record them. * Move replayInst to CtrlSignals.

view details

YikeZhou

commit sha 0153cd55ca4811a69821e72f146368fb37d56d9a

backend, rename: elimination psrc directly from intRat

view details

zhanglinjuan

commit sha ef90f6bd721ba5601e13a151dd5734d17643dfb6

MissQueue: fix bug in miss-merge logic (#1028)

view details

Lemover

commit sha 82d348fb09576bf1d97afca6011d0fe129e2307f

backend.atomic: when addr_valid, just access tlb, ignore data_valid (#1030)

view details

push time in 2 days

push eventOpenXiangShan/XiangShan

zoujr

commit sha 719a3f8a3abcf153a58e001fd73e4d2469f42cdb

BPU: Modify ubtb to direct mapped from fully associative

view details

Steve Gou

commit sha ffcef823738ac31522ad662b9fbd513dd6e359a8

Merge pull request #1057 from OpenXiangShan/ubtb-1K BPU: Modify ubtb to direct mapped from fully associative

view details

push time in 2 days

delete branch OpenXiangShan/XiangShan

delete branch : ubtb-1K

delete time in 2 days

PullRequestReviewEvent

issue commentriscv-boom/riscv-boom

question about TAGE IUM

It seems that you implemented the IUM in TageTable, and store counters in IUM without updating it. I think that's why it caused the counter value problem. According to the original paper (and code), the IUM stores only the direction, and does not modify the predicted counters. Have you tried the original version of IUM?

Lingrui98

comment created time in 6 days

issue openedriscv-boom/riscv-boom

question about TAGE IUM

<!-- choose all that apply --> Type of issue: bug report | feature request | question | other enhancement question

Q: I noticed there was a commit b836971a2911fcd7c12b0eade84620b7df2b6f2c which removed IUM from TAGE and claimed that wrbypass was better. IUM is proposed to deal with the impact of delayed update, however a wrbypass does not seem to manage it. Is there any more evidence on that decision?

created time in 7 days

PullRequestReviewEvent

push eventOpenXiangShan/XiangShan

zoujr

commit sha 65fddcf0358a85e75e3881fcfda6be033c239922

FTQ: Fix the bug that carry calculation is wrong when generating FTB_entry

view details

Steve Gou

commit sha aa9d86a61d5bbe13ec6464186ede1466421be1e7

Merge pull request #1036 from OpenXiangShan/false_hit_fix FTQ: Fix the false hit bug when run mcf

view details

push time in 7 days

delete branch OpenXiangShan/XiangShan

delete branch : false_hit_fix

delete time in 7 days

PR merged OpenXiangShan/XiangShan

FTQ: Fix the false hit bug when run mcf
  • Fix bug that of incorrect calculation of carry when generating ftb entry
  • False hit will no longer appear in mcf
+6 -1

2 comments

1 changed file

zoujr

pr closed time in 7 days

PullRequestReviewEvent
PullRequestReviewEvent

push eventOpenXiangShan/XiangShan

zoujr

commit sha 7f36ad77cdafc6b31d23e7dca7d08ff8e40a33c8

BPU: Fix bug that false hit in coremark 10

view details

zoujr

commit sha 3ad99c7ff9c928dc40cc7d4b0ec7ca0c83ec26eb

BPU: Remove the false_hit_fix branch from the list of auto-run ci

view details

Steve Gou

commit sha 42ba7d8c7bff2c008075543ab6372a0d01d7443c

Merge pull request #1025 from OpenXiangShan/false_hit_fix BPU: Fix bug and significantly reduce false_hit

view details

push time in 16 days

delete branch OpenXiangShan/XiangShan

delete branch : false_hit_fix

delete time in 16 days

PR merged OpenXiangShan/XiangShan

BPU: Fix bug and significantly reduce false_hit
  • Fix bug that of incorrect calculation of target stat when updating ftb entry
  • False hit will no longer appear in some programs
+1 -1

1 comment

1 changed file

zoujr

pr closed time in 16 days

PullRequestReviewEvent

push eventOpenXiangShan/env-scripts

Lingrui98

commit sha 2c03420d7c1e453306e71e9ec01de312de01f3d3

ipc_diff: align outputs

view details

push time in 20 days

push eventOpenXiangShan/XiangShan

William Wang

commit sha 103b691438e3afd99c245eed144b9149ff7c61ef

mem: reduce refill writeback delay by 1 cycle * Now inst being refilled currently can be selected as wb candidate

view details

William Wang

commit sha 594ba8ac93360d08b5da8cc83597e39f211db3fe

mem: let lq refill width be equal to l1d bus width

view details

William Wang

commit sha 7ab59370ffc775aed60862417b6a48af7db40b5f

chore: update load_miss_penalty_to_use counter

view details

William Wang

commit sha 63d95f38401521cb24e7e3507328484614c958c3

ci: run ci on fastpath (without master)

view details

William Wang

commit sha dd9fd7228d205ca63bd710345acb9f03b94d015d

Merge remote-tracking branch 'origin/master' into fastpath

view details

William Wang

commit sha 58628cdc809320a90f350988df672d2179a33ade

Merge branch 'fastpath' into fastpath-ci

view details

William Wang

commit sha aaf9f60c9dc48ff7384d5405a2b0b84d80519368

dcache: fix refill when merge refill request Update should_refill_data eariler to refill first half of refill data

view details

William Wang

commit sha 99aa3a7e43199cc4e00eb7165e0a075d2e409f2b

difftest: bump version to support clang

view details

William Wang

commit sha 16ce2b800c039b7c80e33f6a528faba880e52345

Revert "ci: run ci on fastpath (without master)" This reverts commit 63d95f38401521cb24e7e3507328484614c958c3.

view details

William Wang

commit sha b603de6077afcb821dd6b366b60e253e4b2dc3e2

Merge remote-tracking branch 'origin/master' into fast-refill

view details

William Wang

commit sha 588e93e03b231b74ac1dae8651cc63b28d07b301

chore: fix frontend / memblock merge conflict

view details

William Wang

commit sha b460b7e4c6a6eeb8c959024a15c44273fd41063b

Merge remote-tracking branch 'origin/master' into fast-refill

view details

Jiawei Lin

commit sha 842f79915a809555ec3a945b18383b21a74b9aa0

FPToFP: fix precision width && reuse fcmp to compute min/max (#1005)

view details

YikeZhou

commit sha 31ebfb1dd00f6473404eca1f5d9153cc53e133a7

backend, rename: support elimination of move instruction whose lsrc is 0 + bug fix (#1008) * backend, rename: support elimination of mv inst whose lsrc=0 [known bug] instr page fault not properly raised after sfence.vma * backend, roq: [bug fix] won't label me with exception as writebacked

view details

Yinan Xu

commit sha 698b404af9a2ec6b4ede9981c6564d9ee35f48d3

exu: select RegNext(fflags) if fastNotImplemented (#1006) This commit assigns exu.io.out.fflags to RegNext(fu.io.fflags) if the function unit has fastUopOut but has not implemented it. Previously it causes a bug that fflags may be one cycle earlier than expected. This commit also removes the extra logic in FmacExeUnit and FmiscExeUnit. They are exactly the same as ExeUnit now.

view details

William Wang

commit sha 0292440ac9eec23b6929e14e8a206a29b43a5efb

Merge pull request #987 from OpenXiangShan/fast-refill dcache,lq: make dcache to lq refill faster

view details

Steve Gou

commit sha 31e152efe65e0eb1bba2cff84808188f772ba13c

Merge pull request #1002 from OpenXiangShan/decoupled-frontend add new ittage indirect target predictor

view details

push time in 22 days

create barnchOpenXiangShan/XiangShan

branch : decoupled-frontend

created branch time in 23 days

push eventOpenXiangShan/XiangShan

Lingrui98

commit sha 3bcae573fc6ed87eb736b42d2ac60f1adc0959d1

ftq: modify jmpTarget in FtbEntry whenever jalr target changes * previously we only modify jmpTarget on misprediction, and that's because we only use ftb to predict jalr target. However, with the presence of an indirect branch predictor, there exists such case that an indirect branch is correctly predicted when the target in ftb entry is wrong.

view details

Guokai Chen

commit sha 60f966c8ac39978396a024659277ddcfcfb7b0d6

frontend: add ittage indirect predictor

view details

Lingrui98

commit sha 8ffcd86a9401d6327c221129d36c7660c2fe99a7

bundle: add a full target in update bundle

view details

Guokai Chen

commit sha b0ac2a691c0109faf3f997a7dbea958d19a22ec4

frontend: ittage fix update valid condition

view details

Lingrui98

commit sha abdbe4b74043b0646d9d9d4ca875ec8c050f3833

bundle: add a full target in update bundle

view details

Guokai Chen

commit sha e5d060c15a8b936e5633a105fe8ff7b50377766e

frontend: ittage: switch to full length jmp target

view details

Steve Gou

commit sha 1c8d55c9066bf94c47305e2d396b9c93512323c5

Merge pull request #992 from OpenXiangShan/decoupled-frontend-indirect frontend: add ittage indirect predictor

view details

Lingrui98

commit sha ba4cf51546af7ae0ff7fbef3207618f7f32b2c4f

parameters: ras size 32, btb size 4096

view details

LinJiawei

commit sha a9bb1d5a01d9a27b07fe81feac7bebece197355e

Makefile: add '--gen-mem-verilog'

view details

Lingrui98

commit sha 03ebac49877c1e107f2313aebfc5557c9af32c30

Merge remote-tracking branch 'origin/gen-sram-conf' into decoupled-frontend

view details

Lingrui98

commit sha 9eb7e915958d56aeced1250537f1d273bcca223a

Merge remote-tracking branch 'origin/master' into decoupled-frontend

view details

Lingrui98

commit sha d392ebe5094e9fcf1f5b3aa862391cc03a7ae074

Merge remote-tracking branch 'origin/master' into decoupled-frontend

view details

Steve Gou

commit sha 31e152efe65e0eb1bba2cff84808188f772ba13c

Merge pull request #1002 from OpenXiangShan/decoupled-frontend add new ittage indirect target predictor

view details

push time in 23 days

delete branch OpenXiangShan/XiangShan

delete branch : decoupled-frontend

delete time in 23 days

push eventOpenXiangShan/XiangShan

Jiawei Lin

commit sha 4b65fc7eead70ec79016114a097eebc268efb8bd

FMA: separate fmul/fadd/fma (#996) * FMA: spearate fadd/fmul/fma * exu: enable fast uop out from fmacExeUnit Co-authored-by: Yinan Xu <xuyinan@ict.ac.cn>

view details

Jiawei Lin

commit sha dfc810ae6cc5b40c9b53860f01ccba0becee83d9

Makefile: add '--gen-mem-verilog' (#1000) * Makefile: add '--gen-mem-verilog'

view details

Yinan Xu

commit sha 5dabf2df532e6ae1d39a39e1a33bfe8e2d18cc89

utils,MaskData: assert wmask is wider than data (#1001) This commit adds assertion in MaskData to check the width of mask and data. When the width of mask is smaller than the width of data, (~mask & data) and (mask & data) will always clear the upper bits of the data. This usually causes unexpected behavior. This commit adds explicit width declarations where MaskData is used.

view details

Lemover

commit sha b848eea577a14a673dec8efb4a812696b1bddba9

mmu.l2tlb: l2tlb now supports multiple mem access at the same time (#1003) * mmu.l2tlb: l2tlb now support multiple parallel mem accesses 8 missqueue entry and 1 page table worker mq entry only supports page leaf entry ptw supports all the three level entries * mmu.tlb: fix bug of mq.refill_vpn and out.ready

view details

Yinan Xu

commit sha bd2788978542340035a9a8bfcbd6165d318bd8fb

backend,exu: load balance between issue ports (#947) This commit adds support for load balance between different issue ports when the function unit is not pipelined and the reservation station has more than one issue ports. We use a ping pong bit to decide which port to issue the instruction. At every clock cycle, the bit is flipped.

view details

Lingrui98

commit sha d392ebe5094e9fcf1f5b3aa862391cc03a7ae074

Merge remote-tracking branch 'origin/master' into decoupled-frontend

view details

push time in 24 days

push eventOpenXiangShan/XiangShan

Jiuyang Liu

commit sha 510ae4ee6820204606a64d649eefa05784ab6021

use ExtModule instead of Chisel3.BlackBox. (#988)

view details

Lingrui98

commit sha 9eb7e915958d56aeced1250537f1d273bcca223a

Merge remote-tracking branch 'origin/master' into decoupled-frontend

view details

push time in 24 days

push eventOpenXiangShan/XiangShan

William Wang

commit sha 5f235c685292aa0b75d618d3936a19f18065cb3f

misc: remove unused files, bump difftest

view details

William Wang

commit sha 7822eea61ce3abe046760e327213c3dec7cc3ee6

misc: update ready-to-run nemu

view details

William Wang

commit sha 88fbccdd7f0549848a3b175892e1fbb76ca73c46

mem: add vaddr forward profiling framework

view details

William Wang

commit sha 672f1d35be39207dcee2737156eaf93540f3f3e4

mem: use vaddr match, paddr fix forward in SQ Vaddr Match, Paddr Fix (VMPF) store to load forward uses vaddr cam result to select data to be forwarded. Vaddr cam result and paddr cam result will be compared to check if vaddr based forward is correct. If not, an microarichitectural exception should be raised to flush SQ and committed sbuffer. TODO: forward fail microarichitectural exception

view details

William Wang

commit sha 41962d72a68d0f592d81244f96d9f01708ea990a

mem: use vaddr match, paddr fix forward in sbuffer Now we use vaddr tag to select data to be forwarded in sbuffer. Vtag / ptag match result will be compared latter to check if vaddr based forward is correct. If not, an microarichitectural exception should be raised to flush SQ and committed sbuffer. TODO: forward fail microarichitectural exception

view details

William Wang

commit sha 112138964481e46a325f56dbeb87ccce43efc1ac

mem: drain sbuffer when v/ptag mismatch

view details

William Wang

commit sha 4f2594f26d88fd26a6bac6d9282b9688841d753c

sbuffer: ignore invalid forward request

view details

William Wang

commit sha 6e162816a7fb0411a2efd943c730c1895f3f2645

mem: enable vaddr based sbuffer forward Frontend will be refactored soon. Rollback will not be added until that

view details

William Wang

commit sha 6a2edd8a8b39d7320fa1ccd85015de85dec8f3f3

rob: support replay inst from rob

view details

William Wang

commit sha 4457bfcd222457b14716df63abe6eb389c4941d7

mem: replay forward_fail inst from rob

view details

William Wang

commit sha 0a24fac31eaa52dc7a19bf1f294df4cda9bf1624

Merge remote-tracking branch 'origin/master' into vaddr-fwd

view details

William Wang

commit sha 4887ca7fbd391a890b1877f9399c80b25572a67d

mem: fix replay inst from rob logic

view details

William Wang

commit sha 3db2cf7579cc9b3c36124096d12f797410f73430

mem: loadpipe will not miss if fullForward succeed New option `EnableFastForward` is added to config list. EnableFastForward will reduce L1D$ miss but make timing worse. * `forwardMaskFast` is generated at load_s1, it is used to generate fastUop for fast wakeup * `forwardMask` is generated at load_s2, it will be used to check if forward result is correct

view details

William Wang

commit sha ce28536f0fddf6bfda31615ff3dd5e01e2a3c0e9

mem: fix rsFeedback for fast forward

view details

William Wang

commit sha e3f759ae573d6f4fabbfe9e4dcf7987b1d32d06d

mem: add load to load addr fastpath framework

view details

YikeZhou

commit sha 6e3cddfe588ea01af869549161ffcd8ac18a76f5

AlternativeFreeList: parameterize length of FL FreeList: same as above Parameters: add 2 core param and 2 derived param [TODO] use EnableIntMoveElim to control ME function

view details

YikeZhou

commit sha 39d3280eb31ef056506ae9fdaa3982f76a903ba7

rename: [refactor] move free list into 'freelist' package "trait" was used to improve code style parameters: use EnableIntMoveElim to control code generation [WIP] EnableIntMoveElim=false hasn't been tested

view details

YikeZhou

commit sha 5eb4af5ba4d9ec70191944b8386c8983f355d751

rename/roq/dispatch1: support EnableIntMoveElim=false (finish refactoring) [TODO] remove useless code

view details

YikeZhou

commit sha 2824417d2e22b2c4ee5cdb3f1ad0a33680492666

rename: [refactoring] remove useless file + comment added

view details

William Wang

commit sha 00a565697570aee160a003c022b96a96ff0850b7

mem: mark inst as datavalid in lq if fullForward

view details

push time in 25 days

push eventOpenXiangShan/XiangShan

Lingrui98

commit sha ba4cf51546af7ae0ff7fbef3207618f7f32b2c4f

parameters: ras size 32, btb size 4096

view details

push time in 25 days

push eventOpenXiangShan/XiangShan

Guokai Chen

commit sha 60f966c8ac39978396a024659277ddcfcfb7b0d6

frontend: add ittage indirect predictor

view details

Guokai Chen

commit sha b0ac2a691c0109faf3f997a7dbea958d19a22ec4

frontend: ittage fix update valid condition

view details

Lingrui98

commit sha abdbe4b74043b0646d9d9d4ca875ec8c050f3833

bundle: add a full target in update bundle

view details

Guokai Chen

commit sha e5d060c15a8b936e5633a105fe8ff7b50377766e

frontend: ittage: switch to full length jmp target

view details

Steve Gou

commit sha 1c8d55c9066bf94c47305e2d396b9c93512323c5

Merge pull request #992 from OpenXiangShan/decoupled-frontend-indirect frontend: add ittage indirect predictor

view details

push time in 25 days