Skip to content

Commit

Permalink
[MemPool-Spatz][TCDM ACK Handling Fix]
Browse files Browse the repository at this point in the history
1. Add `wen` field in response signal of tcdm to distinguish the store/load response.
2. Fix a bug in runtime/arch.ld.c that the L1 size is not configured correctly for Spatz cfgs.
3. Add more configs for TeraPool-Spatz.
4. Other small bug fixes.
  • Loading branch information
msc23h24 Diyou Shen (dishen) committed Jan 5, 2024
1 parent e24f8df commit 1057b86
Show file tree
Hide file tree
Showing 30 changed files with 555 additions and 262 deletions.
6 changes: 3 additions & 3 deletions Bender.lock
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@ packages:
- common_cells
- common_verification
cluster_interconnect:
revision: 7d0a4f8acae71a583a6713cab5554e60b9bb8d27
version: 1.2.1
revision: 8c6c2273d60077002834d2cb5d8e44ee0de3e32c
version: null
source:
Git: "https://github.com/pulp-platform/cluster_interconnect.git"
dependencies:
Expand Down Expand Up @@ -85,7 +85,7 @@ packages:
dependencies:
- common_cells
spatz:
revision: 5e854f1fd9e82df236565a61a710d3092059f471
revision: 32038321ecb42d14f5f28444fff2d7e7248d8e41
version: null
source:
Git: git@iis-git.ee.ethz.ch:spatz/spatz.git
Expand Down
6 changes: 3 additions & 3 deletions Bender.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,15 +7,15 @@ package:

dependencies:
axi: { git: "https://github.com/pulp-platform/axi.git", version: 0.36.0 }
cluster_interconnect: { git: "https://github.com/pulp-platform/cluster_interconnect.git", version: 1.2.1 }
cluster_interconnect: { git: "https://github.com/pulp-platform/cluster_interconnect.git", rev: 8c6c227 }
common_cells: { git: "https://github.com/pulp-platform/common_cells.git", version: 1.23.0 }
idma: { path: "hardware/deps/idma" }
register_interface: { git: "https://github.com/pulp-platform/register_interface.git", version: 0.3.1 }
reqrsp_interface: { path: "hardware/deps/reqrsp_interface" }
snitch: { path: "hardware/deps/snitch" }
tech_cells_generic: { git: "https://github.com/pulp-platform/tech_cells_generic.git", version: 0.2.5 }
spatz: { git: "git@iis-git.ee.ethz.ch:spatz/spatz.git", rev: 5e854f1f }
FPnew: { git: "https://github.com/pulp-platform/cvfpu.git", rev: pulp-v0.1.3 }
spatz: { git: "git@iis-git.ee.ethz.ch:spatz/spatz.git", rev: 32038321 }
FPnew: { git: "https://github.com/pulp-platform/cvfpu.git", rev: pulp-v0.1.3 }

workspace:
checkout_dir: "./hardware/deps"
Expand Down
44 changes: 44 additions & 0 deletions config/mempool_spatz2_fpu.mk
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Copyright 2021 ETH Zurich and University of Bologna.
# Licensed under the Apache License, Version 2.0, see LICENSE for details.
# SPDX-License-Identifier: Apache-2.0

# Author: Matheus Cavalcante, ETH Zurich

###############
## MemPool ##
###############

# Number of cores
num_cores ?= 128

# Number of groups
num_groups ?= 4

# Number of cores per MemPool tile
num_cores_per_tile ?= 2

# L1 scratchpad banking factor
banking_factor ?= 4

# Radix for hierarchical AXI interconnect
axi_hier_radix ?= 20

# Number of AXI masters per group
axi_masters_per_group ?= 1

# Activate Spatz and RVV
spatz ?= 1

# Lenght of single vector register
vlen ?= 256

# Number of IPUs
n_ipu ?= 2

n_fpu ?= 2

# Deactivate the XpulpIMG extension
xpulpimg ?= 0

rvf ?= 1
rvd ?= 0
62 changes: 62 additions & 0 deletions config/terapool_spatz2_fpu.mk
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# Copyright 2021 ETH Zurich and University of Bologna.
# Licensed under the Apache License, Version 2.0, see LICENSE for details.
# SPDX-License-Identifier: Apache-2.0

# Author: Matheus Cavalcante, ETH Zurich

################
## TeraPool ##
################

# Global Control
terapool ?= 1

# Number of cores
num_cores ?= 512

# Number of groups
num_groups ?= 4

# Number of cores per Terapool tile
num_cores_per_tile ?= 4

# Number of sub groups per Terapool group
num_sub_groups_per_group ?= 4

# L1 scratchpad banking factor
banking_factor ?= 4

# Access latency between remote groups
# Options: "7", "9" or "11":
remote_group_latency_cycles ?= 7

# Radix for hierarchical AXI interconnect
axi_hier_radix ?= 9

# Number of AXI masters per group
axi_masters_per_group ?= 4

# Number of DMA backends in each group
dmas_per_group ?= 4

# L2 Banks/Channels
l2_banks = 16

# Makefile RTL Filtering Control
subgroup_rtl = 1

# Activate Spatz and RVV
spatz ?= 1

# Lenght of single vector register
vlen ?= 256

# Number of IPUs
n_ipu ?= 2

n_fpu ?= 2

# Deactivate the XpulpIMG extension
xpulpimg ?= 0

rvf ?= 1
62 changes: 62 additions & 0 deletions config/terapool_spatz8_fpu.mk
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# Copyright 2021 ETH Zurich and University of Bologna.
# Licensed under the Apache License, Version 2.0, see LICENSE for details.
# SPDX-License-Identifier: Apache-2.0

# Author: Matheus Cavalcante, ETH Zurich

################
## TeraPool ##
################

# Global Control
terapool ?= 1

# Number of cores
num_cores ?= 128

# Number of groups
num_groups ?= 4

# Number of cores per Terapool tile
num_cores_per_tile ?= 1

# Number of sub groups per Terapool group
num_sub_groups_per_group ?= 4

# L1 scratchpad banking factor
banking_factor ?= 4

# Access latency between remote groups
# Options: "7", "9" or "11":
remote_group_latency_cycles ?= 7

# Radix for hierarchical AXI interconnect
axi_hier_radix ?= 9

# Number of AXI masters per group
axi_masters_per_group ?= 4

# Number of DMA backends in each group
dmas_per_group ?= 4

# L2 Banks/Channels
l2_banks = 16

# Makefile RTL Filtering Control
subgroup_rtl = 1

# Activate Spatz and RVV
spatz ?= 1

# Lenght of single vector register
vlen ?= 1024

# Number of IPUs
n_ipu ?= 8

n_fpu ?= 8

# Deactivate the XpulpIMG extension
xpulpimg ?= 0

rvf ?= 1
11 changes: 10 additions & 1 deletion hardware/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -109,8 +109,11 @@ vlog_defs += -DNUM_SUB_GROUPS_PER_GROUP=$(num_sub_groups_per_group) -DREMOTE_GRO

ifeq ($(spatz), 1)
vlog_defs += -DVLEN=$(vlen) -DN_IPU=$(n_ipu) -DN_FPU=$(n_fpu) -DN_FU=$(shell awk 'BEGIN{print ($(n_ipu) > $(n_fpu)) ? $(n_ipu) : $(n_fpu)}')
vlog_defs += -DMEMPOOL_SPATZ=$(spatz)
# spatz need wen signal on TCDM response channel for ACK handling
vlog_defs += -DMEMPOOL_SPATZ=$(spatz) -DRESPWEN=$(spatz)
bender_defs += -t spatz
SPATZ_DIR := $(shell $(bender) path spatz)
SPATZ_CLUSTER_DIR := $(SPATZ_DIR)/hw/system/spatz_cluster
endif

# Traffic generation enabled
Expand Down Expand Up @@ -142,6 +145,12 @@ $(buildpath):
$(bender):
make -C $(MEMPOOL_DIR) bender

.PHONY: buildspatz
buildspatz:
@if [ "$(spatz)" = "1" ]; then \
$(MAKE) -BC $(SPATZ_CLUSTER_DIR) SPATZ_CLUSTER_CFG=$(SPATZ_CLUSTER_DIR)/cfg/mempool.hjson generate; \
fi

################
# Modelsim #
################
Expand Down
6 changes: 3 additions & 3 deletions hardware/deps/snitch/src/snitch.sv
Original file line number Diff line number Diff line change
Expand Up @@ -2799,7 +2799,7 @@ module snitch
gpr_we[0] = 1'b1;
gpr_waddr[0] = lsu_rd;
gpr_wdata[0] = ld_result[31:0];
end else if (acc_pvalid_i & acc_pwrite_i) begin
end else if (acc_pvalid_i) begin
// if we are not retiring another instruction retire the accelerated one now
retire_acc = 1'b1;
gpr_we[0] = 1'b1;
Expand Down Expand Up @@ -2836,7 +2836,7 @@ module snitch
retire_load = 1'b1;
gpr_we[1] = 1'b1;
lsu_pready = 1'b1;
end else if (acc_pvalid_i & acc_pwrite_i) begin
end else if (acc_pvalid_i) begin
retire_acc = 1'b1;
gpr_we[1] = 1'b1;
gpr_waddr[1] = acc_pid_i;
Expand All @@ -2845,7 +2845,7 @@ module snitch
end
// if we are not retiring another instruction retire the load now
end else begin
if (acc_pvalid_i & acc_pwrite_i) begin
if (acc_pvalid_i) begin
retire_acc = 1'b1;
gpr_we[0] = 1'b1;
gpr_waddr[0] = acc_pid_i;
Expand Down
2 changes: 1 addition & 1 deletion hardware/deps/spatz
Submodule spatz updated from 5e854f to 320383
35 changes: 30 additions & 5 deletions hardware/scripts/gen_trace.py
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,8 @@ def annotate_snitch(
retired_reg: dict,
perf_metrics: list,
force_hex_addr: bool = True,
permissive: bool = False
permissive: bool = False,
spatz_active: int = 0
) -> (str, dict):
# Compound annotations in datapath order
ret = []
Expand Down Expand Up @@ -242,6 +243,20 @@ def annotate_snitch(
# Any kind of PC change: Branch, Jump, etc.
if not extras['stall'] and extras['pc_d'] != pc + 4:
ret.append('goto {}'.format(int_lit(extras['pc_d'])))
if not extras['stall_spatz']:
if extras['stall_totacc']:
ret.append('// spatz stall {} cycles'.format(extras['stall_totacc']))
perf_metrics[-1]['stall_totacc'] += extras['stall_totacc']
if extras['stall_vfu']:
perf_metrics[-1]['stall_vfu'] += extras['stall_vfu']
ret.append('({} vfu)'.format(extras['stall_vfu']))
if extras['stall_vlsu']:
perf_metrics[-1]['stall_vlsu'] += extras['stall_vlsu']
ret.append('({} vlsu)'.format(extras['stall_vlsu']))
if extras['stall_vsldu']:
perf_metrics[-1]['stall_vsldu'] += extras['stall_vsldu']
ret.append('({} vsldu)'.format(extras['stall_vsldu']))
perf_metrics[-1]['spatz_active'] = extras['spatz_active']
# Count stalls, but only in cycles that execute an instruction
if not extras['stall']:
if extras['stall_tot']:
Expand Down Expand Up @@ -309,16 +324,19 @@ def annotate_insn(
show_time_info = (dupl_time_info or time_info != last_time_info)
time_info_strs = tuple((str(elem) if show_time_info else '')
for elem in time_info)
spatz_active = 0
# Annotated trace
if extras_str:
extras = read_annotations(extras_str)
if 'spatz_active' in extras:
spatz_active = extras['spatz_active']
# Annotate snitch
(annot, retired_reg) = annotate_snitch(
extras, time_info[1], last_time_info[1],
int(pc_str, 16), gpr_wb_info, prev_wfi_time, retired_reg,
perf_metrics, force_hex_addr,
permissive)
if extras['stall']:
permissive, spatz_active)
if extras['stall'] or extras['stall_spatz']:
insn, pc_str = ('', '')
else:
perf_metrics[-1]['snitch_issues'] += 1
Expand Down Expand Up @@ -423,6 +441,7 @@ def eval_perf_metrics(perf_metrics: list, id: int):
def fmt_perf_metrics(perf_metrics: list, idx: int, omit_keys: bool = True):
ret = ['Performance metrics for section {} @ ({}, {}):'.format(
idx, perf_metrics[idx]['start'], perf_metrics[idx]['end'])]
ret.append('{:<40}{:>10}'.format('Spatz Active', int_lit(perf_metrics[idx]['spatz_active'])))
for key, val in sorted(perf_metrics[idx].items()):
if omit_keys and key in PERF_EVAL_KEYS_OMIT:
continue
Expand All @@ -447,7 +466,7 @@ def sanity_check_perf_metrics(perf_metrics: list, idx: int):
# Sum up all stalls
sum_tot = perf_metric.get('stall_ins', 0) + \
perf_metric.get('stall_lsu', 0) + perf_metric.get('stall_raw', 0) + \
perf_metric.get('stall_wfi', 0)
perf_metric.get('stall_wfi', 0) + perf_metric.get('stall_acc', 0)
if (sum_tot != perf_metric.get('stall_tot', 0)):
error['total_stalls'] = sum_tot
# Sum up all cycles
Expand Down Expand Up @@ -503,7 +522,13 @@ def perf_metrics_to_csv(perf_metrics: list, filename: str):
'seq_stores_local',
'seq_stores_global',
'itl_stores_local',
'itl_stores_global']
'itl_stores_global',
'stall_spatz',
'stall_totacc',
'stall_vfu',
'stall_vlsu',
'stall_vsldu',
'spatz_active']
for key in keys:
if key not in known_keys:
known_keys.append(key)
Expand Down
Loading

0 comments on commit 1057b86

Please sign in to comment.