Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Activation unit #1

Merged
merged 210 commits into from
Sep 24, 2024
Merged

Activation unit #1

merged 210 commits into from
Sep 24, 2024

Conversation

gamzeisl
Copy link
Collaborator

This PR cleans and extends the merge request !10 on GitLab.

Changes:

  • Implementation of integer-only GELU and RELU activation unit for ITA.
  • Feedforward and linear layers are added to the FSM.
  • FFN requant params are reserved in the register file.

@gamzeisl gamzeisl requested a review from Xeratec September 11, 2024 17:35
@gamzeisl gamzeisl self-assigned this Sep 11, 2024
Copy link
Collaborator

@Xeratec Xeratec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a lot of changes but look good to me. I only have a few minor comments about the golden model and the CI. Furthermore, I suggest squashing the changes to keep a clean commit history.

.gitlab-ci.yml Outdated Show resolved Hide resolved
PyITA/ITA.py Outdated Show resolved Hide resolved
PyITA/ITA.py Outdated Show resolved Hide resolved
PyITA/ITA.py Outdated Show resolved Hide resolved
PyITA/test_ITA.py Outdated Show resolved Hide resolved
gamzeisl and others added 27 commits September 24, 2024 18:22
@gamzeisl gamzeisl merged commit 2e6e8d8 into main Sep 24, 2024
6 checks passed
@gamzeisl gamzeisl deleted the activation-unit branch September 24, 2024 16:44
gamzeisl added a commit to gamzeisl/ITA that referenced this pull request Oct 24, 2024
* build(requirements): add pytorch

* build(requirements): add pytest

* test(py-ita): add feedforward size parameter

* test(test-generator): add feedforward parameter and adjust output path

* feat(py-ita): extend print_properties with feedforward size

* feat(py-ita). implement floating point i_poly

* feat(py-ita): add i_poly_wrapper function

* build(requirements): add pytest-check

* feat(py-ita): add symmetric (de-)quantization functions

* feat(py-ita): add functions for floating point poly and erf

* test(py-ita): add test for quantization, poly and erf functions

* fix(py-ita): fix q_c typo from definition

* feat(py-ita): implement floating point i_gelu

* test(py-ita): add i_gelu tests

* fix(py-ita): fix ipoly types to avoid overflow

* refactor(py-ita): type i-gelu and i-erf

* test(py-ita): clean up and pretty print

* refactor(py-ita): fix typo

* test(py-ita): make scaling factor output more readable

* test(py-ita): verify i-gelu domain interval [b, -b]

* test(py-ita): add error plotting

* refactor(py-ita): rename variable

* feat(py-ita): add missing imports

* feat(py-ita): add gelu golden model based on random preactivation values

* feat(ita-tb): adjustments for feedforward parameter

* feat(return-status): adjust parameters and check number of outputs

* build(makefile): add feedforward parameter

* build(gitlab-ci): adjust for new parameters

* fix(test-generator): typo in simvector path

* fix(test-generator): fix and refactor simvector paths

* feat(py-ita): write out gelu constants

* feat(gelu-tb): add testbench skeleton

* feat(accel-pkg): add gelu bit width parameters

* feat(gelu): add initial scalar gelu

* build(bender): add gelu and gelu testbench

* feat(modelsim): add gelu simulation script

* build(modelsim): add gelu simulation target

* test(gelu-tb): validate scalar gelu for all activations from precomputed stimuli files

* chore(gitignore): ignore vscode settings.json

* fix(py-ita): clip i-gelu within valid range -127, 127

* test(py-ita): add tests for i-gelu edge cases

* refactor(gelu): explicitly sign-extend input to output bit width

* fix(gelu): clip input within valid bounds

* stlye(gelu): auto-format

* feat(py-ita): implement i-gelu with requantization

* test(py-ita): add tests for i-gelu with requantization

* feat(py-ita): use 8 bit eps_mul for requantizer i-gelu

* test(py-ita): auto-format

* feat(py-ita): write out i-gelu requantization constants

* test(gelu): read and apply gelu requantization constants

* feat(gelu): implement requantization

* test(gelu): remove redundant print statements

* refactor(gelu): rename types

* refactor(gelu): use correct type for add

* test(gelu): refactor to test higher-level activation module instead of just GELU

* feat(relu): add RELU activation module

* feat(activation): add module which lifts GELU and RELU to match output dimension of requantizer

* feat(accel-pkg): add enum to select activation function

* build: adjust simulation and build files for activation testbench

* test(activation): read activations in blocks of N_PE size

* test(activation): also verify ReLU and identity activations

* test(activation-tb): extract validation function to reduce redandancies

* test(activation-tb): rename variables holding gelu constants for clarity

* refactor(py-ita): rename files for GELU constants for clarity

* test(activation-tb): extract function which reads all GELU constants

* test(activation-tb): fix equality check + refactor reading of GELU constants

* test(activation-tb): parametrize function for reading GELU constants

* refactor(activation): reorder inputs

* feat(accel-pkg): extend controller interface with constants for GELU and activation mode

* feat(ita): insert activation module in datapath between requantizer and output FIFO

* test(ita-tb): supply ITA with additional activation-related control signals and constants

* feat(py-ita): implement almost symmetric quantization

* test(py-ita): add simple test for almost symmetric quantization

* test(py-ita): fix last test case of quantize

* test(py-ita): use almost symmetric quantization instead of symmetric quantization

* fix(ita): compute scaling factor based on almost symmetric quantization

* fix(gelu): remove edge case treatment of q=-128, no longer necessary with almost symmetric quantization

* refactor(activation): rename requant variables and extract typedefs

* refactor(gelu): extract type for gelu output

* fix(py-ita): make sure to round before clipping and properly apply half-way rounding from zero

* feat(py-ita): ensure requantization constants are unsigned and compensate for eps_mul sign flip

* test(py-ita): reduce error tolerances

* feat(activation): apply requantization using existing vector module instead of inside GELU

* test(activation-tb): account for requantization latency

* refactor(activation): reorder condition for clarity

* test(activation-tb): make sure to use the correct input file when expected values for RELU

* refactor(fifo-controller): rename port for clarity

* fix(fifo-controller): delay insertion by two cycles to account for activation requantizer latency

* feat(activation): don't introduce unnecessary delay for RELU and IDENTITY

* test(activation-tb): extend and refactor for requantized GELU

* feat(fifo-controller): let fifo insertion condition depend on activation latency

* refactor(ita): rename fifo controller port

* refactor(relu): use typedef for clarity

* feat(py-ita): allow choosing activation function for step 6

* feat(ita): delay activation signal to keep in sync with activation input

* test(ita-tb): make value mismatch info more verbose

* test(ita-tb): use RELU activation by default

* feat(activation): fix activation latency to two cycles

* fix(fifo-controller): account for fixed latency of activation module

* fix(ita): account for fixed latency of activation module

* test(ita-tb): use GELU activation by default

* feat(py-ita): use GELU activation by default

* test(ita-tb): introduce activation parameter and set to identity by default

* feat(py-ita): allow setting activation function using command line arguments

* test(ita-tb): expose activation function as parameter

* build(makefile): allow configuring activation function

* test(activation-tb): fix latency

* refactor(accel_pkg): clean up typedefs

* refactor(ita): remove unused first_inner_tile signal

* feat(activation): requantize relu output

* feat(py-ita): requantize relu output

* test(activation-tb): read requantized relu output from generated file

* refactor(test-ITA): move plot files into subdir

* chore(gitignore): ignore pytest cache

* test(activation-tb): rewrite debug msg

* refactor(ita): rename gelu requant constants to more general activation requant constants

* build(requirements): set up pre-commit hooks for auto-formatting python files using yapf

* style: auto-format files with yapf

* build(pre-commit-config): add some linters like checking for trailing whitespace and verifying python ast correctness

* style: fix trailing whitespace

* test(py-ita): fix plot dir

* refactor(ita): remove unused signals

* refactor(accel-pkg): reorganize for clarity

* feat(ita): add requantization controller which decouples state value from constants index using indirection

* refactor(requantizer): incorporate new requant_mode type

* test(activation-tb): use new requant_mode type

* refactor: make enums camel case

* feat(accel-pkg): add feedforward layer type

* refactor(py-ita): extract apply_activation function

* refactor(py-ita): vectorize apply_activation

* test(activation-tb): update file names of activation requant constants

* feat(py-ita): update file names of activation requant constants

* test(ita-tb): rename files of activation and gelu constants

* test(ita-tb): extend testbench to run a single feedforward layer with GELU activation after attention layer

* feat(py-ita): generate testvectors for feedforward layer

* feat(requantization-controller): reuse requantization constants at index 0 for feedforward layer

* feat(accel-pkg): extend interface with layer mode and number of feedforward tiles

* feat(controller): extend FSM for feedforward layer

* feat(py-ita): execute arbitrary activation in feedforward layer

* test(ita-tb): allow executing arbitrary activation function for feedforward layer

* feat(py-ita): add second stage with identity activation to feedforward layer according to transformer architecture

* test(ita-tb): execute second stage with identity activation in feedforward layer

* fix(ita): use correct imported function

* fix(py-ita): use instance field instead of removed local

* fix(py-ita): call correct functions

* refactor(makefile): remove duplicate variable definitions

* refactor(ita-tb): rename projection size to projection space

* fix(py-ita): use RQS constants at index 0 for 2nd FF layer

* fix(ita-tb): add missing required finish_number argument for $fatal call

* fix(hwpe-ita-tb): add missing required finish_number argument for $fatal call

* refactor(accel-pkg): rename layer typedef

* refactor(accel-pkg): reorder ctrl fields

* test(ita-tb): use matching types

* refactor(ita-tb): rename activation constant variables

* refactor(ita-tb): rename index variable since it has no relationship to phase

* fix(hwpe-ita-tb): fix simdir path

* fix(ita-package): update control engine to match control struct in accel_pkg

* test(hwpe-ita-tb): print correct base ptrs

* fix(hwpe-ita-tb): create arrays of correct size so that indexing with the step state does not go out-of-bounds when number of states changes

* feat(ita-package): reorder control engine type

* refactor(ita-package): remove redundant definition of control engine structure

* feat(ita-package): extend register mapping for feedforward layer

* refactor(accel-pkg): explicitly state enum type to allow direct assignment without cast from implicit int

* style(ita-package): formatting

* feat(hwpe-ita-tb): load constants and prepare registers for feedforward layer

* feat(ita-ctrl): pass constants and control signals for feedforward layer to control engine

* refactor(accel-pkg): explicitely compute N_STATES parameter

* refactor(py-ita): extract memfile constant

* fix(py-ita): include all files in file list for hwpe

* perf(gelu): apply strength reduction to remove signed mult when computing gelu_erf

* perf(gelu): reduce intermediate bitwidths

* perf(gelu): use lower bitwidth for poly_d

* feat(py-ita): export mempool and snitch cluster on demand

* chore(gitlab-ci): use gelu activation for FF layer by default

* feat(activation): pipeline gelu

* feat(accel-pkg): increase FIFO depth to account for gelu pipelining

* test(activation-tb): adjust for increased latency

* test(ita-tb): fix input, weight and output timing

* test(activation-tb): fix latency for identity

* refactor(gelu): split up combinational block by stages

* perf(activation): reduce flop resources for RELU buffering by 70% by performing sign extension when reading out buffers

* fix(ita): delay activation control signal until end of activation computations

* perf(gelu): use calc_en signal to only compute during valid cycles

* test(activation): add extra calc_en input

* refactor(accel-pkg): removed unused fields in control_t

* test(ita-tb): don't reference unused control_t fields

* refactor(gelu): merge gelu one and gelu b constants

* test(activation): removed unused signal

* fix(py-ita): correctly compute L2 error

* refactor(ita-package): remove unused gelu one constant

* chore(accel-pkg): increase fifo depth

* build(ci): fix mismatch of generated testvectors

* feat(return-status): add ff checks

* change(hwpe-pkg): do not reuse regs for activations

* fix(ita_tb): lower input valid signal after handshake

* feat: add support for two layer ffn

* fix(PyITA): correct random vector gen for ffn

* feat(PyITA): write hwpe files for ffn

* feat(hwpe_tb): extend to test ffn

* fix(PyITA): correct typecast in gelu

* change(ci): add activation to hwpe sim

* feat(PyITA): add separate requantization params for ffn

* feat(hw): add separate ffn requant params

* [PyITA] Move GELU functions

* [ci] Add tests with relu

* [PyITA] Modify license headers

* Remove config yaml file

* Add header

* Fix python format

* Add relu test vectors to ci

---------

Co-authored-by: Timon Fercho <tfercho@student.ethz.ch>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants