Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add gpt2 converter, hellaswag eval tool, misc fixes #38

Merged
merged 4 commits into from
Jun 20, 2024

Conversation

francoishernandez
Copy link
Contributor

@francoishernandez francoishernandez commented Jun 19, 2024

Experimenting in #32 , I gathered that it might be good to support the official gpt2 baseline, so here it is.

Some notes:

  • gpt2 is not 100% reproducing the official huggingface implementation, most probably because of slight numerical differences between nn.Linear (ours) and nn.Conv1D (huggingface)
  • addition of a "TRANSPOSE" mechanism in convert_HF (again, Linear vs Conv1D)
  • the hellaswag tool is a bit janky
  • addition of "Learned" position_encoding, some additional factorization around this might be good
  • ⚠️ modification of the default left_padding behaviour (might still be improved)

@funboarder13920 @l-k-11235 this will conflict with #26 and potential future work there

@francoishernandez francoishernandez merged commit c4a8a3d into main Jun 20, 2024
4 checks passed
francoishernandez added a commit that referenced this pull request Jun 20, 2024
francoishernandez added a commit that referenced this pull request Jun 20, 2024
@francoishernandez francoishernandez deleted the gpt2_hellaswag branch July 1, 2024 13:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant