Skip to content

AISE-TUDelft/nlbse23_reading_list

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 

Repository files navigation

The (Ab)use of Open Source Code to Train Large Language Models

This repo is a reading list related to the position paper published at The 2nd International Workshop on NL-based Software Engineering.

Articles

Social Media Posts

Papers

  • Andreotta, Adam J., Nin Kirkham, and Marco Rizzi. "AI, big data, and the future of consent." Ai & Society 37.4 (2022): 1715-1728.
  • Carlini, Nicholas, et al. "Extracting training data from large language models." 30th USENIX Security Symposium (USENIX Security 21). 2021.
  • Carlini, Nicholas, et al. "Membership inference attacks from first principles." 2022 IEEE Symposium on Security and Privacy (SP). IEEE, 2022.
  • Carlini, Nicholas, et al. "Quantifying memorization across neural language models." arXiv preprint arXiv:2202.07646 (2022).
  • Fried, Daniel, et al. "Incoder: A generative model for code infilling and synthesis." arXiv preprint arXiv:2204.05999 (2022).
  • Pearce, Hammond, et al. "Asleep at the keyboard? assessing the security of github copilot's code contributions." 2022 IEEE Symposium on Security and Privacy (SP). IEEE, 2022.
  • Sun, Zhensu, et al. "CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning." Proceedings of the ACM Web Conference 2022. 2022.

Other:

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published