Skip to content

NikitaEmberi/Unveiling-the-npm-Ecosystem-STA220-project

 
 

Repository files navigation

Unveiling the npm Ecosystem: Insights into Popularity, Diversity, and Security

Welcome to our project! This repository contains code and documentation related to our exploration of the npm ecosystem.

The npm ecosystem has grown rapidly, now boasting over 800,000 packages, surpassing other package managers. But understanding its impact goes beyond sheer numbers. This project explores different aspects of the npm ecosystem, like package popularity, gender representation, common domains and keywords, dependency networks, and contributor locations. By analyzing these factors, we aim to uncover trends, community priorities, and potential security issues. Through thorough research and analysis, our project aims to provide valuable insights for developers, researchers, and stakeholders invested in the npm ecosystem's growth and sustainability.

Research Questions:

  1. Identification of Popular Packages: How can we effectively identify popular npm packages? This entails defining and exploring metrics such as downloads, GitHub stars, and community engagement to gauge package popularity comprehensively.

  2. Exploring the Gender Distribution among Contributors: What is the gender distribution among contributors in the NPM ecosystem? Are females underrepresented? Do popular packages demonstrate a higher proportion of female contributors?

  3. Prevalent Domains and Keywords within npm Packages: What are the prevalent domains and keywords within npm packages, particularly among popular ones? How do these patterns reflect current trends and areas of focus within the broader software development community?

  4. Construction of Dependency Networks and Ranking Package Importance: How does constructing a dependency network from package data in the NPM ecosystem and applying the Google Pagerank algorithm contribute to identifying the top 10 packages? How does this facilitate a deeper understanding and ranking of package importance within the ecosystem?

  5. Geographic Distribution of npm Developers: How is the geographic distribution of npm contributors distributed across different countries, especially among popular npm packages?

  6. Number and Classification of Vulnerable Packages: What is the number and classification of vulnerable npm packages, particularly among popular ones?

Click here to view Interactive Maps: Contributor Locations: here
Locations of Contributors in Popular NPM packages: here
LDA Model: here
For a detailed and comprehensive description of our findings, please refer to our report. The report contains in-depth analysis, insights, and conclusions drawn from our research efforts. Link to Report. Feel free to explore the report to gain a deeper understanding of our project findings.

Project Acknowledgements

Both of the teammates, V V S Aakash Kotha & Nikita Bhrugumaharshi Emberi, contributed equally to the development of this project. While most of the work was done independently, we utilized Language Models, specifically ChatGPT [10], whenever necessary to confirm details and enhance our understanding. We extend our sincere thanks to it for its valuable assistance throughout the project.

Additionally, we would like to thank our Prof. Peter Kramlinger & TA. Sophia Sun for their guidance and support. Their insights and feedback were invaluable in shaping the direction of our work.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 95.1%
  • HTML 4.9%