Skip to content

Latest commit

 

History

History
85 lines (63 loc) · 3.41 KB

README.md

File metadata and controls

85 lines (63 loc) · 3.41 KB

non-npm-package-json-files

Get a collection of package.json files for non-NPM packages

Why?

We needed package.json files from real projects that aren't packages published to NPM. While NPM can tell you the absolute usage of NPM packages in terms of download numbers, we were interested in the set of dependencies that people were using together in a given project.

More details about how we used a sample of these package.json files to simulation for StackAid: StackAid in Beta

Requirements

On MacOS:

brew install brew install go-task/tap/go-task && task brew:requirements

Get a Sourcegraph access token

Use the src CLI to see if you're authenticated:

task src:login

If you're not logged in, then you should see a link in the output for creating an access token. Once you have an access token, put it in the .env file. It should look like this:

SRC_ACCESS_TOKEN=<your access token>

Once configured correctly, rerun src:login task to confirm your configuration.

Query Sourcegraph

To query for all package.json files on GitHub that aren't in node_modules or directories such as test, fixture or examples:

task src:query

The command will take about 1 minute and return just over 1M results. The results file in the data directory: ./data/src_github_results.jsonl and it should look like this:

{"type":"path","path":"package.json","repository":"freeCodeCamp/freeCodeCamp","branches":[""],"commit":"382717cce4ea5593eb623ba5ef0bd47c534411d1"}
{"type":"path","path":"web/package.json","repository":"freeCodeCamp/freeCodeCamp","branches":[""],"commit":"382717cce4ea5593eb623ba5ef0bd47c534411d1"}
{"type":"path","path":"curriculum/package.json","repository":"freeCodeCamp/freeCodeCamp","branches":[""],"commit":"382717cce4ea5593eb623ba5ef0bd47c534411d1"}
{"type":"path","path":"tools/crowdin/package.json","repository":"freeCodeCamp/freeCodeCamp","branches":[""],"commit":"382717cce4ea5593eb623ba5ef0bd47c534411d1"}
{"type":"path","path":"tools/scripts/seed/package.json","repository":"freeCodeCamp/freeCodeCamp","branches":[""],"commit":"382717cce4ea5593eb623ba5ef0bd47c534411d1"}

To convert the file to a CSV:

task src:query:csv

The results will be in ./data/src_github_results.csv and it should looks this this:

repo,commit_sha,path
freeCodeCamp/freeCodeCamp,382717cce4ea5593eb623ba5ef0bd47c534411d1,package.json
freeCodeCamp/freeCodeCamp,382717cce4ea5593eb623ba5ef0bd47c534411d1,web/package.json
freeCodeCamp/freeCodeCamp,382717cce4ea5593eb623ba5ef0bd47c534411d1,curriculum/package.json
freeCodeCamp/freeCodeCamp,382717cce4ea5593eb623ba5ef0bd47c534411d1,tools/crowdin/package.json
freeCodeCamp/freeCodeCamp,382717cce4ea5593eb623ba5ef0bd47c534411d1,tools/scripts/seed/package.json

Debug Sourcegraph query

Try the query on Sourcegraph!