-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Make log level configurable * Add basic CI checks * Add checkout, oops * Install pipenv * Fix pipenv cmd * Use the Python setter upper * Syntax error * YAML is hard * Add fixtures * Spruce up logging * Update dat readme * Add header image * Accept regions on command line * Fix typo
- Loading branch information
Showing
18 changed files
with
893 additions
and
49 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
name: analyze | ||
on: [push] | ||
|
||
jobs: | ||
analyze: | ||
name: make analyze | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v2 | ||
- uses: actions/setup-python@v1 | ||
with: | ||
python-version: 3.8 | ||
- name: install dependencies | ||
run: pip install pipenv && python3 -m pipenv sync --dev | ||
- name: run static type analysis | ||
run: make analyze |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
name: check | ||
on: [push] | ||
|
||
jobs: | ||
check: | ||
name: make check | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v2 | ||
- uses: actions/setup-python@v1 | ||
with: | ||
python-version: 3.8 | ||
- name: install dependencies | ||
run: pip install pipenv && python3 -m pipenv sync --dev | ||
- name: run unit and functional tests | ||
run: make check |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
name: format check | ||
on: [push] | ||
|
||
jobs: | ||
format: | ||
name: make format-check | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v2 | ||
- uses: actions/setup-python@v1 | ||
with: | ||
python-version: 3.8 | ||
- name: install dependencies | ||
run: pip install pipenv && python3 -m pipenv sync --dev | ||
- name: run formatting check | ||
run: make format-check |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,31 +1,114 @@ | ||
# PyRS990 | ||
![PyRS990 Header](pyrs990_header.png) | ||
|
||
It's a pun. Get it? | ||
|
||
A Python application that can grab all sorts of IRS Form 990 data on non-profit organizations and | ||
put it into a format that can be consumed easily. | ||
A Python application and library that can grab all sorts of IRS Form 990 | ||
data on non-profit organizations and put it into a format that can be | ||
consumed easily by other applications. | ||
|
||
## Running | ||
## Up and Running | ||
|
||
The instructions below should allow you to get the software working | ||
for your purpose (user or developer). If you run into trouble please, | ||
please let us know so that we can update the instructions (or fix the | ||
bug you ran into). | ||
|
||
### User | ||
|
||
For now you need to clone the repo to use it. Eventually we'll package it. | ||
|
||
1. Make sure you have Python 3.8 available | ||
1. Install `pipenv` - `pip install --user pipenv` | ||
1. Clone the whole repo, `cd` into the `pyrs990` directory | ||
1. Install dependencies - `pipenv sync` | ||
1. Run the example: `pipenv run python -m pyrs990` | ||
1. Run it again to use the cache (notice the speedup) | ||
1. Run it, some very simple examples are below: | ||
1. `pipenv run python -m pyrs990 --zip 59801 --use-disk-cache` | ||
1. ...more examples coming soon | ||
1. Run the commands again, notice the cache speedup | ||
1. The cache is set to `./.pyrs990-cache/` | ||
|
||
## Development | ||
### Developer | ||
|
||
1. Make sure you have Python 3.8 available | ||
1. Install `pipenv` - `pip install --user pipenv` | ||
1. Clone the whole repo, `cd` into the `pyrs990` directory | ||
1. Install dependencies - `pipenv install --dev` | ||
1. Install dependencies - `pipenv sync --dev` | ||
1. Make your changes | ||
1. If you need to add dependencies: | ||
1. `pipenv install coolpkg` | ||
1. `pipenv lock` (updates the lock file) | ||
1. Make a pull request | ||
|
||
## About the Data | ||
|
||
Right now we pull data that originated with the IRS (hence the silly name) | ||
but we get it from a couple sources and information about what is actually | ||
available is a little spread out as well. | ||
|
||
### Structure | ||
|
||
There are two indices used to narrow down the list of filing documents | ||
that must be downloaded a satisfy a given query. The first is an | ||
annual index (we refer to it as "Annual" or "Annual Index" in the | ||
code). This index contains all filings processed by the IRS for a | ||
given calendar year. | ||
|
||
Note that this does not necessarily have anything | ||
to do with the filing year. An organization might, for example, file | ||
its 2016 990 in either 2017 or 2018 (or even later). There is a field, | ||
described below, called `tax_period` that reflects the filing period. | ||
In the future, we intend to further abstract this so that it is | ||
easier to use. | ||
|
||
The annual index also contains a field called `object_id` that tells | ||
us where to find the XML document that corresponds to that row in | ||
the index. PyRS990 abstracts this away, but it is still good to be | ||
aware of it. | ||
|
||
The second index is the "Exempt Organizations Business Master File" | ||
distributed by the IRS. We refer to it as the "BMF Index". This | ||
index provides the physical address of each organization, along | ||
with some other helpful information. This allows the data to be | ||
queried by state, zip code, and so on, which greatly reduces the | ||
number of filing documents that must be downloaded for many queries. | ||
|
||
Indices may be used to query filing documents from the command | ||
line using various options. Note that there are options for both | ||
indices and for the filing documents themselves. If possible, it | ||
is a good idea to try to use as many index fields as you can to | ||
reduce the number of files you have to download. | ||
|
||
See the example queries for more information. | ||
|
||
### Sources | ||
|
||
The [IRS BMF index files](https://www.irs.gov/charities-non-profits/exempt-organizations-business-master-file-extract-eo-bmf) | ||
are hosted by the IRS directly and are available by state and region. | ||
|
||
[Descriptions of the variables](https://www.irs.gov/pub/irs-soi/eo_info.pdf) | ||
contained in the files and the process used to build them are | ||
also available (it is also linked from the page above). | ||
|
||
The annual index files come from an | ||
[AWS S3 bucket](https://registry.opendata.aws/irs990/) | ||
managed by the IRS. The contents of the bucket are described there. | ||
|
||
There is also [a readme](https://docs.opendata.aws/irs-990/readme.html) | ||
that demonstrates how to download the files here (it is also linked | ||
from the page above): | ||
|
||
The filing documents themselves also come from this same | ||
[AWS S3 bucket](https://registry.opendata.aws/irs990/) | ||
in XML format. For the extremely XML-savvy, you can checked out the | ||
[schema documentation](https://www.irs.gov/e-file-providers/current-valid-xml-schemas-and-business-rules-for-exempt-organizations-modernized-e-file) | ||
on the IRS website. PyRS990 abstracts this away, however, | ||
so there's no real need to understand it if you only want to access the | ||
data in a convenient format. | ||
|
||
Finally, while not strictly a data source, the | ||
[IRSx documentation](http://www.irsx.info/) created | ||
by ProPublica contains descriptions of many of the filing fields in a | ||
simple, readable format. For developers, PyRS990 has been designed to | ||
work with the exact XPath selectors listed in the IRSx documentation, so | ||
if you want to add a field to the `Filing` object, this is the place to | ||
look first. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
RETURN_ID,FILING_TYPE,EIN,TAX_PERIOD,SUB_DATE,TAXPAYER_NAME,RETURN_TYPE,DLN,OBJECT_ID | ||
16285381,EFILE,133085892,201809,5/10/2019 6:06:12 AM,LOGOS ENCOUNTER INC,990,93493091012069,201910919349301206 | ||
16279505,EFILE,640411847,201805,5/8/2019 9:46:22 PM,MISSISSIPPI CHRISTIAN FOUNDATION,990,93493101010839,201931019349301083 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
EIN,NAME,ICO,STREET,CITY,STATE,ZIP,GROUP,SUBSECTION,AFFILIATION,CLASSIFICATION,RULING,DEDUCTIBILITY,FOUNDATION,ACTIVITY,ORGANIZATION,STATUS,TAX_PERIOD,ASSET_CD,INCOME_CD,FILING_REQ_CD,PF_FILING_REQ_CD,ACCT_PD,ASSET_AMT,INCOME_AMT,REVENUE_AMT,NTEE_CD,SORT_NAME | ||
010571299,HILLSIDE CHURCH,% TERESA PARKER,5685 HWY 93 S,WHITEFISH,MT,59937-8523,1489,03,9,7000,196007,1,10,001268000,1,01,,0,0,06,0,12,,,,, | ||
010613656,CHAPEL OF HOPE MINISTRIES OF ROUNDUP,% MERLE & LOUISE HUNT,16843 HWY 12 WEST,ROUNDUP,MT,59072-0000,0000,03,3,7000,200206,1,10,000000000,1,01,,3,3,06,0,09,,,,X20,YCH |
Oops, something went wrong.