Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

org name matching #9

Open
mcneillucy opened this issue Jul 2, 2019 · 3 comments
Open

org name matching #9

mcneillucy opened this issue Jul 2, 2019 · 3 comments
Assignees

Comments

@mcneillucy
Copy link
Contributor

mcneillucy commented Jul 2, 2019

Here is a list of things that are still incorrect, things that should continue to be worked on, and things to keep in mind within my rules! In the orgnames R script there is a #Testing section that should give good tools that I used to create my rules.

@mcneillucy
Copy link
Contributor Author

mcneillucy commented Jul 3, 2019

ORG.COMMENT *filtered by rulemaking

  • completed:
    FWS, NPS, FDA, EERE, EPA, VA, IRS, CFPB, OSHA, ATF, OFCCP, ACF, ETA, DOD, EEOC, PHMSA, CDC, BSEE, EBSA, OCC, DOI, MMS, EBSA, CPSC, LMSO, OPM, OSM, CMS, ED ,DOD, ETA, BLM, FNS, USCIS (0), FAA (0), NOAA

  • Some agencies were too big to look through using the matching dockets within all comments....
    BLM, WHD, HHS

  • some agencies were pretty messy
    NHTSA, NRC

  • had problems loading as groups, but I assume these are fine on their own but couldn't get to them
    BOEM, DOL, OTS, HHS, NRC, BLM, DOD, CMS, ETA, FNS

  • Notes:

    • played it safe after I completed general org.comment coding. There is lot of coding that just pertains to one agency that could have potential to be broadened, ex. lots of times when I str_dct(org, ".*") I do it for a bunch of agencies after I glance through them
    • there was difficulty deciphering within mass comments if there was one that was directly from the org, so there will be some missing TRUES there
    • knew less when I did the first few agencies so there may be some easy code to capture more, just can't tell with loading speed in the last few days, would definitely at least check back to make sure nothing is miscoding badly (ie. EPA, NPS, FWS)

@mcneillucy
Copy link
Contributor Author

mcneillucy commented Jul 3, 2019

POSITION (two per docket)

  • completed:
    EERE, NPS, FWS, FDA, VA, OSHA, ATF
    (note: we decided to wait for spreadsheet for EPA)

@mcneillucy
Copy link
Contributor Author

mcneillucy commented Jul 3, 2019

ORG

  • for the most part has been working well, with a few gathering more words before them than they should, but should be easy to remove (ex. attended the art institute)

  • created script for one word orgs that are getting dropped because of a coding choice (line 255), these need to be added and/or use code (line 1069) after to bring the organization variable back into org if the org.comment == "T"

  • Problems:

    • (line 250) org names with lots of "." are dropping out because of code that gets rid of names like J. Smith that are gibberish in org...losing names like N.R.A, A.O.I.B.P, and I.A.M National Pension Fund
    • there is large mismatching in org with firstname lastname in org
    • there are sometimes multiple orgs commenting together and that may not be captured in org

@judgelord judgelord self-assigned this Oct 10, 2024
@judgelord judgelord changed the title Summary/Notes org name matching Oct 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants