Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update ct.gov connector and parser for ct.gov API v2 #158

Merged
merged 1 commit into from
Jul 24, 2024
Merged

Conversation

machinehum
Copy link
Contributor

Refactor to use v2 API and update URLs that referenced "classic" site for study pages.

Connector changes:

  1. Recurse over its pages of results -- no longer an option to return more than 100 studies at once when searching with location etc. criteria.
  2. Show status messages for both current "page" and overall.
  3. Fix #clear, which had not been updated to delete TrialSubgroups when those were added (hence was failing to delete Trials with a reference error).

Parser changes:

  1. Major changes to switch from XML/xpath to the new JSON format for API responses.
  2. Don't re-fetch data for each study individually, data for studies loads in each "page" in the connector and that data for each study is included in the parser object initialization (each parser instance represents handling of data for a single study).
  3. Change location matching from exact to substring (via regex). This fixes an issue where e.g. a location of "University of Minnesota/Cancer Center" would not match if the location in the site settings is "University of Minnesota". This was causing data for studies to be incorrectly omitted.
  4. New contacts algorithm, V2 API no longer has "contact" and "backup contact".
  5. Misc. updates for API changes to case in enumerated values, naming and nesting, etc.

Spec changes:
Added many tests and updated existing ones for ct.gov parser.

Rake task changes:
Update for new private connector API.

README changes:
Include update notes.

Copy link
Contributor

@EpicureanHeron EpicureanHeron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this all looks OK aside from the commented out line on 118 "#process_locations(trail.id)"

Fix that and it should be OK.

lib/parsers/ctgov.rb Outdated Show resolved Hide resolved
Refactor to use v2 API and update URLs that referenced "classic" site for study pages.

Connector changes:
1. Recurse over its pages of results -- no longer an option to return more than 100 studies at once when searching with location etc. criteria.
2. Show status messages for both current "page" and overall.
3. Fix #clear, which had not been updated to delete TrialSubgroups when those were added (hence was failing to delete Trials with a reference error).

Parser changes:
1. Major changes to switch from XML/xpath to the new JSON format for API responses.
2. Don't re-fetch data for each study individually, data for studies loads in each "page" in the connector and that data for each study is included in the parser object initialization (each parser instance represents handling of data for a single study).
3. Change location matching from exact to substring (via regex). This fixes an issue where e.g. a location of "University of Minnesota/Cancer Center" would not match if the location in the site settings is "University of Minnesota". This was causing data for studies to be incorrectly omitted.
4. New contacts algorithm, V2 API no longer has "contact" and "backup contact".
5. Misc. updates for API changes to case in enumerated values, naming and nesting, etc.

Spec changes:
Added many tests and updated existing ones for ct.gov parser.

Rake task changes:
Update for new private connector API.

README changes:
Include update notes.

uncomment line and update docker-compose
Copy link
Contributor

@EpicureanHeron EpicureanHeron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it all looks good. I ran it locally and ran the tests and the rake task. Thanks !

@machinehum machinehum merged commit 905cfc5 into main Jul 24, 2024
1 check failed
@machinehum machinehum deleted the ctgov-v2-api branch July 24, 2024 14:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants