Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Harmonize low-level parsers #53

Open
Bisaloo opened this issue Oct 24, 2023 · 3 comments
Open

Harmonize low-level parsers #53

Bisaloo opened this issue Oct 24, 2023 · 3 comments
Assignees
Milestone

Comments

@Bisaloo
Copy link
Member

Bisaloo commented Oct 24, 2023

The goal of readepi is to provide a unified interface to various health information system or common epidemiological databases. A key principle and the challenge is the harmonization of these very different data sources.

This harmony should also visible internally in low-level parsers to 1. facilitate the understanding of the package structure, thus making easier to maintain it and detect potential issues, 2. possibly eventually export these low-level parsers, as discussed in #45.

I see the following sources of inconsistency that should be addressed:

  • naming conventions. Function names should all be based on the same model. In particular, there should not be functions with names such as login() or make_api_request() when they only apply to a specific data source. This video from Felienne Hermans, a researcher in software engineering, is a good resource of this topic.
  • function signature. Right now, the same argument can be in different orders and have different name in different functions, which make it error-prone and require an extra verification step each time one wants to use these functions. A good way to enforce good practice here is to rely on roxygen's documentation inheritance. One example among others:

read_from_dhis2 <- function(base_url,
user_name,
password,

login <- function(username, password, base_url) {

  • argument defaults. On a related note, arguments do not consistently include (or not) default values. A global policy for the whole package should be picked and applied to all low-level parsers.
  • argument checking. At the moment, the same arguments are not validated as strictly across different functions. A good way to enforce good practices and coherence here could be to create checking helpers (such as assert_credentials(username, password), or assert_url())
  • return values. Some functions return list(data) where others return list(data, metadata). I understand metadata may be available only in some cases but the return values should be consistent across functions, even if it means returning list(data = data, metadata = NA). This can again be enforced via roxygen's documentation inheritance.
  • (patterns. This last one is more difficult but important as well for readability and maintainability. The exact same operation is performed via different patterns in different sections of the code. For example do.call(rbind.data.frame, ...) vs dplyr::bind_rows(...); paste(collapse) vs glue::glue_collapse(), passing request parameters to httr::GET() as a list or directly in the URL, etc.)
@Karim-Mane
Copy link
Member

Karim-Mane commented Oct 26, 2023

  • check whether all arguments have default values
  • check for arguments order consistency across functions
  • argument validation to be the same across functions
  • return list(data, metadata). use metadata = NA if not applicable
  • harmonise between do.call() and dplyr::bind_rows(), paste() and glue::glue_collapse()

@Bisaloo
Copy link
Member Author

Bisaloo commented Oct 26, 2023

Thanks for getting started on this. Please make each of these points a separate PR.

@Karim-Mane Karim-Mane self-assigned this Oct 27, 2023
@Karim-Mane
Copy link
Member

I have already made the changes for setting argument's default values and argument consistency.

I will make the pull request for these 2 now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants