-
Notifications
You must be signed in to change notification settings - Fork 15
RMassBank development task list
This page is supposed to be a loose collection of development proposals for RMassBank, which will be discussed in an upcoming dev meeting.
Use a h2 tag for every proposal. Descriptions may be as long or short as needed; make subsections with h3 if required.
Scope:
- narrow are incremental changes in the code, fixing bugs in a single function, etc.
- medium is the addition of new functionality, rewrites affecting a small number of functions
- broad regards changes that affect a large part of the code. This does not mean that this needs to be an all-or-nothing task (e.g. linting can be done incrementally).
Risk:
- minimal: This will be a Pareto improvement, hurting no one and helping at least one person. Also will not make the package more complex.
- medium: This is a change, but is unlikely to break existing functionality. Might need additional documentation and clarification
-
high: This could
- change existing behaviour for the same input
- change the user interface
- change an implementation in a way that needs to be verified (and since we don't have unit tests, needs to be done by hand)
- require changes that "make things worse" before "making things better" later
State:
- idea: proposal before detailed consideration, something to think about: The author does not necessarily already think this should be done. Mainly for collection / triage.
- proposal: The author thinks this should be done eventually. This should be discussed at the next dev meeting
- rejected proposal: Was discussed at a meeting and rejected.
- goal: Was discussed at a meeting and accepted as something that should eventually be done with no specific priorization.
- task: Was discussed at a meeting and specifically added to the list of things that are being worked on
After the task state, items should go to issues & milestones.
There is no automatic TOC here apparently, except for the bar at the right side :(
- Author: meowcat
- Scope: broad
- Risk: minimal
- State: task to be performed incrementally
Over the years we have amassed a lot of codestyle inconsistencies (and some of the things were bad to start with). We should fix bad practices.
- I propose to run
lintr
and fix everything it suggests, with prio 1 to the actual package code (R
directory). - Default
lintr
(and tidyverse, and most of modern R) style is snake_case. However Bioconductor style is camelCase for functions (and CamelCaps for S4 classes) and I suggest we stick with this because all our user-facing functions are camelCase. A lot of internal variables are snake_case or some other inconsistent mess, we should clean those up. - There is https://lcolladotor.github.io/biocthis/reference/bioc_style.html which might work.
- Author: meowcat
- Scope: broad
- Risk: medium if done incrementally.
- State: task to be performed incrementally
An awful lot of RMassBank functionality is a bad implementation of what Tidyverse already does well, very crufty ways of doing the equivalent of mutate
, bind_rows
, map
. We could cut out a lot of code and make the code much more readable if we rewrote the code to use tidyverse functionality.
- The one drawback is that we would add a dependency, since we technically don't yet depend on dplyr and purrr. However these packages are nearly globally installed nowadays.
- Ideally some test is added before making the change.
- Author: meowcat
- Scope: broad
- Risk: minimal
- State: goal
Old topic, but still not done. We should test more of our code.
- Maybe using doctests to test directly in our examples (https://hughjonesd.github.io/doctest/articles/doctest.html) would make things easier. My reservation is that the package seems to be quite new.
- There is also https://rorynolan.github.io/exampletestr/ but this does a one-time conversion.
- Author: meowcat
- Scope: medium
- Risk: medium
- State: proposal, was IMO not clearly decided upon
At Eawag/Uchem and at ETH, we already use wrapper scripts that extract EICs for all fragments and measure the correlation to the precursor, this is a second criterion in addition to formula assignment to get better discrimination between noise and signal. The current way is very hacky and uses attributes to tack stuff into the existing S4 objects.
I suggest to move this as an option to msmsRead
. The data should go directly into the child spectra, probably as a List column in the properties dataframe.
- Scope: narrow
- Risk: medium
- Author: meowcat
- State: proposal, was IMO not clearly decided upon
Existing issue: https://github.com/MassBank/RMassBank/issues/316
I have good experience with using milestones to get things done and to focus on important issues. Issue hoping is not very efficient.
- Author: tsufz
- Scope: broad
- Risk: low
- Scope: medium
- Risk: low
- Author: meowcat
- State: idea
A quick POC shows that we could use jinjar to render spectra using templates rather than by a handmade writer. This is not urgently needed and just one more alternative to using Spectra
-based IO, but worth a consideration.
Copyright (C) MassBank Consortium 2023
Authors: Michael Stravs, Tobias Schulze