Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC schedule and deployment plan #97

Open
Bankso opened this issue Apr 5, 2024 · 12 comments
Open

RFC schedule and deployment plan #97

Bankso opened this issue Apr 5, 2024 · 12 comments
Assignees

Comments

@Bankso
Copy link
Contributor

Bankso commented Apr 5, 2024

To expand the MC2 data model, data type-specific metadata models should be assembled and reviewed by the community via the Request for Comments process. One strategy for implementing is as follows:

  • identify priority metadata models (assay or data type, processing level) and key information that we want to be able to provide for the community (common pipelines/processing, common repos, common access requirements, etc. Basically, any question posed about data reuse/sharing should be considered as content to ask about during the RFC)
  • assemble schedule of RFC dates
  • hold an introductory webinar on the RFC process for the community, to discuss the guidelines, expected outcomes, rules of engagement, use case requirement, etc.

For each metadata model:

  • select key community members to email about the RFC
  • assemble version 0, based on existing models/standards (HTAN, NF, AD, CRDC, etc.)
  • have a sheet with prelim model and RFC document available for contributors
  • have the storage repo included (type-specific if possible)
  • record common processing methods/pipelines
  • for QC metrics, indicate the pipeline or tool used and the source doc or output

Some priority data types:

  • biospecimen
  • tools
  • bulk RNA-seq level 1 - 4
  • scRNA-seq level 1 - 4
  • 10X Visium (or general spatial, if we can decide on how to put that together)
  • imaging/microscopy (recommend stitched images and then masks and analysis, avoiding the giant raw images where possible, unless required for secondary use)
@Bankso Bankso self-assigned this Apr 5, 2024
@Bankso
Copy link
Contributor Author

Bankso commented Jul 3, 2024

Work here will be informed by #74, in particular, 1) the level of CDS/DataHub model integration we reach and 2) our perceived sufficiency of the CDS/DataHub models for MC2 use cases. If we primarily adopt the CDS/DataHub assay models, I'm not sure how critical RFCs will be

@aclayton555
Copy link

Expect this work to be within the scope of work for #115 and will be a specific output

@aclayton555
Copy link

Initial brainstorm with Milen, Ashley, and Orion: https://docs.google.com/document/d/1dF1-FjGSdO3nkKArEsrnjnWFLeOV78MlvGZvM8smJVk/edit

@aclayton555
Copy link

mid-sprint:

  • start to work on design doc
  • schedule for RFC to be informed by what we are receiving (e.g. imaging data likely first)
  • need to start thinking about how we organize the data model (e.g. data types, processing levels)

@Bankso
Copy link
Contributor Author

Bankso commented Aug 30, 2024

Work completed:

RFC tool design doc

RFC tool mockups

Data model visualizer for MC2

RFC tool app repo

Next steps:

  • refine app UI
  • finalize functions required to handle form data + github integration
  • build API
  • build github actions to handle API calls

@aclayton555
Copy link

aclayton555 commented Aug 30, 2024

24-7/8 close-out:

  • @Bankso will do a bit of tidy up and share app prototype with Milen to assess whether support is available to improve data model visualization tool, and what the timelines would be. If immediate support is not available, could link users to data model docs for more information. This is what Aditi has done in HTAN.
  • Ensure we build in a robust method for pulling in the latest version of the data model (should be able to do this similar to how the docs site works).
  • thinks about (external) user experience and instructions for users
  • incorporate "Attribute Required?" true/false

Roll this into the next sprint to complete next steps. A target will be to use this in a pilot with the Tools schema. Alts could be biospecimen. Timeline for October.

@aclayton555
Copy link

aclayton555 commented Sep 4, 2024

24-9: secondary to site visit priorities

@aclayton555
Copy link

During home week, discussions with Adam T to leverage established tools to deploy this (rather than build a new app). Use GH integrations and think about the front end being relatively low lift (e.g. google sheet with gh integration)

Schedule for RFCs - pick one to start: Tools schema is ready to pilot with this.

Aim to have this ready to pilot end of Nov 2024

@aditigopalan aditigopalan self-assigned this Oct 31, 2024
@aclayton555
Copy link

24-10 close out: We don't want to build an app. Orion has been digging around and maybe found an existing solution (link here). Aditi may be able to pick this up and work with this.

@Bankso
Copy link
Contributor Author

Bankso commented Oct 31, 2024

Potential repository that could help us integrate google sheets with GitHub: https://github.com/mahaker/gas-github

@aclayton555
Copy link

aclayton555 commented Nov 1, 2024

24-11/12: Build solution and potentially deploy Tools schema. Need to dig into package a bit and see if this will work. Focus here is on linkage between Google Sheet and GitHub

Would existing functionality that the DCA leverages to generate Google sheets be helpful here?

@aclayton555
Copy link

aclayton555 commented Dec 12, 2024

24-11/12: With sunset of DCA, will likely need to use schematic to generate sheets (not DCA). Also anticipate that LinkML will be able to help with this once we have our data model converted - this might open up a new route for us.

Approaches:

  • Can leverage google sheets approach now (integration with GH that shows changes in CSV)
  • Longer term (or we can wait) make this LinkML compatible

Suggest that we proceed with deploying a google sheets approach. Consider a phased approach to pilot this, but with longer term compatability with LinkML in mind

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants