forked from SimonXIX/quarto_semanticclimate
-
Notifications
You must be signed in to change notification settings - Fork 3
/
workflow.qmd
35 lines (23 loc) · 2.89 KB
/
workflow.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
---
title: Prototype Next Steps & Workflows
filters:
- lightbox
lightbox: auto
---
## Prototype Next Steps & Workflows
![Workflow: City - Open Climate Reader](workflows/open climate reader workflow.drawio.png "Workflow: City - Open Climate Reader")
![Workflow: AI Open Climate Reader](workflows/AI open climate reader workflow.drawio.png "Workflow: AI Open Climate Reader")
See diagrams here: [https://github.com/semanticClimate/city-open-climate-reader/tree/main/workflows](https://github.com/semanticClimate/city-open-climate-reader/tree/main/workflows)
## Next steps?
If you’re interested in helping out on the next round of prototyping by volunteering or contributing resources check out the semanticClimate [discussion](https://github.com/petermr/semanticClimate/discussions/32) and [project tasks](https://github.com/orgs/semanticClimate/projects/3) on GitHub.
### A working prototype for Climate Reader
The round of work from the hackathon allowed the team to see the gaps in the system and what would need to be put in place to have a working prototype — meaning we now have a roadmap. There are four priority areas to work on for a working prototype:
1. How to smoothly move contributors' questions into the search and content retrieval process?
2. How to create citation links to sections of the source material which are only held as PDFs and so can’t be targeted on a granular basis? [Wikibase](https://semanticclimate.org/IPCC-Queries/) has already been used by semanticClimate and is one route forward for this.
3. Further automating the conversion of HTML content to Markdown.
4. Adding a review process to the collated reader content.
### AI prototype (self-hosted)
AI Climate Reader is a proof of concept software prototype that can query scientific corpora and create ‘reader’ publications combining AI with added human review. The types of 'reader’ publications it would make are: Referenced summaries, suggestions for literature, and guides on specific topics. The publications would have all the references collated as full text in the publication.
1. Deposit content in a Wikibase instance as with the [AR6 IPCC Report](https://semanticclimate.org/IPCC-Queries/). New Wikibase instances are supplied by [Wikibase4Research](https://nfdi4culture.de/services/details/wikibase4research.html).
2. Then self-hosted AI would be used to create the publications — example AI models are [falcon-7b](https://huggingface.co/h2oai/h2ogpt-gm-oasst1-en-2048-falcon-7b-v3), [falcon-40b-v2](https://huggingface.co/h2oai/h2ogpt-gm-oasst1-en-2048-falcon-40b-v2), [vicuna-33b-v1.3](https://huggingface.co/lmsys/vicuna-33b-v1.3).
3. Review and multi-format publishing is then run to create a reproducible and replicable open science, semantic publication on GitHub/Lab Pages using the [ADA Semantic Publishing Pipeline](https://github.com/TIBHannover/ADA) which harnesses Fidus Writer and its Open Journals System JATS editor plugin.