Skip to content

Commit

Permalink
introduction: shorten text
Browse files Browse the repository at this point in the history
  • Loading branch information
miltondp committed Mar 26, 2024
1 parent ec91858 commit 6d199c0
Show file tree
Hide file tree
Showing 2 changed files with 54 additions and 105 deletions.
33 changes: 8 additions & 25 deletions content/02.introduction.md
Original file line number Diff line number Diff line change
@@ -1,30 +1,13 @@
## Introduction

The tradition of scholarly writing dates back thousands of years, evolving significantly with the advent of scientific journals approximately 350 years ago [@isbn:0810808447].
External peer review, used by many journals, is even more recent, having been around for less than 100 years [@doi:10/d26d8b].
Most manuscripts are written by individuals or teams working together to describe new advances, summarize existing literature, or argue for changes in the status quo.
However, scholarly writing is a time-consuming process in which the results of a study are presented using a specific style and format.
Academics can sometimes be long-winded in getting to key points, making their writing more impenetrable to their audience [@doi:10.1038/d41586-018-02404-4].

Scholarly writing has evolved since the first scientific journals 350 years ago, adopting practices like external peer review in the last century [@isbn:0810808447; @doi:10/d26d8b].
It often involves dense language to convey new advances or literature summaries [@doi:10.1038/d41586-018-02404-4].
Meanwhile, recent computing advances have enabled large language models (LLMs) like OpenAI's GPT-3 and GPT-4 [@arxiv:2005.14165], revolutionizing technologies and applications in various fields, including medical informatics and scientific communication [@doi:10.1093/jamia/ocad072; @doi:10.1093/jamia/ocad245].
These models promise to streamline scientific writing [@doi:10.1038/d41586-022-03479-w], though their use raises accuracy and ethical concerns [@doi:10.1093/jamia/ocad104; @doi:10.1093/jamia/ocad091].

Recent advances in computing capabilities and the widespread availability of text, images, and other data on the internet have laid the foundation for artificial intelligence (AI) models with billions of parameters.
Large language models (LLMs), in particular, are opening the floodgates to new technologies with the capability to transform how society operates [@arxiv:2102.02503].
OpenAI's models, for instance, have been trained on vast amounts of data and can generate human-like text [@arxiv:2005.14165].
These models are based on the transformer architecture, which uses self-attention mechanisms to model the complexities of language.
The most well-known of these models is the Generative Pre-trained Transformer (GPT-3, and more recently, GPT-4), which is highly effective for a range of language tasks such as generating text, completing code, and answering questions [@arxiv:2005.14165].
In the realm of medical informatics, scientists are beginning to explore the utility of these tools in optimizing clinical decision support [@doi:10.1093/jamia/ocad072] or assessing its potential to reduce health disparities [@doi:10.1093/jamia/ocad245], while also raising concerns about their impact on medical education [@doi:10.1093/jamia/ocad104] and the importance of keeping the human aspect central in AI development and application [@doi:10.1093/jamia/ocad091].
These tools have also been used to enhance scientific communication [@doi:10.1038/d41586-022-03479-w].
This technology has the potential to revolutionize how scientists write and revise scholarly manuscripts, saving time and effort and enabling researchers to focus on more high-level tasks such as data analysis and interpretation.
However, using LLMs in research has sparked controversy, primarily due to their propensity to generate plausible yet factually incorrect or misleading information.


In this work, we present a human-centric approach for using AI in manuscript writing, where scholarly text, initially created by humans, is revised through edit suggestions from LLMs and is ultimately reviewed and approved by humans.
This approach mitigates the risk of generating misleading information while still providing the benefits of AI-assisted writing.
We developed an AI-assisted revision tool that implements this approach and builds on the Manubot infrastructure for scholarly publishing [@doi:10.1371/journal.pcbi.1007128], a platform designed to enable both individual and large-scale collaborative projects [@doi:10.1098/rsif.2017.0387; @pmid:34545336].
Our tool, named the Manubot AI Editor, parses the manuscript, utilizes an LLM with section-specific prompts for revision, and then generates a set of suggested changes to be integrated into the main document.
These changes are presented to the user through the GitHub interface for review.
During prompt engineering, we developed unit tests to ensure that the AI revisions met a minimum set of quality measures.
For an end-to-end evaluation, we applied our tool to five manuscripts that included sections of varying complexity.
We performed 1) a human evaluation by manually reviewing the AI revisions, and 2) an automatic evaluation using the LLM-as-a-Judge technique [@arxiv:2306.05685] to assess the quality of the AI revisions.
Our findings indicate that, in most cases, the models could maintain the original meaning of the text, improve the writing style, and even interpret mathematical expressions.
Officially part of the Manubot platform, our Manubot AI Editor can be readily incorporated into Manubot-based manuscripts, and we anticipate it will help authors more effectively communicate their work.
We introduce a human-centric AI method for scholarly writing, leveraging LLMs for draft revision within the Manubot platform, a tool for collaborative publishing [@doi:10.1371/journal.pcbi.1007128].
Here, we propose the Manubot AI Editor, which suggests revisions via GitHub, balancing AI's efficiency with human oversight to ensure accuracy.
Tested on five manuscripts, we found it maintained the original meaning, improved style, and handled complex expressions, proving a valuable addition to the Manubot suite.
We anticipate our tool will help authors more effectively communicate their work.
Loading

0 comments on commit 6d199c0

Please sign in to comment.