Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GROBID detects funding information, but funder is left out in tei xml output #1144

Open
mariadelmarq opened this issue Jul 25, 2024 · 5 comments
Labels
error cases Some error/test case for future improvements
Milestone

Comments

@mariadelmarq
Copy link

Hi,

Me again, sorry! I have another potential error case where the funder is correctly identified, but the name of the funder is left out of the TEI/XML output in a Cambridge University Press article. See screenshots below:

From the pdf:
image

Resulting tei xml
image

Because we're using NLP techniques to find the funding statements in the text of the article (I acknowledge we are potentially doubling up with what GROBID is attempting to do), this makes it really hard to identify the name of the funder, and the fact that there is a funding statement. Grateful for any ideas/advice!

@lfoppiano
Copy link
Collaborator

@mariadelmarq thanks again for reporting this issue, feel free to send me the source via email.

@lfoppiano
Copy link
Collaborator

@mariadelmarq which grobid version/environment/OS are you using?

@mariadelmarq
Copy link
Author

mariadelmarq commented Jul 25, 2024

Linux OS (Gnome Classic Desktop), running GROBID via Docker with: docker run --rm --init --ulimit core=0 -p 8070:8070 grobid/grobid:0.8.0.

I installed the python client and in my script have:

from grobid_client.grobid_client import GrobidClient

client = GrobidClient(config_path="./config.json")

client.process("processFulltextDocument", fulltext_dir, output = grobid_path)

@lfoppiano
Copy link
Collaborator

lfoppiano commented Jul 25, 2024

Thanks. I've checked and for this bug, is going to be fixed in the coming version 0.8.1. We can leave it open and after the release I will double check.

@lfoppiano lfoppiano added the error cases Some error/test case for future improvements label Jul 25, 2024
@lfoppiano lfoppiano added this to the 0.8.1 milestone Jul 25, 2024
@mariadelmarq
Copy link
Author

Brilliant, thanks heaps!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
error cases Some error/test case for future improvements
Projects
None yet
Development

No branches or pull requests

2 participants