Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark comparison between LLMs and GROBID for scholarly PDF data extraction #1146

Open
IlievskiV opened this issue Jul 31, 2024 · 0 comments

Comments

@IlievskiV
Copy link

Hello everyone,

first of all thank you for your time and for the valuable work you're doing with GROBID.

I was wondering if a benchmark comparison exists between multimodal LLMs and GROBID in its Conditional Random Fields (CRF) and Deep Neural Network (DNN) variants?

LLMs are advancing and becoming more powerful on multimodal inputs (GPT-4 with vision capabilities, or ChatGPT-4o). Thus it is possible to parse PDFs and have structured outputs by using LLMs.

However, before considering if it is worth going in that direction, I was wondering if there's any comparison between any multimodal LLM and GROBID. If such a benchmark doesn't exist, do you have any insights or opinions on how these approaches might compare?

Thanks in advance! Any information would be greatly appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant