Introduction

This repo demonstrates how to use the Unstructured library with Weaviate. The Unstructured Library offers powerful capabilities for parsing a variety of data sources and extracting structured text from them. This includes, but is not limited to, documents like PDFs, Powerpoints, or JPEG files.

The dataset we've included are two publicly available research papers. One paper contains a single column, and the other has a two column format. The notebook starts with a basic approach to using Unstructured and ends with an end-to-end example. This includes connecting to your Weaviate instance, defining your schema, uploading the data and then running two queries.

Read the blog post for more information!

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
notebooks		notebooks
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

About

Releases

Packages

Languages

dalty999/how-to-ingest-pdfs-with-unstructured

Folders and files

Latest commit

History

Repository files navigation

Introduction

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages