Skip to content

Commit

Permalink
Merge pull request #24 from CambioML/dev
Browse files Browse the repository at this point in the history
Rebrand and bump version
  • Loading branch information
CambioML authored May 6, 2024
2 parents fc1b9fe + e933184 commit c036739
Show file tree
Hide file tree
Showing 18 changed files with 79 additions and 79 deletions.
2 changes: 1 addition & 1 deletion .github/ISSUE_TEMPLATE/bug-report.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ body:
- type: markdown
attributes:
value: >
#### Before submitting a bug, please make sure the issue hasn't been already addressed by searching through [the existing and past issues](https://github.com/CambioML/open-parser/issues?q=is%3Aissue+sort%3Acreated-desc+).
#### Before submitting a bug, please make sure the issue hasn't been already addressed by searching through [the existing and past issues](https://github.com/CambioML/any-parser/issues?q=is%3Aissue+sort%3Acreated-desc+).
- type: textarea
attributes:
label: 🐛 Describe the bug
Expand Down
4 changes: 2 additions & 2 deletions .github/ISSUE_TEMPLATE/documentation.yml
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
name: 📚 Documentation
description: Report an issue related to https://www.cambioml.com/docs/open-parser/index.html
description: Report an issue related to https://www.cambioml.com/docs/any-parser/index.html

body:
- type: textarea
attributes:
label: 📚 The doc issue
description: >
A clear and concise description of what content in https://www.cambioml.com/docs/open-parser/index.html is an issue.
A clear and concise description of what content in https://www.cambioml.com/docs/any-parser/index.html is an issue.
validations:
required: true
- type: textarea
Expand Down
2 changes: 1 addition & 1 deletion .github/ISSUE_TEMPLATE/feature-request.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: 🚀 Feature request
description: Submit a proposal/request for a new open-parser feature
description: Submit a proposal/request for a new any-parser feature

body:
- type: textarea
Expand Down
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ repos:
- "--remove-duplicate-keys"
- "--remove-unused-variables"
- "--remove-all-unused-imports"
exclude: "open-parser/__init__.py"
exclude: "any-parser/__init__.py"

# run all unittests
- repo: local
Expand Down
22 changes: 11 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
# 🌊 OpenParse
# 🌊 AnyParser

OpenParse provides an API to accurately extract your unstructured data (e.g. PDF, images, charts) into structured format.
AnyParser provides an API to accurately extract your unstructured data (e.g. PDF, images, charts) into structured format.

## :seedling: Set up your OpenParser API key
## :seedling: Set up your AnyParser API key

OpenParse is still in private beta. If you are interested in testing our document models, please reach out at info@cambioml.com for a FREE API key.
AnyParser is still in private beta. If you are interested in testing our document models, please reach out at info@cambioml.com for a FREE API key.


To set up your API key `CAMBIO_API_KEY`, you will need to :
Expand All @@ -18,13 +18,13 @@ To set up your API key `CAMBIO_API_KEY`, you will need to :
## :computer: Installation

```
conda create -n openparse python=3.10 -y
conda activate openparse
pip3 install open_parser
conda create -n any-parse python=3.10 -y
conda activate any-parse
pip3 install any-parser
```

## :bashfile usage
To use OpenParse via `curl` requests, you can run the following bash command from the root folder of this repository:
To use AnyParser via `curl` requests, you can run the following bash command from the root folder of this repository:
```
bash parse.sh <your apiKey> <file path> <prompt for parse (optional, default="")>
```
Expand All @@ -36,10 +36,10 @@ bash parse.sh gl************************************** /path/to/your/file.pdf "

## :scroll: Examples

OpenParse can extract text, numbers and symbols from PDF, images, etc. Check out each notebook below to run OpenParse within 10 lines of code!
AnyParser can extract text, numbers and symbols from PDF, images, etc. Check out each notebook below to run AnyParser within 10 lines of code!

### [Prompt to Extract Key-values into JSON from W2 (PDF)](https://github.com/CambioML/open-parser/blob/main/examples/prompt_to_extract_table_from_pdf_to_json.ipynb)
### [Prompt to Extract Key-values into JSON from W2 (PDF)](https://github.com/CambioML/any-parser/blob/main/examples/prompt_to_extract_table_from_pdf_to_json.ipynb)
Do you want to extract key-values from a W2 PDF into JSON format? Check out this notebook (3-min read)!

### [Extract a Table from an Image into Markdown Format](https://github.com/CambioML/open-parser/blob/main/examples/extract_table_from_image_to_markdown.ipynb)
### [Extract a Table from an Image into Markdown Format](https://github.com/CambioML/any-parser/blob/main/examples/extract_table_from_image_to_markdown.ipynb)
Are you a financial analyst who need to extract ACCURATE number from a table in an image or a PDF. Check out this notebook (3-min read)!
5 changes: 5 additions & 0 deletions any_parser/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
from any_parser.base import AnyParser

__all__ = ["AnyParser"]

__version__ = "0.0.8"
2 changes: 1 addition & 1 deletion open_parser/base.py → any_parser/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
)


class OpenParser:
class AnyParser:
def __init__(self, apiKey) -> None:
self._uploadurl = CAMBIO_UPLOAD_URL
self._extracturl = CAMBIO_EXTRACT_URL
Expand Down
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,11 @@
"source": [
"# Prompt to Extract Key-values into JSON from CBC Reports (Image) using advanced mode\n",
"\n",
"Below it's an example of using OpenParser to extract key-values from a medical CBC report into JSON format. (Note: the model is still in beta and is NOT robust enough to generate the same output. Please bear with it!)\n",
"Below it's an example of using AnyParser to extract key-values from a medical CBC report into JSON format. (Note: the model is still in beta and is NOT robust enough to generate the same output. Please bear with it!)\n",
"\n",
"### 1. Load the libraries\n",
"\n",
"If you have install `open_parser`, uncomment the below line."
"If you have install `any_parser`, uncomment the below line."
]
},
{
Expand All @@ -20,7 +20,7 @@
"outputs": [],
"source": [
"# !pip3 install python-dotenv\n",
"# !pip3 install --upgrade open_parser\n",
"# !pip3 install --upgrade any-parser\n",
"# !pip3 install pandas\n",
"# !pip3 install jinja2"
]
Expand All @@ -35,7 +35,7 @@
"import pandas as pd\n",
"\n",
"from dotenv import load_dotenv\n",
"from open_parser import OpenParser\n",
"from any_parser import AnyParser\n",
"from IPython.display import Image\n",
"from medical_cbc_report_data.expected_result import expected_result\n"
]
Expand All @@ -44,7 +44,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### 2. Set up your OpenParser API key\n",
"### 2. Set up your AnyParser API key\n",
"\n",
"To set up your `CAMBIO_API_KEY` API key, you will:\n",
"\n",
Expand All @@ -71,9 +71,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### 3. Load sample data and Run OpenParser\n",
"### 3. Load sample data and Run AnyParser\n",
"\n",
"OpenParser supports both image and PDF. First let's load a sample data to test OpenParser's capabilities."
"AnyParser supports both image and PDF. First let's load a sample data to test AnyParser's capabilities."
]
},
{
Expand Down Expand Up @@ -147,7 +147,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Finally, we set up our prompt and run OpenParser on the data."
"Finally, we set up our prompt and run AnyParser on the data."
]
},
{
Expand Down Expand Up @@ -184,7 +184,7 @@
"\n",
"Return all values as strings and always use `.` to separate decimals.\n",
"\"\"\"\n",
"op = OpenParser(api_key)\n",
"op = AnyParser(api_key)\n",
"qa_results = []\n",
"for file in report_files:\n",
" qa_result = op.parse(report_folder + file, prompt, mode=\"advanced\")\n",
Expand Down Expand Up @@ -311,7 +311,7 @@
"metadata": {},
"source": [
"## Output Analysis\n",
"Now, we will analyze the output and compare it with the expected result. We'll take note of any missing keys, any additional keys added by OpenParser, and any incorrect values."
"Now, we will analyze the output and compare it with the expected result. We'll take note of any missing keys, any additional keys added by AnyParser, and any incorrect values."
]
},
{
Expand Down Expand Up @@ -434,7 +434,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "open-parser",
"display_name": "any-parser",
"language": "python",
"name": "python3"
},
Expand Down
20 changes: 10 additions & 10 deletions examples/extract_table_from_image_to_markdown.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,11 @@
"source": [
"# Extract a Table from an Image into Markdown Format\n",
"\n",
"Below it's a simple example of using OpenParser to accurately extract a table from an image into markdown format.\n",
"Below it's a simple example of using AnyParser to accurately extract a table from an image into markdown format.\n",
"\n",
"### 1. Load the libraries\n",
"\n",
"If you have install `open_parser`, uncomment the below line."
"If you have install `any_parser`, uncomment the below line."
]
},
{
Expand All @@ -20,7 +20,7 @@
"outputs": [],
"source": [
"# !pip3 install python-dotenv\n",
"# !pip3 install --upgrade open_parser"
"# !pip3 install --upgrade any-parser"
]
},
{
Expand All @@ -32,14 +32,14 @@
"import os\n",
"from dotenv import load_dotenv\n",
"from IPython.display import Image, Markdown\n",
"from open_parser import OpenParser\n"
"from any_parser import AnyParser\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 2. Set up your OpenParser API key\n",
"### 2. Set up your AnyParser API key\n",
"\n",
"To set up your `CAMBIO_API_KEY` API key, you will:\n",
"\n",
Expand Down Expand Up @@ -68,7 +68,7 @@
"source": [
"### 3. Load the test sample data\n",
"\n",
"Now let's load a sample data to test OpenParser's capabilities. OpenParser supports both image and PDF. \n",
"Now let's load a sample data to test AnyParser's capabilities. AnyParser supports both image and PDF. \n",
"\n",
"Let's visualize the sample image first!"
]
Expand Down Expand Up @@ -99,9 +99,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### 4. Run OpenParser and Visualize the Markdown Output\n",
"### 4. Run AnyParser and Visualize the Markdown Output\n",
"\n",
"We will run OpenParser on our sample data and then display it in the Markdown format."
"We will run AnyParser on our sample data and then display it in the Markdown format."
]
},
{
Expand Down Expand Up @@ -140,7 +140,7 @@
}
],
"source": [
"op = OpenParser(example_apikey)\n",
"op = AnyParser(example_apikey)\n",
"content_result = op.extract(example_local_file)\n",
"\n",
"for content in content_result:\n",
Expand All @@ -163,7 +163,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "open-parser",
"display_name": "any-parser",
"language": "python",
"name": "python3"
},
Expand Down
20 changes: 10 additions & 10 deletions examples/prompt_to_extract_table_from_pdf_to_json.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,11 @@
"source": [
"# Prompt to Extract Key-values into JSON from W2 (PDF)\n",
"\n",
"Below it's an example of using OpenParser to extract key-values from a W2 PDF into JSON format. (Note: the model is still in beta and is NOT robust enough to generate the same output. Please bear with it!)\n",
"Below it's an example of using AnyParser to extract key-values from a W2 PDF into JSON format. (Note: the model is still in beta and is NOT robust enough to generate the same output. Please bear with it!)\n",
"\n",
"### 1. Load the libraries\n",
"\n",
"If you have install `open_parser`, uncomment the below line."
"If you have install `any_parser`, uncomment the below line."
]
},
{
Expand All @@ -20,7 +20,7 @@
"outputs": [],
"source": [
"# !pip3 install python-dotenv\n",
"# !pip3 install --upgrade open_parser"
"# !pip3 install --upgrade any-parser"
]
},
{
Expand All @@ -33,14 +33,14 @@
"import pandas as pd\n",
"\n",
"from dotenv import load_dotenv\n",
"from open_parser import OpenParser\n"
"from any_parser import AnyParser\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 2. Set up your OpenParser API key\n",
"### 2. Set up your AnyParser API key\n",
"\n",
"To set up your `CAMBIO_API_KEY` API key, you will:\n",
"\n",
Expand All @@ -67,11 +67,11 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### 3. Load sample data and Run OpenParser\n",
"### 3. Load sample data and Run AnyParser\n",
"\n",
"OpenParser supports both image and PDF. First let's load a sample data to test OpenParser's capabilities.\n",
"AnyParser supports both image and PDF. First let's load a sample data to test AnyParser's capabilities.\n",
"\n",
"Now we can run OpenParser on our sample data and then display it in the Markdown format."
"Now we can run AnyParser on our sample data and then display it in the Markdown format."
]
},
{
Expand All @@ -92,7 +92,7 @@
"example_local_file = \"./sample_data/test1.pdf\"\n",
"example_prompt = \"Return table in a JSON format with each box's key and value.\"\n",
"\n",
"op = OpenParser(example_apikey)\n",
"op = AnyParser(example_apikey)\n",
"# mode can be \"basic\" or \"advanced\"\n",
"qa_result = op.parse(example_local_file, example_prompt, mode=\"basic\")\n"
]
Expand Down Expand Up @@ -237,7 +237,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "open-parser",
"display_name": "any-parser",
"language": "python",
"name": "python3"
},
Expand Down
Loading

0 comments on commit c036739

Please sign in to comment.