-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ocr showcase #866
Ocr showcase #866
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Let's always put the images to the
asset
branch, besides the size problem, there are two more reasons: 1. git is also not good at managing binary data 2. the asset branch won't subject to frequent change like master (so the url will be more stable) - I have attempted to clean up the payload structure in another branch. It is not finished and I deliberately make a wrong test case, but some changes there may help you.
- It feels like there are a lot of payload basic not ready to finish up this OCR showcase.
@@ -47,13 +46,13 @@ | |||
from forte.data.ontology.core import EntryType | |||
from forte.data.ontology.top import ( | |||
Annotation, | |||
Grids, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Grids
is still an entry? I thought we discuss that it should be a data structure. Or is that change not merged?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not merged yet and it's in #827
@@ -244,7 +243,7 @@ def text(self) -> str: | |||
@property | |||
def audio(self) -> Optional[np.ndarray]: | |||
r"""Return the audio of the data pack""" | |||
return self.get_payload_data_at(Modality.Audio, 0) | |||
return cast(np.ndarray, self.get_payload_data_at(Modality.Audio, 0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this cast
is not safe, not everything can be cast this way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a typing issue mypy forte
, and I haven't find a better way to solve it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I don't think we should assume audio
to be ndarray
@@ -569,7 +565,7 @@ def set_text( | |||
# temporary solution for backward compatibility | |||
# past API use this method to add a single text in the datapack | |||
if len(self.text_payloads) == 0 and text_payload_index == 0: | |||
from ft.onto.base_ontology import ( # pylint: disable=import-outside-toplevel | |||
from ft.onto.payload_ontology import ( # pylint: disable=import-outside-toplevel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are we importing locally here, these should be imported top level
"description": "A payload that caches image data", | ||
"attributes":[] | ||
}, | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's make the indent aligned.
"type": "str" | ||
}, | ||
{"name": "mime", | ||
"type": "str"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just use a json formatter
@@ -435,7 +463,12 @@ class CrossDocEventRelation(MultiPackLink): | |||
ParentType = EventMention | |||
ChildType = EventMention | |||
|
|||
def __init__(self, pack: MultiPack, parent: Optional[Entry] = None, child: Optional[Entry] = None): | |||
def __init__( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this code generated? I feel like there are some manual or tool modifications to this file.
img_meta = JpegMeta(datapack) | ||
img_meta.source_type = "local" | ||
|
||
self.f.register(img_meta) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of a register
is to associate something together. For example, here we should associate the meta
data info with the function to load the meta data, and the function to serialize the meta data. But if you only provide one value to register
, it cannot achieve what we want.
@@ -0,0 +1,181 @@ | |||
{ | |||
"cells": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you link the notebook to the tutorial section next time? So that I can also review what it looks like in the tutorial section
" pack.set_text(ocr_text)\n", | ||
" pack.pack_name = data_source\n", | ||
" \n", | ||
" yield pack\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The loading part is not what we desired, for example, it requires one to explicitly set the cache. And I don't even see metadata being used here.
pack: DataPack = DataPack()
jpegMeta = JpegMeta()
ImagePayload(pack, url="some_url", meta=jpegMeta) # payload_index default to 0
Now the user's job is done, the rest should all be handled by Forte (well, the user could also provide a method to load jpegMeta
)
This is what we call lazy_loading
, it doesn't happen anywhere near the reader, but behind the scene.
When the Payload
is used, it should load the data to cache based on URL and jpegMeta, there doesn't need to be a step where the user set the cache.
" yield pack\n", | ||
"\n", | ||
"\n", | ||
"# TODO: split ocr part into a processor" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah please do
This PR fixes #867
Description of changes
Implement a notebook tutorial for OCR to show forte's capability on managing related data such as images containing text and recognized text, texts' bounding boxes and their relations..
Possible influences of this PR.
Describe what are the possible side-effects of the code change.
Test Conducted
Describe what test cases are included for the PR.