Skip to content
This repository has been archived by the owner on Jun 22, 2023. It is now read-only.

clarification : gram ctc - alphabet_size ? "a","b" or "ab" single logits output? #1

Open
danFromTelAviv opened this issue Feb 19, 2019 · 3 comments

Comments

@danFromTelAviv
Copy link

Thank you very much for your implementation of ctc varients. To be frank I think that is the main value of this repo and I would change its name to pytorch ctc varients or something of the like because it is very very hard to find these great implementations you made. OCR, however, is pretty prevalent.

Just to clarify - for gram ctc the logits should represent single characters such as "a" and "b" or grams such as "ab" ?

and just to validate - this is an implementation of this: https://arxiv.org/pdf/1703.00096.pdf right?

Thanks,
Dan

@danFromTelAviv
Copy link
Author

danFromTelAviv commented Feb 19, 2019

from reading the code it does look like this is actually "gram-ctc" but the test doesn't run... It's missing the mandatory input grams.

based on :
max_gram_length = len(grams.shape) ; if max_gram_length >= 4: raise NotImplementedError
# num_basic_labels = grams.shape[0]

should it be a tensor of shape [(alphabet size+1) x (alphabet size+1) x (alphabet size+1)] for grams of max size 3?

@artbataev
Copy link
Owner

I'm sorry, Gram-CTC is not yet implemented, but it is first priority future task: https://github.com/artbataev/end2end#future-plans, and I'm working on it.
For now only CTC-Loss and CTC Beam Search Decoder with language model are working https://artbataev.github.io/end2end/pytorch_end2end.html

@danFromTelAviv
Copy link
Author

ok. thank you for your work. good luck !

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants