Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

video-caption pair #19

Open
jihwanp opened this issue Mar 14, 2021 · 2 comments
Open

video-caption pair #19

jihwanp opened this issue Mar 14, 2021 · 2 comments

Comments

@jihwanp
Copy link

jihwanp commented Mar 14, 2021

I noticed that the caption csv file seems to have many overlaps for each clip. I want to see the long range pair, but I think it'll be a problem if I just concatenate them. Is there any way to get a video-caption pair? Do I have to use ASR on my own?
By the way, thanks for sharing nice work.

@antoine77340
Copy link
Owner

What about if you remove the redundant overlapping part of the concatenated clips?

@Jabb0
Copy link

Jabb0 commented Jan 24, 2022

@jihwanp I noticed this as well. The textual overlap is in the ASR subtitles from Youtube already.
However, that actually depends on the subtitle format you download from Youtube.
VTT has the issue while the TTML subtitles look more clean.
Note: The overlap is still there but only in the timestamps. The text itself is clean.

I guess this is due to limitations of the subtitle format to achieve the desired effect on screen, as those formats are meant for visualization and not data storage.

Hope this helps.

Example (video qREX695vxKs):
VTT:

00:00:04.400 --> 00:00:06.309 align:start position:0%
hey there real woman of philadelphia and

00:00:06.309 --> 00:00:06.319 align:start position:0%
hey there real woman of philadelphia and
 
00:00:06.319 --> 00:00:06.950 align:start position:0%
hey there real woman of philadelphia and
paula dean

00:00:06.950 --> 00:00:06.960 align:start position:0%
paula dean
 
00:00:06.960 --> 00:00:09.030 align:start position:0%
paula dean
my name is peyton kaminski and i'm about

TTML:

<p begin="00:00:04.400" end="00:00:06.960" style="s2">hey there real woman of philadelphia and</p>
<p begin="00:00:06.319" end="00:00:09.040" style="s2">paula dean</p>
<p begin="00:00:06.960" end="00:00:10.400" style="s2">my name is peyton kaminski and i&#39;m about</p>
<p begin="00:00:09.040" end="00:00:12.080" style="s2">to be moving to florida in a couple</p>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants