-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix issues with parsing .ssa subtitles #68
base: master
Are you sure you want to change the base?
Conversation
@adracea please review the code. |
@@ -688,7 +691,7 @@ mod parse { | |||
strikeout: parse_str_to_bool( | |||
get_line_value( | |||
&headers, | |||
"Strikeout", | |||
"StrikeOut", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It needs to support Strikeout
and StrikeOut
i think. At least in the spec in the section 5 overview it says that the field is called StrikeOut
, but in the more detailed explanation beneath (field 9.2) it says Strikeout
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the get_line_value function it will be in lowercase anyway, so it does not matter anymore.
pub outline: f32, | ||
/// If [SSAStyle::border_style] is `1`, then this specifies the depth of the drop shadow behind | ||
/// the text (in pixels). Values may be `0`, `1`, `2`, `3` or `4`. Drop shadow is always used in | ||
/// addition to an outline - SSA will force an outline of 1 pixel if no outline width is given. | ||
pub shadow: i8, | ||
pub shadow: f32, | ||
/// Sets how text is "justified" within the Left/Right onscreen margins, and also the vertical | ||
/// placing. | ||
pub alignment: Alignment, | ||
/// Defines the Left Margin in pixels. | ||
pub margin_l: i32, | ||
pub margin_l: f32, | ||
/// Defines the Right Margin in pixels. | ||
pub margin_r: i32, | ||
pub margin_r: f32, | ||
/// Defines the Vertical Left Margin in pixels. | ||
pub margin_v: i32, | ||
pub margin_v: f32, | ||
/// Specifies the font character set or encoding and on multilingual Windows installations it | ||
/// provides access to characters used in multiple than one language. It is usually 0 (zero) | ||
/// for English (Western, ANSI) Windows. | ||
pub encoding: i32, | ||
pub encoding: f32, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you changed the types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In reality those properties are floats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I see. I've only used this spec as reference when rewriting the ass parsing and interpreted the "Values may be ..." part as "values must be ...".
I don't want to sound know-it-all, but the documentation of libass is not a ass/ssa specification, they have stated that right on the top. But I think most of subtitle software that utilized ass subtitles is built around libass so having compatibility to it is a good thing ig
@@ -246,7 +246,10 @@ impl SSA { | |||
let mut line_num = 0; | |||
|
|||
let mut blocks = vec![vec![]]; | |||
for line in content.as_ref().lines() { | |||
|
|||
let contents: String = content.as_ref().to_string().replace("\u{feff}", ""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this character do / why removing it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you try to open any subtitle file the first char will be \u{feff} and it breaks parsing, so we need to remove it. I don't why it appears when I do read_to_string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently \ufeff
is a byte order mark (BOM). Normally this is only set when the file is UTF-16 or UTF-32 encoded. Rust is completely UTF-8 oriented, so read_to_string
expects the input file to be in UTF-8 and thus does not strip it away. For me that's something that should be sanitized out by the user and not the library
~украл отсюда adracea#68
Hi! I've fixes issues with parsing real complex .ass subtitles.