Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove couple of lint checks and modify another, add CSS to core to support new hgroup definition #593

Merged
merged 3 commits into from
Jul 21, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions se/data/templates/core.css
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,32 @@ hgroup{
text-align: center;
}

/* simulate h3 in an hgroup */
hgroup h2 + p{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use a child selector here for performance? hgroup > h2 + p

font-size: 1.17em;
}

/* simulate h4 in an hgroup */
hgroup h2 + p + p,
hgroup h3 + p{
font-size: 1em;
}

/* simulate h5 in an hgroup */
hgroup h2 + p + p + p,
hgroup h3 + p + p,
hgroup h4 + p{
font-size: .83em;
}

/* simulate h6 in an hgroup */
hgroup h2 + p + p + p + p,
hgroup h3 + p + p + p,
hgroup h4 + p + p,
hgroup h5 + p{
font-size: .67em;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we should add an hgroup p selector to line 38 so that the <h#> styling carries on to <p>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And, we also will need text-indent: 0 to all of these new selectors, to override p + p text indent styling in case we have multiple <p> in an <hgroup>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those h#/hgroup selectors (at 38) have a 3em margin, which would give each p in the hgroup that margin (I tested to confirm). I'm assuming we don't want that. :) Would you prefer to make a separate selector for just hgroup p and put everything but the margin on that, or add hgroup p to the existing group, remove margin, and make a separate selector for everything but hgroup p with just the margin on it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do want that, we want to replicate <h#> styling for these new <p>s. <h#> has a 3em margin so the <p>s should also. Margins of direct siblings will collapse.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we make a separate hgroup p selector and put everything in it but the margin, we can also add the text-indent: 0, so we don't have to add it to each of the other selectors. We also don't need the text-align: center, since that will be handled by the hgroup selector in the original. So:

hgroup p{
	break-after: avoid;
	break-inside: avoid;
	font-variant: small-caps;
	hyphens: none;
	-epub-hyphens: none;
	text-indent: 0;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hyphens will inherit. It's useful to open it in a browser and use the browser's CSS inspector to see how things are being inherited and what's overriding what.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Margins of direct siblings do not collapse. This is with just hgroup p added to the existing selectors.
image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I made a mistake, though for a different reason. Line 48 removes margins for all children of <hgroup>. So the margins aren't important after all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All right, I updated the message and added the unused block, and did some more testing with the CSS.

I created a test chapter with an hgroup with h2 thru h4 elements. With the current core.css, or the new core.css, they display like this.
image

If I change them to p's, however, and add hgroup p to the selector in question, they display like this:
image

Which is what I would expect. As h#s, they are caught by the hgroup > * selector, which zeroes their margin. We have to have that since a standalone h# should have a margin, but one in an hgroup shouldn't. However, with the addition of hgroup p to the other selectors, that has higher precedence, and so the p's in the hgroup are caught by it rather than by hgroup > *, and we get margins inside the hgroup, which we don't want. Thus, we don't need the hgroup p; the hgroup handles the hgroup margins, and we don't want margins within the hgroup.

There is another complication however. There is a separate set of selectors (starting with p.continued,) that ensure the first paragraph after an h# (and hr, etc.) aren't indented, and use hanging punctuation. Several of the selectors are of the form h# + p, which of course now matches the p's in the hgroup, and so they are handled by that selector. The good news is that gives us the text-indent: 0 we wanted; the other news is that that also gives them hanging punctuation, which maybe we want in the header and maybe we don't, and maybe it doesn't matter since they're centered?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I didn't see your followup message until just now. So the only new news is the above complication.

hgroup > *{
font-weight: normal;
margin: 0;
Expand Down
22 changes: 8 additions & 14 deletions se/se_epub_lint.py
Original file line number Diff line number Diff line change
Expand Up @@ -258,11 +258,10 @@
"s-067", "Header element with a [val]label[/] semantic child, but without an [val]ordinal[/] semantic child."
"s-068", "Header element missing [val]ordinal[/] semantic."
"s-069", "[xhtml]<body>[/] element missing direct child [xhtml]<section>[/] or [xhtml]<article>[/] element."
"s-070", "[xhtml]<h#>[/] element without [xhtml]<hgroup>[/] parent and without semantic inflection."
"s-070", "[xhtml]<h#>[/] element without semantic inflection."
"s-071", "Sectioning element with more than one heading element."
"s-072", "Element with single [xhtml]<span>[/] child. [xhtml]<span>[/] should be removed and its attributes promoted to the parent element."
"s-073", "Header element that requires [val]label[/] and [val]ordinal[/] semantic children."
"s-074", "[xhtml]<hgroup>[/] element containing sequential [xhtml]<h#>[/] elements at the same heading level."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to list removed rules as available for the next time we want to add one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know the tests had a core.css; I'll add that.

I used the structure Alex initially posted. These are epubs, not high-traffic web sites, so I can't imagine that makes the slightest difference at all in performance on an ereader, but I'll be glad to make the change if Alex prefers.

List them where? They're available if they're not being used. :)

Copy link
Member

@robinwhittleton robinwhittleton Jul 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the state of CPUs on ereaders I’d say it’s more important here than on a website, but it’s a minimal difference either way.

Have a look at this commit for an example of retiring a lint rule: 98fa396

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear: I don't care which we use, and what you suggest is what I would have done had I written it from scratch. But I learned long ago that when Alex suggests CSS, that's the CSS to use. :)

That UNUSED block no longer exists. I'm sure there's a way in git to figure out when it went away, but I don't know what it is (without looking at every commit involving se_epub_lint, which I'm not going to do). Regardless, it's not there now. And, again, anything that is not in the used list is by definition unused, so I don't understand why it would be needed anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a new equivalent block to the bottom of the s- rules. The block is removed when those unused rules are all reused. It’s just there as a reference so that those numbers can be reused in future rule additions. Keeps the numbering in check, and everyone’s OCD happy.

"s-075", "[xhtml]<body>[/] element with direct children that are not [xhtml]<section>[/], [xhtml]<article>[/], or [xhtml]<nav>[/]."
"s-076", "[attr]lang[/] attribute used instead of [attr]xml:lang[/]."
"s-077", "[xhtml]<header>[/] element preceded by non-sectioning element."
Expand All @@ -283,7 +282,6 @@
"s-092", "Anonymous contributor with [val]z3998:*-name[/] semantic."
"s-093", "Nested [xhtml]<abbr>[/] element."
"s-094", "Element has an [attr]xml:lang[/] attribute that incorrectly contains [val]-latn[/] instead of [val]-Latn[/]."
"s-095", "[xhtml]<hgroup>[/] element containing [xhtml]<h#>[/] element in the wrong heading order."
"s-096", "[xhtml]h1[/] element in half title page missing the [val]fulltitle[/] semantic."
"s-097", "[xhtml]a[/] element without [attr]href[/] attribute."
"s-098", "[xhtml]<header>[/] element with only one child."
Expand All @@ -292,6 +290,10 @@
"s-101", "Anonymous primary contributor value not exactly [text]Anonymous[/]."
"s-102", "[attr]lang[/] attribute detected. Hint: Use [attr]xml:lang[/] instead."
"s-103", "Probable missing semantics for a roman I numeral."
UNUSED
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
"s-074", "[xhtml]<hgroup>[/] element containing sequential [xhtml]<h#>[/] elements at the same heading level."
"s-095", "[xhtml]<hgroup>[/] element containing [xhtml]<h#>[/] element in the wrong heading order."

TYPOGRAPHY
"t-001", "Double spacing found. Sentences should be single-spaced. (Note that double spaces might include Unicode no-break spaces!)"
Expand Down Expand Up @@ -1739,10 +1741,10 @@ def lint(self, skip_lint_ignore: bool, allowed_messages: Optional[List[str]] = N
if nodes:
messages.append(LintMessage("t-018", "Stage direction ending in period next to other punctuation. Remove trailing periods in stage direction.", se.MESSAGE_TYPE_WARNING, filename, [node.to_string() for node in nodes]))

# Check for h# without semantics. h# that are children of hgroup are OK, as are any h# that have child elements (which likely have the correct semantics)
nodes = dom.xpath("/html/body//*[re:test(name(), '^h[1-6]$')][not(@epub:type)][not(./*[not(name()='a' and contains(@epub:type, 'noteref'))])][ancestor::*[1][name() !='hgroup']]")
# Check for h# without semantics; h# that have child elements (which likely have the correct semantics) are OK
nodes = dom.xpath("/html/body//*[re:test(name(), '^h[1-6]$')][not(@epub:type)][not(./*[not(name()='a' and contains(@epub:type, 'noteref'))])]")
if nodes:
messages.append(LintMessage("s-070", "[xhtml]<h#>[/] element without [xhtml]<hgroup>[/] parent and without semantic inflection.", se.MESSAGE_TYPE_WARNING, filename, [node.to_string() for node in nodes]))
messages.append(LintMessage("s-070", "[xhtml]<h#>[/] element without semantic inflection.", se.MESSAGE_TYPE_WARNING, filename, [node.to_string() for node in nodes]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make sure to change the message to match in the big list of messages at the top of the file too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doh! I thought of that once and obviously didn't get back to change it. I have RL things today, I'll make this and the rest of the changes tonight after I hear from you on the above.


# Check for <table> element without a <tbody> child
if dom.xpath("/html/body//table[not(tbody)]"):
Expand Down Expand Up @@ -2554,14 +2556,6 @@ def lint(self, skip_lint_ignore: bool, allowed_messages: Optional[List[str]] = N
if nodes:
messages.append(LintMessage("s-019", "[xhtml]<h#>[/] element with [attr]id[/] attribute. [xhtml]<h#>[/] elements should be wrapped in [xhtml]<section>[/] elements, which should hold the [attr]id[/] attribute.", se.MESSAGE_TYPE_WARNING, filename, [node.to_tag_string() for node in nodes]))

nodes = dom.xpath("/html/body//hgroup[./*[following-sibling::*[1][name()!='h6' and name()=name(preceding-sibling::*[1])]]]")
if nodes:
messages.append(LintMessage("s-074", "[xhtml]<hgroup>[/] element containing sequential [xhtml]<h#>[/] elements at the same heading level.", se.MESSAGE_TYPE_WARNING, filename, [node.to_string() for node in nodes]))

nodes = dom.xpath("//hgroup/*[./preceding-sibling::* and number(substring(name(./preceding-sibling::*[1]), 2, 1)) != number(substring(name(), 2, 1)) - 1 ]")
if nodes:
messages.append(LintMessage("s-095", "[xhtml]<hgroup>[/] element containing [xhtml]<h#>[/] element in the wrong heading order.", se.MESSAGE_TYPE_WARNING, filename, [node.to_string() for node in nodes]))

# Check for common errors in language tags
# `gr` is often used instead of `el`, `sp` instead of `es`, and `ge` instead of `de` (`ge` is the Georgian geographic region subtag but not a language subtag itself)
nodes = dom.xpath("//*[re:test(@xml:lang, '^(gr|sp|ge)$')]")
Expand Down
26 changes: 26 additions & 0 deletions tests/data/draft/jane-austen_unknown-novel/src/epub/css/core.css
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,32 @@ hgroup{
text-align: center;
}

/* simulate h3 in an hgroup */
hgroup h2 + p{
font-size: 1.17em;
}

/* simulate h4 in an hgroup */
hgroup h2 + p + p,
hgroup h3 + p{
font-size: 1em;
}

/* simulate h5 in an hgroup */
hgroup h2 + p + p + p,
hgroup h3 + p + p,
hgroup h4 + p{
font-size: .83em;
}

/* simulate h6 in an hgroup */
hgroup h2 + p + p + p + p,
hgroup h3 + p + p + p,
hgroup h4 + p + p,
hgroup h5 + p{
font-size: .67em;
}

hgroup > *{
font-weight: normal;
margin: 0;
Expand Down