Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Identify glif file that caused XMLSyntaxError #264

Open
Hoolean opened this issue Mar 6, 2023 · 3 comments
Open

Identify glif file that caused XMLSyntaxError #264

Hoolean opened this issue Mar 6, 2023 · 3 comments

Comments

@Hoolean
Copy link

Hoolean commented Mar 6, 2023

When ufoLib2 reads a UFO that contains a glif file with invalid XML, we do not provide enough information to identify which glif file is causing the problem.

Example

Making and then reading from a UFO with an XML error in /b:

from pathlib import Path
from ufoLib2 import Font

# Create and save UFO with three glyphs
before = Font()
before.newGlyph("a")
before.newGlyph("b")
before.newGlyph("c")
before.save("Font.ufo")

# Break the XML of a single glyph in any way
Path("Font.ufo", "glyphs", "b.glif").write_text("<broken></xml>")

# Reload the UFO, or otherwise indirectly parse glifs
after = Font.open("Font.ufo", lazy=False)

Stack trace does not identify in which file the issue originates:

Traceback (most recent call last):
  File ".\make-broken.py", line 12, in <module>
    after = Font.open("Font.ufo", lazy=False)
  File "C:\Program Files\Python38\lib\site-packages\ufoLib2\objects\font.py", line 189, in open
    self = cls.read(reader, lazy=lazy)
  File "C:\Program Files\Python38\lib\site-packages\ufoLib2\objects\font.py", line 205, in read
    layers=LayerSet.read(reader, lazy=lazy),
  File "C:\Program Files\Python38\lib\site-packages\ufoLib2\objects\layerSet.py", line 165, in read
    layer = cls._loadLayer(reader, layerName, lazy, isDefault)
  File "C:\Program Files\Python38\lib\site-packages\ufoLib2\objects\layerSet.py", line 198, in _loadLayer
    return Layer.read(layerName, glyphSet, lazy=lazy, default=default)
  File "C:\Program Files\Python38\lib\site-packages\ufoLib2\objects\layer.py", line 151, in read
    glyphSet.readGlyph(glyphName, glyph, glyph.getPointPen())
  File "C:\Users\harry.dalton\AppData\Roaming\Python\Python38\site-packages\fontTools\ufoLib\glifLib.py", line 407, in readGlyph
    tree = _glifTreeFromString(text)
  File "C:\Users\harry.dalton\AppData\Roaming\Python\Python38\site-packages\fontTools\ufoLib\glifLib.py", line 1030, in _glifTreeFromString
    root = etree.fromstring(data, parser=etree.XMLParser(remove_comments=True))
  File "src\lxml\etree.pyx", line 3252, in lxml.etree.fromstring
  File "src\lxml\parser.pxi", line 1913, in lxml.etree._parseMemoryDocument
  File "src\lxml\parser.pxi", line 1800, in lxml.etree._parseDoc
  File "src\lxml\parser.pxi", line 1141, in lxml.etree._BaseParser._parseDoc
  File "src\lxml\parser.pxi", line 615, in lxml.etree._ParserContext._handleParseResultDoc
  File "src\lxml\parser.pxi", line 725, in lxml.etree._handleParseResult
  File "src\lxml\parser.pxi", line 654, in lxml.etree._raiseParseError
  File "<string>", line 1
lxml.etree.XMLSyntaxError: Opening and ending tag mismatch: broken line 1 and xml, line 1, column 15

Potential change

Depending on the exceptions that dependent libraries or tools expect us to throw, we could wrap the exception to include the path to the glif file, and/or the name and layer of the glyph it represents.

This is done for other areas of the UFO, e.g. layerscontents.plist:

UFOLibError: 'layercontents.plist' could not be read on <osfs 'Font.ufo'>: Opening and ending tag mismatch: array line 4 and plist, line 9, column 9 (layercontents.plist, line 9)
@anthrotype
Copy link
Member

sounds good to me! Got a PR?

@Hoolean
Copy link
Author

Hoolean commented Mar 7, 2023

Happy to put something together :)

One further thought: would it better for us to wrap the XML library error here (Layer.loadGlyph) or in fonttools (GlyphSet.readGlyph)?

@anthrotype
Copy link
Member

If we have all the info inside fonttools, it might be better to raise it over there

Hoolean added a commit to daltonmaag/fonttools that referenced this issue Mar 9, 2023
This allows dependent projects to catch errors parsing glifs without
requiring logic to account for which XML library fonttools is using
internally (e.g. for implementing fonttools/ufoLib2#264).

This commit also adds tests to ensure that the exception we expose when
glifs have invalid syntax remains stable across future releases.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants