Full font embedding #1322

pointlessone · 2023-11-21T19:35:20Z

This add an option to disable font subsetting. Original fonts can be embedded in full original form.

This feature can make documents substantially bigger. In addition to embedded fonts being bigger PDF requires additional information in order to properly render text. Specifically, it requires glyph widths. Some fonts contain thousands of glyps. A thousand of glyph widths on average would result in about 4 Kb additional size of the document. Additionally, PDF requires another mapping to make the text intelligible when copying. This additional size is much harder to estimate as it greatly depend on the font coverage but usually on the order of ~1-10 Kb per font.

Intended use case is a workaround for when TTFunk breaks fonts in subsetting. But also this might be useful for documents that are going to be edited. For example, documents that are templates and more text would be added later, or AcroForm feature that allows end users to fill forms.

This add an option to disable font subsetting. Original fonts can be embedded in full original form. This feature can make documents substantially bigger. In addition to embedded fonts being bigger PDF requires additional information in order to properly render text. Specifically, it requires glyph widths. Some fonts contain thousands of glyps. A thousand of glyph widths on average would result in about 4 Kb additional size of the document. Additionally, PDF requires another mapping to make the text intelligible when copying. This additional size is much harder to estimate as it greatly depend on the font coverage but usually on the order of ~1-10 Kb per font. Intended use case is a workaround for when TTFunk breaks fonts in subsetting. But also this might be useful for documents that are going to be edited. For example, documents that are templates and more text would be added later, or AcroForm feature that allows end users to fill forms.

mojavelinux · 2024-01-17T09:54:27Z

This change caused numerous test errors in the visual tests for Asciidoctor PDF even without the new option turned on to embed the full font. It seems there's more to this change than just the option to embed the full font. Missing glyphs now seem to have variable width, whereas before they had the width of the missing/not glyph (typically a fixed-width box). (See https://github.com/asciidoctor/asciidoctor-pdf/blob/main/spec/reference/font-i18n-default.pdf). It may be this is just going to be how it is now (and I'll have to figure out how to accommodate this change in the tests). But I'm just letting you know. Perhaps notice may be warranted to inform users of the change. Ideally, however, it would continue to operate as it does today without the new option.

pointlessone · 2024-01-17T15:33:30Z

@mojavelinux Text is hard, man…

Do you have by chance a document that I could look at before and after that demonstrates the issue?

mojavelinux · 2024-01-17T21:23:26Z

Text is hard, man…

Trust me, I get it. I've been messing around with text in PDF for over a decade now. It's not only hard, it's tedious ;)

Do you have by chance a document that I could look at before and after that demonstrates the issue?

The following test demonstrates the problem:

https://github.com/asciidoctor/asciidoctor-pdf/blob/main/spec/font_spec.rb#L7-L11

There's nothing really special that test is doing other than converting the file with a bunch of non-Latin characters in it to PDF and comparing it against the expected result.

If you are interested in digging into it, I suppose I could distill that down to code that only uses Prawn.

pointlessone · 2024-01-21T11:12:29Z

@mojavelinux I'd love a minimal reproduction script that demonstrates the issue. But even just the same document before and after the change might be useful. At the moment I don't even know where to start. Admittedly, I'm not familiar with asciidoc source code.

mojavelinux · 2024-01-22T09:45:57Z

This has nothing to do with AsciiDoc. The issue is when a glyph needed by the text is missing from a non-AFM (e.g., TTF) font. Prior to this change, the width would be fixed to the size of the missing glyph (so each missing glyph in a fragment was adjacent to the previous one). After this change, the width is 0.

Here's a reproducible test case using Prawn's own test suite:

# frozen_string_literal: true

require 'spec_helper'

class PDF::Inspector::TextWithStringWidths < PDF::Inspector::Text
  attr_reader :string_widths

  def initialize(*)
    super
    @string_widths = []
  end

  def show_text text, kerned = false
    super
    @string_widths << ((@state.current_font.unpack text).reduce(0) do |width, code|
      width + (@state.current_font.glyph_width code) * @font_settings[-1][:size] / 1000.0
    end)
  end
end

describe Prawn::Font do
  it '.only does not change width of unknown glyph' do
    pdf = Prawn::Document.new do
      font_families.update(
        'DejaVu Sans' => {
          normal: "#{Prawn::DATADIR}/fonts/DejaVuSans.ttf",
          bold: "#{Prawn::DATADIR}/fonts/DejaVuSans-Bold.ttf"
        },
      )

      # changing option to subset: false fixes issue (albeit using different behavior)
      font 'DejaVu Sans', subset: true do
        text '日本語<b>end</b>', inline_format: true
      end
    end

    rendered_pdf = pdf.render
    File.write '/tmp/debug.pdf', rendered_pdf
    analyzed_pdf = PDF::Inspector::TextWithStringWidths.analyze(rendered_pdf)
    expect(analyzed_pdf.string_widths.length).to eql(2)
    expect(analyzed_pdf.string_widths[0]).to be > 0
  end
end

On the master branch, this test fails. If you roll back to the commit before this change (to 772a41e), it passes. You can also look at debug.pdf to see that the missing glyphs within the fragment are all sitting on top of one another. Note that the position of the next fragment seems to be correct in either case.

Changing the option to subset: false also fixes the issue, which means that when the full font is embedded, it works as before. While that's a workaround, it does not preserve the previous behavior when font subsetting is enabled.

pointlessone · 2024-01-22T15:51:49Z

@mojavelinux Could you please try #1327 and let me know if it fixes the issue?

mojavelinux · 2024-01-23T10:22:41Z

Yep, that did it! Thanks a bunch. I appreciate you taking the time to circle back and apply this update.

(My apologies again for my misdirected comment on 1103. Since these were both font-related issues, I wrongly assumed it was the same thread).

Skulli · 2024-04-08T08:19:10Z

See #1347 having issues since 2.5 but only with Adobe Acrobat

koffeinfrei · 2024-04-11T08:39:41Z

lib/prawn/font.rb

+    # A hash font definition can specify a number of options:
+    #
+    # - :file -- path to the font file (required)
+    # - :subset -- whether to subset the font (default false). Only


I think the default is true

laurafeier · 2024-05-29T11:54:18Z

Had issues with characters not rendered after 2.5 upgrade using .ttf fonts. Fixed this by setting font to subset: false.

Example:
font_families.update("family_name" => { bold: { file: "path/to/file", subset: false } })

pointlessone force-pushed the full-font-embedding branch 2 times, most recently from b1a6232 to fcf1e74 Compare December 6, 2023 14:29

pointlessone mentioned this pull request Dec 6, 2023

Allow embedding CFF fonts #1282

Closed

pointlessone force-pushed the full-font-embedding branch from fcf1e74 to 4abd211 Compare December 7, 2023 20:59

pointlessone force-pushed the full-font-embedding branch from 4abd211 to 59c1221 Compare January 15, 2024 13:28

pointlessone force-pushed the full-font-embedding branch from 59c1221 to 528a37d Compare January 15, 2024 14:05

pointlessone merged commit 385116d into master Jan 15, 2024
28 of 30 checks passed

pointlessone deleted the full-font-embedding branch January 15, 2024 14:18

gettalong mentioned this pull request Jan 22, 2024

Possible Bug: Unicode conversion. #1103

Closed

pointlessone mentioned this pull request Jan 22, 2024

Fix missing glyph widths #1327

Merged

Skulli mentioned this pull request Apr 8, 2024

Font error with 2.5 and Adobe Acrobat #1347

Open

koffeinfrei reviewed Apr 11, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Full font embedding #1322

Full font embedding #1322

pointlessone commented Nov 21, 2023

mojavelinux commented Jan 17, 2024

pointlessone commented Jan 17, 2024

mojavelinux commented Jan 17, 2024

pointlessone commented Jan 21, 2024

mojavelinux commented Jan 22, 2024 •

edited

Loading

pointlessone commented Jan 22, 2024

mojavelinux commented Jan 23, 2024

Skulli commented Apr 8, 2024

koffeinfrei Apr 11, 2024

laurafeier commented May 29, 2024

Full font embedding #1322

Full font embedding #1322

Conversation

pointlessone commented Nov 21, 2023

mojavelinux commented Jan 17, 2024

pointlessone commented Jan 17, 2024

mojavelinux commented Jan 17, 2024

pointlessone commented Jan 21, 2024

mojavelinux commented Jan 22, 2024 • edited Loading

pointlessone commented Jan 22, 2024

mojavelinux commented Jan 23, 2024

Skulli commented Apr 8, 2024

koffeinfrei Apr 11, 2024

Choose a reason for hiding this comment

laurafeier commented May 29, 2024

mojavelinux commented Jan 22, 2024 •

edited

Loading