-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathREADME
218 lines (150 loc) · 6.78 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
These scripts require the following binaries:
jruby (package jruby)
ruby (package ruby1.8)
curl (package curl)
pdftoppm (package poppler-utils)
make (package make)
ocrad (package ocrad)
convert (package imagemagick)
... and many from the netpbm package. You will also need to have the
following libraries installed:
libpng12-dev
libgd-ruby1.8
... and the font DejaVu installed as well:
ttf-dejavu-core
You can install all of these dependencies with:
apt-get install jruby ruby1.8 ruby curl poppler-utils make ocrad imagemagick libpng12-dev libgd-ruby1.8 ttf-dejavu-core netpbm
A completed build tree for this poster is about 18 GiB for me so
you need lots of disk space.
Step 1: Extracting the Glyphs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
You should be able to run:
./make-unicode-poster
After a long time (at least a day on my computer) this will
generate some useful output in the directory
'individual-characters'. These files include:
U????-???-????????.png
The glyph images with Unicode codepoint
U????-???-????????-top.png
Just the glyph image, cropped from the image above
U????-???-????????-bottom.png
Just the image of the Unicode codepoint number, also cropped
from the full glyph image.
In the base directory there will also be:
top-sizes.yaml
Information about the size of the -top.png image files and
the block that the characters are from.
codepoints.yaml
This file gives the name of each image glyph and the Unicode
codepoint acquired from doing OCR on the codepoint in the
image. These values will have a number of errors in them,
since the OCR isn't perfect; if you want to be able to map
each image to a codepoint, see the option Step 5 below.
Step 2: Drawing the Poster Strips
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The script "compose-inline-breaks" will create poster strips
from these blocks. You can invoke it like:
./compose-inline-breaks "Unicode 5.1.0 (Single Page Test)" 1
... where the first argument is the title of the page and the
second argument is the number of posters to spread across. This
will generate a series of strips that we can compose into the
final posters, called something like:
poster-inline-breaks-00-00000030.png
Note that these images are up to 90000 pixels wide, which
creates a number of problems. You can view them in gimp quite
happily, but many other programs won't cope. There are even
bugs in some graphical file managers that mean that your file
manager will crash if you open this directory.
Step 3: Composing the Poster Strips
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Now, to compose the strips into a single file, just run:
./compose-pnm-files-multi-page
That script converts each strip to PGM format and concatenates
them into gigantic PNM files, one per page of poster.
****** WARNING *****
You need a great deal of disk space for this to run to
completion. The concatenated version of the poster is about
12GB. So, before you consider running this file you probably
want to make sure that you have 30GB of disk free to be on
the safe side.
********************
The eventual output should be called (for a single page poster):
poster-complete-00.pnm
... or multiple files for multi-page posters.
In some sense this is the final output (a complete poster!) but
it's 12GB and 90000x130000 pixels, so not very useful for
printing. To generate output you have a chance of printing see
the next section:
Step 4: Generating Printable Output
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[This is by far the worst script, but you'll probably want to
customize it quite a bit anyway, depending on what your
printing situation is - see Step 5.]
First, as a test, you can try generating some 600dpi A4 versions:
./generate-output-multi-page A4 600
If the PNG and EPS files generated from that look sensible then
you can try:
./generate-output-multi-page A0 600
... to generate A0 versions.
Step 5: Printing Out Your Poster
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1. Find someone with an A0 plotter.
2. Negotiate with them at great length about whether what you
want to do is sensible (it is :)) and what file formats they
can deal with.
Step 1 is relatively easy, and step 2 is often very difficult.
Some printers will want to load the bitmap into Photoshop and
print it out from there. That will probably work OK, but you
have to be very careful that they don't enable any option in the
"Print" dialog that will cause JPEG compression to be applied to
your image or the output will look horrible. Even at A0 size
you really need the fine detail for this poster.
Many of the large HP plotters will accept TIFF files if you can
send files directly to them - this sometimes takes some
negotiation. Before doing this, make sure that the resolution
set in the TIFF file is correct for your poster.
One other option here is to use pamdice to split the A0 image
into many A4 sheets that you can stick together. It's much
nicer to have a proper A0 version, though.
Good luck with this step - I've found it a bit of a nightmare :/
[Optional] Step 5: Mapping Codepoints to Glyph Images
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
For a number of applications you might want to be able to find the
glyph image for a particular codepoint. In order to do this, you'll
need to correct the generated codepoints.yaml file. We can detect
most of the errors (all if we're lucky) by looking for the codepoint
values that appear to be out of place. The script in this directory
called "test-codepoints" is designed for this. You should do the
following:
# ./test-codepoints
Problem with: U0C00-002-00000099.png => 0x0CS3
Not a Fixnum
Problem with: U0E00-002-00000016.png => 0x
Not a Fixnum
Problem with: U1D400-002-00000165.png => 0x_D4A5
Not a Fixnum
Problem with: U1D400-002-00000181.png => 0x_D4B5
Not a Fixnum
Problem with: U1D400-002-00000191.png => 46271
Out of order...
Problem with: U1D400-002-00000198.png => 0x1_4C6
Not a Fixnum
[.. etc ..]
For each of these, do something like:
# Look for what the codepoint should be from the image:
xli U0C00-002-00000099.png
# Correct it in the file:
emacs individual-characters/non-blank/codepoints.yaml
Run ./test-codepoints again and correct any further errors,
repeating if necessary. Once that's done you can use the script
./copy-characters-to-webspace to scale them down by 50% and copy
the glyph images to a directory where they'll have sensible
names:
mkdir ~/renamed-unicode-glyphs/
./copy-characters-to-webspace codepoints.yaml ~/renamed-unicode-glyphs/
Those files will be called things like:
reduced-0x00260E.png
... for U+00260E (BLACK TELEPHONE)
------------------------------------------------------------------------
Mark Longair <mark-unicode@longair.net>
http://longair.net/mark/