-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove gratuitous spaces in ls #77
Comments
Will change space between comma and digit within scope of an Most lines of pwg.txt have a space at the end, which serves no purpose currently; so removing the ending space. These two changes modify over half of the million-plus lines of pwg.txt. |
Excellent; I would also request Jim to remove the blank lines within the body matter (mostly preceding the <ls or <div breaks). [A blank line should only be present between two entries (or at the header H-lines), as a NORM.] There are about 13k of such blank lines on the whole. Also there are some cases [3 blanks (2 places), 2 blanks (95 places)], where multiple blank-lines are before the new entry meta-line. |
These are all at the (artificial) line-splitiing introduced at <ls (~500k), <div (~100k) and <is (94) tags! |
This cleanup completed. Summary:
The lines outside of entries can be summarized as:
grep -E '^' temp_pwg_5.txt | wc -l |
There still remain 3 instances where a [Pagev-xxxx] line precedes the LEND line, that are of same nature as the above criterion.
Also found some misc. cases that need correction: And the 85 cases of "more than a single space together", |
I am glad that Jim has put his "first-step" in getting the digital text closer to the PWG print. |
Dear Jim, Now that you've reopened this issue, you might consider this point as well! |
AB addtional changesPer above suggestions. Jim notes in issue77 readme at Jim will consider the |
Thank you, Jim; now this issue can be closed again! |
hyphen to em-dash changes made. Many improvements made to cdsl pwg, thanks to @Andhrabharati suggestions, and his followup of some 3.5 year old suggestions from @gasyoun . @Andhrabharati I suspect many of these changes have the side-effect of decreasing the diff between your version(s) and the cdsl version of pwg. Perhaps there are other 'discrete' (well-defined) changes that cdsl could tackle now? If so these can be taken up, in new issues or in existing issues which have been neglected by cdsl. |
Thank you Jim, for appreciating my work. And yes, there are quite many refinements in my version that could (and should) be carried into the cdsl version. If you are serious and willing, I can make and post an initial version (with sufficient details of working) for you to start with. |
Systematic attention to pwg is a worthwhile goal. Incorporation of digitizations of the missing VN (#76) should be done first. Then posting of your initial version. Agree? |
And probably after the AES "revision", which is more or less a very straight-forward and easy work (from my version). |
Makes sense -- just to finish it. |
This is in response to this comment.
I
The text was updated successfully, but these errors were encountered: