Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Word conversion via Pandoc #256

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Conversation

stefanv
Copy link
Member

@stefanv stefanv commented Sep 24, 2019

📰 Rendered document

Download this document and place it inside the same directory as paper.tex before viewing.

@tylerjereddy
Copy link
Contributor

tylerjereddy commented Sep 28, 2019

paper-tyler-sept28.docx

I've attached a revised version of the Word document, with some formatting improvements visible via "track changes--" mostly the author list formatting and Table 1 formatting. Not ideal to pass around a Word Document, but we have to switch to Word at some point anyway because of the Editorial constraint.

Worst case scenario the changes can be copy-pasted in as appropriate.

I'm using native Microsoft Word with "backwards compatibility" mode for saving...

@tylerjereddy
Copy link
Contributor

now with Matt's revised abstract from #258 added as well (pandoc completely ignored converting the abstract..):
paper-tyler-sept28.docx

@mdhaber
Copy link
Contributor

mdhaber commented Sep 29, 2019

Looking good. How should we proceed with manual editing of the Word document (formatting Table 1, converting sections to Boxes)?

@tylerjereddy
Copy link
Contributor

I guess we just start doing it in Word as you say--if you want to make a few changes in the next day or two go ahead. If there's stuff you want me to do specifically let me know.

Some stuff can be done in parallel as long as we can copy-paste..

@mdhaber
Copy link
Contributor

mdhaber commented Sep 30, 2019

Current version

Tried to reorder all the content as requested in formatting guide.
• Title
• Author List and affiliations
• Abstract
• Main Text
• Data Availability Statement
• Acknowledgements
• Competing Financial Interests Statement
• References (for main text only)
• Figure legends (for main text only)
• Tables (note: tables should be pasted into Word files as editable tables, not as images)
• Boxes

References are probably all messed up. I'd better just fix them manually, as they'll need to be checked manually in the end.
I'll try to add the first-page author affiliations.

@mdhaber
Copy link
Contributor

mdhaber commented Sep 30, 2019

@tylerjereddy If you get a chance, please take a look at the version linked above accepting changes or noting what you disagree with. Some things that it still needs:

  • Remove ~200 words, if possible
  • Remove detail from figure captions
  • Add missing numbers. The pandoc conversion appears to have removed some numbers like "more than ___ unit tests" in the test suite. I have added some back but haven't checked the whole document.
  • Check reference order

@tylerjereddy
Copy link
Contributor

Probably won't get to it until tonight, but thinking about the references, do you think we're going to need to use i.e., EndNote to avoid the manual burden? That might be a little painful to switch over from bibtex?

@mdhaber
Copy link
Contributor

mdhaber commented Sep 30, 2019

@tylerjereddy actually I think it all will work out. Below I'll use "citation" to refer to the superscript in the text of the paper, and "reference" to refer to the entry in the reference list. Here's what happened:

  • There was a shuffling of citation/references 25-27, but that was easy to fix manually
  • The citation/reference for your code coverage repo (131 in the PDF) was missing from the Word document. According to Editorial Revision: Figure Legends #262, the portion of the figure caption in which the citation appeared needed to be moved to the main text, so I did that and re-added the citation. This required manually shuffled all the remaining citations/references by one; not a big deal since it was only a few.
  • The references in Table I are missing in the Word document, but that's OK, because they need to be moved to the end anyway.
  • The Abbasi reference in Table II should still be the last of the paper.

If you can just hack the LaTeX document/pandoc to generate the references for the table as 127-146, I think we can paste them near the end of the Word doc reference list and we'll be golden.

Update: Ooops - one more thing. References 56-64 have now been moved into Box 2. That will shuffle things some more, so my list above is not quite right. If you can generate the Table II references and paste them in, I'll fix this, too

@mdhaber
Copy link
Contributor

mdhaber commented Sep 30, 2019

@tylerjereddy can you ask Rita for clarification about reference order?

The Preparing your Submission document used to read:

References should be numbered sequentially, first throughout the text, then in tables, followed by figures; that is, references that only appear in tables or figures should be last in the reference list.

Now it has changed to:

References are numbered sequentially as they appear in the text, tables, figure legends and online Methods.

It was much more explicit above. Why would they change it to something less explicit unless they actually meant to change something?

There is not clarification in the Word docs Rita sent. In fact, in the word document "Formatting Guidelines" only mentions:

References (for main text only)

It doesn't specify where references for figures, tables, and boxes should appear.

@tylerjereddy
Copy link
Contributor

ok, I've sent an email asking for clarification on reference ordering

@tylerjereddy
Copy link
Contributor

latest_version.docx

I filled in some more of the numbers that pandoc stripped out, accepted some of your changes (didn't reject any), and added a comment about one more box we could make to save 200 words (though I'm not sure it is worth it to be honest..).

I'm a little worried about the time we'll lose with organizing references, but anyway, tried to chip away at a few things.. I think Rita is checking with copy editor on the referencing formalities you asked about.

@mdhaber
Copy link
Contributor

mdhaber commented Oct 1, 2019

I'm a little worried about the time we'll lose with organizing references

What's the alternative?
You mean that if she has to check with someone else, maybe it doesn't actually matter that much? : ) I would agree, but at this point I'd prefer to just push through.

I've noticed that a lot of URLs have also disappeared in the conversion. If you're able to re-generate the references with the URLs in there, I'd appreciate it, even though I'd have to re-do some work. It's probably faster than manually adding them back.

@tylerjereddy
Copy link
Contributor

Here's the clarification on reference ordering:

From our copyeditor: References should be numbered in order of main text, tables, figures, boxes, and methods-only. So if there is a reference that is, for example, only cited in Fig 1 then it would appear at the end of the main text references.

@tylerjereddy
Copy link
Contributor

I converted the bibtex data to Endnote and then went through and replaced all the in-text citations up to end of Discussion with Endnote refs (didn't do Tables, etc., yet). Of course, the conversion using jabref wasn't perfect---mostly the online -only refs need to be cleaned up / re-categorized maybe. Anyway, with > 100 refs I wanted them tracked automatically if possible..

NMETH_Oct1.tar.gz

@tylerjereddy
Copy link
Contributor

NMETH_Oct2.tar.gz

The above tarball contains updated Word document & EndNote refs files. We now have an exact match to the number of references in the latest compiled PDF, since I've now gone through and manually dealt with adding EndNote refs to the Tables & Boxes at the end of the doc.

I've also started cleaning up the EndNote conversion a bit--there's still plenty more references that should be converted to "Webpage" from "Generic" so that the URL & Title get displayed instead of just the author. A few more authors names need to be clarified as organizations by adding a comma after "Organization Name" so it doesn't get abbreviated to "Name, O."

I'll continue to chip away at this as free time permits.. others are also free to clean that up of course.

@tylerjereddy
Copy link
Contributor

tylerjereddy commented Oct 4, 2019

NMETH_Oct3.tar.gz

Ok, the latest tarball above contains my full manual curation of the 147 references in EndNote--looking reasonably solid now, I think. There are still things that could be nit-picked, but if people have a problem with it they should probably fix themselves!

A few accents have been replaced by plain letters-- @stefanv given name, for example; I think I managed to keep some in surnames like for Fernando since those are a bit more visible, but my main priority was to make the formatting look reasonable, not achieve perfect language letter fidelity.

Note that the new Endnote reference list is currently separate at the end of the Word Doc--presumably we can copy / paste to appropriate location before submitting back.

@tylerjereddy
Copy link
Contributor

@mdhaber If there are things in the reference list at the very end (the EndNote-produced one) that you don't like, can you let me know & will do what I can for it.

@mdhaber
Copy link
Contributor

mdhaber commented Oct 4, 2019

I'm not asking for any changes, but here's what I noticed:
underscores are preceeded with \.
Some seem to be missing article titles (e.g. 41, 79, 85, 86, 141)
112 has author Anaconda, I.

@tylerjereddy
Copy link
Contributor

Thanks Matt -- I'll see what I can do tomorrow.

@mdhaber
Copy link
Contributor

mdhaber commented Oct 4, 2019

@tylerjereddy Re: "important anecdotes" box - maybe let's submit without any more changes for the sake of word removal. If we need to cut out another few hundred words, this sounds like something to consider. But since this is where Numeric is introduced and other changes might be needed, I'd wait on it unless we really really have to.

@tylerjereddy
Copy link
Contributor

agreed -- I'll try to touch up what you mention & hopefully the Figures are getting close now

@tylerjereddy
Copy link
Contributor

NMETH_Oct4.tar.gz

Ok, I think I dealt with the requested changes in the above files--spurious URL underscores gone in the EndNote ref list, ref 112 cleanup, dismiss suggestion about extra box.

Note that the missing titles you mention seem to be a "feature" of the Nature style used by EndNote. If you'd like to suggest another style, feel free & we can switch. I suspect each may have its drawbacks. I do see at least some "book" entry examples in Nature Methods publications where there is just the "In collection" type entry with no title proper for the work.

I haven't looked around for a Nature Methods specific EndNote style--not sure if different from Nature one I'm using, or if we really need to fret about that...

@tylerjereddy
Copy link
Contributor

Mostly conference proceedings that get title suppression.

@mdhaber
Copy link
Contributor

mdhaber commented Oct 4, 2019

OK. Yeah, if it's their preferred style, there's not much to be done.

@mdhaber
Copy link
Contributor

mdhaber commented Oct 6, 2019

Do we need to to track changes to show Rita what has changed, or can we accept everything? Did you want to discuss any of the changes I made or should I accept them all? All the reference changes look fine.

It looks like you've updated the references in Table 1?
I don't think we need to worry about formatting Table 1 to a single page. The information is there, and they will be formatting it.

@tylerjereddy
Copy link
Contributor

Accept your own changes I think. Everything including Table 1 should be using EndNote refs now & should be up-to-date. Hopefully we don't need to show another diff on the document given that these Editorial adjustments...

@tylerjereddy
Copy link
Contributor

Are you doing touch-ups now Matt? I will maybe try to read through in a few hours or tomorrow at the lastest.

@mdhaber
Copy link
Contributor

mdhaber commented Oct 6, 2019

OK, if you are OK with the changes so far I'll accept them all and do a read-through shortly, tracking only new changes.

@mdhaber
Copy link
Contributor

mdhaber commented Oct 7, 2019

After accepting all changes:
scipy_collaborative_edit_NM-2019-10-6-6-30.docx

I added a Code Availability statement and modified the data availability statement, intending to make it more specific. I'm not happy with it, though - it should be rewritten following one of the examples in:
https://www.nature.com/documents/nr-data-availability-statements-data-citations.pdf

I haven't done a read-through; I've been working through forms linked at the bottom of #264. I don't think I'll touch the document again tonight; instead I'll start the "point-by-point response to the remaining editorial and reviewer point" required in the Publishing Policy worksheet.

@mdhaber
Copy link
Contributor

mdhaber commented Oct 7, 2019

Just noticed that the original decision email said:
"Please upload the text (in Microsoft Word format, version 2010 or earlier) and figures using the link provided below."

Can't convert to a .doc because the equations will turn to bitmaps, so I'm not sure how we're supposed to do this.

@ilayn
Copy link
Member

ilayn commented Oct 7, 2019

That's probably the guideline being too old. You don't have to worry about following the guidelines this strictly. Their copy editing office will retype the whole thing anyways in their own way. Something in the ball park of the final product is sufficient.

@tylerjereddy
Copy link
Contributor

I've been saving in "compatibility mode" FWIW. Ok, I'll try to read through the manuscript now, which will hopefully be my "final pass." Thanks for working on the forms--I obviously missed that we needed a formal rebuttal letter.

@mdhaber
Copy link
Contributor

mdhaber commented Oct 7, 2019

Unfortunately, I didn't save in compatibility mode : / It told me it was updating to latest version and I accepted before I noticed the requirement : /

@tylerjereddy
Copy link
Contributor

scipy_collaborative_TR_oct_7.docx

I've read through the entire 31-page Word document now & made only microscopic adjustments using track changes in the above attached file.

Overall, it reads well and I think we've managed to fill in the gaps left by the pandoc conversion & editorial refactoring now.

@mdhaber
Copy link
Contributor

mdhaber commented Oct 7, 2019

scipy_collaborative_TR_oct_7b.docx

Besides a few micro-edits, I removed the third paragraph of the Linear Optimization section. Since I'm probably the only person who would care, it seems like an easy way to remove 108 words.

I also noticed that quotes were added around "version added" and "wrapper", but none of the other fields in Table 1. I accepted the changes, but I think this should be consistent either way.

If you accept the changes and address the quotes in the Table I caption, I think the resulting document can be submitted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants