word2asciidoc
is a comprehensive Python toolkit designed to support the conversion of Word documents to AsciiDoc files. Using the pandoc
capabilities for the initial conversion, this package provides a set of post-processing scripts that enhance and refine the resulting output, improving the formatting, readability and compatibility of the final AsciiDoc document.
- Image Conversion: Converts EMF images to PNG and updates document references.
- Character Escaping: Escapes double angle brackets to avoid tag misinterpretation.
- Note Styling: Enhances the visibility of note boxes.
- Bibliography Anchors: Adds anchors to bibliography entries for in-document references.
- Caption Fixing: Ensures image captions use the correct block tag.
- Bracket Escaping: Prevents misinterpretation of square brackets by AsciiDoc processors.
- Visit the issue creation page and select the "Transform Word Document to AsciiDoc" template.
- Upload your Word file and submit the issue.
- An AsciiDoc file will be generated from your Word file, and the package scripts will be applied to it.
- Once the conversion is complete, you'll receive a notification.
- Download the converted AsciiDoc file from the link in the notification comment.
- Do manual adjustments as needed (see Manual Adjustments).
For local conversion:
- Install Pandoc.
- Create a directory for your Word file, AsciiDoc output, and media.
- Convert your document with Pandoc:
pandoc [YOUR_DOCUMENT].docx -f docx -t asciidoctor --wrap=none --markdown-headings=atx --extract-media=[MEDIA_DIRECTORY] -o [OUTPUT_DIRECTORY]/[YOUR_DOCUMENT].adoc
- Install
word2asciidoc
using pip:
pip install git+https://github.com/admin-shell-io/word2asciidoc.git@master
-
Apply
word2asciidoc
scripts to refine the AsciiDoc file with the following command:fix_adoc --adoc_input /path/to/input.adoc --adoc_output /path/to/output.adoc
If you encounter issues, try running the script as a module:
python -m word2asciidoc.fix_adoc --adoc_input /path/to/input.adoc --adoc_output /path/to/output.adoc
-
Do manual adjustments as needed (see Manual Adjustments).
After running the scripts, manually check [YOUR_DOCUMENT].adoc
for any unresolved issues. Refer to the AsciiDoc User Manual for guidance.
Common manual tasks include:
- Table of Contents: Remove or adjust the placement.
- Document Attributes: Set up necessary AsciiDoc attributes at the beginning of the file.
- Table Formatting: Ensure tables have headers and are formatted correctly.
- Image Text Flow: Verify that text flows correctly around images.
Example of setting document attributes:
:toc: left
:toc-title: Table of Contents
:stylesheet: style.css
:favicon: favicon.png
:nofooter:
= Document Title
:author: Your Name
:revnumber: 1.0
:revdate: January 2023
:revremark: Initial release
This project is under the Apache License 2.0. See the LICENSE file for details.