Microsoft Word File Import/Export (Book)

Book tools ::: booktool_wordimport
Maintained by Eoin Campbell
Import the contents of a Microsoft Word file into a book, splitting it into chapters and (optionally) subchapters, based on the heading styles. The file can be saved from Microsoft Word, Google Docs or LibreOffice, as long as it has a '.docx' suffix. Also supports exporting books to Word format, for round-trip editing.
Latest release:
8106 sites
1k downloads
118 fans
Current versions available: 3

This plugin supports importing a Microsoft Word docx-formatted file as chapters to a book. The file is split into chapters and subchapters based on the built-in heading styles "Heading 1" and "Heading 2" in Word. Embedded images are also imported if they are in web-compatible format (GIF, PNG, JPEG).

It imports .docx files only, not the older .doc format.  Note that files in.docm format (i.e. including macros) are not supported.

Both GoogleDocs and LibreOffice 5.x can also save files in .docx format, and they will import too, but generally the quality is not as good as those files saved from the native Word editor, even if the document uses the same built-in "Heading 1" and "Heading 2" styles well. Your mileage may vary.

Note also that the PHP XSL extension must be enabled on your webserver, and the plugin requires Moodle 2.7 or higher.

After installation, the Book administration menu should have a new item added, similar to the screenshot below. You must create a new book, or turn on editing in an existing book, to see this menu.

Book administration menu

The plugin can also be used to export books, or chapters from books, back into Word .doc format. In general heading elements in HTML are converted back into corresponding heading styles in Word, and so on with other styles. The exported file must be opened first using Microsoft Word. You can then save it to .docx format and edit it in LibreOffice or GoogleDocs. If the exported file contains images, then you must use Word 2019 or higher, or Word 365 (web version) , in order for images to be opened correctly.

Screenshots

Screenshot #0

Contributors

Eoin Campbell (Lead maintainer)
Please login to view contributors details and/or to contact them

Comments RSS

Show comments
  • Iván Castiblanco Ramírez
    Wed, 11 Aug 2021, 10:18 PM
    Ok, thanks!!!
  • Iván Castiblanco Ramírez
    Thu, 24 Mar 2022, 2:38 AM
    Hi, When I import a .docx it converts paragraph text to Small Caps... any idea how to fix this?
  • Eoin Campbell
    Fri, 25 Mar 2022, 6:57 AM
    Hi Iván, this problem typically arises if the paragraph style definition specifies one format, but it is overridden in with a different format in a specific paragraph using that style. For example the style definition for a "Quotation" paragraph might be italic by default, but this is overridden with bold. Reset the text formatting in the paragraph by selecting all the text and pressing +to clear the formatting.
  • Josh Lim
    Thu, 30 June 2022, 7:26 PM
    Hi Eoin! Thanks for the plugin: it looks really useful! I've tested it out and have a few concerns around accessibility about the conversion to html.

    1) The alt-text gets put in the longdesc attribute (which is depreciated) rather than alt. This has a few issues: users can't edit alt-text through the Atto editor (but must edit the html instead), accessibility checkers like Brickfield won't see this and some screenreaders (e.g. Chromevox) don't handle longdesc well.

    2) Tables: these get imported without header row tags (even if this is set in the Word document originally).

    3) tags everywhere! In Word, I use a 14pt font as my normal style (the default 11pt in Word is smaller than recommended for print accessibility) - this formatting gets applied to each paragraph in a tag, but sometimes for no discernible reason each word in a sentence gets it's own tag. This has a few potential issues:
    a) with a per word it can make screenreader navigation cumbersome.
    b) Sometimes my screenreader skips the image (potentially because there is so much nested formatting instructions centre, font size etc). I'm not really sure why this happens, but it is not an issue content authored directly with the Book Activity.
    c) it makes the html very long and difficult to edit.

    4) Bulleted/Numbered lists get removed. At least this issue is apparent to the author, but bulleted/numbered lists are important in supporting accessibility.
  • Eoin Campbell
    Thu, 30 June 2022, 7:51 PM
    Hi Joshua, many thanks for your comments, most of which are true. However, regarding item 2) Tables, heading row elements (.......) should be included if you have set the "Repeat Header Rows" flag in the Table Tools Layout ribbon menu (visible when the cursor is inside a table). Regarding item 4) Bulleted/Numbered lists, list paragraphs must use the "List Bullet" or "List "Number" styles. By default, Word applies a bullet or number to a "Normal" or "Body Text" style paragraph, and the conversion tool isn't smart enough to figure this out. Note also that nested lists (using "List Bullet 2") are not supported. I will try to fix items 1) and 3) at some point. 3) is a bit tricky because I try to support some simple visual formatting like coloured text, but don't do a good enough job at stripping out font-size style directives.
  • Eoin Campbell
    Thu, 30 June 2022, 7:53 PM
    heading row elements (<table><thead><tr><th...</th><th>...</th></tr></thead><tbody>....</tbody></table>)
  • Alex Williams
    Fri, 29 July 2022, 11:40 AM
    Hi Eoin,
    I posted a comment to https://github.com/ecampbell/moodle-booktool_wordimport/issues/8 about a potential continuation of the same problem in v1.4.11 (2021083100) of this plugin. The exported file is showing broken images when exporting and opening in current (2022) desktop versions in Windows and Mac. The issue isn't occurring 100% of the time, but reliably enough for majority of embedded images in Book chapters. Any tips for troubleshooting this at all?
    Thanks,
    Alex
  • Eoin Campbell
    Fri, 29 July 2022, 4:18 PM
    Hi Alex, this looks like an incompatibility introduced in the newest version of Word. Have you been able to test with an older version? Regarding troubleshooting, things to check are a) that the image is actually embedded into the Book, and not a reference to an image stored somewhere else; b) the image file suffix is "standard", like png, jpg, jpeg, gif; and c) the image format is web-compatible (i.e. not BMP or WMF). You could try saving the book as an ePub (there's a "Download as ebook" option in the Book administration menu), and checking that the image appears when read in an ePub Reader, or by opening the ePub file in a Zip reader to check that the images are present and have the right suffixes. If you backup the Book to a Moodle course backup file and send it to me, I can take a closer look.
  • Alex Williams
    Fri, 5 Aug 2022, 12:29 PM
    Hi Eoin, Thanks for the follow-up and suggestions. Have you been able to confirm whether this is actually an incompatibility with the newest Word release versions? I am unfortunately not able to test in older versions (My Word license/access is corporate, so is managed outside my control). Regarding A) The images should have been directly embedded. I replicated the problem using ATTO to simply browse for an image and upload it. I did not use any external repository and the same method worked for PNG and GIF formats, just not reliably for JPG/JPEG. For B) The file extensions used to test and replicate were standard. I tested with multiple common formats (PNG, JPG, JPEG and GIF). For C) no web-incompatible formats were used and D) at this time I am not at liberty to authorise the release of a backup containing the Book activities. The Moodle sites I administer and have tested within are not owned/maintained by myself.
  • Malcolm Green
    Wed, 7 Dec 2022, 1:23 AM
    I would like to create a side index for my book based on H2 headings and from the documentation it seems that this should be possible but I'm not seeing it. Do I need to enable this is in the settings?
  • Eoin Campbell
    Wed, 7 Dec 2022, 5:32 PM
    Hi Malcolm, you can create subheadings based on the Word "Heading 2" style (in addition to headings based on the "Heading 1" style), by checking the "Create subchapters based on subheadings" box when importing the Word file.
  • Malcolm Green
    Wed, 7 Dec 2022, 10:19 PM
    Thanks Eoin. I have tried checking the box before importing the Word file but I'm still only seeing the topic title in the side menu. I'm using Moodle 4, is that version supported?
  • Eoin Campbell
    Wed, 7 Dec 2022, 11:53 PM
    Hi Malcolm, the import works on all versions up to 4.1. I wonder have you formatted your Word file correctly to use the "Heading 2" style? You also need to have at least 1 "Heading 1" style before any "Heading 2" styles, or nothing gets imported.
  • Eliot Hoving
    Mon, 2 Oct 2023, 5:23 PM
    Hi Eoin, we really appreciate this plugin. I've had some questions from our end users which would be great to get your responses to. Is there a reason this plugin exports to .doc instead of docx? Have you thought of making the plugin available to student, i.e. allow students to export Moodle books to word (as an improvement to the print to PDF they can currently do which doesn't allow them to edit the content for accessibility reasons)?
  • Eoin Campbell
    Tue, 3 Oct 2023, 7:28 PM
    Hi Eliot, the plugin actually exports an XHTML file, not a native Word file. By using the ".doc" suffix, it fools Windows into opening the file using the Microsoft Word app, which then interprets the HTML content into native Word format. Saving as ".docx" or ".htm" would not achieve this effect.
    I did investigate allowing students to download as Word a while ago, but it required a bit more work than I thought, so I put it aside. I might take another look.
1 2 3 4 5 6
Please login to post comments