OOXML Hacking: Editing in macOS

Note: I’ve included the original article text to describe the background issues about XML editing in macOS, but to retain your sanity, be sure to follow the May 2016 and July 2018 updates at the end and use a text editor that doesn’t require unzipping and rezipping the files

When you’re hand-editing Office files in Windows, it’s pretty straight-forward: unzip file > edit > rezip, you’re done. Editing in macOS requires a couple of extra precautions. This is because the graphical user interface adds Mac attributes to files and plants hidden files in folders. Office will not tolerate either of these:

Editing in macOS - The Open XML file cannot be opened because there are problems with the contents. Details The file is corrupt and cannot be opened.

XML error message in 2008


Editing in macOS - The Open XML file cannot be opened because there are problems with the contents or the file name might contain invalid characters (for example, \/). Details The file is corrupt and cannot be opened.

XML error message in 2011


Editing in macOS - The Open XML file cannot be opened because there are problems with the contents or the file name might contain invalid characters (for example, \/). Details The file is corrupt and cannot be opened.

XML error message in 2016

If you use macOS’s Archive Utility to unzip or zip the files, Word will refuse to open the resulting file. On top of that, if you look in any of the folders using the Finder, a hidden .DS_Store file will be created in the folder. When re-zipped, Word will not accept the extra file and again report an XML error. The solution to these issues is to use the command line, like the Unix warrior you want to be! Remember to run each Terminal command by pressing the Return key after typing the command.

A valuable utility for this is OpenTerminalHere. Open any Finder window, click on OpenTerminalHere and a terminal window opens pointed to the Finder window. So download and install it, then follow these steps to open, edit and re-zip Office files:

  1. Move a copy of the Office document (let’s call it TestDoc.docx) to a separate folder and open that folder in the Finder.
  2. Click on OpenTerminalHere to open a copy of Terminal aimed at the folder.
  3. In the Terminal, type
    unzip TestDoc.docx

    then press Return. The file is unzipped into several folders plus a file called [Content_Types].xml.

  4. Do not look in any of the folders using the Finder, or you’ll have to start over. To examine a folder’s contents, use the Terminal to change the folder, then list the contents:
    cd word

    ls -l
  5. To go back up to the previous folder, type:
    cd ..
  6. To edit the files, open your text editor, then navigate using the File>Open dialog to find the file. Edit the file, then save and close.
  7. When you’re all done, double-check that terminal is pointing at the original folder holding the documents and the expanded folders. If you’re unsure, close terminal, then click on OpenTerminalHere to reopen in the right spot.
  8. In Terminal, re-zip the files with this style of command:
    zip -r RevisedDoc.dotx [Content_Types].xml _rels docProps word

    This example is for Word, but the correct syntax after zip -r is to type the name of the final document, followed by the file and folders, each separated by a space. The file is reassembled into an Office file.

  9. Test that you can open it. If you get an XML error notice, re-read the above steps and try again.

Please note: these editing techniques are required when editing in macOS with Word, PowerPoint and Excel documents and templates, plus Office Theme files (the kind exported from PowerPoint that combine all Theme elements.

If, on the other hand, you are editing a Font Theme or a Color Theme, those are simple XML files. They don’t need to be unzipped or re-zipped and Office doesn’t seem to care about macOS attributes attached to them. These plain XML files don’t need to be handled through the terminal, just use the Finder.

Next time, we’ll be looking at managing Word styles in macOS. Finally, a way to get rid of the zombie styles automatically created by Word! Happy hacking!


March 2016 Update

An alternative to working entirely in Terminal is to work on a network or USB disk where creation od .DS_Store files has been turned off. On a network disk, open Terminal in your choice of folder and run the command:

defaults write com.apple.desktopservices DSDontWriteNetworkStores true

To use a USB disk, run this command instead:

defaults write com.apple.desktopservices DSDontWriteUSBStores true

While this will prevent future generation of the .DS_Store files in that folder and any subfolders, it’s very likely you already have such files, since they’re created almost as soon as you view a folder’s contents in the Finder. In addition, some important XML parts are hidden and need to be revealed. So while Terminal is open, run:

defaults write com.apple.finder AppleShowAllFiles YES

followed by:

killall Finder

The second line restarts the finder to force a refresh of the view. Now you can see any .DS_Store files and delete them before re-zipping the files into an Office document. You’ll have still have to do the zipping in Terminal. Also, no .DS_Store files means OpenTerminalHere doesn’t work, so you’ll have to navigate manually via Terminal commands. Now you know why this is a lame alternative.

If you try this technique, you can always restore the clean file view by running:

defaults write com.apple.finder AppleShowAllFiles NO
killall Finder

Once you’ve created this OOXML editing drive, you can use the command-line zip utility to unzip the files. But there’s also a very useful GUI utility that works better than Archive Utility with Office files. Visit the App Store and get The Unarchiver. Then use it to unzip and expand the Office file.


Editing in macOS – May 2016 Update

BBEdit 11 and better has the ability to open and edit Office files directly, avoiding all of the above hassle when editing in macOS. BBEdit has a 30-day free trial with all features included. While older versions of BBEdit used Tidy to format text, that utility has been retired. The BBEdit programmers have written a script to format XML in human-readable form. You can download it from here, please be sure to read the installation instructions first: Click to download XML Tidy Script for BBEdit

Here’s your working procedure:

  1. Open your Office file in BBEdit 11 or later. In the left-hand pane, you’ll see a folder tree of the files contained within, so no unzipping is required
  2. Select the file you want to edit. The file opens in the main BBEdit window, displaying two lines. The first is the XML header, the second is the actual content.
  3. Click at the left end of the second line.
  4. Choose Text>Apply Text Filter>run_tidy.
  5. Make your edits and save. It’s not necessary to linearize the XML. The Office program will do that anyway the first time you save it. However, if you like to leave things exactly the way you found them, click in from of the first line of content (after the header line), choose Markup>Utilities>Format…, change the Mode to Compact and click on the Format button. Save the file and test your editing in macOS.


Editing in macOS – July 2018 Update

Technology marches on! If you use the Chrome browser, there is a free XML editing alternative that avoids unzipping and rezipping files. Open this link in Chrome: OOXML Tools and download the free plugin. After installation, click on the OOXML icon to the right of the browser address bar. Drag your Office files onto the browser window to begin editing. When you’re finished, click on the Save button, then the Download button in the upper left corner and give the new file an appropriate name. Chrome will place the new file in your Downloads folder and leave the original file untouched. OOXML’s EMF/WMF bug has been fixed, so download the most recent version. Thanks to Bram Alkema of the Netherlands for informing us about OOXML Tools.

Please note, for any OOXML Hacking that requires adding new XML parts (Ribbon mods, creating SuperThemes), BBEdit and OOXML Tools will not work. You’ll have to use the March 2016 update solution and create a network or USB disk set up for XML editing.

We’re experts in XML hacking, so you don’t have to be. Contact me at production@brandwares.com with the details of what you need hacked.

9:28 pm

19 thoughts on “OOXML Hacking: Editing in macOS

    • Your document had several mc:Fallback errors, which are caused by a Word bug. I fixed those.

      Then I noticed that 2 mc:Fallback tags had a number of parameters added to them that Word couldn’t parse. I removed the parameters and the document opened in Word. If you added the custom parameters, that never works in Office programs. It has a very strict parser, so custom parameters will be ignored at best, or cause the file to be unopenable at worst. Here is the repaired document: Repaired Document

  1. Thank you for this, John. I feel I’m close, but when I follow your instructions to edit normal.dotm (I’m trying to adjust the priority of my heading style) I get an XML file that doesn’t work. Should it work, or is there something special about a .dotm?

    • By default, Heading styles appear after Normal in the Style Gallery. The XML looks like this:

      <w:lsdException w:name=”heading 1″ w:semiHidden=”0″ w:uiPriority=”9” w:unhideWhenUsed=”0″ w:qFormat=”1″/>

      To move it before Normal, give it the same w:uiPriority value as Normal. Because Word lists styles with the same priority in alphabetical order, Heading 1 will pop before Normal:

      <w:lsdException w:name=”heading 1″ w:semiHidden=”0″ w:uiPriority=”0” w:unhideWhenUsed=”0″ w:qFormat=”1″/>

      Especially when trying out new operations with XML, make just one change at a time, then rezip and test the file in Word. Otherwise it’s very difficult to pinpoint an error.

      If this doesn’t answer your question, please post more details, any error messages or other symptoms that may help diagnose the issue. A simple “doesn’t work” doesn’t tell me enough. Thanks!

  2. This has been very helpful. I am currently working on automating a ppt report which has embedded xls files in it. Currently I unzip, write new data to xls files, zip. I can open the ppt with no warnings, but my embedded xls tables are not updated. If I double click on the table, it opens the excel file and shows the updated values. I close excel and then the table opens. Is it possible to get ppt to notice changes have been made?

    • Your method bypasses the ways by which PowerPoint finds that data has been updated. It would be possible to write a macro in a macro-enabled presentation to do the updating. If you saved that as an add-in, the updater could run automatically when you open a presentation. Otherwise, You’ll have to manually update as you’ve been doing.

      • Thanks for the quick reply. I was looking into a Macro but was concerned it might get security concerns on opening. Do you think this will be a problem? Could i unzip again and remove the Macro after running it? If you have any resources to point me to that would be great! Thanks again!

        • In Windows versions of Word, it’s possible through Office settings, or by using a digital certificate, to eliminate the macro warning after the first time you open it. In Mac versions, you are always warned, but writing a PowerPoint for Macro macro is a difficult beginner project. But if you’re the only one using it, you can also just ignore the warning.

          When I’m searching for info, adding “powerpoint” and “vba” in quotes to the search terms gets me more useful information. There are a number of examples of others doing similar things, so you should be able to find useful code with a search.

          Useful VBA reference pages include the PPTFAQ page, about halfway down, Shyam Pillai’s VBA Reference page, the OfficeOne VBA page and PowerPoint Alchemy. We can also program it for you, if the learning curve is too steep.

  3. Hello
    Is it possible to hide a file | directory or vbaproject.bin in ooxml office file with no error to open it ??????

    Please answer me by email :
    Email Address Removed

    Thanks

    • It’s not possible to hide a VBA macro file. That would introduce a major security breach.

      It might be possible to include content in a custom XML part, but I haven’t done any serious research on that yet.

  4. Thank you for the OOXML Tools link. I was trying to find fonts in a PowerPoint file (on a Mac) to eliminate some save errors. I learned that I could search the XML to find and replace the font references, but PowerPoint for Mac does not export to XML.

    • Editing on a Mac does present challenges that don’t exist in Windows. The lack of the XML save format is one. The tendency of macOS to write hidden .DS_Store files in every viewed directory is another.

      This article looks at these problems and how to solve them: OOXML Hacking: Editing in macOS. If you work on PowerPoint files on a regular basis, setting up either a USB or network disk that doesn’t create .DS_Store files will make life easier. The UnArchiver utility makes it a breeze to unzip and rezip Office files. The other part of the puzzle to replace fonts is a good text editor that can find and replace in a folder full of files. The free version of BBEdit will do that. Best of luck!

  5. John, huge thanks for this article. I spent several hours, several times over the past year, trying to come with a way to generate properly styled word docs from a script with dynamic content. The OOXML Tools extension is a God send, it just works! I’m very excited to start collecting on the timesaves in my workflow with this method. Thank you!

  6. John, why should anyone even consider using OpenTerminalHere instead of simply using the standard Terminal app created by Apple that already comes with MacOS by default? I’m unclear regarding what major advantage such a third-party program would theoretically offer.

    • OpenTerminalHere is a simple AppleScript that opens a standard Terminal window at the selected folder. It is not a replacement for Terminal. When the article was written in 2016, macOS didn’t have the ability to right-click on a folder to open a Terminal window.

  7. Thank you so much! This was the only resource I could find anywhere about accessing these types of files on a Mac. You saved the day!

  8. Hi John
    OSX and Powerpoint 16.88.1

    I’m trying to make it work but apparently no changes to tablestyes.xml reflect on the file itself.
    Specifically I deleted all the tablestyles but one, although when I open my pptx all the default ones are still there and can’t see my own that I changed the name of.
    I am using OOXML ext for Brave, then downloaded the modified file and opened it normally. Am I missing something? Is it the .DS_Store issue that’s not allowing me to go forward?

    Thanks

    • Modifying OOXML by hand usually doesn’t give you any feedback if there is a mistake. When you’re starting out with this process, the best idea is to start with an existing table style. Make 1 change to it, then test it. Make one more change, then test that. It’s frustrating and slow. If you make many changes at once and create a mistake, it’s very difficult to find the error by reading the XML.

      Using the OOXML Tools extension sidesteps the whole problem of unzipping and rezipping, so .DS_Store is not likely to be the problem.

      Brandwares is always available to hire, if you can’t get it working.

Leave a Reply

*Required fields. Your email address will not be published.

Posting XML? To enter XML code, please replace all less than signs "<" with "&lt;" and greater than signs ">" with "&gt;". Otherwise, Wordpress will strip them out and you will see only a blank area where your code would have appeared.