SWSA

I have just finished drafting a new novel. It was composed using OpenOffice Writer running under Linux, and has now migrated to Microsoft Word on a Mac for further formatting.

OpenOffice uses a much more compact file-format than Word. The finished book occupies 305 Kb in odt format and 1.5 Mb in doc format. At the end of every working day during the drafting period, I saved what I had written so far with a name like “101103.odt” (being the saved file for 3 November, 2010). The result is a directory containing 121 cumulative versions of the book, totalling 16.7 Mb.

Thus I have a permanent record of the drafting process, complete with all the errors, cul-de-sacs and general groping for direction that accompanies the construction of any novel, no matter how well planned it may be – and on this project I dispensed with my usual synopsis and flew by the seat of my pants.

I am making this post for the benefit of other authors. Such a record might prove instructive long after composition. It retains ideas and passages that, even if at first rejected, you may decide to use later. Finally, it provides incontrovertible proof of authorship, should there ever (heaven forfend!) be a need to produce it.

OpenOffice has become very stable and sophisticated, and if you haven’t checked it out recently or at all I recommend that you do. It crashed three times during perhaps 500 instantiations, which beats Word on the Mac hands down; and only crashed at all when I was doing unusual things. The autosave feature is configurable. Very little is lost even if a crash occurs. The flavour of Linux I used is Linux Mint, which is an Ubuntu derivative I can also recommend.

Via Richard Herley’s blog

11 COMMENTS

  1. Interesting workflow, but I have to wonder, why switch to MS Word for further formatting? I find that the formatting control in OpenOffice.org (or LibreOffice, which I have been using) is far more robust, with options for not just paragraphs, but also frames, characters, and lists. I also find that OpenOffice.org is much more reliable for longer documents.

    Also, have you considered using the “versions” feature (see the file menu) in OpenOffice.org? I generally use that feature to record my incremental revisions–or you can have the program save a new version each time you close the file. That way, you don’t end up with hundreds of files, but you still have access to all versions. I typically save a new file (similar to you, with using the date) whenever I feel my document has gotten to a “major” version.

  2. I have to admit I also find the choice of OpenOffice when you already have MSWord … and even the other way around … odd.
    I have run MSWord on my work office Mac for four years on a daily basis and I have never once had it crash.

  3. My Linux box is nowhere near my (wired) printer, and my agent likes Word. When I saved from OO in Word format and opened that file in Word, I found various inconsistencies, probably caused by all the fiddling with the text over a period of months.

    @Ananda, thanks for the tip. I will check that feature out — didn’t even know it was there!

  4. Yes, I know that archiving previous versions is a cause célèbre among two or three digital preservationists. I suppose it makes sense that someone with an obsession with digital archiving would also be an OpenOffice and Linux propagandist.

    But.

    Use an unambiguous date format or your preservation efforts are useless. 20111103, 2011-Nov-03, or 2011-11-03 are obvious contenders; you could stretch it to the Lotus 1-2-3–style 3-Nov-2011. Otherwise future historians are simply going to have to guess what your dates are, because 101103 could represent anytime from 2003 to 9910. (I wouldn’t depend on file metadata for that sort of thing, but it will probably still be there.)

  5. @Richard: As far as I remember, earlier versions of MS Office had versioning built in, but for some reason it was removed with Office 2007 (unless you were using SharePoint Server), only to be reintroduced with Office 2010. I guess Microsoft assumed that the only people interested in versioning would be those working in a collaborative writing/editing environment. Virtually all online wordprocessors (like Google/Zoho docs) also have versioning enabled by default.

    I see versioning as a form of “non-destructive editing” (as they call it in audio, video, and photo software).

    @Joe: I agree with your point about the date format. I believe the ISO standards recommend your third option (2011-11-03), and I find that format most useful because of the ease of visually scanning the dates (they hyphens make the string of numbers easier on the eye); the format also facilitates easy sorting of files.

  6. The sarcasm about Richards archiving is utter nonsense. Keeping separate files in this way is a very valuable way to preserve an ‘audit trail’ of versions of files, where access to the full file contents as of a previous date is critically important. Anyone who has been involved in any project with those priorities knows that.

    Relying on the internal ‘versions’ function within Word or other apps is also useful, but far more prone to data loss if there are disk problems or backup problems.

    In addition the format “101103.odt” is perfectly adequate and sorts correctly.

  7. Archiving each day’s versions has saved my bacon on many times. Sometimes I find I made some stupid mistake — or else the software ‘ate the homework’ — and I could open a past version, find what was missing or changed, and copy it into the current version.

    Re: inconsistencies between LO.org, OO.org and MS-Word, I only use the basic, ‘available to everybody’ fonts, and only the basic paragraph styles — in fact, I only use the basic HTML styles, and sometimes use LO.org to work on a project in html. For archiving these I zip-compress them. Using html forces me to concentrate on basic structure rather than formatting, and lets me easily view and correct the code underneath, something I’m not at all competent at when in comes to the .odt xml codes.

    I don’t go back to Word; for Kindle editions I save as HTML and then hand-code for Amazon’s subset of HTML, and for PDF editions I hand-edit the HTML into LaTeX and then use LyX to produce the final product.

    But I only write text. It would be a different story if I were doing work that required more complex formatting involving images, diagrams, charts, tables, and the like.

  8. It seems like the “ate my homework” concern is something that has been mentioned in defense of multiple whole file versions (over using the versioning system inbuilt in software). I like the versioning option since it also lets me add notes to remind myself of what I have changed. However, if that’s not your cup of tea, two solutions would be to (1) use a RAID setup where your data is simultaneously mirrored on one or more additional hard-drives, or (2) use a service like Dropbox to synchronize your working folder.

    Since Richard is already using multiple computers, Dropbox can be really handy since the files would automatically synchronize across all his systems whenever he goes online. The downside to a program like Dropbox is that it usually can’t synchronize when a file is in use, so for your changes to be updated on the Dropbox server, you would occasionally need to close out of whatever program you are using.

  9. asotir is absolutely right about the ‘software ate my homework’ experience.

    Having worked for many years in large project planning with many rolling document versions from word to excel spreadsheets I can say that over the years it is quite a common thing for files to inexplicably become damaged or unreadable because of unknown flaws in either the software, the OS or the disks. Relying on internal ‘version’ software leaves you totally vulnerable to this wrecking your work.

    Maintaining an archive of whole version file versions is a very worthwhile strategy to combat this. After all the file sizes are tiny and the process is very easy to operate.

    The use of something like dropbox is indeed a very useful additional backup strategy to use in parallel with a personal backup strategy which I assume is ALWAYS being used.

The TeleRead community values your civil and thoughtful comments. We use a cache, so expect a delay. Problems? E-mail newteleread@gmail.com.