Version Control: Why is it Critical when Authoring DITA XML?

Keywords: FrameMaker, book, DITAMAP, DITA files, oXygen, SVN, GiT, SourceTree, Tortoise, TortoiseGiT, version control, DITA, XML

Warning! Possible data destruction.
Some tools, including some Samalander modules change the DITA XML on the file system.

Because of this it is an INDUSTRY BEST PRACTICE to adopt a version control system and use it before running Samalander software or any other software that could change the content of DITA files.

A version control system allows you to revert to a previous version of a file or files in case something goes wrong.

What is version control?

Version control was invented to control software source files which are all text files. A typical version control system stores the changes from one version of a text file to the next.

Therefore it consumes very little disk space and allows the user to revert to a previous version of the file quite easily. It also allows more advanced functions such as branching but these more advanced features are not particularly relevant to DITA XML projects.

What does this have to do with DITA XML?

DITA XML files are also text files. Therefore version control systems provide the same reversion benefits enjoyed by software programmers. This means that if you do a global or batch change to a suite of DITA files (think updating the copyright date) a version control system will allow you to revert to a previous version of all your files if the global or batch change somehow goes wrong.

Anyone with any experience of GREP changes will know why this is valuable. It is terribly easy to change all of your XREFs to trash in an instant and create an enormous amount of work to recover unless you can revert to a previous version. This is a key reason it is generally recommended to store your DITA files in a Content Management System (CMS). Most CMS' have this ability to revert to a previous version of a file, a key feature.

But what if we could get that key benefit of expensive CMS systems for free? Well, you can by using free version control tools such as SVN or GiT.

SVN and Tortoise

Tortoise is a Graphical User Interface (GUI) front end to the SVN version control system. It makes using the SVN version control system relatively easy.

Files that have been changed in your Windows file system are flagged with various icons that tell you whether or not a file needs to be committed to the repository or whether there is a conflict between your change and someone else's change.

Tortoise and SVN assume that the repository resides on a server somewhere and that one or more people submit their changes to that central repository. This architecture is ideal for self-contained teams in one location but can become a bit of a problem when teams are scattered over many locations around the globe.

Samalander-OS Ltd. uses SVN with major customers on self-contained projects with good results. It has saved us a lot of major project grief in the last few years of working with DITA. We can recommend it with no reservations except that the Tortoise GUI is sometimes confusing. However, overall it is an excellent product combination. Currently, when used for our major customer, the repository actually resides on a USB key with repository backups to servers in various places.

GiT, SourceTree and TortoiseGiT

SourceTree (for Mac OSX and TortoiseGiT (for Windows) are a GUI front ends to the GiT version control system. As with Tortoise for SVN, they make using the GiT version control system relatively easy.

In TortoiseGiT files that have been changed in your Windows file system are flagged with various icons that tell you whether or not a file needs to be committed to the repository or whether there is a conflict between your change and someone else's change.

We use SourceTree (for Mac OSX) for our own source code files and DITA files on the Macintosh platform. A major advantage of GiT is it's lack of a central repository and easy branching and merging capability. This likely makes it a better versioning system for DITA authoring groups but we will have to use it for another year or so before coming to a firm conclusion. A further advantage of SourceTree and is that it seems easier to learn and use than SVN and Tortoise or TortoiseGiT.

Versioning Control Systems and Software Development Groups

It can be hard to decide which of several versioning control systems might be better for your DITA project. On the other hand if you are supporting a software product your decision has likely already been made: you'll want to use the same system your software group is using. The reasons are simple and all related to money:

  • adopting the software group's versioning system relieves the documentation group of supporting the system; you can simply tag along,
  • the software group will provide the expertise, and usually the minimal training required,
  • the hardware and software infrastructure is usually already in place.
  • your documents can more easily be coordinated with software products, and
  • you will improve the teamwork between software development and your documentation group.