Skip to content

Upgrading community archive software

One of the constant themes you will find on this web site is the concept of taking a long view regarding running a digital archive.  This truism is sometimes in conflict with the world in which it operates, the technological and digital world, which is driven by constant expectations of "upgrades", "features", "faster" and other implications of improvement.  In the consumer digital world, we are used to the short life spans of technology, but in the world of systems providing particular services, such as a community archive, change for the sake of change is not always welcome.

But we must live in the real world, and the reality is that, after a while, the developers no longer wish to support older systems.  This is fully understandable; if you have spent time improving your software and solving bugs, you don't want to have to deal with those bugs in older versions when you have already fixed them in newer versions.  The Free Software world does give you the option of providing your own support, so you are not bound by the services your supplier wishes to offer.  But sooner or later, part of the longevity equation means keeping reasonably up to date.

In the case of the Assynt Community Digital Archive, we received an email at the beginning of 2014 pointing out that the version of DSpace that we were running was considered to be at the end of its life, and it was recommended that we upgrade.    We chose not to do that at the time for a variety of reasons, but summer is a good time to work on the systems, as in communities like ours, a lot of voluntary effort happens in the slower-paced winter months.

One of the beauties of the way in which the Assynt Archive is implemented is that it uses the concept of virtualisation.  This type of technology, and the reasons for t suiting a community project so well, are explained elsewhere on this website.  Virtualisation allows one to run an entire system independently of the physical hardware that underlies it.  In the case of doing an upgrade, this means that it was possible to take a copy of the entire virtual machine, and work on that, such that the live Archive was not in any way at risk as part of the process.  It also means that, as you go through a complex update procedure, you can take "snapshots" along the way, so that any oopsies do not mean hours of wasted effort.

An update to something like DSpace is not always easy for lay people to understand,  Updating systems software such as DSpace is not like updating productivity software on a laptop or desktop, where it;s a case of inserting a CD or downloading a zip file, and clicking "Setup.exe" or an "Installer" icon.  DSpace needs a runtime framework and a build framework which consists of the Java runtime system and various other components.  In addition, it needs am industrial-strength database.  So it is the type of process that only suitably skilled people should undertake.

Another early design decision was to do the least amount of customisation possible, ideally restricting the customisation to a logo and naming.  This pays dividends when it comes to upgrading.  It means simply following the processes outlined for the upgrade: download and unpack the new version, build it using the supplied tools, make the required changes to the database, deploy the new system and start it all up.  The bits of customisation., if they are restricted to the minimum, need not affect that ideal process too much.

In our case, though, we needed to go through upgrades from version 1.7.2 to version 4.1.  As this is not advisable in one step, it meant carrying out the upgrade process from version 1.7.2 to version 1.8, then 1.8 to 3.0. then 3.0 to 4.1.  At each step, it is necessary to carry out a battery of tests to ensure that each step is working.  The skills involved are a mixture of Linux/Unix skills, some Java development skills, PostgreSQL database skills and some experience as to how these types of things work.  Verison 4.1 also required updates to the deployment system, tomcat, as well as the build mechanism, maven, and ideally to the database, PostgreSQL.  This meant that one step was also to upgrade the operating system running the virtual machine from Debian 6 "squeeze", to Debian 7, "wheezy".  Fortunately, this is a well documented and bullet-proof procedure.   But with all that work, we are now.... I nearly said "future-proof," but maybe immediate-furture-proof would be more accurate - until the end of 2016 or maybe 2017 anyway, when we expect the next steps to be very similar to these.

The virtual machine can then be transferred back to the live Archive, and it will magically be running the new version.