Research Software Sustainability

One issue that is bothering a lot of participants in the DH 2019 conference in Utrecht, from which we’ve just returned, is “Software Sustainability.” They want the software they write now to work in the future as well, after they finish the project. This is a very reasonable concern – you want to know than a decade from now scholars will still be able to access and use the product of your current research.

Unfortunately, it’s not really possible. Just like the molding, discarded pile of VHS tapes in your parent’s basement were once the “hot new thing,” software becomes outdated as well. It can last 10 years, maybe 20, but it’s hard to get software to last more than that. This is true for all technology, not just research software and the VHS tape of your 8th grade talent show. Sometimes this can be a good thing,

We have a few projects where we have revived “ancient” software. Ancient in this case means about 15-years-old. Using a website created 15 years ago is an anthropological experience, just like driving a car from 2003, or watching reruns of Friends (They were on a break!) Fifteen years ago, web developers had to make sure their sites would work on Internet Explorer 6, the most popular computer screens had a lower resolution than today’s smart watches, and nobody imagined changing their screen from landscape to portrait.

We have no idea today how to write software that will still look good in the year 2033. We can be reasonably sure Java, Linux and Windows will still be around (or at least a compatible layer), but we don’t know much else.

This doesn’t mean we should ignore sustainability completely. On the contrary, we must plan for it. However, we need to plan for it properly. If what you develop today is still going to be useful in 15 years, it will need to be rewritten in whatever technology is going to be available then. There’s one thing that you must make sure will last the next 15 years (and more) – the data.

You must write your software in a way that 15 years from now, when a future programmer who is today binge-watching Game of Thrones is hired to rewrite it, they will be able to reuse the data your software uses and generates now, even while they are laughing at your out-of-date design.
Therefore, you must document your data properly – the database tables or collections, any custom formats you’re using – everything. This is so important, I’m going to repeat it. Document your data! And if you’re using standard formats, make sure you have the documentation for them, as well, in case someone tries to rewrite your program in 50 years, and nobody knows what a RAR file looks like. If you can’t find the documentation of the RAR file format, switch to a format that is documented.

If you document your data properly, rewriting the software will not require a lot of guesswork, will be a lot easier, and won’t have to get thrown into the scrap heap with your outdated smart watches and holograms of your 8th grade talent shows.

Leave a Comment