Part I – Introduction
Several years ago, we were approached by a research lab at The Institute of Chemistry at Hebrew University with a request to help with a project called the Continuous Symmetry Measure (CSM). Like many projects on which we are asked to consult, the code had been written and maintained by a PhD student. The student had graduated and left the lab, leaving the researchers with code they relied on, but with no one dedicated to maintaining it. Trying to do it themselves was detracting from their research time.
This project, which we still maintain, provides an excellent case study of how we here at The Research Software Company work with university research labs. First though, a brief introduction to the concept behind the project:
The Continuous Symmetry Measure is a way of expressing the symmetry of molecules beyond a binary yes/no answer to whether they have symmetry, an incredibly important trait for understanding biological systems. To simplify how CSM works: It would take a molecule, create the idealized symmetric version of that molecule, and then measure the distance between the actual molecule and its symmetric counterpart.
The program had two parts:
- The exact CSM worked by creating every single possible permutation of the molecule’s atoms, measuring the distance for each, and reporting the best symmetry it found. It was guaranteed to find the correct symmetry measure—but, because it worked on permutations, its runtime increased exponentially with larger molecules, so that even a modest-sized 20 atom molecule could take over a billion years to finish running.
- The approximate CSM was built to overcome the limitations of the exact CSM. Instead of checking every single possible permutation, the program followed a specific mathematical algorithm to iterate over symmetry-axis-direction/permutation guesses, a method proven to converge after a small number of iterations. While the result arrived at wasn’t guaranteed to be the exactly correct symmetry measure, it could be proven mathematically to be close enough.
The initial request was just to tidy up and add comments to the code. The code had been written in C++ and sprawled over many pages, so that it wasn’t clear which parts were doing what. We cleaned up the code and added explanations, including citations back to the article.
Coming in Part II –
(Alon, G., and Tuvi-Arad, I. “Improved algorithms for symmetry analysis: Structure preserving permutations”. J. Math. Chem., 56(1), 193–212 (2018).
Zabrodsky, S. Peleg and D. Avnir “Continuous Symmetry Measures” J. Am. Chem. Soc., 114, 7843-7851 (1992)