One of my big passions in software development is automated testing, which can be seen by how often I write on the subject. This passion stems from having spent years working with organizations that struggled delivering to production in a timely manner and the anxiety I would experience during deploys. Too often when a release was pushed to production I would soon hear back that a bug has made it to production and the commit logs would show “bkorando” for when the bug was introduced. I would feel ashamed that a change I made caused a problem in production, and felt that it was because I wasn’t being careful enough.
Of course I wasn’t the only developer accidentally introducing bugs into production and the organizations I worked with, and many others, would respond to this problem by introducing new processes. Processes like; having manager sign off on a release, a tester manually run through a test script, or requiring extensive justification for why a change should be made. These additional processes would do little to resolve the issue of bugs getting to production, they did however result in two things:
- Production pushes became more painful, which lead to them happening less often
- Encouraged developers to limit the scope of their changes to reduce risk
In this article we are going to look at how relying on manual processes hinder an organization’s ability to keep their dependencies up-to-date, why this is a problem, and look at a use case of how an automated test can catch a subtle breaking change that occurred from upgrading a dependency.
Coding On the Shoulders of Giants
As modern software developers, we owe a lot to developers that came before us. Margret Hamilton pioneered error handling and recovery techniques while developing the software for the Apollo flight control computer. If you are performing distributed transactions, Hector Garcia-Molina and Kenneth Salem paved the way with the development of the SAGA pattern on how to rollback disturbed transactions when errors inevitably occur.
Our foremothers and forefathers, have allowed our field to advance by pushing on the boundaries of what was possible and introducing new concepts and patterns. Often though the way we benefit from these pioneering advances is through the use of libraries and frameworks that implement or provide abstractions into those concepts and patterns. While we labor all day building and maintaining applications for our organizations, the reality is the code we write is only a small fraction of the code actually running in production.
Just like how the applications we maintain are constantly changing, so too are the libraries and frameworks we depend upon. New features are being added, bugs fixed, performance improved, security holes plugged. If you want the hair to stand on the back of your neck, I would recommend checking out shodan.io, which can be a great resource for showing not only the vulnerabilities of a website, but how to exploit them!
Frozen in Time
In the introduction, I talked about how additional process would encourage developers to minimize their changes. An easy way of keeping a change set small is by not updating an application’s dependencies. I have quite frequently worked on applications whose dependencies were years old! At times working on these applications made me feel like I was a paleontologist, examining applications frozen in amber from a time long ago.
From the perspective of an organization that depends upon manual process, the decision to freeze dependencies is understandable. Updating a dependency, especially if it’s a framework dependency like Spring Boot, could impact an entire application. If an entire application might be impacted from a change, then it would need a full regression test, and when that is manual, that is a time consuming and expensive effort. However in the attempt to resolve one issue, prevent bugs from being introduced into production, this created another, deploying applications with significantly out-of-date dependencies which might contain critical security vulnerabilities (CVEs).
Focusing on the Right Issues
Manual regression testing, along with being very time consuming, isn’t a very good way to verify the correctness of an application. Issues with manual regression testing include:
- Difficult to impossible to recreate some scenarios (e.g. system failure cases)
- Test cases not being executed correctly leading to false positives or false negatives
- Test cases not being executed at all
Automated testing can address these issues as automated tests are: executed much more rapidly, have very little manpower requirement to execute, are much more granular and flexible, and auditable to ensure they are being executed and testing the intended behavior.
While it is nice to sing the praises of automated testing in the abstract, it would be good to see an example of how they can catch a non-trivial, non-obvious bug that occurred from updating a dependency. As luck would have it I just so happen to have an example.
When I was working on a recent article, wiring multiple datasources, I ran into a bit of a frustrating issue. To distill what happened to the essential points. While checking my work with existing articles out there on the same subject, I tried implementing a version of a solution another author demonstrated. However this solution didn’t work for me, even though the code looked the same. The gif below demonstrates the issue I ran into, as well as how to write a test that would catch the issue:
Code can be found on my GitHub. Note the code from the gif above are in branches off master.
A Perfect and Imperfect Use Case
Most any use case is going to run into relevancy. Problems with this use case, in showing the value of automated testing when upgrading dependencies, is that this problem would show up at start up time. Upgrading to Spring Boot 2.x and not updating the properties to use
*.jdbc-url would cause the application to fail at start up as it is not able to properly configure an
EntityManager. So the effect of this change would be fairly obvious.
On the other hand, this is a perfect use case for demonstrating the usefulness of automated testing for several reasons. Let’s quickly step through some of them:
- The test is easy to write – The test case as can be seen in the above gif is pretty simple to write. Recreating the failure scenario is easy and can be done with an in-memory database, so the test is highly portable.
- The test covers other concerns – The test isn’t strictly about recreating a narrow error case, i.e. you’d have to know about the problem before writing the test case. Database communication is important, and is something that would require a test regardless. If there were other subtle breaking changes in that area as well resulting from a dependency upgrade, they might be caught with a test case like this one as well.
- An automated test suite can find multiple issues simultaneously – This issue was found at startup. In a more substantial application it is likely there would be other issues that come from doing a major version upgrade of a framework dependency like Spring Boot. Even if all issues were at startup, which wouldn’t always be the case, it would be a grueling process to sequentially go through fixing each issue as it comes up having to restart the application every time. A test suite could catch multiple issues in a single execution, allowing them to be fixed simultaneously.
Resolving issues that come up when upgrading underlying dependencies, even with a good automated test suite can be a difficult process. It took me awhile to track down the solution to my issue, which would be the case regardless of how the issue was discovered.
Safe to Fail
To begin summarizing this article, I was listening to a recent episode of A Bootiful Podcast and this quote really stuck out to me (lightly edited):
You become safe to fail, you accept that no matter what you do there will be bugs… you’re just able to find them quickly and push the fix quickly
An important takeaway is that no automated test suite is going to entirely eliminate the risk of a bug making it to production. What an automated test suite does gives you though is a safety net and ability to rapidly respond to bugs when they do occur. It gives you confidence that generally it will find bugs in your code, and when a bug inevitably slips through you can respond to it rapidly and with confidence you will not be introducing yet another bug into production. This protection applies to changes made within your code, as well as changes made to dependencies your code depends upon as well.
Awareness and importance of automation, and particularly automated testing, has been slowly increasing in the software development industry over recent years. I am pleasantly surprised when I ask during presentations how many people are working at organizations that do continuous delivery, usually between 25-50% of the room raise their hands.
Still that leaves half or more of organizations which haven’t yet turned the corner on automation. I often find the best way to encourage change at organizations is to lay out the cost of staying with the status quo, and how the proposed changes resolve these problems. If you’re at an organization that still relies heavily on manual processes, and been wanting to change this, hopefully this article might help build your case for your organization to start embracing automation and automated testing.