Automatic Code Transformation With OpenRewrite
Learn how OpenRewrite enhances automated refactoring, improves code quality, and tackles maintenance challenges with real-world examples and benefits.
Join the DZone community and get the full member experience.
Join For FreeCode Maintenance/Refactoring Challenges
As with most problems in business, the challenge with maintaining code is to minimize cost and maximize benefit over some reasonable amount of time. For software maintenance, costs and benefits largely revolve around two things: the quantity and quality of both old and new code.
Quantity
SonarQube suggests our organization maintain at least 80 million lines of code. That’s a lot, especially if we stay current with security patches and rapid library upgrades.
Quality
In fast-paced environments, which we often find ourselves in, a lot of code changes must come from:
- Copying what you see, either from nearby code or from places like StackOverflow.
- Knowledge that can be applied quickly.
These typically boil down to decisions made by individual programmers. Of course, this comes with pros and cons. This post is not meant to suggest human contributions are not extremely beneficial! I will discuss some benefits and negatives of automated refactoring and why we are moving from Butterfly, our current tool for automated refactoring, to OpenRewrite.
Benefits and Costs of Automated Refactoring
When we think about automation, we typically think about the benefits, and that is where I’ll start. Some include:
- If a recipe exists and works perfectly, the human cost is almost 0, especially if you have an easy way to apply recipes on a large scale. Of course, this human cost saving is the obvious and huge benefit.
- Easy migration to newer libraries/patterns/etc. brings security patches, performance improvements, and lower maintenance costs.
- An automated change can be educational. Hopefully, we still find time to thoroughly read documentation, but we often don’t! Seeing your refactored code should be educational and should help with future development costs.
There are costs to automated refactoring. I will highlight:
- If a recipe does not exist, OpenRewrite is not cost-free. As with all software, the cost of creating a recipe will need to be justified by its benefit. These costs may become substantial if we try to move towards a code change that is not reviewed by humans.
- OpenRewrite and AI reward you if you stick with commonly used programming languages, libraries, tools, etc. Sometimes going against the norm is justified. For example, Raptor 4's initial research phases looked at other technology stacks besides Spring and JAX-RS. Some goals included performance improvement. One of the reasons those other options were rejected is that they did not have support in Raptor's automated refactoring tool. Decisions like this can have a big impact on a larger organization.
- Possible loss of ‘design evolution.’ I believe in the ‘good programmers are lazy’ principle, and part of that laziness is avoiding the pain you go through to keep software up to date. This laziness serves to evolve software so that it can easily be updated. If you take away the pain, you take away one of the main incentives for doing that.
What We’ve Been Using: Butterfly
‘Butterfly’ is a two-part system. Its open-source command-line interface (CLI), officially named ‘Butterfly’, modifies files. There is also a hosted transformation tool called Butterfly, which can be used to run Butterfly transformations on GitHub repositories.
This post focuses on replacing the CLI and its extension API with OpenRewrite. There is an OpenRewrite-powered large-scale change management tool (LSCM) named Moderne, which is not free.
Where We Are Going: OpenRewrite
Why are we switching to OpenRewrite?
- Adopted by open source projects that we use (Spring, Java, etc.).
- Maintained by a company, Moderne.
- Lossless Semantic Trees (akin to Abstract Syntax Trees), which allow compiler-like transformation. These are much more powerful than tools like regular expression substitution.
- Visitor pattern. Tree modification happens primarily by visiting tree members.
- They are tracking artificial intelligence to see how it can be leveraged for code transformation.
We are still early in the journey with OpenRewrite. While it is easy to use existing recipes, crafting new ones can be tricky.
What About Artificial Intelligence?
If you aren’t investigating AI, you certainly should be. If AI can predict what code should be created for a new feature, it certainly should be useful in code transformation, which is arguably easier than creation.
Our organization has started the journey of incorporating AI into its toolset. We will be monitoring how tools like OpenRewrite and AI augment one another. On that note, we are investigating using AI to create OpenRewrite recipes.
How We’ve Used OpenRewrite
Manually running recipes against a single software project.
There have been multiple uses of OpenRewrite against an individual software project. I come from the JVM framework team, so our usage involved refactoring Java libraries. You can find some examples of that below:
- JUnit 4 to JUnit 5 JAX-RS refactoring. Comments discuss some impressive changes. Note that there are multiple commits. More on why that was needed later.
- Nice GitHub release notes refactoring. This is a trivial PR, but being able to do it on a large scale with low cost helps with cost-based arguments when value is not widely agreed upon.
- Running UpgradeSpringBoot_3_2, CommonStaticAnalysis, UpgradeToJava17, and MigrateHamcrestToAssertJ recipes on a larger organization project with a whopping 800K lines of code resulted in ~200K modified lines spanning ~4K files with an estimated time savings of ~8 days. I believe that is quite an underestimate of the savings!
- JUnit4 -> JUnit5 refactoring. Estimated savings: 1d 23h 31m.
- Common static analysis refactoring. Estimated savings: 3d 21h 29m. If you are tired of manually satisfying Sonar, then this recipe could be for you! Unfortunately, these need to be bulk closed due to an issue (we’re not trying to hide anything!). You can read about that here.
Again, I think OpenRewrite significantly underestimates some of these savings. Execution time was ~20 minutes. That was the computer’s time, not mine!
Caveat: It’s Only Easy When It’s Easy
When a recipe exists and has no bugs, everything is great! When it doesn’t, you have multiple questions. The two main ones are:
- Does an LST/parser exist? For example, OpenRewrite has no parser for C++ code, so there is no way to create a recipe for that language.
- If there is an LST/parser, how difficult is it to create a recipe? There are a bunch of interesting and easy ways to compose existing recipes; however, when you have to work directly with an LST, it can be challenging.
In short, it’s not always the answer. Good code development and stewardship still play a large role in minimizing long-term costs.
Manual Intervention
So far, the most complicated transformations have required human cleanup. Fortunately, those were in test cases, and the issues were apparent in a failed build. Until we get more sophisticated with detecting breaking changes, please understand that you own the changes, even if they come via a tool like OpenRewrite.
Triaging Problems
OpenRewrite does not have application logging like normal Java software. It also does not always produce errors in ways that you might expect.
To help with these problems, we have a recommendations page in our internal OpenRewrite documentation.
Conclusion
Hopefully, you are excited about the new tools coming that will help you maximize the value.
Resource
Opinions expressed by DZone contributors are their own.
Comments