The economics of reuse
If need the same functionality in two projects, you should reuse code between them, right? Or should you? For as long as there has been a profession of software engineering, we have tried to achieve more reuse. But reuse has both a benefit and a cost. Too often, the cost is forgotten. In this article, I examine the economics of reuse.
True story: One of the earliest projects to embrace object-oriented programming in the 1990s did so with the goal of maximizing reuse. The team responsible for creating the company wide framework used the following formula for calculating the value of their work:
[Value of reuse] = [numbers of uses of framework] * [value of the framework to reusers] - [cost of developing the framework]
This formula is obviously correct, but this is where they went horribly wrong: The organization said [value of framework to reusers] = [cost of developing framework]. In other words: The more expensive it was to create, the more valuable it was to use.
We have clearly progressed beyond this thinking. A more updated formula would say: [value of framework to reusers] = [cost of developing the feature in question]. But even this is too optimistic.
No library comes for free to its users. At the very least, you have to discover the features and learn about the details. The cost of reusing depends on many factors, such as the quality of the framework and the documentation and also upon the type of feature. A complex algorithm with a simple interface is cheap to use, while most domain-specific frameworks require relatively much work to reuse. We can express this as a reuse value factor, likely between 90% and 50%. For most cases, my guess would be at about 75%.
So we have:
[value of reuse] = [number of users] * ([cost of feature] * [reuse value factor]) - [cost of developing the reusable component]
What about the other important factor: [cost of developing the reusable component]?
It’s easy to assume that the cost of developing a feature in a framework is equal to that of developing the feature in an application, but on further analysis shows that this is far from true. A reusable component needs more documentation, it needs to handle more special cases and it has a slower feedback cycle. This cost is actually substantial and may mean that it costs between 150% to 300% or more to develop a feature for reuse. Personally, I think the reusability cost factor lies around 300%. And the lower this number, the higher the cost factor of reuse is likely to be, because that may mean we skimped on documentation etc.
A revised number would be:
[value of reuse] = [number of users] * ([cost of feature] * [reuse value factor]) - [reusability cost factor] * [cost of feature]
Or
[value of reuse] = [cost of feature] * ([number of users] * [reuse value factor] - [reusability cost factor])
The more complex formula actually lets us make a few predictions. Let’s say we assume a reuse value factor of 75 % (meaning that it requires 1/4 of the effort to reuse a library rather than creating the feature from scratch) and a reusability cost factor of 300 % (meaning that it requires three times the effort to create something that’s worth reusing). This means:
[value of reuse] = [cost of feature] * ([number of users] * 75% - 300%)
This equation breaks even when [number of users] = 4. That means that to get any value from your reused component, you better have five or more reusers or you have to find a way to substantially improve the [reuse value factor] or [reusability cost factor]. Very smart people have failed to do this.
Improving the value:
- Increase the number of reusers: Simple enough, but when you do, you risk that the [reuse value factor] goes down as the framework doesn’t suit everybody equally well.
- Reduce the cost of reusing the library: This means investing in documentation, improving your design, improving testing to reduce the number of bugs, handle bug reports and feature requests faster from your reusers - all of which increase your cost reusability cost factor.
- Reduce the extra work in making the library reusable: The most important way to reduce the cost of developing for reuse is to choose the right kind of problem to solve. Problems with a small surface and big volume are best. That means: Easy to describe, hard to implement. Sadly, most of the juiciest fruit was picked years ago by the standard library in your programming language and by open source frameworks.
On a global scale, reuse has saved the software industry tremendous amounts. In an organization, it can be hard to get the same effect. Reuse comes at a cost to the reuser and to the developer of the reusable library. How do you evaluate and improve your [reuse value factor] and your [reusability cost factor]?
Comments:
Johannes Brodwall - Mar 24, 2014
Developers can go wrong both ways here, Niklas. I’ve seen teams search for a long time and find a pre-alpha library that looked like it solved their problem, but introduced twice as many. The balance is difficult and people fail both ways.
As a side-note, I see them fail hardest when the do reuse something that wasn’t good.
[Kim] - Mar 24, 2014
This fits very well with my own experience. My life as a developer became a lot simpler after I gave up on this reuse mania and just copied the damn code + tests between projects. It’s so much easier to extract code into a library for reuse when it’s obvious that the code actually can be reused.
[niklasbjrnerstedt] - Mar 24, 2014
I agree with your assessment of building for reuse. That said, a very common mistake I see is the development of solutions when there is a reusable alternative available. Sometimes the developers did not find the alternative and other times “not invented here” prevailed.
Johannes Brodwall - Mar 24, 2014
It’s true, Eike - I have not discussed the value of maintenance. However, maintenance also has a cost on a reuser: You have to monitor, evaluate and apply upgrades. And you face the risk of regression in maintenance releases. In addition, if you discover the bug yourself, you may face a cost in getting it fixed unless the library maintainer is merging and releasing your pull request immediately.
The cost and benefit tradeoffs are similar, but probably slightly different in details.
[Eike Lang] - Mar 24, 2014
Interesting point, and not without merits, but I think there’s one missing factor here: Maintenance cost. I consider it unlikely that either a framework/library implementation or any of the n custom implementations will manage to be 100% free of errors from the get-go, so any custom implementations that are not 100% perfect when conceived will incur a cost disadvantage compared to a reusable component.
Johannes Brodwall - Mar 24, 2014
I haven’t thought a lot about this a separate factor, but the point is very valid. The way I relate is that that by making something more generic, that probably reduces the [reuse value factor], but has the potential of increasing [number of users].
[geirhedemark] - Mar 24, 2014
I have often wondered at the cost of actually learning to reuse something that is more generalized (in order to be reusable) than necessary. Do you think there should be a separate cost factor for that?
[bin] - Mar 24, 2014
Shouldnt it be reuse value factor instead of reuse cost factor in your second formula?
-—– The very reason one creates a framework out of reusable code is that it has more than 2 reusers. The reusability cost factor is increased iff the assumptions taken before its development contradict the new usecases. And therefore, the assumptions should be chosen very carefully when one is designing a framework which is supposed to be used by more than 2 reusers.
J. B. Rainsberger - Mar 24, 2014
I tend not to make features reusable, but instead I tend to see reusable (generic) code trapped inside one-off uses. I take some time to extract that code from its context (how I’m currently using it). I find, but haven’t measured, that understanding and fixing context-dependent code costs me more than fixing context-independent code. (Not always true, but true on average.) I don’t know how to include this savings in your cost model.
We often don’t extract code because we don’t have a place for it. That’s why I invest a few hours in creating a “flow pipeline” for reusable code, which I really only have to do once per technology platform (I understand how to create libraries in Java and Ruby, but I remain hopeless in doing it for Haskell and Python). Once I have a place for reusable to go, then I feel less inertia related to extracting it. I wish I could measure the results, but it feels better, and perhaps that’s enough to improve my overall effectiveness. :)
This, of course, leads to the Catch-22 argument: “I’m not going to bother extracting this, because nobody’s going to use it.” On the other hand, if I bury this code inside its context, and couple it to its environment, then when I want to reuse it, I won’t put in the effort to extract it, due to uncertain cost/benefit. This virtually guarantees two negative outcomes: (1) no reuse and (2) increased cost to understand and fix context-dependent code. (You call the latter “legacy code” most of the time.)
So this seems to me to amount to “there’s never a good time to start, so let’s start now”.
Johannes Brodwall - Mar 24, 2014
I think you’re right about the formula, bin. I have updated the article.
I also agree with your second point - I’ll sum it up as: Having a good reuse cost factor is hard, and it’s harder the more reusers you have. Do I understand you correctly?
[Krisztina Hirth] - Mar 24, 2014
I decide this based on the “fool me twice”-rule: Fool me once, shame on me. Fool me twice… you never fool me twice!" I mean, if I need the same functionality once again then so be it. Once. But the second time I have to think about reusing this code.
It differs if we are talking about features. I see it like J.B.: a feature depends on its context. I never would agree to multiply code like 2*2 or logging or VAT-calculation. This is ALWAYS the same, I don’t need/want to reimplement it. The functionality depends on the same parameters. But if my code is only a part of a behavior and it depends on the context then either I have different implementations which are not depending on the context or it can not be reused as a whole feature.
Anyway your calculations are amazing, I never looked at it this way.
[Christian B. Hauknes] - Mar 24, 2014
Regarding reuse in an organization (especially one with a specialized and complex domain), I find that to reuse or not primarily is a business problem. You start reusing functionality across processes (or process steps), you bind this together and make them inter dependent. How certain are you (or the Product Owner, actually), that this functionality will change for the same reason across all these processes / process steps?
Get it wrong, and you risk either inconsistency or unwanted dependency restricting wanted change. Both can be bad, but you can manually mitigate risk of inconstancy, and over eager reuse is in my experience worse.
Johannes Brodwall - Mar 25, 2014
An excellent point, Christian. There are other reasons for reuse than cost savings, especially for data or functional consistency. Like you point out, you can go wrong here as well, but the reasoning is different than for reuse for cost savings.
Johannes Brodwall - Mar 25, 2014
There’s one additional risk with this approach, @jbrainsberger:disqus. When you extract generic code for your own reuse, everything is good. But if others are to reuse it, they may either find that it only solves the subset of the problem that interested you, or you have to invest more time in expanding it. (Which puts you squarely into my equation ;-))
[leif] - Mar 25, 2014
Nice write up. I think generalization is dangerous when it comes to the reuse discussion. Features or business requirements I think is hard to reuse. When it comes to technical stuff which are fairly small I find reuse and investing in building libraries are worth while. Given that your culture is one were reuse among peers comes natural. If you have a hostile culture were isolation and competition is favored you are probably better off not writing it at all.
Johannes Brodwall - Mar 25, 2014
Good input. I think good culture can bring up the reuse value factor, especially by removing impediments to learning and dealing with problems.
J. B. Rainsberger - Mar 25, 2014
Indeed. All the more reason that we benefit from a system (including standards) for publishing reusable code. Now we have a balancing problem: we must have high-enough standards to allow prospective clients to trust published code, but we must have standards that encourage people to publish libraries.
On the other hand, perhaps the natural barrier to publishing code itself encourages enough people to do enough of the “right enough” thing.