Category Archives: Software Development

Replanning your project with a time machine

I have an amazing time machine that lets me think better about projects. This is part 1 in a series of blog posts exploring the use of a time machine.

Let’s say that you have a project that has been running for a couple of months. Looking back at your issue tracker and other artifacts, you notice that it’s hard to see what has been done and especially how much time remains. You really wish that you had a proper product backlog of what has been done, so you could forecast how the future will be.

Pulling out a mental time machine, you can answer this question. This is how I create a plan that I’d like to travel back in time to give to myself.

  1. Look over the actual features that you have build. If the application is a normal application with a set of web pages, list all the screens.
  2. Which of the screens have features that required extra work, such as integration, search, dialogues or complex business rules? Split the screens that required extra work into one item per work
  3. Which of the work items did you have to substantially change or even throw away all together? Add a work item for each of these events
  4. For each work item, list a rough date when it was completed. If you have know the sprint, that’s okay. Just list the end date of the sprint. If you know the week, that’s great!
  5. Look over the list that you have so far: Are there things that are especially big? Can you find a way to split them? Are there things that are so small that they don’t really count? Can you logically merge some of these together?
  6. Now you’re ready to count the number of work items completed per week.

If we’re lucky, you may come up with a list of 2-5 items done every week. Each of the items has a clear demonstrable effect that could be seen by a customer.

Now you can look at the work ahead of you. Can you find a similar-size items? If you have some sort of screen mockups of the rest of the work you’re interested in, this is a great source of information.

To complete a plan, list a planned completion date for each work item:

  • For past work items, the planned date is the same as the completed date.
  • For future work items, put a planned date that gives the same rate of items per week as in a reasonable interval of the past.

There are many sources of uncertainty still left in such a plan, but it is a quick way to get a reasonable idea of the work ahead.

Posted in English, Software Development | Leave a comment

The reuse dilemma

The first commandment that any young programmer learns is “Thou Shall Not Duplicate”. Thus instructed, whenever we see something that looks like it may be repeated code, we refactor. We create libraries and frameworks. But removing duplication doesn’t come for free.

If I refactor some code so that instead of duplicating some logic in Class A and Class B, these classes share the logic in Class R (for reuse!). Now Classes A and B are indirectly coupled. This is not necessarily a bad thing, but it comes with some consequences, which are often overlooked.

If Class A require the shared functionality to change, we have a choice: We make the change or we stymie Class A. Making the change comes at a cost of breaking Class B. If these classes are in the same Java package (or .NET Namespace), chances are that we will be able to verify that the change didn’t break anything. If the reused functionality is in a library that is used by another library that is used by a Class B, verifying that the change was good is harder.

This is where Continuous Integration comes in. We check in our change to class R (for reuse, remember). A build is triggered on Jenkins, Bamboo, TFS or what have you. The build causes other projects to get built and eventually, we get a failing build where a unit test for Class B breaks. If we did our job right.

Even with this failing test, we’re not out of the woods. A build system for a large enterprise may take up to an hour or even more to run. This means that by trying to improve Class R, the developers of Class A have made a mess for the developers of Class B at least for a good while.

When organizations see repeated build breaks due to changes in dependencies, the reaction is usually the same: Lets limit changes to the shared code. Perhaps we’ll even version it, so that any change that you need will be released in the next version and dependencies can upgrade at their own pace.

Now we’ve introduced another problem: We are no longer able to change the code. Developers of Class A will have to take Class R as it is, or at the very least go through substantial work to change it. If the reused code is very mature, this is fine. After all, this is what we have to deal with when it comes to language standard libraries and open source projects.

Most reused code inside an organization, however, isn’t so mature. The developers of Class A will often be frustrated by the limitations of Class R. If the reused class is something very domain specific, the limitation is even worse.

So the developers do what any reasonable developer would do: They make a copy of Class R (or they fork the repository where it lives), creating a new duplication of the code.

And so it goes: Removing duplication leads to risk of adverse interactions between reusers. Organizations enforce change control on the reused code to avoid changes made in one context from breaking other parts of the system, resulting in paralysis for the reusers who need the reused code to change. The reusers eventually duplicate the reused code to their own branch where they can evolve it, leading to duplication. Until someone gets the urge to remove the duplication.

The dilemma happens in the small and in the large, but the trade-offs are different.

What we should do less is to reuse immature code with little novel value. The cost of the extra coupling often far outweighs the benefit of reuse.

Posted in English, Software Development | 1 Comment

The madness of layered architecture

I once visited a team that had fifteen layers in their code. That is: If you wanted to display some data in the database in a web page, that data passed through 15 classes in the application. What did these layers do? Oh, nothing much. They just copied data from one object to the next. Or sometimes the “access object layer” would perform a check that objects were valid. Or perhaps the check would be done in the “boundary object layer”. It varied, depending on which part of the application you looked.

Puzzled (and somewhat annoyed), I asked the team why they had constructed their application this way. The answer was simple enough: They had been told so by the expensive consultant who had been hired to advice on the organization’s architecture.

I asked the team what rationale the consultant had given. They just shrugged. Who knows?

Today, I often visit teams who have three to five layers in their code. When asked why, the response is usually the same: This is the advice they have been given. From a book, a video or a conference talk. And the rationale remains elusive or muddled at best.

Why do we construct layered applications?

There’s an ancient saying in the field of computing: Any problem in computer science can be solved by adding a layer of indirection.

Famously, this is the guiding principle behind our modern network stack. In web services SOAP performs method calls on top of HTTP. HTTP sends requests and receives responses on top of TCP. TCP streams data in two directions on top of IP. IP routes packets of bits through a network on top of physical protocols like Ethernet. Ethernet broadcasts packets of bits with a destination address to all computers on a bus.

Each layer performs a function that lets the higher layer abstract away the complexities of for example resending lost packets or routing packets through a globally interconnected network.

The analogy is used to argue for layers in enterprise application architecture.

But enterprise applications are not like network protocols. Every layer in most enterprise application operates at the same level of abstraction.

To pick on a popular example: John Papa’s video on Single Page Applications uses the following layers on the server side (and a separate set on the client side): Controllers, UnitOfWork, Repository, Factories and EntityFramework. So for example the AttendanceRepository property in CodeCamperUnitOfWork returns a AttendanceRepository to the AttendanceController, which calls GetBySessionId() method in AttendanceRepository layer, which finally calls DbSet.Where(ps => ps.SessionId == sessionId) on EntityFramework. And then there’s the RepositoryFactories layers. Whee!

And what does it all do? It filters an entity based on a parameter. Wat?!

(A hint that this is going off the rails is that discussion in the video presentation starts with the bottom and builds up to the controllers instead of outside in)

In a similar Java application, I have seen – and feel free to skip these tedious details – the SpeakersController.findByConference calls SpeakersService.findByConference, which calls SpeakersManager.findByConference, which calls SpeakersRepository.findByConference, which constructs a horrific JPAQL query which nobody can understand. JPA returns an @Entity which is mapped to the database, and the Repository, or perhaps the Manager, Service or Controller, or perhaps two or three of these, will transform from Speaker-class to another.

Why is this a problem?

The cost of code: A reasonable conjecture would be that the cost of developing and maintaining an application grows with the size of the application. Adding code without value is waste.

Single responsibility principle: In the above example, the SpeakerService will often contain all functionality associated with speakers. So if adding a speaker requires you to select a conference from a drop-down list, the SpeakerService will often have a findAllConferences method, so that SpeakersController doesn’t need to also have a dependency on ConferenceService. However, this makes the classes into functionality magnets. The symptom is low coherence: the methods of one class can be divided into distinct sets that are never used at the same time.

Dumb services: “Service” is a horrible name for a class – a service is a more or less coherent collection of functions. A more meaningful name would be a “repository” for a service that stores and retrieves objects, a Query is a service that selects objects based on a criteria (actually it’s a command, not a service), a Gateway is a service that communicates with another system, a ReportGenerator is a service that creates a report. Of course, the fact that a controller may have references to a repository, a report generator and a gateway should be quite normal if the controller fetches data from the database to generate a report for another system.

Multiple points of extension: If you have a controller that calls a service that calls a manager that calls a repository and you want to add some validation that the object you are saving is consistent, where would you add it? How much would you be willing to bet that the other developers on the team would give the same answer? How much would you be willing to bet that you would give the same answer in a few months?

Catering to the least common denominator: In the conference application we have been playing with, DaysController creates and returns the days available for a conference. The functionality needed for DaysController is dead simple. On the other hand TalksController has a lot more functionality. Even though these controllers have vastly different needs, they both get the same (boring) set of classes: A Controller, a UnitOfWork, a Repository. There is no reason the DaysController couldn’t use EntityFramework directly, other than the desire for consistency.

Most applications have a few functional verticals that contain the meat of the application and a lot of small supporting verticals. Treating them the same only creates more work and more maintenance effort.

So how can you fix it?

The first thing you must do is to build your application from the outside in. If your job is to return a set of objects, with .NET EntityFramework you can access the DbSet directly – just inject IDbSet in your controller. With Java JPA, you probably want a Repository with a finder method to hide the JPAQL madness. No service, manager, worker, or whatever is needed.

The second thing you must do is to grow your architecture. When you realize that there’s more responsibilities in your controller than deciding what to do with a user request, you must extract new classes. You may for example need a PdfScheduleGenerator to create a printable schedule for your conference. If you’re using .NET entity framework, you many want to create some LINQ extension methods on e.g. IEnumerable (which is extended by IDbSet)

The third and most important thing you must do is to give your classes names that reflect their responsibilities. A service should not just be a place to dump a lot of methods.

Every problem in computer science can be solved by adding a layer of indirection, but most problems in software engineering can be solved by removing a misplaced layer.

Let’s build leaner applications!

Posted in C#, English, Java, Software Development | Tagged | 17 Comments