Category Archives: Java

Posts containing Java code

Promises you can trust

JavaScript Promises provide a strong programming model for the future of JavaScript development.

So here I’m playing with promises.

First I need a bit of a package.json file:

Now I can write my first test (test/promises_test.js):

Notice that the “it” function takes a “done” function parameter to ensure that the test waits until the promise has been resolved. Remove the call to done() or to resolve() and the test will timeout.

This test fails, but because of a timeout. The reason is that done is never called. Let’s improve the test.

Using “done()” instead of “then()” indicates that the promise chain is complete. If we haven’t dealt with errors, done will throw an exception. The test no longer times out, but fails well:

And we can fix it:

Lesson: Always end a promise chain with done().

In order to separate this, you can split the then and the done:

There is another shorthand for this in Mocha as well:

But what is a promise chain?

This is extra cool when we need multiple promises:

Notice that done is only called when ALL of the strings have had their length calculated (asynchronously).

This may seem weird at first, but is extremely helpful when dealing with object graphs:

Here, the save methods on dao.orderDao and orderLineDao both return promises. Our “savePurchaseOrder” function also returns a promise which is resolved when everything is saved. And everything happens asynchronously.

Okay, back to basics about promises.

Here, the second function to “done()” is called. We can use “fail()” as a shortcut:

But this is not so good. If the comparison fail, this test will time out! This is better:

Of course, we need unexpected events to be handled as well:

And of course: Something may fail in the middle of a chain:

The failure is automatically propagated to the first failure handler:

It took me a while to become really comfortable with Promises, but when I did, it simplified my JavaScript code quite a bit.

You can find the whole source code here. Also, be sure to check out Scott Sauyet’s slides on Functional JavaScript for more on promises, curry and other tasty functional stuff.

Thanks to my ex-colleague and fellow Exilee Sanath for the inspiration to write this article.

Posted in Code, English, Java | 1 Comment

Dead simple configuration

Whole frameworks have been written with the purpose of handling the configuration of your application. I prefer a simpler way.

If by configuration we mean “everything that is likely to vary between deploys“, it follows that we should try and keep configuration simple. In Java, the simplest option is the humble properties file. The downside of a properties file is that you have to restart your application when you want it to pick up changes. Or do you?

Here’s a simple method I’ve used on several projects:

The AppConfiguration class looks like this:

This reads the configuration file in an efficient way and updates the settings as needed. It supports environment variables and system properties as defaults. And it even gives a pretty good log of what’s going on.

For the full source code and a magic DataSource which updates automatically, see this gist: https://gist.github.com/jhannes/b8b143e0e5b287d73038

Enjoy!

Posted in Code, English, Java | Leave a comment

The lepidopterist’s curse: Playing with java.time

Pop quiz: What will be the output of this little program?

The answer is, like with most interesting questions, “it depends”. How can it depend? Well, let try a few examples:

  • getHoursOfDay(LocalDate.of(2014, 7, 15), ZoneId.of("Asia/Colombo")) returns 24. As expected
  • getHoursOfDay(LocalDate.of(2014, 7, 15), ZoneId.of("Europe/Oslo")) also returns 24.
  • But here comes a funny version: getHoursOfDay(LocalDate.of(2014, 3, 30), ZoneId.of("Europe/Oslo")) returns 23! This is daylight saving time.
  • Similarly: getHoursOfDay(LocalDate.of(2014, 10, 26), ZoneId.of("Europe/Oslo")) also returns 25
  • And of course, down under, everything is upside down: getHoursOfDay(LocalDate.of(2014, 10, 5), ZoneId.of("Australia/Melbourne")) gives 23.
  • Except, of course, in Queensland: getHoursOfDay(LocalDate.of(2014, 10, 5), ZoneId.of("Australia/Queensland")) => 24.

Daylight saving hours: The bane of programmers!

Daylight saving hours were instituted with the stated purpose of improving worker productivity by providing more working hours with light. Numerous studies have failed to prove that it works as intended.

Instead, when I examined the history of daylight saving hours in Norway, it turns out that it was lobbied by a golfer and a butterfly collector (“lepidopterist”) so that they could better pursue their hobbies after working hours. Thus the name of this blog post.

Most of the time, you can ignore daylight saving hours. But when you can’t, it can really bite you in the behind. For example: What does the hour by hour production of a power plan look like on the day that changes from daylight saving hours to standard time? Another example given to me by a colleague: TV schedules. It turns out that some TV channels just can’t be bothered to show programming during the extra hour in the fall. Or they will show the same hour of programming twice.

The Joda-Time API and now, the Java 8 time API java.time can help. If you use it correctly. Here is the code to display a table of values per hour:

Given 2014/10/26 and Oslo, this prints:

And on 2014/3/30, it prints:

So, if you ever find yourself writing code like this: for (int hour=0; hour<24; hour++) doSomething(midnight.plusHours(hour)); you may want to reconsider! This code will (probably) break twice a year.

At the face of it, time is an easy concept. When you start looking into the details, there's a reason that the java.time library contains 20 classes (if you don't count the subpackages). When used correctly, time calculations are simple. When used incorrectly, time calculations look simple, but contain subtle bugs.

Next time, perhaps I should ruminate on the finer points of Week Numbers.

Posted in Code, English, Java | Leave a comment

The madness of layered architecture

I once visited a team that had fifteen layers in their code. That is: If you wanted to display some data in the database in a web page, that data passed through 15 classes in the application. What did these layers do? Oh, nothing much. They just copied data from one object to the next. Or sometimes the “access object layer” would perform a check that objects were valid. Or perhaps the check would be done in the “boundary object layer”. It varied, depending on which part of the application you looked.

Puzzled (and somewhat annoyed), I asked the team why they had constructed their application this way. The answer was simple enough: They had been told so by the expensive consultant who had been hired to advice on the organization’s architecture.

I asked the team what rationale the consultant had given. They just shrugged. Who knows?

Today, I often visit teams who have three to five layers in their code. When asked why, the response is usually the same: This is the advice they have been given. From a book, a video or a conference talk. And the rationale remains elusive or muddled at best.

Why do we construct layered applications?

There’s an ancient saying in the field of computing: Any problem in computer science can be solved by adding a layer of indirection.

Famously, this is the guiding principle behind our modern network stack. In web services SOAP performs method calls on top of HTTP. HTTP sends requests and receives responses on top of TCP. TCP streams data in two directions on top of IP. IP routes packets of bits through a network on top of physical protocols like Ethernet. Ethernet broadcasts packets of bits with a destination address to all computers on a bus.

Each layer performs a function that lets the higher layer abstract away the complexities of for example resending lost packets or routing packets through a globally interconnected network.

The analogy is used to argue for layers in enterprise application architecture.

But enterprise applications are not like network protocols. Every layer in most enterprise application operates at the same level of abstraction.

To pick on a popular example: John Papa’s video on Single Page Applications uses the following layers on the server side (and a separate set on the client side): Controllers, UnitOfWork, Repository, Factories and EntityFramework. So for example the AttendanceRepository property in CodeCamperUnitOfWork returns a AttendanceRepository to the AttendanceController, which calls GetBySessionId() method in AttendanceRepository layer, which finally calls DbSet.Where(ps => ps.SessionId == sessionId) on EntityFramework. And then there’s the RepositoryFactories layers. Whee!

And what does it all do? It filters an entity based on a parameter. Wat?!

(A hint that this is going off the rails is that discussion in the video presentation starts with the bottom and builds up to the controllers instead of outside in)

In a similar Java application, I have seen – and feel free to skip these tedious details – the SpeakersController.findByConference calls SpeakersService.findByConference, which calls SpeakersManager.findByConference, which calls SpeakersRepository.findByConference, which constructs a horrific JPAQL query which nobody can understand. JPA returns an @Entity which is mapped to the database, and the Repository, or perhaps the Manager, Service or Controller, or perhaps two or three of these, will transform from Speaker-class to another.

Why is this a problem?

The cost of code: A reasonable conjecture would be that the cost of developing and maintaining an application grows with the size of the application. Adding code without value is waste.

Single responsibility principle: In the above example, the SpeakerService will often contain all functionality associated with speakers. So if adding a speaker requires you to select a conference from a drop-down list, the SpeakerService will often have a findAllConferences method, so that SpeakersController doesn’t need to also have a dependency on ConferenceService. However, this makes the classes into functionality magnets. The symptom is low coherence: the methods of one class can be divided into distinct sets that are never used at the same time.

Dumb services: “Service” is a horrible name for a class – a service is a more or less coherent collection of functions. A more meaningful name would be a “repository” for a service that stores and retrieves objects, a Query is a service that selects objects based on a criteria (actually it’s a command, not a service), a Gateway is a service that communicates with another system, a ReportGenerator is a service that creates a report. Of course, the fact that a controller may have references to a repository, a report generator and a gateway should be quite normal if the controller fetches data from the database to generate a report for another system.

Multiple points of extension: If you have a controller that calls a service that calls a manager that calls a repository and you want to add some validation that the object you are saving is consistent, where would you add it? How much would you be willing to bet that the other developers on the team would give the same answer? How much would you be willing to bet that you would give the same answer in a few months?

Catering to the least common denominator: In the conference application we have been playing with, DaysController creates and returns the days available for a conference. The functionality needed for DaysController is dead simple. On the other hand TalksController has a lot more functionality. Even though these controllers have vastly different needs, they both get the same (boring) set of classes: A Controller, a UnitOfWork, a Repository. There is no reason the DaysController couldn’t use EntityFramework directly, other than the desire for consistency.

Most applications have a few functional verticals that contain the meat of the application and a lot of small supporting verticals. Treating them the same only creates more work and more maintenance effort.

So how can you fix it?

The first thing you must do is to build your application from the outside in. If your job is to return a set of objects, with .NET EntityFramework you can access the DbSet directly – just inject IDbSet in your controller. With Java JPA, you probably want a Repository with a finder method to hide the JPAQL madness. No service, manager, worker, or whatever is needed.

The second thing you must do is to grow your architecture. When you realize that there’s more responsibilities in your controller than deciding what to do with a user request, you must extract new classes. You may for example need a PdfScheduleGenerator to create a printable schedule for your conference. If you’re using .NET entity framework, you many want to create some LINQ extension methods on e.g. IEnumerable (which is extended by IDbSet)

The third and most important thing you must do is to give your classes names that reflect their responsibilities. A service should not just be a place to dump a lot of methods.

Every problem in computer science can be solved by adding a layer of indirection, but most problems in software engineering can be solved by removing a misplaced layer.

Let’s build leaner applications!

Posted in C#, English, Java, Software Development | Tagged | 17 Comments

Why I stopped using Spring

My post on DZone about Humble Architects sparked somewhat of a controversy, especially around my disparaging comments regarding Spring and Dependency Injection Frameworks. In this post, I expand on why I stopped using Spring.

I was one of the earliest adopter of Spring in Norway. We developed a large system where we eventually had to start thinking about things like different mechanisms for reuse of XML configuration. Eventually, this evolved into the @Autowire and component-scan which took away the problem with huge configuration files, but in return reduced the ability to reason about the whole source code – instead isolating developers in a very small island in the application.

The applications tended to blossom in complexity as either the culture, the tool, the documentation or something else made developers build unnecessary layer upon unnecessary layer.

Later, I tried to build applications without a Dependency Injection framework, but taking with me the lessons about when to use “new”, when to have a setter or a constructor argument and which types were good to use as dependencies and which created coupling to infrastructure.

So I found that some of the instincts that the DI container had given me made me improve the design, but at the same time, I found that when I removed the container, the solution became smaller (which is good!), easier to navigate and understand and easier to test.

This leaves me with a dilemma. I found that the cost of using the container is very high – it creates a force towards increasing complexity and size and reduced coherence. But at the same time, it taught me some good design skills as well.

In the end, creating a coherent, small system is to me of much higher value than to create one that is decoupled just for the sake of decoupling. Coherence and decoupling are opposing forces and I side with coherence.

At the same time, I found that the culture around dependency injection has a very strong preference for reuse. But reuse does introduce coupling. If module A reuses module B, module A and B are coupled. A change in B might affect A for better (a bug fix) or worse (an introduced bug). If the savings from the reuse are high, this is a trade-off worth making. If the savings are low – it is not.

So reuse and decoupling are opposing forces. I find myself siding with decoupling.

When there is a conflict, I value coherence over decoupling and decoupling over reuse. The culture where Spring is in use seems to have opposite values.

Posted in English, Java, Software Development | Leave a comment

Humble architects

Humility is not a very common trait with software architects. After having worked with a few awful architects and recently with a very pleasant one, I’ve compiled a few of my experiences in the way every architect loves: As a set of rules.

Rule 0: Don’t assume stupidity

It seems like some architects assume that developers, if left to their own devices, would behave like monkeys. In my experience, this is very rarely the case. The only situations where I’ve seen developers do something stupid is when silently protesting against an architect. If you follow this rule, the rest are details.

Rule 1: You may be wrong

When reviewing someone’s design idea, I prefer to try and ask questions that are honestly open. Maybe I think the developer have overlooked a critical fact, for example concurrency. There are a few different approaches to the situation:

  1. Architect: “You can’t do it that way, because it breaks the code guidelines”
  2. Architect: “You can’t do it that way, because it won’t be safe when there are several users”
  3. Architect: “Have you thought of how it will work with several users?”
  4. Architect: “How does your solution address the situation of several users?”

Dear architect: Please rate these approaches from the one least to the one most likely to get a best possible system. (Tip: This is an easy task, even though many architects fail it routinely)

Rule 2: Be careful with technology

Every technology comes with a cost. Many technologies come with an infinitesimal benefit.

Here’s a list of technologies that I’ve experienced as having consistently higher cost than benefits, and thus will never use (if you don’t know them, don’t worry. The point is the number): JavaServer Pages, Java Server Faces, JAX-WS, Hibernate, Spring, EJB, Oracle SOA Server, IBM WebSphere, Wicket, Google Web Toolkit, Adobe Flex, JBoss jBPM, JMS (all implementations), JBoss.

Here’s a list of technologies that I happily use: JUnit, Jetty, Joda-time, Java Standard Edition.

Here’s a humble exchange that you may want to try and copy:

  • Architect: You should use technology X
  • Me: I’ve looked at technology X, and I don’t see how it will help me solve the business problem
  • Architect: What do you mean?
  • Me: Well, this is what we need to do: ….. And this is what technology X assumes: …. I don’t see how they match.
  • Architect: So what do you suggest using instead?
  • Me: Um… I think we can solve this with plain Java. As a matter of fact, I made a pretty good proof-of-concept yesterday evening.
  • Awesome architect: Cool. Let’s use it.

Rule 3: Consistency isn’t as important as you think

If I’d have a penny for every time I’d hear this….

  • Architect: “Yes, I know this way may seem clumsy, but you have to do it. You see, if you don’t, the system becomes inconsistent and impossible to maintain”

Granted, I don’t often work with maintenance, but I know that when I deal with any system, the most difficult part is understanding the business logic for that system. Whether system X (which has one set of business logic) and system Y (which has another) are consistent is very low on the list of things that make me lose sleep.

The fact that system X is horribly complex because it has a dozen layer to be consistent with system Y – now this is something that does make me want to pull out my hair. Different contexts have different trade-offs.

Oh, yes: Remember Rule 0? Assume that the developers in a given context are trying to create a good solution for that context.

Oh, yes, another thing: I’ve never seen something that was incomprehensibly complex in the small become maintainable just because we grew to become big.

Oh, yes, yet another thing: If a programmer really runs screaming away from the code because some of it had one style of curly braces and some had curly braces in another style, I’m going to lose all faith in humanity.

Rule 4: Bottom-up consistency beats top-down consistency

There is one way I’ve been able to create more consistency inside a system:

  • Create a reference application with an architecture that is easier to follow than to break. If you do this well, the developers will shake their head at the very idea of deviating from the architecture. Unless they really need to. In which case it’s okay.
  • Foster a culture of cross-pollination: Developers who see each other’s code have more consistent code than developers who just see their own code. Pair programming, code reviews and tech sharing sessions all foster cross-pollination.

Rule 5: Tactical reuse in across systems is suboptimization

Reuse creates coupling. If system X and system Y reuse some functionality and system X needs that functionality to be modified, this will affect system Y. At the very least, the team working on system X must decide to make a private fork of the reused functionality, which means that it’s no longer really reused. At worst, system Y will get a bug because of a change in the reused functionality.

When you reuse across systems, what you reuse should either be stable (say: the Java SE platform, or something so stable that you definitely didn’t make it yourself) or strategic. By strategic reuse, I mean services that integrate information and not just duplicate functionality.

In other words: Reuse should either be use or integration. Duplication is your friend.

Rule 6: Separate between rules and dogma

There are three reasons to have a rule in any coding standard:

  • Unsafe: The code has a bug that will manifest under some (non-theoretical) circumstance
  • Incomprehensible: “I” don’t understand what’s going on
  • Heresy: The code is written in a style some person doesn’t like

Pop quiz: If you have a rule that says “all fields must have a JavaDoc comment”, is that a safety issue, a comprehensibility issue or a heresy issue? What if the standard uses this example:

What about the rule that says “no newline before an opening curly brace”? What about the rule: “the style used for curly braces should be consistent”? Is it addressing Unsafe code, Incomprehensible code or Heresy?

We should be more concerned with writing appropriate code for the situation and less concerned with appeasing the Gods of Consistency.

Rule Omega: Be humble

In the years I’ve worked with software development, I’ve seen more harm done by software architects than help. As a professional role, I think we’d save money if we got rid of them (us). Even if we still paid their salaries.

When you’re in a profession that causes more harm than they prevent, you have two options: You can try and improve, or you can pray that nobody notices.

Posted in English, Java, Software Development | Leave a comment

Announcing EAXY: Making XML easier in Java

XML libraries in Java is a minefield. The amount of code required to manipulate and read XML is staggering, the risk of getting class path problems with different libraries is substantial and the handling of namespaces opens for a lot of confusion and errors. The worst thing is that the situation doesn’t seem to improve.

A colleague made me aware of the JOOX library some time back. It’s a very good attempt to patch these problems. I found a few shortcomings with JOOX that made me want to explore alternatives and naturally I ended up writing my own library (as you do). I want the library to allow for Easy manipulation of XML, and in an episode of insufficient judgement, I named the library EAXY. It’s a really bad name, so I appreciate suggestions for improvement.

Here is what I set out to solve:

  • It should be easy to create fairly complex XML trees with Java code
  • It should be straight-forward and fool-proof to use namespaces. (This is where JOOX failed me)
  • It should easy to read values out of the XML structure.
  • It should be easy to work with existing XML documents in the file structure or classpath
  • The library should prefer throwing an exception over silently failing.
  • As a bonus, I wanted to make it even easier to deal with (X)HTML, by adding convenience functions for this.

1. Creating an XML document

An XML document is just a tree. How about if align the tree to the Java syntax tree. For example – lets say you wanted to programmatically wanted to construct some feedback on this article:

Each element (Xml.el) has a tag name and can nest other elements, attributes (Xml.attr) or text (Xml.text). If the element only contains a text, we don’t even need to make the call to Xml.text. The syntax is optimized so that if you want to do a static import on Xml.* you can write code like this:

2. Reading XML

Reading XML with Java code can be a challenge. The DOM API makes it extremely wordy to do anything at all. You an use XPath, but can be a bit too much on the compact side and when you do something wrong, the result is simply that you get an empty collection or a null value back. I think we can improve on this.

Consider the following:

I step down the XML tree structure and get all the recipient email addresses of the previous message. But wait – running this code returns an empty list. EAXY allows us to avoid scratching our head over this:

Now I get the following exception:

As you can see, we misspelled “recipent” in the message. Let’s get back to this problem later, but for now, let’s work around it to create something meaningful:

Again, I think this is about as fluent as Java’s syntax allows.

3. Validation and namespaces

So, we had a message where one of the element names was misspelled. If you have an XSD document for the XML you’re using, you can validate the document against this. However, as you may get used to when it comes to Java XML libraries the act of performing this validation is quite well hidden behind complex API’s. So I’ve provided a little help:

This reads the mailmessage.xsd from the classpath, which is the most common use case for me.

Of course, most schemas don’t refer to elements in the empty namespace. When using validation, it’s common that we have to construct elements in a specific namespace. In most Java libraries for dealing with XML, this is hard and easy to get wrong, especially when namespaces are mixed. I’ve made namespaces into a primary feature of the Eaxy library:

Notice that the “type” and the “role” attributes belong to different namespaces – a scenario that is especially hard to facilitate with other libraries.

4. Templating

Reading the XSD from the classpath inspired another usage: What if we have an XML document as a template in the classpath and then use Java-code to manipulate this document. This would be especially handy for XHTML:

This code reads the file testdocument.html from the classpath, selects the element with id “peopleForm” and adds two input elements to it.

5. HTML convenience

In the code above, we set the type, name and value attributes of HTML input elements. These are among the most frequently used attributes in HTML manipulation. To make this easier, I’ve added some convenience methods to Eaxy:

A final case I wanted to optimize for is that of dealing with forms in HTML. Here’s some code that manipulates a form before that can be sent to the user.

Here, I set the form contents directly. The code will throw an exception if a parameter name is misspelled, so it’s easy to ensure that you use it correctly.

Conclusion

I have five examples of how Eaxy can be used to do easily what’s hard to do with most XML libraries for Java: Create a document tree with pure Java code, read and manipulate individual parts of the XML tree, the use of namespace and validation, templating and manipulating (X)HTML documents and forms.

Check out The library is not stable now, but for an XML library to be unstable may not be a very risky situation as most errors will be easy to detect long before production.

I hope that you may find it useful to try and use this library in your code to deal with XML and (X)HTML manipulation. I’m hoping for some users who can help me iron out the bugs and make Eaxy even more easy to use.

Oh, and do let me know if you come up with a better name.

Posted in Code, English, Java | Leave a comment

Having fun with Git

I recently read The Git Book. As I went through the Git Internals parts, it struck me how simple and elegant the structure of Git really is. I decided that I just had to create my own little library to work with Git repositories (as you do). I call the result Silly Jgit. In this article, I will be walking through the code.

This article is for you if you want to understand Git a bit deeper or perhaps even want to work directly with a Git repository in your favorite programming language. I will be walking through four topics: 1) Reading a raw commit from a repository, 2) Reading the tree hash of the root of a commit, 3) parsing the file list of a directory tree, and 4) Reading the file contents from a subdirectory of a commit root.

Reading the head commit from a repository

The first thing we need to do in order to read the head commit is to find out which commit is the head of the repository. The .git/HEAD file is a plain text file that contains the name of a file in the .git/refs/heads directory. If you’ve checked out master, this will be .git/refs/heads/master. This file is a plain text file which contains a hash, that is: a 40 digit hexadecimal number. The hash can be converted to a filename of a Git Object under .git/objects. This file is a compressed file containing the commit information. Here’s the code to read it:

Running this code produces the following output (notice that some of the spaces in the output are actually null bytes in the file):

Finding the directory tree of a commit

When we have the commit information, we can parse it to find the tree hash. The tree hash references another file under .git/objects which contains the index of the root directory of the files in the commit. In the example above, the tree hash is “c03265971361724e18e31cc83e5c60cd0e0f5754”. But before we read the tree hash, we have to read the object type (in this case a “commit”) and size (in this case 237).

Looking at the tree hash file is not as straight forward, however:

The next part of this article will show how to deal with this.

Parsing a directory tree

The tree file has what looks like a lot of garbage. But don’t panic. Just like with the commit object, the tree object starts with the type (“tree”) and the size (130). After this, it will list each file or directory. Each tree entry consists of permissions (which also tells us whether this is a file or a directory), the file name and the hash of the entry, but this time as a binary number. We can read through the entries and find the file we want. We can then just print out the contents of this file:

Here’s an example of a parsed directory listing. I have not showed the octalMode for each file, but this can be extremely useful to separate between directories (which octalMode starts with 0) and files:

Reading a file

This leads us to the end of our journey – how to read the contents of a file. Once we have the entries of a tree, it’s a simple matter of looking up the hash for a filename and parsing that file. As before, the file contents will start with the type (“blob” – which means “data”, I guess) and file size:

This prints the contents of our file. Obviously, if you want to find a file a subdirectory, you’ll have to do a bit more work: Parse another tree object and look and an entry in that object, etc.

Conclusions

This blog post shows how in less than 50 lines of code, with no dependencies (but a small utility helper class), we can find the head commit of a git repository, parse the file listing of the root of the file tree for that commit and print out the contents of a file. The most difficult part was to discover that it was the InflaterInputStream and not Zip or Gzip that was needed to unpack a git object.

My silly-jgit project supports reading and writing commits, trees and hashes from .git/objects. This is just the core subset of the Git plumbing commands. Furthermore, just as I wrote the article, I noticed that git often packs objects into .git/objects/pack. This adds a totally new dimension that I haven’t dealt with before.

I hope that nobody is crazy enough to actually use my silly Git library for Java. But I do hope that this article gave you some feeling of Git mastery.

Posted in Code, English, Java, Technology | 2 Comments

Offensive programming

How to make your code more concise and well-behaved at the same time

Have you ever had an application that just behaved plain weird? You know, you click a button and nothing happens. Or the screen all the sudden turns blank. Or the application get into a “strange state” and you have to restart it for things to start working again.

If you’ve experienced this, you have probably been the victim of a particular form of defensive programming which I would like to call “paranoid programming”. A defensive person is guarded and reasoned. A paranoid person is afraid and acts in strange ways. In this article, I will offer an alternative approach: “Offensive” programming.

The cautious reader

What may such paranoid programming look like? Here’s a typical example in Java:

This code simply reads the contents of a URL as a string. A surprising amount of code to do a very simple task, but such is Java.

What’s wrong with this code? The code seems to handle all the possible errors that may occur, but it does so in a horrible way: It simply ignores them and continues. This practice is implicitly encouraged by Java’s checked exceptions (a profoundly bad invention), but other languages see similar behavior.

What happens if something goes wrong:

  • If the URL that’s passed in is an invalid URL (e.g. “http//..” instead of “http://…”), the following line runs into a NullPointerException: connection = (HttpURLConnection) url.openConnection();. At this point in time, the poor developer who gets the error report has lost all the context of the original error and we don’t even know which URL caused the problem.
  • If the web site in question doesn’t exist, the situation is much, much worse: The method will return an empty string. Why? The result of StringBuilder builder = new StringBuilder(); will still be returned from the method.

Some developers argue that code like this is good, because our application won’t crash. I would argue that there are worse things that could happen than our application crashing. In this case, the error will simply cause wrong behavior without any explanation. The screen may be blank, for example, but the application reports no error.

Let’s look at the code rewritten in an offensive way:

The throws IOException statement (necessary in Java, but no other language I know of) indicates that this method can fail and that the calling method must be prepared to handle this.

This code is more concise and if the is an error, the user and log will (presumably) get a proper error message.

Lesson #1: Don’t handle exceptions locally.

The protective thread

So how should this sort of error be handled? In order to do good error handling, we have to consider the whole architecture of our application. Let’s say we have an application that periodically updates the UI with the content of some URL.

This is the kind of thinking that we want! Most unexpected errors are unrecoverable, but we don’t want our timer to stop because it it, do we?

What would happen if we did?

First, a common practice is to wrap Java’s (broken) checked exceptions in RuntimeExceptions:

As a matter of fact, whole libraries have been written with little more value than hiding this ugly feature of the Java language.

Now, we could simplify our timer:

If we run this code with an erroneous URL (or the server is down), things go quite bad: We get an error message to standard error and our timer dies.

At this point of time, one thing should be apparent: This code retries whether there’s a bug that causes a NullPointerException or whether a server happens to be down right now.

While the second situation is good, the first one may not be: A bug that causes our code to fail every time will now be puking out error messages in our log. Perhaps we’re better off just killing the timer?

Lesson #2: Recovery isn’t always a good thing: You have to consider errors are caused by the environment, such as a network problem, and what problems are caused by bugs that won’t go away until someone updates the program.

Are you really there?

Let’s say we have WorkOrders which has tasks on them. Each task is performed by some person. We want to collect the people who’re involved in a WorkOrder. You may have come across code like this:

In this code, we don’t trust what’s going on much, do we? Let’s say that we were fed some rotten data. In that case, the code would happily chew over the data and return an empty set. We wouldn’t actually detect that the data didn’t adhere to our expectations.

Let’s clean it up:

Whoa! Where did all the code go? All of the sudden, it’s easy to reason about and understand the code again. And if there is a problem with the structure of the work order we’re processing, our code will give us a nice crash to tell us!

Null checking is one of the most insidious sources of paranoid programming, and they breed very quickly. Image you got a bug report from production – the code just crashed with a NullPointerException (NullReferenceException for you C#-heads out there) in this code:

People are stressed! What do you do? Of course, you add another null check:

You compile the code and ship it. A little later, you get another report: There’s a null pointer exception in the following code:

And so it begins, the spread of the null checks through the code. Just nip the problem at the beginning and be done with it: Don’t accept nulls.

By the way, if you wonder if we could make the parsing code accepting of null references and still simple, we can. Let’s say that the example with the work order came from an XML file. In that case, my favorite way of solving it would be something like this:

Of course, this requires a more decent library than Java has been blessed with so far.

Lesson #3: Null checks hide errors and breed more null checks.

Conclusion

When trying to be defensive, programmers often end up being paranoid – that is, desperately pounding at the problems where they see them, instead of dealing with the root cause. An offensive strategy of letting your code crash and fixing it at the source will make your code cleaner and less error prone.

Hiding errors lets bugs breed. Blowing up the application in your face forces you to fix the real problem.

Posted in Code, English, Java | 9 Comments

A canonical web test

In order to smoke test web applications, I like to run-to-end smoke tests that start the web server and drives a web browser to interact with the application. Here is how this may look:

This test is in the actual war module of the project. This is what it does:

  1. Configures the application to run towards a test database
  2. Starts up the web server Jetty on an arbitrary port (port = 0) and deploys the current web application into Jetty
  3. Fires up HtmlUnit which is a simulated web browser
  4. Inserts an object into the database
  5. Navigates to the appropriate location in the application
  6. Verifies that inserted object is present on the appropriate page

This test requires org.eclipse.jetty:jetty-server, org.eclipse.jetty:jetty-webapp and org.seleniumhq.selenium:selenium-htmlunit-driver to be present in the classpath. When I use this technique, I often employ com.h2database:h2 as my database. H2 can run in-memory and so the database is fresh and empty for each test run. The test does not require you to install an application server, use some inane (sorry) Maven plugin or create any weird XML configuration. It doesn’t require that your application runs on Jetty in production or test environment – it work equally fine for Web applications that are deployed to Tomcat, JBoss or any other application server.

Please!

If you are developing a web application for any application server and you are using Maven, this trick has the potential to increase your productivity insanely. Stop what you’re doing and try it out:

  1. Add Jetty and HtmlUnit to your pom.xml
  2. Create a unit test that starts Jetty and navigates to the front page. Verify that the title is what you expect (assertEqual("My Web Application", browser.getTitle()))
  3. Try and run the test

Feel free to contact me if you run into any trouble.

Posted in English, Extreme Programming, Java, Software Development, Unit testing | Leave a comment