Category Archives: Java

Posts containing Java code

Having fun with Git

I recently read The Git Book. As I went through the Git Internals parts, it struck me how simple and elegant the structure of Git really is. I decided that I just had to create my own little library to work with Git repositories (as you do). I call the result Silly Jgit. In this article, I will be walking through the code.

This article is for you if you want to understand Git a bit deeper or perhaps even want to work directly with a Git repository in your favorite programming language. I will be walking through four topics: 1) Reading a raw commit from a repository, 2) Reading the tree hash of the root of a commit, 3) parsing the file list of a directory tree, and 4) Reading the file contents from a subdirectory of a commit root.

Reading the head commit from a repository

The first thing we need to do in order to read the head commit is to find out which commit is the head of the repository. The .git/HEAD file is a plain text file that contains the name of a file in the .git/refs/heads directory. If you’ve checked out master, this will be .git/refs/heads/master. This file is a plain text file which contains a hash, that is: a 40 digit hexadecimal number. The hash can be converted to a filename of a Git Object under .git/objects. This file is a compressed file containing the commit information. Here’s the code to read it:

Running this code produces the following output (notice that some of the spaces in the output are actually null bytes in the file):

Finding the directory tree of a commit

When we have the commit information, we can parse it to find the tree hash. The tree hash references another file under .git/objects which contains the index of the root directory of the files in the commit. In the example above, the tree hash is “c03265971361724e18e31cc83e5c60cd0e0f5754”. But before we read the tree hash, we have to read the object type (in this case a “commit”) and size (in this case 237).

Looking at the tree hash file is not as straight forward, however:

The next part of this article will show how to deal with this.

Parsing a directory tree

The tree file has what looks like a lot of garbage. But don’t panic. Just like with the commit object, the tree object starts with the type (“tree”) and the size (130). After this, it will list each file or directory. Each tree entry consists of permissions (which also tells us whether this is a file or a directory), the file name and the hash of the entry, but this time as a binary number. We can read through the entries and find the file we want. We can then just print out the contents of this file:

Here’s an example of a parsed directory listing. I have not showed the octalMode for each file, but this can be extremely useful to separate between directories (which octalMode starts with 0) and files:

Reading a file

This leads us to the end of our journey – how to read the contents of a file. Once we have the entries of a tree, it’s a simple matter of looking up the hash for a filename and parsing that file. As before, the file contents will start with the type (“blob” – which means “data”, I guess) and file size:

This prints the contents of our file. Obviously, if you want to find a file a subdirectory, you’ll have to do a bit more work: Parse another tree object and look and an entry in that object, etc.

Conclusions

This blog post shows how in less than 50 lines of code, with no dependencies (but a small utility helper class), we can find the head commit of a git repository, parse the file listing of the root of the file tree for that commit and print out the contents of a file. The most difficult part was to discover that it was the InflaterInputStream and not Zip or Gzip that was needed to unpack a git object.

My silly-jgit project supports reading and writing commits, trees and hashes from .git/objects. This is just the core subset of the Git plumbing commands. Furthermore, just as I wrote the article, I noticed that git often packs objects into .git/objects/pack. This adds a totally new dimension that I haven’t dealt with before.

I hope that nobody is crazy enough to actually use my silly Git library for Java. But I do hope that this article gave you some feeling of Git mastery.

Posted in Code, English, Java, Technology | 2 Comments

Offensive programming

How to make your code more concise and well-behaved at the same time

Have you ever had an application that just behaved plain weird? You know, you click a button and nothing happens. Or the screen all the sudden turns blank. Or the application get into a “strange state” and you have to restart it for things to start working again.

If you’ve experienced this, you have probably been the victim of a particular form of defensive programming which I would like to call “paranoid programming”. A defensive person is guarded and reasoned. A paranoid person is afraid and acts in strange ways. In this article, I will offer an alternative approach: “Offensive” programming.

The cautious reader

What may such paranoid programming look like? Here’s a typical example in Java:

This code simply reads the contents of a URL as a string. A surprising amount of code to do a very simple task, but such is Java.

What’s wrong with this code? The code seems to handle all the possible errors that may occur, but it does so in a horrible way: It simply ignores them and continues. This practice is implicitly encouraged by Java’s checked exceptions (a profoundly bad invention), but other languages see similar behavior.

What happens if something goes wrong:

  • If the URL that’s passed in is an invalid URL (e.g. “http//..” instead of “http://…”), the following line runs into a NullPointerException: connection = (HttpURLConnection) url.openConnection();. At this point in time, the poor developer who gets the error report has lost all the context of the original error and we don’t even know which URL caused the problem.
  • If the web site in question doesn’t exist, the situation is much, much worse: The method will return an empty string. Why? The result of StringBuilder builder = new StringBuilder(); will still be returned from the method.

Some developers argue that code like this is good, because our application won’t crash. I would argue that there are worse things that could happen than our application crashing. In this case, the error will simply cause wrong behavior without any explanation. The screen may be blank, for example, but the application reports no error.

Let’s look at the code rewritten in an offensive way:

The throws IOException statement (necessary in Java, but no other language I know of) indicates that this method can fail and that the calling method must be prepared to handle this.

This code is more concise and if there is an error, the user and log will (presumably) get a proper error message.

Lesson #1: Don’t handle exceptions locally.

The protective thread

So how should this sort of error be handled? In order to do good error handling, we have to consider the whole architecture of our application. Let’s say we have an application that periodically updates the UI with the content of some URL.

This is the kind of thinking that we want! Most unexpected errors are unrecoverable, but we don’t want our timer to stop because of it, do we?

What would happen if we did?

First, a common practice is to wrap Java’s (broken) checked exceptions in RuntimeExceptions:

As a matter of fact, whole libraries have been written with little more value than hiding this ugly feature of the Java language.

Now, we could simplify our timer:

If we run this code with an erroneous URL (or the server is down), things go quite bad: We get an error message to standard error and our timer dies.

At this point of time, one thing should be apparent: This code retries whether there’s a bug that causes a NullPointerException or whether a server happens to be down right now.

While the second situation is good, the first one may not be: A bug that causes our code to fail every time will now be puking out error messages in our log. Perhaps we’re better off just killing the timer?

Lesson #2: Recovery isn’t always a good thing: You have to consider errors are caused by the environment, such as a network problem, and what problems are caused by bugs that won’t go away until someone updates the program.

Are you really there?

Let’s say we have WorkOrders which has tasks on them. Each task is performed by some person. We want to collect the people who’re involved in a WorkOrder. You may have come across code like this:

In this code, we don’t trust what’s going on much, do we? Let’s say that we were fed some rotten data. In that case, the code would happily chew over the data and return an empty set. We wouldn’t actually detect that the data didn’t adhere to our expectations.

Let’s clean it up:

Whoa! Where did all the code go? All of the sudden, it’s easy to reason about and understand the code again. And if there is a problem with the structure of the work order we’re processing, our code will give us a nice crash to tell us!

Null checking is one of the most insidious sources of paranoid programming, and they breed very quickly. Image you got a bug report from production – the code just crashed with a NullPointerException (NullReferenceException for you C#-heads out there) in this code:

People are stressed! What do you do? Of course, you add another null check:

You compile the code and ship it. A little later, you get another report: There’s a null pointer exception in the following code:

And so it begins, the spread of the null checks through the code. Just nip the problem at the beginning and be done with it: Don’t accept nulls.

By the way, if you wonder if we could make the parsing code accepting of null references and still simple, we can. Let’s say that the example with the work order came from an XML file. In that case, my favorite way of solving it would be something like this:

Of course, this requires a more decent library than Java has been blessed with so far.

Lesson #3: Null checks hide errors and breed more null checks.

Conclusion

When trying to be defensive, programmers often end up being paranoid – that is, desperately pounding at the problems where they see them, instead of dealing with the root cause. An offensive strategy of letting your code crash and fixing it at the source will make your code cleaner and less error prone.

Hiding errors lets bugs breed. Blowing up the application in your face forces you to fix the real problem.

Posted in Code, English, Java | 11 Comments

A canonical web test

In order to smoke test web applications, I like to run-to-end smoke tests that start the web server and drives a web browser to interact with the application. Here is how this may look:

This test is in the actual war module of the project. This is what it does:

  1. Configures the application to run towards a test database
  2. Starts up the web server Jetty on an arbitrary port (port = 0) and deploys the current web application into Jetty
  3. Fires up HtmlUnit which is a simulated web browser
  4. Inserts an object into the database
  5. Navigates to the appropriate location in the application
  6. Verifies that inserted object is present on the appropriate page

This test requires org.eclipse.jetty:jetty-server, org.eclipse.jetty:jetty-webapp and org.seleniumhq.selenium:selenium-htmlunit-driver to be present in the classpath. When I use this technique, I often employ com.h2database:h2 as my database. H2 can run in-memory and so the database is fresh and empty for each test run. The test does not require you to install an application server, use some inane (sorry) Maven plugin or create any weird XML configuration. It doesn’t require that your application runs on Jetty in production or test environment – it work equally fine for Web applications that are deployed to Tomcat, JBoss or any other application server.

Please!

If you are developing a web application for any application server and you are using Maven, this trick has the potential to increase your productivity insanely. Stop what you’re doing and try it out:

  1. Add Jetty and HtmlUnit to your pom.xml
  2. Create a unit test that starts Jetty and navigates to the front page. Verify that the title is what you expect (assertEqual("My Web Application", browser.getTitle()))
  3. Try and run the test

Feel free to contact me if you run into any trouble.

Posted in English, Extreme Programming, Java, Software Development, Unit testing | Leave a comment

Om å løse alt bortsett fra det egentlige problemet

“Problemet med Java er at det krever så mange abstraksjoner. Factories, proxies, rammeverk…” Min samtalepartner gjenfortalte inntrykket han hadde av de Java-programmerende kollegene sine.

Jeg måtte innrømme at jeg kjente meg igjen. Kulturen rundt Java-programmering har noen sykdomstrekk. Kanskje det minst flatterende er fascinasjonen for komplekse teknologiske løsninger. Et gjennomsnittlig Java-prosjekt har rammeverk (Spring, Hibernate), tjenestebusser – gjerne flere (OSB, Camel, Mule), byggverktøy (Maven, Ant, Gradle), persisteringsverktøy (JPA, Hibernate), kodegeneratorer (JAXB, JAX-WS), meldingskøer (JMS), webrammeverk (JSF, Wicket, Spring-MVC) og applikasjonsservere (WebSphere, WebLogic, JBoss). Og det er bare starten. Hvor kommer denne impulsen fra?

Jeg har hørt to teorier om hvorfor Java-programmerere ender opp med en kompleks hverdag. Den ene skylder på sjefene, mens den andre skylder på programmererne. Jeg vil ta for meg begge og peke på en vei ut av jungelen.

Teori 1: Steak and strippers

Zed Shaw, mannen bak “The motherfucking manifesto for programming, motherfuckers” skylder på IT-sjefene. Han mener at selgerne fra teknologiske giganter tar med IT-sjefer ut på strippebuler og kjøper biffmiddager til dem. Og vips så kjøper IT-sjefen inn et ubrukelig verktøy som programmererne er tvunget til å bruke. (Zed påpeker at det vil være positivt med flere kvinnelige IT-sjefer, ettersom de i det minste ikke vil være like interessert i strippebulene)

Ref: Zed Shaw: Control and responsibility (http://zedshaw.com/essays/control_and_responsibility.html). Zed Shaw: Programming, motherfuckers (http://programming-motherfucker.com/). Zed Shaw: Video – Steak and Strippers (http://vimeo.com/2723800).

Argumentet er underholdende, men forklarer bare en del av problemet. Mange av verktøyene som står bak kompleksiteten i Java-prosjekter er åpen kildekode og selges typisk ikke av selgere med fete representasjonskontoer.

Teori 2: Blinkende lys

Jeg tror en viktigere teori er at programmerere er opptatt av fancy ting som blinker. Enkle ting som firmaet tjener penger på er kjedelige. Å sette seg ned og flytte data fra databasen og putte det i en webside er kjedelig for en programmerere. Å lære seg et nytt rammeverk som gjør den samme jobben på en fancy måte er spennende. Å fjerne teknologier og lage en enklere løsning er kjedelig. Å innføre et rammeverk som skal skru sammen teknologien er spennende.

Alle foretrekker å gjøre det de synes er spennende. Jeg vet det, for jeg har vært der selv.

Grunnleggende sett tror jeg at Java-programmere selv har skaffet seg de problemene de ofte klager over med komplekse teknologier.

Veien ut av villmarken

For meg var det viktigste skrittet ut av villmarken foredraget jeg hold på JavaZone for to år siden. I foredraget bygger jeg og Anders Karlsen en webapplikasjon i Java uten å bruke webrammeverk. (Innbild deg at du hører et ironisk gisp her)

Jeg har øvet meg på å løse de problemene jeg har i prosjekter med å bruke enklere teknologier. Slik vet jeg hva de komplekse løsningen gjør. Og så langt er det nesten ingen løsninger som ikke innfører mer problemer enn de fjerner. Det de gjør er at de flytter fokuset fra den oppgaven programmet egentlig skulle løse over til alle de teknologiske delene som man skal få til å fungere sammen.

Jeg tror ikke programmerere er bedre eller dårlige enn andre mennesker når det gjelder det. Men vi har alle en tendens til å bruke tiden vår til å finne spennende problemer å løse i stedet for å løse det vi egentlig skulle på en naiv og enkel måte. Min far sa det egentlig best: Du bør ikke bruke en kalkulator før du kan løse de samme regnestykkene på papir.

Posted in Java, Non-technical, Norsk, Software Development | 27 Comments

A jQuery inspired server side view model for Java

In HTML applications, jQuery has changed the way people thing about view rendering. Instead of an input or a text field in the view pulling data into it, the jQuery code pushes data into the view. How could this look in a server side situation like Java?

In this code example, I read an HTML template file from the classpath, set the value of an input field and append more data based on a template (also in the HTML file).

This is a simplified version of the HTML:

This is a third way from the alternatives of templated views like Velocity and JSP and from component models like JSF. In this model, the view, the model and the binding of the model variables to the view are all separate.

Disclaimer: In this example, I’ve used my still in pre-alpha XML library with the working name of Eaxy. You can get similiar results with libraries like jSoup and JOOX.

Caveat: I’ve never tried this on a grand scale. It’s an idea that compels me for three reasons: First, it’s very explicit. Nothing happens through @annotation, conventions or some special syntax in a template. Second, it’s very unit testable. There is nothing tying this code to running in a web application server. Finally, it’s easy to get to this code through incremental steps. I initially wrote the example application with code that embedded the HTML as strings in Java code and refactored to use the Java Query approach.

Could this approach be worth trying out more?

Posted in Code, English, Java | 3 Comments

A canonical Repository test

There are only so many ways to test that your persistence layer is implemented correctly or that you’re using an ORM correctly. Here’s my canonical tests for a repository (Java-version):

Very simple. The samplePerson test helper generates actually random people:

If your data has relationships with other entities, you may want to include those as well:

A simple and easy way to simplify your Repository testing.

(The tests use FEST assert 2 for the syntax. Look at FluentAssertions for a similar API in .NET)

(Yes, this is what some people would call an integration test. Personally, I can’t be bothered with this sort of classifications)

Posted in Code, English, Extreme Programming, Java, Unit testing | 5 Comments

Loud failures are better than silent, faulty behavior

Sometimes, small questions lead to big answers. Sometimes these answers are controversial. One such question is “What does this warning about serialVersionUID mean”? All the advice out there basically is for developers who don’t know what’s going on to write code that will ignore errors when something unexpected happens. In my view – this is exactly the wrong approach. The safe way to act is to make sure that your program crashes if you don’t have control.

Java programmers usually get this warning when they write code that looks like this:

On stackoverflow this question has an answer that is extensive, well-written, accepted and, in my opinion, wrong. In short, the answer just recommends to implement something like the following:

Let’s dig deeper.

The reason we get this warning is that HttpServlet implements the interface Serializable. This was a design flaw in the first version of the servlet-api, and now we’re stuck with it. A serialized object can be written to and read from byte streams, such as a file. Here is an example:

In this case, everything is fine. But let’s imagine that some time passes between the write and the read. For example, what if we try to read the file with the next version of the program. A version in which the Person class is changed? For example, what if we changed the implementation of Person to store the name as a single field fullName, instead of two fields firstName and lastName?

In this case, we would get an error message like the following:

In other words: If we had set the serialVersionUID, we could still have read the file. Now we’re stuck.

This is why the stackoverflow answer recommends putting a serialVersionUID field in the class.

But wait, there’s another option. Let’s say we found this problem when we tested if our new version was backwards compatible. Now, we could just cut and paste the serialVersionUID from the stack trace. If we do test our software, this is just as easy as putting the serialVersionUID there in the first place. So, let’s fix the Person class:

Voila! Problem fixed. Our code will run again. Here’s the output of printing the object:

Whoops! By forcing different versions of class Person to have the same serialVersionUID, the code now has a serious bug. FullName should never be null (especially since it’s final!). And what’s worse, the bug is a silent one. If we don’t examine the contents of System.out (in this case), we might not catch it before it escapes into the wild.

When you’re not sure, the correct behavior should be to fail, not to silently do the wrong thing.

TL;DR

If you omit a serialVersionUID field from your class, many changes will cause serialized objects to no longer be readable. Even for trivial changes.

Sadly, classes like HttpServlet, AbstractAction and JFrame which are meant to be subclassed implements Serializable, even though they are almost never serialized in practice. Adding serialVersionUID to these classes would only be noise.

The serialVersionUID field can be added to a class afterward if you actually want to read old objects in a new version of the program. This leaves you no worse off than if you added the serialVersionUID field in the first place.

If the old and the new version of the class are deeply incompatible, giving the class a serialVersionUID when you first create it will cause silent faulty behavior.

I prefer loud, failing behavior that is easy to detect during testing over quiet, faulty behavior that may escape into production. I think you do, too.

Posted in Code, English, Java | 5 Comments

Teaser: Bare-knuckle SOA

I’m working on this idea, and I don’t know if it appeals to you guys. I’d like your input on whether this is something to explore further.

Here’s the deal: I’ve encountered teams who, when working with SOA technologies have been dragged into the mud by the sheer complexity of their tools. I’ve only seen this in Java, but I’ve heard from some C# developers that they recognize the phenomenon there as well. I’d like to explore an alternative approach.

This approach requires more hard work than adding a WSDL (web service definition language. Hocus pocus) file to your project and automatically generating stuff. But it comes with added understanding and increased testability. In the end, I’ve experienced that this has made me able to complete my tasks quicker, despite the extra manual labor.

The purpose of this blog post (and if you like it, it’s expansions) is to explore a more bare-bones approach to SOA in general and to web services specifically. I’m illustrating these principles by using a concrete example: Let users be notified when their currency drops below a threshold relative to the US dollar. In order to make the service technologically interesting, I will be using the IP address of the subscriber to determine their currency.

Step 1: Create your active services by mocking external interactions

Mocking the activity of your own services can help you construct the interfaces that define your interaction with external services.

Teaser:

public class CurrencyPublisherTest {

    private SubscriptionRepository subscriptionRepository = mock(SubscriptionRepository.class);
    private EmailService emailService = mock(EmailService.class);
    private CurrencyPublisher publisher = new CurrencyPublisher();
    private CurrencyService currencyService = mock(CurrencyService.class);
    private GeolocationService geolocationService = mock(GeolocationService.class);

    @Test
    public void shouldPublishCurrency() throws Exception {
        Subscription subscription = TestDataFactory.randomSubscription();
        String location = TestDataFactory.randomCountry();
        String currency = TestDataFactory.randomCurrency();
        double exchangeRate = subscription.getLowLimit() * 0.9;

        when(subscriptionRepository.findPendingSubscriptions()).thenReturn(Arrays.asList(subscription));

        when(geolocationService.getCountryByIp(subscription.getIpAddress())).thenReturn(location);

        when(currencyService.getCurrency(location)).thenReturn(currency);
        when(currencyService.getExchangeRateFromUSD(currency)).thenReturn(exchangeRate);

        publisher.runPeriodically();

        verify(emailService).publishCurrencyAlert(subscription, currency, exchangeRate);
    }

    @Before
    public void setupPublisher() {
        publisher.setSubscriptionRepository(subscriptionRepository);
        publisher.setGeolocationService(geolocationService);
        publisher.setCurrencyService(currencyService);
        publisher.setEmailService(emailService);
    }
}

Spoiler: I’ve recently started using random test data generation for my tests with great effect.

The Publisher has a number of Services that it uses. Let us focus on one service for now: The GeoLocationService.

Step 2: Create a test and a stub for each service – starting with GeoLocationService

The top level test shows what we need from each external service. Informed by this and reading (yeah!) the WSDL for a service, we can test drive a stub for a service. In this example, we actually run the test using HTTP by starting Jetty embedded inside the test.

Teaser:

public class GeolocationServiceStubHttpTest {

    @Test
    public void shouldAnswerCountry() throws Exception {
        GeolocationServiceStub stub = new GeolocationServiceStub();
        stub.addLocation("80.203.105.247", "Norway");

        Server server = new Server(0);
        ServletContextHandler context = new ServletContextHandler();
        context.addServlet(new ServletHolder(stub), "/GeoService");
        server.setHandler(context);
        server.start();

        String url = "http://localhost:" + server.getConnectors()[0].getLocalPort();

        GeolocationService wsClient = new GeolocationServiceWsClient(url + "/GeoService");
        String location = wsClient.getCountryByIp("80.203.105.247");

        assertThat(location).isEqualTo("Norway");
    }
}

Validate and create the XML payload

This is the first “bare-knuckled” bit. Here, I create the XML payload without using a framework (the groovy “$”-syntax is courtesy of the JOOX library, a thin wrapper on top of the built-in JAXP classes):

I add the XSD (more hocus pocus) for the actual service to the project and code to validate the message. Then I start building the XML payload by following the validation errors.

Teaser:

public class GeolocationServiceWsClient implements GeolocationService {

    private Validator validator;
    private UrlSoapEndpoint endpoint;

    public GeolocationServiceWsClient(String url) throws Exception {
        this.endpoint = new UrlSoapEndpoint(url);
        validator = createValidator();
    }

    @Override
    public String getCountryByIp(String ipAddress) throws Exception {
        Element request = createGeoIpRequest(ipAddress);
        Document soapRequest = createSoapEnvelope(request);
        validateXml(soapRequest);
        Document soapResponse = endpoint.postRequest(getSOAPAction(), soapRequest);
        validateXml(soapResponse);
        return parseGeoIpResponse(soapResponse);
    }

    private void validateXml(Document soapMessage) throws Exception {
        validator.validate(toXmlSource(soapMessage));
    }

    protected Validator createValidator() throws SAXException {
        SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
        Schema schema = schemaFactory.newSchema(new Source[] {
              new StreamSource(getClass().getResource("/geoipservice.xsd").toExternalForm()),
              new StreamSource(getClass().getResource("/soap.xsd").toExternalForm()),
        });
        return schema.newValidator();
    }

    private Document createSoapEnvelope(Element request) throws Exception {
        return $("S:Envelope",
                $("S:Body", request)).document();
    }

    private Element createGeoIpRequest(String ipAddress) throws Exception {
        return $("wsx:GetGeoIP", $("wsx:IPAddress", ipAddress)).get(0);
    }

    private String parseGeoIpResponse(Element response) {
        // TODO
        return null;
    }

    private Source toXmlSource(Document document) throws Exception {
        return new StreamSource(new StringReader($(document).toString()));
    }
}

In this example, I get a little help (and a little pain) from the JOOX library for XML manipulation in Java. As XML libaries for Java are insane, I’m giving up with the checked exceptions, too.

Spoiler: I’m generally very unhappy with the handling of namespaces, validation, XPath and checked exceptions in all XML libraries that I’ve found so far. So I’m thinking about creating my own.

Of course, you can use the same approach with classes that are automatically generated from the XSD, but I’m not convinced that it really would help much.

Stream the XML over HTTP

Java’s built in HttpURLConnection is a clunky, but serviceable way to get the XML to the server (As long as you’re not doing advanced HTTP authentication).

Teaser:

public class UrlSoapEndpoint {

    private final String url;

    public UrlSoapEndpoint(String url) {
        this.url = url;
    }

    public Document postRequest(String soapAction, Document soapRequest) throws Exception {
        URL httpUrl = new URL(url);
        HttpURLConnection connection = (HttpURLConnection) httpUrl.openConnection();
        connection.setDoInput(true);
        connection.setDoOutput(true);
        connection.addRequestProperty("SOAPAction", soapAction);
        connection.addRequestProperty("Content-Type", "text/xml");
        $(soapRequest).write(connection.getOutputStream());

        int responseCode = connection.getResponseCode();
        if (responseCode != 200) {
            throw new RuntimeException("Something went terribly wrong: " + connection.getResponseMessage());
        }
        return $(connection.getInputStream()).document();
    }
}

Spoiler: This code should be expanded with logging and error handling and the validation should be moved into a decorator. By taking control of the HTTP handling, we can solve most of what people buy an ESB to solve.

Create the stub and parse the XML

The stub uses xpath to find the location in the request. It generates the response in much the same way as the ws client generated the request (not shown).

Spoiler: The stubs can be expanded to have a web page that lets me test my system without real integration to any external service.

Validate and parse the response

The ws client can now validate that the response from the stub complies with the XSD and parse the response. Again, this done using XPath. I’m not showing the code, as it’s just more of the same.

The real thing!

The code now verifies that the XML payload conforms to the XSD. This means that the ws client should now be usable with the real thing. Let’s write a separate test to check it:

Yay! It works! Actually, it failed the first time I tried it, as I didn’t have the correct country name for the IP address that I tested with.

This sort of point-to-point integration test is slower and less robust than my other unit tests. However, I don’t find make too big of a deal out of that fact. I filter the test from my Infinitest config and I don’t care much beyond that.

Fleshing out all the services

The SubscriptionRepository, CurrencyService and EmailService need to be fleshed out in the same way as the GeolocationService. However, since we know that we only need very specific interaction with each of these services, we don’t need to worry about everything that could possibly be sent or received as part of the SOAP services. As long as we can do the job that the business logic (CurrencyPublisher) needs, we’re good to go!

Demonstration and value chain testing

If we create web UI for the stubs, we can now demonstrate the whole value chain of this service to our customers. In my SOA projects, some of the services we depend on will only come online late in the project. In this case, we can use our stubs to show that our service works.

Spoiler: As I get tired of verifying that the manual value chain test works, I may end up creating a test that uses WebDriver to set up the stubs and verify that the test ran okay, just like I would in the manual test.

Taking the gloves off when fighting in an SOA arena

In this article, I’ve showed and hinted at more than half a dozen techniques to work with tests, http, xml and validation that don’t involve frameworks, ESBs or code generation. The approach gives the programmer 100% control over their place in the SOA ecosystem. Each of the areas have a lot more depth to explore. Let me know if you’d like to see it be explored.

Oh, and I’d also like ideas for better web services to use, as the Geolocated currency email is pretty hokey.

Posted in Code, English, Extreme Programming, Java | 9 Comments

How changing Java package names transformed my system architecture

Changing your perspective even a small amount can have profound effects on how you approach your system.

Let’s say you’re writing a web application in Java. In the system you deal with orders, customers and products. As a web application, your classes include staples like PersonController, PersonRepository, CustomerController and OrderService. How do you organize your classes into packages?

There are two fundamental ways to structure your packages. Either you can focus on the logical tiers, like com.brodwall.myapp.controllers, com.brodwall.myapp.domain or perhaps com.brodwall.myapp.services.customer. Or you can focus on the domain contexts, like com.brodwall.myapp.customer, com.brodwall.myapp.orders and com.brodwall.myapp.products. The first approach is by far the most prevalent. In my view, it’s also the least helpful.

Here are some ways your thinking changes if you structure your packages around domain concepts, rather than technological tiers:

First, and most fundamentally, your mental model will now be aligned with that of the users of your system. If you’re asked to implement a typical feature, it is now more likely to be focused around a strict subset of the packages of your system. For example, adding a new field to a form will at least affect the presentation logic, entity and persistence layer for the corresponding domain concept. If your packages are organized around tiers, this change will hit all over your system. In a word: A system organized around features, rather than technologies, have higher coherence. This technical term means that a large percentage of a the dependencies of a class are located close to that class.

Secondly, organizing around domain concepts will give you more options when your software grows. When a package contains tens of classes, you may want to split it up in several packages. The discussion can itself be enlightening. “Maybe we should separate out the customer address classes into a com.brodwall.myapp.customer.address package. It seems to have a bit of a life on its own.” “Yeah, and maybe we can use the same classes for other places we need addresses, such as suppliers?” “Cool, so com.brodwall.myapp.address, then?” Or maybe you decide that order status codes and payment status codes deserve to be in the “com.brodwall.myapp.order.codes” package.

On the other hand, what options do you have for splitting up com.brodwall.myapp.controllers? You could create subpackages for customer, orders and products, but these subpackages may only have one or possibly two classes each.

Finally, and perhaps most intriguingly, using domain concepts for packages allows you to vary the design according on a case by case basis. Maybe you really need a OrderService which coordinates the payment and shipping of an order, while ProductController only needs basic create-retrieve-update-delete functionality with a repository. A ProductService would just get in the way. If ProductService is missing from the com.brodwall.myapp.services package, this may be confusing or at the very least give you a nagging feeling that something is wrong. On the other hand, if there’s no Controller in the com.brodwall.myapp.product package, it doesn’t matter much.

Also, most systems have some good parts and some not-so-good parts. If your Services package is not working for you, there’s not much you can do. But if the Products package is rotten, you can throw it out and reimplement it without the whole system being thrown into a state of chaos.

By putting the classes needed to implement a feature together with each other and apart from the classes needed to implement other features, developers can be pragmatic and innovative when developing one feature without negatively affecting other features.

The flip side of this is that most developers are more comfortable with some technologies in the application and less comfortable with other technologies. Organizing around features instead of technologies force each developer to consider a larger set of technological challenges. Some programmers take this as a motivating challenge to learn, while others, it seems, would rather not have to learn something new.

If it were my money being spend to create features, I know what kind of developer I would want.

Trivial changes can have large effects. By organizing your software around features, you get a more coherent system that allows for growth. It may challenge your developers, but it drives down the number of hand-offs needed to implement a feature and it challenges the developers to improve the parts of the application they are working on.

See also my blog post on Architecture as tidying up.

Posted in English, Java | 15 Comments

The Architecture Spike Kata

Do you know how to apply coding practices the technology stack that you use on a daily basis? Do you know how the technology stack works? For many programmers, it’s easy enough to use test-driven development with a trivial example, but it can be very hard to know how to apply it to the problems you face every day in your job.

Java web+database applications are usually filled to the brim with technologies. Many of these are hard to test and many of these may not add value. In order to explore TDD and Java applications, I practiced the Java EE Spike Kata in 2010. Here’s a video of me and Anders Karlsen doing this kata at JavaZone 2010.

A similar approach is likely useful for programmers using any technology. Therefore, I give you: The rules of the Architecture Spike Kata.

The problem

Create a web application that lets users register a Person with names and search for people. The Person objects should be saved in a data store that is similar to the technology you use daily (probably a relational database). The goal is to get a spike working as quickly as possible, so in the first iteration, the Person entity should probably only contain one field. You can add more fields and refactor the application later.

The rules

The most important rules are Robert Martin‘s three rules of Test-driven development:

  • No code without test (that is, the code should never do something that isn’t required in order to get a test to pass)
  • Only enough test to get to red (that is, the tests should run, give an error message and that error message should correct)
  • Only enough code to get to green (that is, the tests should run and not give an error)
  • (My addition: Refactor on green without adding functionality)

Secondly, application should be driven from the outside in. That is, your first test should be a top-level acceptance test that tests through http and html. It’s okay to comment out or @Ignore this test after it has run red for the first time.

Lastly, you should not introduce any technology before the pain of not doing so is blinding. The first time you do the kata in a language, don’t use a web framework beyond the language minimum (in Java, this means Servlets, in node.js it’s require('http'), in Ruby it means Rack). Don’t use a Object-Relational Mapping framework. Don’t use a dependency injection framework. Most definitely don’t use an application generator like Rails scaffold, Spring Roo or Lift. These frameworks can be real time savers, but this kata is about understanding how the underlying technology works.

As a second iteration, use the technologies you use on a daily basis, but this time set up from scratch. For example, if your project uses Hibernate, try configuring the session factory by hand. By using frameworks in simplest way possible, you’ll both learn more about what they bring to the table and how to use them properly. For complex technology like Hibernate, there’s no substitute for deeper understanding.

What to expect

So far, I’ve only done the Architecture Spike Kata in Java. But on the other hand, I’ve done it around 50 times together with more than ten other developers. I’ve written about how to get started with the Java EE Spike Kata (in Norwegian) on my blog before.

This is what I’ve learned about working with web applications in Java:

  • Most Java web frameworks seem to harm more than they help
  • Hibernate is a bitch to set up, but once it’s working, it saves a lot of hassle
  • Using TDD with Hibernate helped me understand how to use Hibernate more effectively
  • I’ve stopped using dependency injection frameworks (but kept on using dependency injection as a pattern)
  • I have learned several ways to test web applications and database access independently and integrated
  • I no longer have to expend mental energy to write tests for full stack application

The first time I write this kata with another developer, it takes around 3 to 5 hours, depending on the experience level of my pair. After running through it a few times, most developers can complete the task in less than an hour.

We get better through practice, and the Architecture Spike Kata is a way to practice TDD with the technologies that you use daily and get a better understanding of what’s going on.

Posted in English, Extreme Programming, Java, Software Development | 1 Comment