A Brief Adventure with Universal Repositories and REST Web Services

Inspired by Per Mellqvist (and myself, to be fair), I wanted to explore the possibility of using a generic DAO or Repository interface for REST. Based on this simple idea, I was able to create a very cute and testable prototype of a full Web Service stack for REST based Web Services. The most interesting aspect was creating a universal test case for Repositories.

This article shows how little code is required to implement and test a REST based Web Service in Java, despite the horror of the Java HTTP client API. The source code can be downloaded from my subversion repository. I also want to illustrate how to create black box tests that can be reused efficiently with different implementations of a Repository.

The generic Repository and its friend, the generic Repository Test

A Repository or Data Access Object allows a client program to Create, Retrieve, Update and Delete object. A simple design that has served me well is something like this:

However, when you start implementing this interface, you will soon find that you need a little more functionality even in the general case. Here is a test case that shows one of my favorite patterns for black box testing of repositories:

Notice that I was the insertedObject to be equals to the updatedObject, but in order for this to make any sense, I need them to be different object instances. Most Repository implementations will have some sort of session cache, and the point of the writeChanges() method in this case is to flush this cache, so objects returned will be fresh. Having a session cache avoids a lot of confusion, so I like to test for it. That is what the final assertSame in testRetrieve verifies.

The second thing we need for a generic repository is a way to deal with unwanted aliasing. This is in particular an issue with Object-Relation Mapping technology such as Hibernate or TopLink. If I change an object I got back from the Repository, but don’t call Repository.update, the object may still be changed in the database. This is because the Object-Relation Mapper will keep track of all changes in objects that it has returned. Here is a test case that illustrates the problem (and solution):

With discardChanges in place to ensure that we don’t write to the Repository when we don’t want to, the final Repository interface looks like so:

This interface proves to be both generic enough that we can construct a generic test case for it, and powerful enough that it can be implemented in a variety of ways.

The Memory Repository

Since the interface is so small, I decided to implement a pure in-memory variant of Repository using HashMaps as the internal storage. The code is trivial, the only interesting thing is the session cache on top of it.

As it turns out, the SessionCachedRepository can be used in other scenarios as well, and I will reuse it for the REST implementation of the repository. There are a few challenges to a generic SessionCachedRepository. Most importantly, because insert returns the key for the object, we need to have it write through to the server. In order to implement discardChanges, I save a list of inserted objects that can be removed manually:

(Note to self: Could I implement the key with a smart proxy to get around this problem?)

The REST repository

I implemented both the server and the client side of the REST Web Service with many of the same pieces. By using the same Repository interface on both sides of HTTP, I can easily exchange remote and local tests. The great advantage of this is that it is usually much easier to get the local part to work first, and then work on the server. Here is the client side Repository:

This code really shows off the beautiful simplicity of the REST scheme. Each method does an HTTP call to a URL that is determined by the ID, encodes the object as XML, and decodes the result as XML, too. The XML mapping can be as complex as you want, but it is independent of the REST stack. The only other complex part of this solution is the RESTHttpClient. I implemented this on top of Java’s poor HttpURLConnection in order to avoid dependencies. It’s a horribly API, so please don’t blame me for the horrible client code. :-(

There is one thing that should make you go “that’s funny” in the RESTRepostory code, namely the insert method. Here, we get a String back and convert it to a URL. What is going on here?

With REST web services, the normal way of creating a new resource is to POST the resource. The server will respond with 201 – created, and it will have the URL to the new resource in the “Location” header. I have taken a shortcut with regard to this in the RESTHttpClient.doPostAndReturnLocation. An interesting point here is that I can use the URL as key throughout the client side. As long as the code doesn’t get too curious with the actual class of the key, this works fine.

Now for the other side of the equation, the servlet:

Again, REST maps one to one with the servlet specification. The doPost mirrors what we saw on the client side. It returns 201, and sets the Location header. There is a minor snafu, though. I have been unable to find the full path to the servlet, so I had to paste it together from all the different parts of the request properties. I would very much appreciate input on how this should be done.

I have one more trick up my sleeve when it comes to the servlet. The code for validating the content of the request is usually littered all over the Servlet. In order to fix this, I use an unchecked exception and override HttpServlet.service to deal with it:

Who needs a web framework?! Hahaha!

Testing it all

I have created two implementations of the RESTHttpClient used by the RESTClientRepository. One is a bridge from the interface to HttpServletRequest and HttpServletResponse. This way, I can test the client and server connection in memory. This is very fast so it comes in quite handy. The code uses spring-mock MockHttpServletRequest and MockHttpServletResponse.

Here is the code that runs the http to servlet bridge test:

Notice how I wrap the RESTClientRepository in a SessionCachedRepostory. This way, we can reuse the implementation of this code.

Of course, no test of Web Services would be complete without there at least being some traffic over the network! Following the recipe on how to run unit test with Jetty, I start the servlet in Jetty and run the unit tests with it. This code implements my RESTHttpClient using java.net.HttpURLConnection. It’s a horrible class, but it’s usable (barely).

Conclusion

RepositoryTest tests a generic Repository interface with three implementations: In-memory, simulated servlets, and real http traffic. The RepositoryTest has almost all the code, and there are only minimal overrides in the subclasses. This illustrates how to reuse tests in different contexts. The test is also a good example of how to test on a black-box level, focusing on the interesting behavior, and not the implementation.

RESTClientRepository and RESTRepositoryServlet implements the client and server-side parts of a full REST stack. I have done the XML binding as simple as possible in this article, but in a later post, I will explore how to do more interesting stuff with the XML binding, including implementing lazy relationships over REST.

The source of the article is available under the Apache Software License. Feel free to use it any way you see fit. However: I will be extremely grateful if you drop me a line if you find it useful, or if there’s anything you’d like to see improved.

About Johannes Brodwall

Johannes is Principal Software Engineer in SopraSteria. In his spare time he likes to coach teams and developers on better coding, collaboration, planning and product understanding.
This entry was posted in Java, SOA. Bookmark the permalink.
  • Jakob

    Hei Johannes,

    Sjekk ut WP-Syntax pluggen til WordPress. Vil gi deg en vakker output på kodeeksemplene dine.

  • Jakob

    Hei Johannes,

    Sjekk ut WP-Syntax pluggen til WordPress. Vil gi deg en vakker output på kodeeksemplene dine.

  • Mats

    Really elegant construction this.
    When trying to run it, I had to add 10 external jars. Makes me wonder
    if they're all really needed. Second, in the “trade” package there are
    unresolved references to the …rest.UrlMemoryRepository class.
    Still, really cool stuff.

  • Hi, Mats

    Yeah, as it turns out, Spring is a dependency hog, and the other dependencies stack up.

    If you're not doing it already, you really should use Maven to build this (and other) project. It really helps with dependency mess.

    Sorry about the UrlMemoryRepository. I am reworking the code, and it is in a bit of an intermediate state right now. If this is a blocker for you, I can prioritize fixing it. Let me know.

  • Mats

    Really elegant construction this.
    When trying to run it, I had to add 10 external jars. Makes me wonder
    if they’re all really needed. Second, in the “trade” package there are
    unresolved references to the …rest.UrlMemoryRepository class.
    Still, really cool stuff.

  • Hi, Mats

    Yeah, as it turns out, Spring is a dependency hog, and the other dependencies stack up.

    If you’re not doing it already, you really should use Maven to build this (and other) project. It really helps with dependency mess.

    Sorry about the UrlMemoryRepository. I am reworking the code, and it is in a bit of an intermediate state right now. If this is a blocker for you, I can prioritize fixing it. Let me know.

  • Mats

    No problem, I can manage. Just wanted to give feedback in case you weren't aware. And thanks for the Maven tip.

    Another thing is that the repository interface looks a lot like the Map interface. Make me wonder about the differences, such as method names (should “retrieve” be called “get”?), the type parameters (wouldn't it be nicer with a key type parameter?), would it be possible to go so far as to have Repository implement Map?

    Also I didn't quite like the “hack” with URLs being used interchangeably with Keys, or the “Long.valueOf(parts[1])” in RESTRepositoryServlet.getKey where Keys are suddenly assumed/restricted to be Longs.

    Not “complaining”, just trying to provide some humble feedback, hoping it might be of some interest.

  • Mats

    No problem, I can manage. Just wanted to give feedback in case you weren’t aware. And thanks for the Maven tip.

    Another thing is that the repository interface looks a lot like the Map interface. Make me wonder about the differences, such as method names (should “retrieve” be called “get”?), the type parameters (wouldn’t it be nicer with a key type parameter?), would it be possible to go so far as to have Repository implement Map?

    Also I didn’t quite like the “hack” with URLs being used interchangeably with Keys, or the “Long.valueOf(parts[1])” in RESTRepositoryServlet.getKey where Keys are suddenly assumed/restricted to be Longs.

    Not “complaining”, just trying to provide some humble feedback, hoping it might be of some interest.

  • Thanks for the good comments, Mats.

    I have indeed considered using the Map interface. My main issue is how to treat keys during inserts. The Map interface is not designed for key generation purposes. I've toyed with a few options, but found none that I like.

    Using URLs as keys is actually one of the neat things about REST. This will become more obvious if I get around to publishing work on more complex models.

    Regarding “get” versus “retrieve”. I choose “retrieve” because of the CRUD patterns (but I cowardly left “insert” as “create” to avoid the overloaded meaning of “insert”). Also, I wanted to remove the association with getters. But “get” might be a better name after all.

  • Thanks for the good comments, Mats.

    I have indeed considered using the Map interface. My main issue is how to treat keys during inserts. The Map interface is not designed for key generation purposes. I’ve toyed with a few options, but found none that I like.

    Using URLs as keys is actually one of the neat things about REST. This will become more obvious if I get around to publishing work on more complex models.

    Regarding “get” versus “retrieve”. I choose “retrieve” because of the CRUD patterns (but I cowardly left “insert” as “create” to avoid the overloaded meaning of “insert”). Also, I wanted to remove the association with getters. But “get” might be a better name after all.

  • Mats

    When GETing it's pretty clear that if an object contains a reference to another object, it is either included in the returned document, or it is referenced.
    When POSTing or PUTting, however, the situation is more complex. I don't think it matters if you are creating or updating, what to do with direct attributes are clear, but not with referenced objects. A referenced object might exist only on the client side, or it might be in the repository, but with different attributes. Do you have a clear vision of how to handle this?

  • Mats

    When GETing it’s pretty clear that if an object contains a reference to another object, it is either included in the returned document, or it is referenced.
    When POSTing or PUTting, however, the situation is more complex. I don’t think it matters if you are creating or updating, what to do with direct attributes are clear, but not with referenced objects. A referenced object might exist only on the client side, or it might be in the repository, but with different attributes. Do you have a clear vision of how to handle this?

  • Hi, Mats

    I'm thinking about GET as analogous to lazy loading proxies with Hibernate (or other persistence framework). This means that a GET request could either return a link, or the embedded object. Using the same analogy, PUT and POST requests can be similar to cascading saves, I think. This means that the onus is on the client to ensure that the correct state is transferred.

    Does that make sense?

  • Hi, Mats

    I’m thinking about GET as analogous to lazy loading proxies with Hibernate (or other persistence framework). This means that a GET request could either return a link, or the embedded object. Using the same analogy, PUT and POST requests can be similar to cascading saves, I think. This means that the onus is on the client to ensure that the correct state is transferred.

    Does that make sense?