Category Archives: SOA

On Integration: Why I enjoy working with databases

Status: This article is currently pretty dry. I’d like feedback on how to make it more eloquent.

In my previous blog post, I promised to write more about using databases as the main integration strategy. In the current post, I plan to cover maybe the most important question: “Why?”

Imagine an application where every time it wants to communicate with another system, it reads or writes to the database. For now, let’s ignore how this would work, and how it would evolve, which will be the subject of later posts. What advantages does this offer?

The alternative is usually to integrate with another system though a variety of means. In Java, the most common ones are Web Services, RMI, EJBs (which offers it own quirks in addition to those of RMI), Sockets, and various tricks using the file system.

The most important issue to me is invariably productivity. When I work with databases, I generally can use Object-Relation Mapping tools. This is a very productive way of accessing database data in an application. RMI offers similar advantages, but you will have to build lazy loading on top of the domain model if you want to have a rich model where the objects are interconnected. Web Services generally have some bindings to Java, but in my experience, these are really inadequate. Either the Java side suffers, for example by forcing you to have getters and setters, by forcing you to use arrays instead of collections, or by forcing you to use strings as the main data type. Alternatively, the XML-side suffers by having non-specific types (if you use collections). Sockets, of course are very unproductive. They give up productivity for simplicity.

The data that is managed by the remote service generally will come from a database anyway. This means that the data access code will be have to be developed somewhere anyway. A remoting layer will have to be developed in addition.

To maintain sustainable productivity, we need unit tests. Unit testing has for me proved to be hard to do well for both Web Services and RMI, and EJBs are of course out of the question. As my regular readers know, using a test database for standalone unit testing is quite simple. As an added bonus, tests that use the database will essentially have verified the integration. When I use a remoting protocol, I always run into strange problems very late in the test process.

Both unit testing and productivity benefits from the fact that dealing with databases is something we’ve done for a long time. The tools and techniques for doing so are very mature, compared to other methods of integration.

Secondly, there is the problem of reliability. If you use a single database, everything you do is within one transaction. Either all work will be committed, or it will be rolled back. This vastly simplifies your logic if you care about your correctness. For distributed systems, this will in theory be solved by the 2-phase commit protocol. However, my experience is that this adds so much complexity to a solution that the system can metaphorically collapse under its own weight. As a result, most solutions I’ve seen (and, I suspect, most solutions I haven’t) simply ignore this problem. This means that the odd resource error that occurs might very well have very unpredictable results.

A remote layer will also introduce another place where things can go wrong. Many developers end up coding recovery rutines for dealing with these kinds of errors. In my experience, this is some of the most error prone code you can write.

Third, performance-wise it is hard to beat the database. Most other methods will eventually hit the database anyway, and as a general rule, adding more steps to a solution seldom makes it faster. There are some issues with scalabilitity, however, that I will address in a later post.

Last, and maybe most importantly, I have never seen a standard interface for dealing with remote services. Solutions generally end up having half-a-dozen or more different policies for accessing different back end systems. There is one thing we will always be sure of, though: There’ll always be a database among these backend systems, no matter what else you have to talk to. Every extra communication mechanism you remove will reduce the shoestring-and-paperclip-factor of your system.

By using a single data source as the place for communicating with other systems, we will reduce complexity and improve testability, performance and reliabilty.

I hope that in this post, I have demonstrated why, in an ideal world, you would want to use a single database as your primary integration mechanism. However, the world is rarely ideal. Database schemas change, more load is added than what a single database can tackle, you have to understand a forest of database schemas, some applications should not be allowed to access all the data. In my next blog post, I will talk about how to solve these problems with database without giving up the single database vision. Stay tuned for evolution, scalability, security, reuse, and understandability.

Posted in SOA, Software Development | 4 Comments

On Integration: The vision of a single database

Before Web Services, there was CORBA. Before CORBA, there was DCOM. Before DCOM, there was RPC. Before RPC, there was BSD sockets. Before sockets, there were databases. And as it was in the beginning, so shall it too be in the end.

The only systematically successful strategy in the history of computing is databases. I have discovered more and more lately that integration using a database is well-defined (DDLs – a WSDL that works!), flexible (views and triggers can hide many old sins), well-supported (today, powerful Object-Relation Mapping tools should be de regur for any sensible project), and performant (sooner or later, you’re gonna hit the database anyway). Using modern database features it can be made secure and scalable as well. In the end, databases are the best thing since, well, since databases.

I want to write a series of blog posts detailing strategies I use and explore to make database integration work. For now, let me just share my vision with you: One huge enterprise database that appears flat to any application that uses it. All applications in the enterprise using the single database instance. This vision has many practical issues in terms of performance, security, maintainability and understandability. I will spend the blog posts exploring these issues.

First, though: Where is integration using databases applicable, and how does it relate to the main buzzword of the day, Service Oriented Architecture (SOA)?

Database-based integration is only applicable for applications that are not distributed across multiple organizations or distributed widely within the same organization. This is what we can call “application-to-application” (A2A), as opposed to “business-to-business” (B2B). For B2B, technologies associated with SOA are still going to be your best bet. Also, I would not use database integration from desktop clients (“2-tier architecture”). I am not sure whether this is just because everyone has been so excited about the 3-tier architecture for so long. Maybe you could make it work. However, I don’t much care for desktop clients, so I will leave this subject to someone else.

However, when the “services” are just internal services between different parts of my application portfolio, I think integration on the database layer is greately underused.

Posted in SOA, Software Development | 10 Comments

Why I Love SOA: Design Business-Related Services

What happens when a customer asks for a simple new bit of functionality? Do you have to execute changes on four different systems, test each in isolation and in combination, involve a separate testing, infrastructure and operations team?

If so, your architecture is probably not service oriented. In this post, I will examine the real meaning of coupling, and how it relates to SOA.


Posted in SOA | 8 Comments

Why I Hate SOA: Bad Ideas that Just won’t Die

When I see people after they have read about SOA or attended a conference with SOA, there are a few ideas that seem to pop up repeatedly. I have even been guilty of using these ideas myself. These ideas were proven to be bad before SOA came around, and (some) SOA evangelists seem to think that SOA solved these problems. It did not. It just refused to learn from history. Some of these ideas work under some circumstances, but recent SOA-itis has caused them to be used in inappropriate contexts.


Posted in SOA, Software Development | Leave a comment

TSS JS Europe Recap


I just returned from The ServerSide JavaSymposium Europe. Great conference, with interesting tracks and good opportunities to get to know people. The conference was in Barcelona, which was interesting, because hardly anyone (taxi drivers and waiters included) understand English here. It’s the first time where I’ve been a place where I am totally unable to communicate verbally with people around me. So it as a bit of an adventure. My only gripe about the location is the price of WiFi access in Spanish hotels.


In the tracks I attended, there seemed to be two recurring themes: Grid-technology and EJB 3. JPA looks like it is coming along nicely, and after discussing with spec lead Mike Keith some issues I’ve had with JPA, I am fairly convinced we will attempt a migration soon.

The Grid-based talks were Terracotta, JGroups, GigaSpaces and Coherence (which I did not attend). I especially liked Nati Shalom’s view of the tierless architecture. GigaSpaces has Spring-like DAO support, which makes it as natural-feeling to use as Jdbc. This lowers the barrier of entry quite a bit, but it brings forth a problem with spaces-based technologies: The level of abstraction is currently that of a DAO, not that of an ORM. This means: As far as I could see, you will not have a rich domain model with references between your domain objects if you use spaces. You will also have to implement referencial integrity yourself. (I hope to write up a little piece on creating lazy loaded abstractions on top of tuplespaces in the future, this might help address the issue).

A second problem with grid based data is that of the fragmented cluster/split brain. For those who do not know what Split Brain means, consider the following: A large grid usually has a failover that elects a new master if the old master is unavailable. Now, if the network is split in the middle, there will be two nodes that both think they are the master. The split brains will each update the data, and when they reconnect again, the data will have to be merged. Even with databases, this is not a painless operation, but I expect that with a tuplespace, it will potentially be unrecoverable.
Bela Ban was the only speaker I attended who covered fragmentation satisfactory. I wish that he had explored merging in more detail, though. Jonas Bonér seemed quite unprepared for the same question.
I don’t need split brain remerging to be perfect, but I need to be confident that the vendor has understood the issues and that the whole space won’t be corrupted when it gets back up again.

Outstanding sessions

  • John Davies: It was very interesting to hear John Davies speak. C24 has some outstanding technology within the banking sector, and I am really looking forward to seeing if my company can make use of this technology
  • Kirk Pepperdine: The performance anti-patterns talk was very interactive and fun. Kirk gave out chocolate to attendees, which is always a good way to win me over.
  • Heinz Kabutz: I went to both of Heinz’ talks. He is a very entertaining speaker. Sadly, I knew most of what he talked about already.

Ideas of my own

With the help of people I talked to during the conference, a few new ideas have started germinating in my head. I hope to be writing more about this in the future:

  • The SOA battle: More voices who are critical to SOA should be heard. I may reengadge myself in this
  • Object-based queries: I hope to be working on a portable version of TopLink Expressions/Hibernate Queries. The query language is the last thing preventing persistence technology portability
  • Myopic Development: I had good results focus testing the term “myopic development” instead of “agile development”. I will write more on this later
  • Lazy loading: Lazy loading, like dynamic proxies, is not hard, but people are afraid of it. I’d like to see custom lazy-loading become more mainstream

Stay tuned for updates.

Posted in SOA, Software Development | 2 Comments

SOA and enterprise architecture

Roger Sessions has just published “A Better Path to Enterprise Architectures“. His main point is that large, centralized, big-bang enterprise architecture efforts fail. I could not agree more. Sessions gives some good arguments for why you would want to deliver incrementally. He calls this approach SOA.

This is a fairly common way of defining SOA – basically SOA is another name for incremental deliveries. If this is SOA, I don’t hate SOA at all. However, I find that “SOA as incremental deliveries” fails to shine any new light on the subject. To paraphrase Sessions: if you have a $100 million system, you should at least split it up in 10 deliveries. Now, this is neither hard, unusual, or especially helpful. The problem is of course, a $10M project is still very large. How can you split that up further. Finding good services without too high coupling is harder as you move down into single-project scales. And so far, I have seen no description of SOA that sheds any light on this particular issue.

I have discovered that SOA services (at least XML-based protocols in Java) tend to be used as horizontal integration. That is, you insert a SOA layer as the interface to your data access logic. This is by definition going to be a very wide interface, and any changes to the UI, the data services or the database are likely to require a coordinated change through the whole stack. SOA is not necessarily a good idea for horizontal integration. (There is one valid use: Rich clients beyond the firewall). SOA for vertical integration (for example: a webshop that integrates with a payment system) seem to be benefitial fairly frequently. However, I have not seen this as often in the wild as I’ve seen harmful horizontal integration.

I’d like to finish this post with a jab at a small comment in Sessions’ article: “Code Sharing—Many organizations believe that reuse is achieved through code sharing. It is somewhat amazing that this belief persists, despite decades of failure to achieve this result. The best way to reduce the amount of code that a given project needs is through delegation of functionality (as in the case of Web services), not through code sharing.” This is one of the most common myths about SOA. I would like to submit that code sharing has succeeded wildly. Look at reuse of Java open-source projects, especially after Maven entered the picture. Reuse through services, on the other hand, couples your whole service network together, making it hard to change anything without risking breaking something totally unrelated. With horizontal integration being the most common way to use SOA, the information schema is larger, and the coupling is even tighter. Code sharing is not without problems, but SOA does nothing to improve upon these problems.

Posted in SOA, Software Development | Leave a comment

SOA evolution

In my previous post, I talked about how I feel SOA encourages rigid design. Of course, in some situations, you may not really have a choice. When creating Business-to-Business (B2B) integration, interfaces will naturally be much more rigid. There is no way around it, SOA or SOA-not.

Ian Robinson recently published an article on Martin Fowler’s webpage titled Consumer-Driven Contracts: A Service Evolution Pattern. The article gives some very clever ideas of how to make the consumers drive the service definition instead of doing this on the provider side (this is related to Lean Software Development, by the way). The article also goes through some of the challenges of a contract-oriented interface.

As I said, for B2B integration, contracts may be essential. However, for internal integration in an organisation, it seldom is. When I have the choice, I’d rather not have to incur the cost of developing and maintaining. The cost in time spend on contractual interface is relatively simple, but the cost in lost agility is substantial.

Posted in SOA, Software Development | Leave a comment

Why I hate SOA in less than 200 words.

On JavaZone 2005, I talked about “Why I hate SOA”. I found it hard then, and I’ve still found it hard for a while to express this sentiment concisely. I think I’ve finally got it!

One of the most common inefficiencies I discover in organizations is poorly designed boundaries. I find that people suffer when a boundary is too ridig, not when it is too loosely defined. Contractual interfaces create a “mine versus yours” mentality, where every problem beyond the boundary is not corrected, but instead wrapped in even more code. Almost all such boundaries lead to wrappers, lots of extra crufty code, duplicated decisions and thus a rigid architecture.

When used for a stable interface, a contract-driven approach may be appropriate. However, when I create an interface contract, it takes me many attempts before I get it right. In our profession, a big problem is our tendency to focus too narrowly on your small, and poorly divided, slice of the problem. SOA-thinking for in-house integration encourages this approach, instead of encouraging cooperation and “thinking inside a bigger box”.

Posted in SOA | 4 Comments

Architecture Astronauts

Joel Spolsky had a blog entry seems eerily familiar: The Architecture Astronauts (in outer space)

I’m starting to see a new round of pure architecture astronautics: meaningless stringing-together of new economy buzzwords in an attempt to sound erudite.

I’ve seen the type, and I’m glad to say that we haven’t got any of those around. A bad architect can cause enormous damage.

Posted in Links, SOA, Software Development | Leave a comment