Archive for Software Development

Agile Release Pattern: Database migrations

As I release more frequently, I start to focus on automating the actual process of deploying a release. One of the most powerful steps of automating deployment is to automatically upgrade the database schema.

This technique first saw mainstream use with the Ruby-on-Rails framework. Today, there are several mature tools that will help you organize and execute database changes (Scala Migrations, Ruby-on-Rails Migrations, dbdeploy, Liquibase). And if none fit you perfectly, it’s easy to create your own.

In my current project, we have rolled our own solutions for this:

  • All changes to the database are stored as SQL-files that are packaged into the deployment unit (in our case, a WAR-file). These files will usually contain statements like “ALTER TABLE ADD COLUMN” and “CREATE TABLE”. To get the files executed in the right order, we name the files with an increasing sequence number, like 012-add_payment_type_to_customer.sql.
  • Whenever the application is started, it looks for a table named “MIGRATIONS” in the database and creates it if it doesn’t exist.
  • At startup, the application looks through the list of migration files it has been packaged with and sees which file names don’t have an entry in the MIGRATIONS table.
  • The application executes all the scripts that haven’t been executed already. If any script fails to execute, it makes a note of the error in the MIGRATIONS table and refuses to start the application

We run the migration procedure every time we start up the application, whether it is in test or production. Even the JUnit tests that access the database will run any pending migrations before starting. The result is that any database change that we intent to roll out into production will at the very least be executed once on each developers private copy of the database, as well as once on the continuous integration server. By the time they get executed in a controlled testing environment, we’re pretty confident that they work as intended.

Some migration tools use a more friendly (and portable) syntax than SQL DDL statements. Many allow for rollback of migrations. Most don’t automatically execute the pending migrations on application start, but require a separate command to execute them.

Your first step towards automating database migrations is to make sure that every change in the database is represented by some sort of script and that all these scripts are versioned with the rest of your code. From there, you can improve your process when you notice a step in the process that seems to involve too much work or risk.

Automating the deployment process will reduce the need for documentation and the opportunity for errors during one of the most critical times in the project. It is especially important to reduce the possibility of miscommunication and mistyping if the people responsible for deployment are in a separate organizational unit, which often seems to be the case. Make their job as easy as possible!

View Comments

Are you an architect or just a freaking good developer?

A software architect who doesn’t care about what his system is supposed to do isn’t worth his salt. For the term “software architect” to hold any meaning at all, it must be to describe someone who understands what the customer needs and designs a system that is fit for this purpose.

Sometimes, however, people talk about “technical architects”. I have myself been guilty of falling into this category once or twice myself. The fallacy of the “technical software architect” is that there is a large number of solutions that will work independent of what problem the customer have.

This is ludicrous. The very meaning of architecture is to apply technology to a context. Leave out the context, and you have no basis to choose a technology.

This is not to say that you don’t need experts within technologies and design strategies used in the solution. If the customer wants to connect a number of old and new systems into a shared portal, an Enterprise Service Bus (or a Data Warehouse!) may be a good solution. And then you will need people who are experts on the technology in question.

However, I would not call such an expert an architect. If you’re really good in some technology, I have no problem calling you a freaking good developer. Heck, you can even put that on your business card. Just don’t think you’re an architect.

View Comments

Agile release pattern: Feature-on/off-switch

If you want to release frequently, a problem you may encounter is that some features, even though functionally complete, don’t stand well on their own, but require other features to be valuable to the user. If you want to release the system in this state, you need a way to hide features. A Feature-on/off-switch is a simple idea for dealing with this.

A feature-on/off-switch is some mechanism to hide features from a system. A feature-on/off-switch must be able to remove menu items concerning the feature and also to prevent adventuresome users from accessing the feature. It may be as crude as commenting out code (not recommended!), to enabling the feature based on a complex set of conditions (also not recommended).

I’ve encountered features switches triggered by the following mechanisms:

  • A configuration file or configuration database table tells the system whether to turn the feature on or off.
  • The feature is turned on for users that have a specific role (typically something like BETA_TESTER)
  • The feature is turned on when the system is deployed as /foo-preview, but not when the system is deployed as /foo
  • The feature is turned on after a specific date. This may seem weird, but was a potential solution when we were waiting for a release of another system and operations-freeze during summer was in effect.

There are probably many more conditions you may use to trigger a feature-on/off-switch. Maybe some of my readers have good examples?

View Comments

Generalized observation

As a general observation, it seems that when software architects try to solve general problems, they come up with horrible designs; when they solve specific problems, they come up with good designs.

Designs made without reference to a problem often become complex and not very fit for purpose when we’re solving specific problems. As a general rule, avoid generalizations.

Some examples:

  • An Enterprise Service Bus may create a big project and maintainable needs for something that turns out to be only a few simple integration points.
  • Splitting a system into generic reusable services may make it harder to understand and maintain
  • A generalized security role model may make be hard to understand while the only thing the system needed was an “is_admin?” toggle.
  • A complex role based (“can create payment”) security model may hide control of authorization from the developers, making it harder to implement the usually more important data based (“can access account number 5″) security model
  • A general model for workflow transition may make it harder to implement the specific workflow you need for a particular process. And it leads to endless discussions about the relationship between a workflow, a process and a process step.
  • Generalized test strategies are often vague and require a large number of test environments. In the end, it doesn’t contribute to increased quality.

In “No Silver Bullet”, Fred Brooks introduces the concepts of “essential complexity” and “accidental complexity.” The complexity from generalization is always “accidental” (that is: not inherently necessary). When you focus on solving the essential complexity (that is: the users’ problem) as efficiently as possible, you may found the complexity of your problem shrink by half or more.

Have you ever found yourself thinking, “what specific problem was I trying to solve again”? Then you’ve probably been down the dark road of generic thinking.

View Comments

Unified task list: A requirement mirage?

When developing a system that people use in their day to day work, I often meet the following requirement: “A user should be able to see all tasks from each functional area on a single screen.” This requirement requires integration with all parts of the system, making it architecturally costly. Luckily, the requirement might often not be needed at all!

Everyone will tell you: The most powerful technique for dealing with a large problem is breaking it up in smaller problems. But we often end up breaking the problem in such a way that the parts will have to talk with each other. And in software development, integration is expensive and bug ridden. So I want to avoid it.

There are some areas that tempt me to integrate the parts more tightly:

  • I might want to share information about customers. This is usually a fairly benign integration point, as it’s read only.
  • I might want to reuse a service that looks similar to what I need. In a larger ecosystem, this often creates more noise, as other users of the service may be affected by any changes I need to make.
  • And there might be places where a user may want to see information from several systems. The most common one is the “unified task list”. This often result in each system having to let go of the life-cycle of it’s own processes, which may increase complexity a lot.

Of these, “unified task list” is the one that’s at the same time most costly and most related to real requirements. While rejecting list can potentially make the project a bit more expensive, rejecting a unified task list means you’re not delivering what the user asked for.

Or did the user really ask for a unified task list? On my last project, we ended up having separate task lists for separate types of tasks. The users were indifferent about unifying these. As a matter of fact, we got the following responses from our user panel:

  • In one user’s organization, they only process forestry subsidies every few months, because there are so few and we want to see the total subsidy expenditure
  • In another user’s organization, the process the same tasks every month, because there’s a lot more activity.
  • In almost all the organizations, the person dealing with forestry subsidies is another person than the person dealing with road subsidies. They have their expertize in different areas, know the external parties for their separate fields and want their separate tasks lists for their jobs

Do you have a requirement for a “unified task list” in your project? Are you really sure the users really need it?

View Comments

What is the right iteration length?

When picking iteration length for an agile project, there are mainly two forces that you have to balance: The rate of learning is proportional with the number of iterations, rather than the length of the project. This means that shorter iterations help you get better faster. But each iteration has some overhead with sprint reviews, retrospectives and planning. You don’t want this overhead to dominate the effort spent on the project.

For some reason, most projects I’ve seen with little experience in iterative development prefer three week iterations. Personally, I prefer two week iterations. Here is the breakdown:

  • Three week iterations: After three months, you’ve spent about 7% of your time on iteration meetings. You’ve had 4 opportunities to improve.
  • Two week iterations: After three months, you’ve spent about 10% of your time on iteration meetings. You’ve had 6 opportunities to improve.
  • One week iterations: After three months, you’ve spent about 20% of your time on iteration meetings. You’ve had 12 opportunities to improve.

Going from 93% to 90% efficiency for a 50% increase in learning seems like a good deal. Going from 90% to 80% efficiency for a 100% increase in learning, not so much.

These numbers are of course greatly simplified. You might also consider:

  • With shorter iterations, the planning time may go down. But this takes practice – it doesn’t happen automatically.
  • With very short iterations, you may not have experienced enough to learn much from the retrospective. However, if you find that you do a timeline, and most of the things people remember happened the last week, it may not be because that’s the only time something significant happened.
  • You may consider different frequencies for different ceremonies. For example, on my current project we want to have demos with our power users. But they have to travel far to visit us. So we only have a full demo every other four weeks. We plan every two weeks and have an internal review and retrospective every two weeks.

What’s the right iteration length for your project?

View Comments

Getting started with pair programming

As it turns out, one of the least used practices of agile development is also one of the most powerful.

Up into the start of last year, I only worked sporadically with pair programming. Last year, I was lucky enough to be part of a team that used pair programming all the time. Since I’ve experienced real pair programming, I never want to give it up.

Pair programming offers benefits to many stakeholders:

  • As a developer, you will have more fun at work. You will get to know your colleagues better and experience flow practically the whole day. You will be tired by the end of the day, but you will also feel like you’ve accomplished good work.
  • The team will have a higher quality code base that everyone is comfortable with.
  • As an architect or team lead, you will have a good way to contribute even if you only have a little time before a meeting. You will also have a better chance to influence the rest of the team, instead of just issuing edicts that nobody follows.
  • As the project manager, you will have a more flexible team. If someone gets sick, goes on vacation or moves to another project, there won’t be a big problem.
  • As the customer, you will get better quality code faster.

With these benefits in mind, why doesn’t everybody pair program? Well, it is unfamiliar, a little scary, and exhausting when you start out. Most developers are not used to having other watch them code. Or to focus on the task at hand the whole day.

Here are some techniques I’ve seen have effect for teams transitioning to pair programming:

  • Code dojos: Everyone on the team gets together and programs a sample program or a spike together. Two people sit at the keyboard, while the rest watch on a projector. Rotate pairs frequently. This lets everyone get comfortable with coding as a social activity.
  • Pair programming should be the norm, but allow for exceptions. If people only pair program occasionally, they end up not pair programming at all. If people are forced to pair program when they just need some time by themselves to think, they will not be happy pair programming.
  • The pair programming star: Write the names of the team members in a circle. Every time two people pair program, draw a line between their names. Keep the pair programming star in a visible location.
  • Facilities: The furniture can make it harder to get started pair programming. Consider using two mice, two keyboards and perhaps two monitors per PC to make it easier. Or use VNC for desktop sharing.
  • Give it time: Pair programming is exhausting when you first start doing it. It will take a while before people are comfortable with the new pace. But once they switch, they will never want to go back.

Resources

For more inspiration, see these presentations from the Smidig 2009 conference (in Norwegian):

View Comments

Å trene på Java EE

For å bli bedre må man trene. For å bli bedre med avanserte ting, må man forstå de grunnleggende tingene bra. For å vite hvorfor man bruker avanserte verktøy, må man prøve å jobbe uten dem. Derfor har jeg de siste ukene trent mange ganger på å lage en veldig enkel webapplikasjon i Java. For hele applikasjonen har jeg startet med å skrive testene før koden som implementerer funksjonaliteten.

Dersom du vil prøve deg på samme øvelse, inneholder denne artikkelen litt informasjon for å komme i gang. Start med koden under og følg feilmeldingene. Send en kommentar dersom du ikke kommer videre fra en feilmelding, så får vi en FAQ.

Oppgaven

Løs et så enkelt som mulig problem som involverer websider og database med så enkel teknologi om mulig.

Oppgaven jeg har laget går ut på å opprette personer med fullt navn og søke etter personer basert på navnet deres. For å gjøre oppgaven så lite som mulig har jeg valgt å la personer kun ha ett informasjonsfelt: Fullt navn. Denne oppgaven tar cirka 2-3 timer uten øvelse og du kan få den ned i 60-90 minutter med trening.

Du kan naturligvis velge en annen oppgave, men uansett hva du velger: Det er mer lærerikt å gjenta den samme oppgaven flere ganger enn å utføre en avansert oppgave.

Når jeg utfører oppgaven er det viktigste jeg lærer meg å forstå feilmeldingene som guider meg gjennom utviklingen. Dersom du trenger hjelp til å komme til de første feilmeldingene kan du se resten av artikkelen.

Steg for steg: Startpunktet

Selv om jeg valgte veldig enkel teknologi for implementasjonen, har jeg valgt et større sett med biblioteker for å skrive testene. Jeg bruker følgende når jeg skriver testene:

  • JUnit 4.6
  • Jetty 6.1.22
  • HSqlDb 1.8.0.10
  • WebDriver-HtmlUnit 0.6.1039
  • Mockito 1.8.0
  • FEST-assert 1.2 (ikke påkrevd, men gjør testene søtere)

Den eneste teknologien jeg har valgt for implementasjonen er Servlet-API 2.5 og Hibernate-Annotations 3.4.0.GA.

For at du skal slippe å plundre så mye med avhengigheter før du kommer i gang har jeg laget en pom.xml-fil som du kan ta utgangspunkt i.

Web-tester

For å starte utviklingen, er det lurt med en test som starter på utsiden av applikasjonen. Noe slikt:

  1. Start opp miljøet
  2. Legg inn en person
  3. Søk etter personen

Slik kommer du i gang med en test som går mot en web applikasjon:

int SERVER_PICKS_PORT = 0;
org.mortbay.jetty.Server server = 
       new org.mortbay.jetty.Server(SERVER_PICKS_PORT);
server.addHandler(
       new org.mortbay.jetty.webapp.WebAppContext("src/main/webapp", "/"));
server.start();
 
int serverPort = server.getConnectors()[0].getLocalPort();
 
org.openqa.selenium.WebDriver browser =
       new org.openqa.selenium.htmlunit.HtmlUnitDriver();
browser.get("http://localhost:" + serverPort + "/");
browser.findElement(By.linkText("Create person"));

Dette oppsettet forventer å finne web.xml-fila på src/main/webapp/WEB-INF/web.xml.

Funksjonell test

En funksjonell test definerer kravene i applikasjonen. Det er lurt å gjøre funksjonelle tester så raske som overhode mulig, samtidig som de går gjennom alle kravene. En funksjonell test trenger ikke være en ende-til-ende test, slik som eksempelet over. Dette er viktig, fordi ende-til-ende tester er ofte veldig trege. Her er noen eksempler på funksjonelle tester:

  • Vis en siden for å opprette nye personer
  • Opprett en ny person
  • Verifiser at personens navn er oppgitt og ikke inneholder ulovlige tegn
  • Vis en side for å søke etter personer
  • Vis alle personer dersom søkestreng ikke er angitt
  • Søk etter angitt søkestreng

En funksjonell test kan se slik ut:

PersonServlet servlet = new PersonServlet();
 
HttpServletRequest req =
    org.mockito.Mockito.mock(HttpServletRequest.class);
HttpServletResponse resp =
    org.mockito.Mockito.mock(HttpServletResponse.class);
 
PersonDao personDao =
    org.mockito.Mockito.mock(PersonDao.class);
servlet.setPersonDao(personDao);
 
org.mockito.Mockito.when(req.getMethod())
    .thenReturn("POST");
org.mockito.Mockito.when(req.getPathInfo())
    .thenReturn("/create.html");
org.mockito.Mockito.when(req.getParameter("full_name"))
    .thenReturn("Johannes Brodwall");
 
StringWriter pageSource = new StringWriter();
org.mockito.Mockito.when(resp.getWriter())
    .thenReturn(new PrintWriter(pageSource));
 
servlet.service(req, resp);
 
org.mockito.Mockito.verify(personDao)
    .create(Person.byName("Johannes Brodwall"));
 
org.fest.assertions.Assertions.assertThat(pageSource.toString())
    .contains("Personen er opprettet");

Data-aksess-test

Hibernate forenkler databasebruken mye. Men Hibernate er selv komplekst og når man bruker det på mer avanserte måter fortjener det egne tester. En typisk test med Hibernate kan være:

  1. Legg i tre personer i database
  2. Søk etter en del av navnet på en av dem
  3. Sjekk at du får tilbake akkurat den du forventet

Når jeg starter med Hibernate, lager jeg en test som dette, og følger feilmeldingene. Pass på å både følge feilmeldinger i loggen og stack tracer.

AnnotationConfiguration conf = new AnnotationConfiguration()
    .setProperty(Environment.URL, "jdbc:hsqldb:mem:persondaotest");
PersonDao dao = new HibernatePersonDao(conf.buildSessionFactory());
 
dao.create(Person.withName("foo"));
 
org.fest.assertions.Assertions.assertThat(dao.find(null))
    .containsExactly(Person.withName("foo"));

Følg feilmeldingene herfra.

Integrasjon

En veldig vanlig måte for web serveren å overlevere spesielt ting som DataSources til applikasjonen er via JNDI. I Jetty kan du gjøre dette i Web-testen på følgende måte:

org.hsqldb.jdbc.jdbcDataSource ds = new org.hsqldb.jdbc.jdbcDataSource();
ds.setDatabase("jdbc:hsqldb:mem:personwebtest");
ds.setUser("sa");
new org.mortbay.jetty.plus.naming.Resource("jdbc/primaryDs", ds);
 
// Oppstart av Jetty som vist over

Konklusjon

Å gjøre en liten øvelse som dette er en god måte å bli bevisst hvilke vaner du har og hvor lang tid det egentlig tar for deg å gjøre oppgavene dine. Du vil oppleve at det å skrive tester før koden føles som om det går saktere enn du tror du er vant til.

Men dersom du er som meg, vil du også oppleve noe annet: Når du tester ut applikasjonen første gang (du kan gjøre dette med Jetty, naturligvis) så er sjansene gode for at den vil være nokså feilfri og at debugging i stor grad er overflødig. Jeg vet ikke med deg, men debugging er en aktivitet jeg gjerne blir kvitt.

View Comments

Tips for databasemigreringer

En kollega spurte i dag om mine topp tips når det gjelder databaserefactorings. Her var mitt svar:

  1. Ha en organisert struktur med at man gjennomfører navngitte migreringer (a la Ruby-on-Rails sine migrations eller dbdeploy). Typisk er det vanlig og velfungerende å navngi scripts med løpenummer (001, 002, …) eller timestamp (20091124071300, …) og ha en tabell i databasen som holder styr på hva som har blitt kjørt
  2. Bruk views og materialiserte views for å støtte tilbakekompabilitet (NB: Oracle er veldig sterk på dette, andre databaser kan slite)
  3. Om mulig, gjør hver migrering bakoverkompatibel på en versjon av programvaren. Dette er lettere å få til jo hyppigere du releaser programvaren
  4. Skill endringer i skjema (for eksempel: legg på en kolonne) fra migrering av data (for eksempel: populere kolonnen). Feilene vil typisk ligge i #2 av disse, og den er lett å gjøre transaksjonell, mens skjemaendringer ikke er transaksjonelle i de fleste baser.

Har jeg dekket det viktigste da?

View Comments

Effective Enterprise Java at Öredev

Just three weeks ago, I was asked to step in for Ted Neward to give a tutorial at Öredev on Effective Enterprise Java. As I did not have time to get the tutorial materials printed, I present them here on the web for the participants and others.

1. Effect Enterprise Java architecture in 2009

Since the Effective Enterprise Java book was written, many of the topics regarding transactions, concurrency and shared state have been resolved. Here are the basic guidelines of an enterprise application in Java as of 2009:

  1. All processing is triggered by an event, such as an http-request, a timer or an incoming message
  2. Each processing event is handled in an isolated scope, never touching the data of another processing event. All coordination of data happens through the data layer. This means that objects are either stateful and short-lived or stateless and immortal.
  3. Each processing event is either completed or aborted totally. Very few applications will benefit from trying to automatically recover from most problems.
  4. Inconsistent updates are resolved when transactions are committed, usually through optimistic locking.

Some things I told the attendants to consider: First, today most people consider EJBs to be more trouble than value (with the exception of Entity beans 3.0 which is JPA which is really mostly Hibernate, which really isn’t very much EJB). Second, all triggers can be forged. We return to the second issue when we discuss security.

2. Web integration testing

I showed a practical demo using WebDriver and Jetty to perform web integration tests as JUnit test. The remarkable things about this example is that it requires no installation of an app server (Jetty is installed as a Maven dependency), it requires no separate starting of an application server (Jetty can run embedded in the test) and it is very fast (Jetty starts up in about 200 milliseconds).

3. Hibernate integration testing

I showed a practical example of how to test a DAO implemented with Hibernate. The remarkable things about this demonstration was that, again, no installation or startup is required (I use H2 as an in-memory database).

Hibernate is a power tool. I use the following analogy: If you’re building a tunnel and need to mine through a mountain, you want to use dynamite. If you want to remove a rock from you back yard, you may want to use dynamite. But if you don’t know what you’re doing, chances are you may blow your foot off.

Hibernate is like that dynamite. You need knowledge and safety measures to deal with it correctly, but when you do, it can save you a lot of effort. Creating JUnit tests for your Hibernate code is one such safety measure.

4. Security

Almost all the threats an application developer should be concerned with are in the same class, namely that of Injection attacks. An injection attack is when a client tricks another process into treating data as instructions. For example by using SQL meta-characters:

Little Bobby Tables

An important source of injection vulnerabilities is HTML injection, also known as Cross-Site Scripting (XSS).

In both situations, and in all others, there’s one important guideline: Data from the outside world should be considered “tainted”. Never use tainted data in unsafe ways. When reading input parameters, validate against malicious characters (but please don’t make poor “O’Reiley” unable to use your system). When writing HTML pages, always escape tainted data. When using tainted data during access to the database or with HSQL or JPAQL, always use PreparedStatement and send in data as parameters.

Another often overlooked exploit is request forgery, often used in combination with phishing attacks. To protect your users from request forgery, supply an authentication token as a hidden field with all forms. Or if you’re lazy: Make sure all operations have confirmation dialogs.

5. Continuous Deployment

Continuous Deployment is the practice of rolling out a deployment to a server after every successful build on your Continuous Integration server. I described two ways of doing Continuous Deployment during the tutorial, but I will restrict this discussion to the more modern one.

Most teams doing continuous deployment use Maven or Ant to invoke the deployment tools of their respective application servers. Many application servers make this pretty hard, but the hardest part of the battle if finding out what command needs to be invoked. The Continuous Integration server can be configured to run this task.

After doing deployment, it is a good idea to run some sort of system level integration tests. Teams use replay of production data, load generators like JMeter and webcrawlers that validate HTML and CSS to do automated non-functional integration tests. If you keep your logs clean, you can actually gain quite a bit of confidence just by looking at the logs after applying simulated load to your system.

Some projects take this even further, by continuously deploying to production. Both IMVU and Flickr are known to practice this.

At any rate, the practice of doing continuous deployment should lead you to consider how to simplify your deployment and runtime configuration, which will result in an easier installation procedure into production, even if it’s not automated.

Summary

Effective Enterprise Java development has progressed a lot since 2004. Much of the emphasis now is on how to improve testing in enterprise Java applications. The way applications usually process data has stabilized as well, with most application preferring each event to be processed in an isolated, transactional context with very little automated recovery.

In the end, Effective Enterprise Java is a lot simpler in 2009 than it was in 2004.

Material

  • My slides, including topics that we didn’t discuss as well as code for all the examples
  • The complete source code for one iteration of my Enterprise Java Kata, including a pom.xml file with all dependencies needed to get the tests running

View Comments

Creative Commons Attribution 3.0 Unported
This work is licensed under a Creative Commons Attribution 3.0 Unported.