Monthly Archives: November 2009

Tips for databasemigreringer

En kollega spurte i dag om mine topp tips når det gjelder databaserefactorings. Her var mitt svar:

  1. Ha en organisert struktur med at man gjennomfører navngitte migreringer (a la Ruby-on-Rails sine migrations eller dbdeploy). Typisk er det vanlig og velfungerende å navngi scripts med løpenummer (001, 002, …) eller timestamp (20091124071300, …) og ha en tabell i databasen som holder styr på hva som har blitt kjørt
  2. Bruk views og materialiserte views for å støtte tilbakekompabilitet (NB: Oracle er veldig sterk på dette, andre databaser kan slite)
  3. Om mulig, gjør hver migrering bakoverkompatibel på en versjon av programvaren. Dette er lettere å få til jo hyppigere du releaser programvaren
  4. Skill endringer i skjema (for eksempel: legg på en kolonne) fra migrering av data (for eksempel: populere kolonnen). Feilene vil typisk ligge i #2 av disse, og den er lett å gjøre transaksjonell, mens skjemaendringer ikke er transaksjonelle i de fleste baser.

Har jeg dekket det viktigste da?

Posted in Norsk, Software Development | 17 Comments

Why don’t we call our customers “clients”?

Lately I’ve been thinking a lot about how easy it is to lose sight of the goal of the project and instead focus on whatever means someone first thought was a good starting point when the project was first conceived of. And I think it all comes down to words.

The first years I was working in this business, I didn’t see any distinction between “the user” and “the customer”. Once I started seeing the distinction, I started to understand that the person who is going to use the system we’re developing is not the person who defines what the system should do and neither of these is usually the person that pays me to develop the system. So I starting distinguishing between the product owner, that is, the customer and the end user. But the product owner often calls the person I call “end user” his “customer”. What’s going on here? Let’s check the dictionary:

CUSTOMER
Main Entry: cus·tom·er
Pronunciation: \ˈkəs-tə-mər\
Function: noun
1: one that purchases a commodity or service
2: an individual usually having some specified distinctive trait

CLIENT
Main Entry: cli·ent
Pronunciation: \ˈklī-ənt\
Function: noun
1: one that is under the protection of another : dependent
2a: a person who engages the professional advice or services of another
2b: customer
2c: a person served by or utilizing the services of a social agency
2d: a computer in a network that uses the services (as access to files or shared peripherals) provided by a server

I’ve seen suppliers approach their work by asking for a specification of a product to deliver and then trying to deliver something to that specification for payment. The mental model is that of a customer going to the grocery story asking for “eight pounds of CRM software”. My experience with organizations with this sort of mindset has always been unsatisfactory.

On the other hand, I’ve seen suppliers approach their work as an agent of the organization that pays them. “Our job is to enable someone else do their job better.” This totally changes the way an organization deals with this relationship. The word “customer” may not be conductive to this sort of thinking. Instead, we should think of ourselves as agents acting on behalf of a client. As an agent, your responsibility is to enable your client. This includes helping your client to find better means of reaching their goal.

By the way, wikipedia defines the word “agent” as “a person who is authorized to act on behalf of another (called the Principal or client) to create a legal relationship with a Third Party”. If the “third party” is the computer, then a good developer is an agent acting on their clients behalf in dealings with the computer software.

Why doesn’t the software industry use the word “client” instead of “customer”?

Posted in English, Extreme Programming, Non-technical | 10 Comments

The Malmö Experiment: Estimation Techniques Shootout

At ØreDev I ran into Lasse Koskela. We started talking about estimation techniques, and we both felt that the dominant estimation technique of relative estimation with planning poker has been unchallenged for a very long time. We found ourselves wondering what the next big idea about estimation will be. After throwing a couple of ideas back and forth, we decided to invite to a workshop comparing a few estimation techniques. We decided to call the workshop “The Malmö Experiment.”

The results of the experiment were interesting, but far from conclusive.

During the experiment, we gave the same set of requirements to three teams, each consisting of three estimators. Each team was told to use a different technique. We decided on the following techniques:

  • Planning poker: The purpose is to give all requirements a relative number (the meaning of these numbers will later be measured based on the output of the iterations). Each estimator has a deck of cards and choose a card with the number he feels is appropriate for the current requirement. Everyone reveals the numbers at the same time to avoid anchoring. The team discusses and reestimates a requirement until their estimates converge.
  • Table spread estimation: This is one of the new techniques we proposed. Each requirement is written on a card. The estimators spread the cards along a large table according to the relative effort required per requirement. Numbers can be imposed later if desired.
  • Goldilocks estimation: The purpose is to restructure requirements until they all have roughly equal size. Instead of assigning a number to a requirement, the estimators pick one of three options: Too big (split up and estimate the parts again), Too small (merge with other requirements), or Just right. When all requirements have been split or merged into “Just Right” size, the estimation is complete.

All teams found their estimation techniques to be motivating, but the Table Spread and Goldilocks groups managed to complete the estimation much faster. The Table Spread estimation would obviously need more space if we had a lot of requirements, while the Goldilocks estimation would generate a large number of requirements.

Based on these experiences, we propose the following experiment in a project:

  • Use Table Spread Estimation for Release planning. This will encourage the team to keep the number of requirements low instead of trying to plan too detailed too far ahead. Since the table spread is quick it can be redone every iteration.
  • Use Goldilocks Estimation for the next few upcoming iterations to split up the requirements into equal sized items. This will generate a better set of work items. The shorter planning window will ensure that we won’t have an unmanageable number of requirements.

These are currently very rough ideas and we have no idea of whether it will work as we expect. Let me know if you have any relevant experience or if you want more information.

Posted in English, Extreme Programming | 9 Comments

Effective Enterprise Java at Öredev

Just three weeks ago, I was asked to step in for Ted Neward to give a tutorial at Öredev on Effective Enterprise Java. As I did not have time to get the tutorial materials printed, I present them here on the web for the participants and others.

1. Effect Enterprise Java architecture in 2009

Since the Effective Enterprise Java book was written, many of the topics regarding transactions, concurrency and shared state have been resolved. Here are the basic guidelines of an enterprise application in Java as of 2009:

  1. All processing is triggered by an event, such as an http-request, a timer or an incoming message
  2. Each processing event is handled in an isolated scope, never touching the data of another processing event. All coordination of data happens through the data layer. This means that objects are either stateful and short-lived or stateless and immortal.
  3. Each processing event is either completed or aborted totally. Very few applications will benefit from trying to automatically recover from most problems.
  4. Inconsistent updates are resolved when transactions are committed, usually through optimistic locking.

Some things I told the attendants to consider: First, today most people consider EJBs to be more trouble than value (with the exception of Entity beans 3.0 which is JPA which is really mostly Hibernate, which really isn’t very much EJB). Second, all triggers can be forged. We return to the second issue when we discuss security.

2. Web integration testing

I showed a practical demo using WebDriver and Jetty to perform web integration tests as JUnit test. The remarkable things about this example is that it requires no installation of an app server (Jetty is installed as a Maven dependency), it requires no separate starting of an application server (Jetty can run embedded in the test) and it is very fast (Jetty starts up in about 200 milliseconds).

3. Hibernate integration testing

I showed a practical example of how to test a DAO implemented with Hibernate. The remarkable things about this demonstration was that, again, no installation or startup is required (I use H2 as an in-memory database).

Hibernate is a power tool. I use the following analogy: If you’re building a tunnel and need to mine through a mountain, you want to use dynamite. If you want to remove a rock from you back yard, you may want to use dynamite. But if you don’t know what you’re doing, chances are you may blow your foot off.

Hibernate is like that dynamite. You need knowledge and safety measures to deal with it correctly, but when you do, it can save you a lot of effort. Creating JUnit tests for your Hibernate code is one such safety measure.

4. Security

Almost all the threats an application developer should be concerned with are in the same class, namely that of Injection attacks. An injection attack is when a client tricks another process into treating data as instructions. For example by using SQL meta-characters:

Little Bobby Tables

An important source of injection vulnerabilities is HTML injection, also known as Cross-Site Scripting (XSS).

In both situations, and in all others, there’s one important guideline: Data from the outside world should be considered “tainted”. Never use tainted data in unsafe ways. When reading input parameters, validate against malicious characters (but please don’t make poor “O’Reiley” unable to use your system). When writing HTML pages, always escape tainted data. When using tainted data during access to the database or with HSQL or JPAQL, always use PreparedStatement and send in data as parameters.

Another often overlooked exploit is request forgery, often used in combination with phishing attacks. To protect your users from request forgery, supply an authentication token as a hidden field with all forms. Or if you’re lazy: Make sure all operations have confirmation dialogs.

5. Continuous Deployment

Continuous Deployment is the practice of rolling out a deployment to a server after every successful build on your Continuous Integration server. I described two ways of doing Continuous Deployment during the tutorial, but I will restrict this discussion to the more modern one.

Most teams doing continuous deployment use Maven or Ant to invoke the deployment tools of their respective application servers. Many application servers make this pretty hard, but the hardest part of the battle if finding out what command needs to be invoked. The Continuous Integration server can be configured to run this task.

After doing deployment, it is a good idea to run some sort of system level integration tests. Teams use replay of production data, load generators like JMeter and webcrawlers that validate HTML and CSS to do automated non-functional integration tests. If you keep your logs clean, you can actually gain quite a bit of confidence just by looking at the logs after applying simulated load to your system.

Some projects take this even further, by continuously deploying to production. Both IMVU and Flickr are known to practice this.

At any rate, the practice of doing continuous deployment should lead you to consider how to simplify your deployment and runtime configuration, which will result in an easier installation procedure into production, even if it’s not automated.

Summary

Effective Enterprise Java development has progressed a lot since 2004. Much of the emphasis now is on how to improve testing in enterprise Java applications. The way applications usually process data has stabilized as well, with most application preferring each event to be processed in an isolated, transactional context with very little automated recovery.

In the end, Effective Enterprise Java is a lot simpler in 2009 than it was in 2004.

Material

  • My slides, including topics that we didn’t discuss as well as code for all the examples
  • The complete source code for one iteration of my Enterprise Java Kata, including a pom.xml file with all dependencies needed to get the tests running
Posted in English, Java, Software Development | Leave a comment