Thursday, September 18, 2014

Making WAR!

Objective

Find a build system that will allow me to make JAR and WAR files, import dependencies from Maven repository, copy config files and dependencies, validate the environment and run dbupdate.  A very ordinary web application environment.

-

UPDATE: September 2021

Gradle is a lot more mature at this point and a preferred way to go, everything below is in the context of 2014.

---


My development environment

At the time of writing

Kubuntu
git
JetBrains IntelliJ IDEA 13.x
Tomcat 8.x server
PostgreSQL 9.x


What the build system needs to do for me

1. Build WAR file that will be used in deployment (by Jenkins or similar)

2. Manage dependent files using Maven repository

3. Create my custom CATALINA_BASE directory to avoid polluting tomcat install with custom jars which makes upgrading tomcat non-trivial with project specific configurations such as:

  • DB connection pool
  • distributed session management (based on memcached or similar)
  • custom port (8080 is overused)
  • tomcat global jars
  • custom log4j configuration (with more info)

4. Reset database (drop database, create database, run bunch of SQL scripts using dbupdate)

5. Validate environment (make sure necessary applications and config files are in the right place), this is useful when you have new people getting set up on the environment or to check if everything is running as expected

6. Run unit and integration tests

Contenders

  • Maven
  • Gradle
  • Ant with Ivy and python


Maven


If there was a scale to how rigid a build system is, Maven would be the most rigid.  It is either the Maven way or you are in for a world of hurt.  Anything custom needs to be written in java as a plugin.  Once you define a project in Maven you can easily import it into IDEA and everything just works.  It will build the WAR file for you.  It will run unit tests.  It will manage dependencies.  When it works, it works very well.

The real problem came up when I tried to do things like copy files or execute external tasks, everything is a plugin with a very complicated XML configuration.  It took me almost an hour to get simple file copy configured and even then I couldn't get it to synchronize the files (copy only if newer), I could have written a plugin to do that but it's a lot more time needed to mess with build environment than I wanted to spend.

Many of the things that I needed to do, such as dbupdate, creating custom CATALINA_BASE, validating the environment turned into calling ant tasks to preserve sanity and the XML config just grew in complexity.  I started to understand why some maven POM files are borderline unreadable.

But I really liked the simplicity of how maven fetched dependencies and built the WAR and how well it worked with IDEA.  At this point I decided to write an ant script that would call maven to do builds and dependency updates and use ant/python for everything else environment related.  But at this point I was managing 2 different build systems.  So I figured I try something that may simplify my life a bit.


Gradle

Gradle is near the middle of the build rigidity scale, leaning slightly towards Maven.

According to their site, they are the "next generation" of build systems, so there must be something to this claim.  I started a new project with same requirements and was able to get dependencies configured and using WAR plugin get the build done.  Seemed reasonable, but Groovy syntax did not sit well with me, there were way too many ways to do the same thing, which made reading script samples from other people challenging.  Not a show stopper.

Copying files was also not difficult using the copy task, and once you define dependencies you can iterate through them and copy when you need.

Next I tried to incorporate dbupdate (or similar) and ran into a problem where the only solution was to define an ant task and run it through there, there are samples on stackoverflow.com so I was able to get that running (here we have Gradle calling Ant, I am back to supporting 2 build systems).  Now the task of resetting the database is 3 part: drop database, create database, run SQL scripts... in that order.  Gradle has a last minute addition to executing sequential tasks using .mustRunAfter and .shouldRunAfter which makes configuring dependent tasks overly complicated.  One of the main features of a build system should be doing things in a predictable order and Gradle way is very fragile:
task resetDatabase(dependsOn: [dropDatabase,createDatabase,createChangelogTable,updateDatabase])
createDatabase.mustRunAfter dropDatabase
createChangelogTable.mustRunAfter createDatabase
updateDatabase.mustRunAfter createChangelogTable

Another issue came up when I imported the gradle project into IDEA.  Since this is a web app, I need to mark some jars as provided (servler-api, jsp-api, etc) but there is no such thing in Gradle, runtime is as far as I could go but that's not the same as provided (it means that the jars need to be included at runtime but not for compiling which is not the same as available at runtime, there is a servlet spec that states that you can only have 1 copy of servlet-api.jar or tomcat will complain on startup, this is to make sure your servlet API is consistent with server).  There is a custom Gradle plugin for doing provided, but it did not work with IDEA.  Every time I changed dependencies, it would remove the tomcat libraries from the project dependency and code would break.  To fix it I would have to open Project Settings and manually add dependency on Application Server libraries which made them marked as provided.  Got annoying very quickly as this happened every time anything changed with dependencies.

Another minor issue was with IDEA Gradle plugin and Web facet, it created one entry for every directory in WEB-INF which made for a very complicated configuration when it was really WEB-INF to context root.

I do think that many of the problems with Gradle were caused by IDEA integration but since that is my environment it affected my decision.

Ant

Ant is on the opposite side of Maven on the build rigidity scale, it's very flexible and also not very tied into the build process.  Which means I have to define a bit more up front.

Once you add the ant script to IDEA it just runs it, there is no dependency management by default (but there is an Ivy plugin which I did not try yet), but Apache Ivy uses maven repositories and is very easy to integrate into Ant.  Pulling down dependencies and copying them into your project is an easy ant task (not an implicit action as it is with Maven or Gradle).  Ant is explicit about everything and easy to set up sequential dependencies.  The only obvious issue was that I had to add WEB-INF/lib to .gitignore since Ivy would pull all the jars into the source tree.

There are ant tasks for almost everything since it has been around for a long time.  dbupdate was a simple configuration. creating custom CATALINA_BASE was also a simple task.  Checking the environment is where it gets weird with ant, since you have to define tasks for setting properties to check anything and then other tasks that check the result of those properties.  So after a little while ant tasks get difficult to read.

I also had to write a few scripts in python to do environment validation.

Summary

Maven Pro

  • Transparent once everything is where Maven expects it
  • Nice integration with IDEA
  • Arch-type projects help you get started quickly

Maven Con

  • Very rigid
  • POM configuration files are very hard to read
  • POM files get very complicated very quickly
  • Custom operations are time consuming and not easy to do



Gradle Pro

  • Nice dependency management
  • Most things can be done programatically
  • Supports WAR and JAR building via plugins

Gradle Con

  • IDEA plugin is quirky
  • No support for provided libraries
  • Sequential execution of tasks is tedious to configure



Ant Pro

  • Simple to work with and many tasks are already available built-in
  • Ivy dependency management is not intrusive
  • ant-contrib adds a lot of improvements


Ant Con

  • WAR and JAR building requires specifying everything manually


Conclusion

Maven was too much configuration work and I did not want to write plugins.  It's a nice build system if you don't mind the very rigid approach, but doing anything unplanned becomes a major undertaking.

Gradle lacks some very important features to me and IDEA integration is quirky (having to re-add application server library whenever dependencies changed was very annoying, messy Web Facet import, etc).  This may eventually be resolved but at the moment it feels hacky and requires you to be aware of the shortcomings and compensate for them.

I chose to go with Ant and Ivy for simplicity, support and maturity.  It provides me with fine grained control over the build process and stays out of my way while I develop; while initial configuration can be a bit time consuming, over time it averages out to be less maintenance.  The other two were too intrusive and dependency management was something that got more in the way than I wanted.  I want my build system to stay out of my way when developing, as it constitutes a very small amount of time-in-use overall.  If I have to be aware of it and work around (or within) it, then it means I am not as free or productive as I want to be.