Sunday, 2 March 2008

Groovy performance test

A few days ago I came across an article about Groovy (http://groovy.codehaus.org). It took me so I decided to go through useful Groovy tutorial (you can find it here http://www.asert.com/pubs/Groovy/Groovy.pdf). Groovy is a scripting language based on Java. You can write your Java code in much simpler way using Groovy and you can whenever use legacy Java code in your Groovy code if you want to. Groovy adds to Java language closures, domain specific languages, metaprogramming and extends existing Java classes too (e.g. Groovy adds method eachLine(Closure closure) to standard java.io.File class). Groovy has also direct support for SQL, SOAP, web services, COM scripting, servlets and much more. Groovy seems to be very development oriented so I decided to do basic performance testing to see Groovy’s performance because performance is the most important question for every developer (I hope at least :-)).

Before testing itself it would be worthy to show you briefly how Groovy works e.g. with collections. Let’s have the following Groovy snippet:
names = ["Ted", "Fred", "Jed", "Ned"]
println names
shortNames = names.findAll{ it.size() <= 3 } println shortNames.size() shortNames.each{ println it }
You should notice that Groovy doesn’t force you to use semicolons. Now let’s move to the code itself. The first line declares and defines a list of four strings – “Ted”, “Fred”, “Jed” and “Ned”. The second line prints the list to the standard output. The third line uses closures (finally something more interesting ;-)). It takes each value in list ‘names’ and run a closure { it.size() <= 3 } over it. ‘it’ in the closure refers to a parameter to the closure. 'findAll' method finds and returns all values in ‘names’ list for which the closure returns value ‘true’. So ‘shortNames’ would be a list containing values from list ‘names’ with size lower or equal to 3. The fourth line prints the size of the list ‘shortNames’. Finally the fifth line prints all the values in list ‘shortNames’. The output is:
["Ted", "Fred", "Jed", "Ned"]
3
Ted
Jed
Ned
I hope you enjoyed short Groovy example and if you want to try on your own then simply download and install Groovy, run GroovyConsole, copy there the snippet above and run it.

Maybe you are wondering how the Groovy works. Groovy source code is converted into common Java code and then compiled using compiler to the Java byte code. In the Groovy source code the Groovy specific code (non normal Java one) is wrapped during conversion into Groovy's comon Java classes to get work. This wrapping certainly has a performance impact on your application and that’s the reason why I decided to do a performance test. Groovy has its own executable for running source codes (groovy.exe in Windows) and the executable accepts (among others) the same arguments as Java’s executable (java.exe). I run all the tests with arguments –Xms64m and –Xmx512m to ensure that memory conditions are the same.

The first feature I would like to test was file access. The test creates a new file, writes numbers 0 to 25000 into it (one number per line), closes the file then opens again, read all the lines and print the sum of the numbers to ensure that everything was all right. Source code in Java is:
File file = new File("temp.txt");
file.createNewFile();
PrintStream printer = new PrintStream(file);
long before = System.currentTimeMillis();
for (int i = 0; i <= 25000; i++) { printer.println(i); } long after = System.currentTimeMillis(); System.out.println("Writing done in: " + (after - before)); printer.close(); long result = 0; BufferedReader reader = new BufferedReader(new FileReader(file)); for (int i = 0; i <= 25000; i++) { result += Integer.parseInt(reader.readLine()); } after = System.currentTimeMillis(); System.out.println("Reading done in: " + (after - before)); reader.close(); System.out.println("Result: " + result); file.delete();
And corresponding code in Groovy is:
file = new File("temp.txt")
before = System.currentTimeMillis()
for (i in 0..25000) file.append(i + "\n")
after = System.currentTimeMillis()
println "Writing done in: " + (after - before)
long result = 0
before = System.currentTimeMillis()
file.eachLine( { result += Integer.parseInt(it)} )
after = System.currentTimeMillis()
println "Result: " + result
file.delete()
println "Reading done in: " + (after - before)
The result was partially unexpected because while writing performance was much better in Java, reading performance was better in Groovy although I was using BufferedReader in Java. Even setting the size of the buffer to higher value had no effect. I tried to find source code of Groovy’s java.io.File class but although the Groovy is open source, I had no success. Could anyone help me please? You can see the test results below.

The second feature I would like to test was collection access. I chose List (concretely ArrayList). The test creates new collection, adds 100 000 values and then removes the first 1000 values. Removing a value results in shifting remaining values to the left thus removing all 100 000 values would take very much of time. Source code in Java is:
List stringList = new ArrayList();
long before = System.currentTimeMillis();
for (int i = 0; i <= 100000; i++) { stringList.add(String.valueOf(i*i)); } long after = System.currentTimeMillis(); System.out.println("Added in: " + (after - before)); before = System.currentTimeMillis(); for (int i = 0; i <= 1000; i++) { stringList.remove(i); } after = System.currentTimeMillis(); System.out.println("Removed in: " + (after - before));
And corresponding code in Groovy is:
def stringList = []
before= System.currentTimeMillis()
for (int i in 0..1000000) stringList.add(String.valueOf(i*i))
after=System.currentTimeMillis()
println "Added in: " + (after - before) + " ms"
before= System.currentTimeMillis()
for (int i in 0..1000) stringList.remove(i)
after=System.currentTimeMillis()
println "Removed in: " + (after - before) + " ms"
The result didn’t surprise me at all, in both cases had the Java much better performance (adding values was even 65x faster) than Groovy. You can see the result below.

The final feature I would like to test was database access. I chose the 100% pure Java database – HSQLDB about which I’m going to write an article in the future. The test itself inserts into a simple table 25 000 records and then selects them and count the sum. Source code in Java is:
try {
Class.forName("org.hsqldb.jdbcDriver");
} catch (ClassNotFoundException e) {
e.printStackTrace();
}
Connection dbConnection = null;
Statement statement = null;
try {
dbConnection = DriverManager.getConnection("jdbc:hsqldb:hsql://localhost/mydb", "sa", "sa");
statement = dbConnection.createStatement();
statement.executeUpdate("DELETE FROM \"test\"");
long before = System.currentTimeMillis();
for (int i = 0; i <= 25000; i++) { statement.executeUpdate("INSERT INTO \"test\" VALUES (" + i + ", " + i + ")"); } long after = System.currentTimeMillis(); System.out.println("Data inserting: " + (after - before) + " ms"); ResultSet data = statement.executeQuery("SELECT VALUE FROM \"test\""); long result = 0; before = System.currentTimeMillis(); while (data.next()) { int value = data.getInt(1); result += value; } after = System.currentTimeMillis(); System.out.println("Data selecting: " + (after - before) + " ms"); data.close(); statement.close(); dbConnection.close(); System.out.println("result: " + result); } catch (SQLException e) { e.printStackTrace(); }
And corresponding code in Groovy is:
import groovy.sql.Sql

def sql = Sql.newInstance("jdbc:hsqldb:hsql://localhost/mydb", "sa", "sa", "org.hsqldb.jdbcDriver")
sql.executeUpdate("DELETE FROM \"test\"")
before = System.currentTimeMillis()
for (i in 0..25000) {
sql.executeUpdate("INSERT INTO \"test\" VALUES (${i}, ${i})")
}
after = System.currentTimeMillis()
println "Data inserting: " + (after - before) + " ms"
long result = 0;
before = System.currentTimeMillis()
sql.eachRow("select value from \"test\"") { result += it.value }
after = System.currentTimeMillis()
println "Data selecting: " + (after - before) + " ms"
println "Result: " + result
You can see that code in Groovy is really much shorter and simpler, but what about the performance? Again had the Java much better performance than Groovy, you can see the result below.

Conclusion

In the beginning I was really very curios how fast is Groovy going to be and in the end I was very disappointed. Of course that I expected lower performance than in Java, but I didn’t such huge performance impact. On the other hand Groovy is surely appealing project that just need a performance tuning to be done. I can recommend you trying Groovy but I don’t recommend you to develop a high performance application using it.

Sunday, 24 February 2008

Java date & time API vs. JODA

Java has simple API for working with date and time. Many people find the API deficient and I’m one of them. Although I was using SimpleDateFormat and GregorianCalendar classes in Java API for long time, sometimes I found it very cumbersome and unsuitable for my task. Sometimes there were problems with parsing input text and the problems couldn’t be directly solved using Java API. Java community is working on new Java date and time API (see https://jsr-310.dev.java.net/), but I think it will take a while to be finished. A few weeks ago I came across JODA API and without delay tried using it. I was really very pleasantly surprised how useful and simple the JODA is.

Java API has in fact 2 (in words two) classes for working with date and time – java.text.SimpleDateFormat for parsing and converting date and time and java.util.GregorianCalendar for manipulating a date and time. Either class extends its own abstract ascendant but it’s not important in light of functionality. Creating an instance of SimpleDateFormat is simple; you pass format string and a Locale instance and start using the instance. The most confusing thing for a beginner can be leniency of the new instance. Default behavior of a new instance of SimpleDateFormat is not to be lenient. That means you can pass any string to its parse(…) method and you get a result without any exception. If you forget to set leniency by setLenient(true) method you can later unreasonably get unknown behavior of your application.

JODA API (see http://joda-time.sourceforge.net/) has in contrast to Java API much more (tens) classes to work with data and time. JODA architecture contains Instants (a moment in the datetime continuum), Intervals (an interval of time from one instant to another instant), Durations (a duration between two Intervals in milliseconds, doesn’t have start and end), Periods (a duration in e.g. years, months, days and hours), Chronology (a calculation engine that supports the complex rules for a calendar system), TimeZones (don’t need commentary I hope) and then many tools to manipulate, parse and format date and time. First thing we need to do is create new Instant:
DateTime dateTime = new DateTime();
The dateTime instance contains date and time according to the date and time it was instantiated (similarly to Java API and its java.util.Date class). You should notice that a dateTime instance is immutable and can be shared among threads without need of access synchronization. Need to get last possible day in actual month? No problem:
int lastDay = dateTime.dayOfMonth().getMaximumValue();
Creating of a Duration instance is simple and intuitive:
DateTime before = new DateTime();
Thread.sleep(30);
Interval interval = new Interval(before, new DateTime());
int timeInterval = interval.toDuration().getMillis();
The snippet above should create interval of 30 miliseconds but the interval is longer because of time spent during threads scheduling. JODA uses formatter classes to format and parse date and time. Creating of a custom formatter looks similarly to Java API:
DateTimeFormatter dtf = DateTimeFormat.forPattern("yyyy-MM-dd HH:mm:ss");
Then simply use the instance of the formatter to format a DateTime instance:
String formattedTime = dateTime.toString(dtf);
Need to create DateTime with different time zone (e.g. New York)? No problem in JODA:
DateTime newYork2 = new DateTime(DateTimeZone.forID("America/New_York"));
All supported time zone ID strings are located in package org.joda.time.tz.data. Of course you can “change” time zone of existing DateTime instance (in fact you can’t change existing DateTime instance, you have to create new):
DateTime newYorkTZ = dateTime.withZone(DateTimeZone.forID("America/New_York"));
Need to add 3 days to an existing DateTime instance? There’s nothing simpler:
DateTime add3Days = dateTime.dayOfMonth().addToCopy(3);
You can really do many and many things using JODA API and much more simply then using Java API.

One of my targets when I was preparing this post was to do a performace test and measure performance of parsing and formatting input in either API. I decided to do it in the following way: call parse and format method of each API specific number of times (e.g. 100 000x) and measure time spent using System.currentTimeMillis(). Following snippet shows how I measured formatting performance of Java API:
DateFormat javaFormatter = new SimpleDateFormat("dd-MM-yyyy HH:mm:ss");
Date date = new Date();
String result = null;
beforeTest = System.currentTimeMillis();
for (int i = 0; i < testCount; i++) {
result = javaFormatter.print(dateTime);
}
aftertest = System.currentTimeMillis();
I got the following results:



You can see that while Java and JODA formats an input string at the same speed, parsing takes much less time in JODA.