08 October 2007

Drools-Puzzle #2: And the Winner is...

Scott Reed (.... applause ...)

Dunnit! A dozen solutions to the second installment of the (now famous!) Drools puzzle contest arrived at my desk, originating from brave men spread all over the world (all right, at least from Australia, South-America, North-America and Europe).

I evaluated during a brief holiday in souther Turkey - without internet access and with only Drools 4.0.1 plus Eclipse 3.2 installed on my MacBook. That initially sounded all right for me - but wasn't. Read on why I failed to evaluate a few (gorgeously interesting) solutions with this setup!

C++ for highest performance

Like in puzzle #1, all rule-based solutions were outperformed by Dirk Farin's C++ version. Dirk: You're great. But we're using rules to factor business-logic out of java applications... If we ever need to build something ultra-performant, we'll surely remember your approx-one-nanosecond-solution. Btw: I took your source and could immediately build-and-run it on the Mac :-) Dirk used numbers instead of (symbolic) names, which makes his solution-output a little hard to read...

Although admirable in performance, Dirks' solution still is a C++ program... and I didn't let those count in the puzzle...

Standalone CLIPS

The next hurdle in evaluation came from Johan Lindberg from Sweden: He send a pure CLIPS solution, which wouldn't run in Drools either...

Johan seemed to be pleased with the puzzle and asked for more. Johan & Dirk - please send us Drools solutions next time :-)

Only Seeing is Believing

Like in any athletic competition I required the winning solution to run on my machine - during my evaluation. I didn't actually feel the urge to debug or modify submissions so they would/could run in my limited setup (see above). Although this may sound unfair ("but we're still using 3.x in our project"), I had no other chance.

Therefore, solutions without source-code (Carlos Bustamante), with requirements to "mvn install" additional drools modules (Geoffrey DeSmet) or with Drools 3.1 rule syntax didn't count either. Sorry guys - life can be tough. More on Geoffreys' solution later on...

Now there were 7 solutions from 6 participants left over to compare. It had already gotten hot in
the mediterranean hemisphere - I noticed other people jump into the pool, I smelled roasted garlic and delicious turkish mokka. But I kept evaluating...

Simple and Complex Rules

No participant bothered with any explanation, how or why his solutions works. They all did work correctly, by the way. But with several of them I still don't know why. The golden ideas of their creators, those shiny bits
of knowledge - buried in crusts of unexplained code. I hear people talking about self-explaining code, of the truth that only lies on code, never in documentation. Might hold for guru-readers, but not for humble-ones like myself. I need explanation, abstracted away from source-code. Even for our simple puzzle, several solutions spanned more than 100 lines of code - mostly without any docu.

For me, this is one lessons I (once again!) learned from this puzzle: Care for understandability.

All solutions gave correct results - but I evaluated only one (from John Kirchmeier) as "clean-and-simple", without fancy decoration, and gave it top-score in "Drools code understandability". Well done, John!

One case in particular gave me headaches. One participant (he'll know who I'm talking about), a real Drools-guru, supplied a solution based on the DroolsSolver approach (formerly known as Taseree). In my remote holiday location, without internet access, I couldn't even build it. He was the only one to submit a solution based on a scoring approach - his program creates just one (arbitrary) family, scores it and, if the score is not perfect, modifies the family... The result surely works - but at what price? The "business logic" of our simple riddle gets mixed up in six business rules, one scoring rule and a pretty complex (java) function to re-order the family relationships - naturally based on Generics, a FamilyMoveFactory and two Switcher-classes... Impressive stuff, I'm sure this approach will solve generations of extremely complex riddles - but simply overwhelmed me. Impressiveness doesn't count.

Another sting to another participant: 18 rules for this puzzle might be a little over-modularized...

Running for Gold

Now that I bashed a little on overly complex solution approaches (yep - rules can be kept simple yet correct!), what solutions remained in the race for the title? (Running on a MacBook Pro, Intel Dual-Core, OS-X 10.4.10, 2GB Ram, Eclipse 3.2 with Java 1.5, running exclusively Eclipse and Finder, using System.currentTimeMillis() for timing). I ran every solution 3 times and took the lowest values for the overall performance (that is: initialization plus rule-evaluation!).

  1. Scott Reed, 1596 overall, 37 for Drools

  2. Christiano Guiffrida, 1682 overall, 33ms for Drools

  3. Dan Berindei (first solution based on fact-retraction), 1720ms overall, 110 for Drools

  4. John Kirchmeier, 1760 overall, 53 for Drools

  5. Dan Berindei (second solution, without retract), 1812 overall, 65 for Drools

  6. Chris Barham, 1875ms overall solution time, 56ms for Drools.

  7. Reginaldo Delfino, 1956 overall, 77 for Drools

Scott - congratulations. Your solution outperformed the other Drools solutions. Please supply the next puzzle to the Drools community!

It was close: Christiano's and Dan's solutions were pretty fast too. My personal evaluation
on understandability: The fastest solutions score 4 out of 5 possible points.

With 1760 msec overall runtime, John Kirchmeier's clean-and-elegant solution provided an
excellent compromise between efficiency, maintainability and understandability. John: You're my moral winner, with a rule-understandablity-score of 5.

Download of submissions

I do not provide downloads for the submissions - maybe the participants can put their solutions
in the JBoss-Wiki and send me the link - I'll post those here!

Hints for the Future

As code quality remains to be very subjective, it cannot easily be evaluated by a single referee
in our little contest. Taking pure performance clearly identifies a winner - but as a software-architect
I'd definitely pick the solution with the lowest overall complexity and acceptable performance.

I propose that from now on each "puzzle-creator/judge" publishes her or his set of evaluation
criteria... (I failed in this respect - sorry!). Next time I'd even propose a naming convention plus a predefined "main" class structure:

long start = System.currentTimeMillis();
long begin_fact = System.currentTimeMillis();
long end = System.currentTimeMillis();

Thanx again for participating!