Toggle navigation OptaPlanner logo
  • Home
  • Download
  • Learn
    • Documentation
    • Videos
    • Slides
    • Training
    • Use cases
    • Compatibility
    • Testimonials and case studies
  • Get help
  • Source
  • Team
  • Services
  • Star
  • @OptaPlanner
  • Fb
Fork me on GitHub
  • Vehicle routing with real road distances
  • Prototyping an enterprise webapp at Devoxx Hack...

Open benchmarks for the win

Fri 7 November 2014

Avatar Geoffrey De Smet

Geoffrey De Smet


Twitter LinkedIn GitHub

OptaPlanner lead

Recently, there was some commotion on Twitter because a competitor heavily restricts publicising benchmarks of their Solver as part of their license. That might seem harsh, but I can understand the sentiment: when a competitor publicizes a benchmark report comparing our product against their own, I know we’re gonna get screwed. Unlike single product benchmarking, competitive benchmarking is inherently dishonest…​

Competitive benchmarking for dummies

As as competitor, you can utilize several (obvious and not so obvious) means to prove your superiority over another Solver:

  • Publication bias

    • Pick a use case which is know to work well in your Solver.

    • Use datasets with a scale and granularity which are known to work well in your Solver.

    • If you’re really evil, benchmark multiple use cases and datasets in both Solvers and only retain those for which your Solver wins.

  • Expertise imbalance

    • Let one of your experts develop an implementations for both Solvers.

      • Motivation: like any other company, your company only employs experts in your own technology.

    • If he has years of recent experience in your technology, it’s unlikely he’ll had time for any recent experience in the competitive technology.

      • So you’re effectively using your jockey on someone else’s horse.

  • Tweaking imbalance

    • Spend an equal amount of time on both implementations.

      • The use case is probably already implemented in your Solver (or straightforward to implement), so you can spend most of the time budget to tweak it better.

      • You 'll need to learn the competitor’s Solver first, so you 'll spend most of the time budget in that implementation to learn the technology, which leaves no room for tweaking.

  • Funding

    • There’s no need to explicitly set a desired outcome: your developer will know better than to bite the hand that feeds him.

Notice how these approaches don’t require any malice (except for the evil one): it’s normal to conduct a competitive benchmark like this…​

Furthermore, you can make the competitive benchmark comparison look more objective, by sponsoring an academic research group to do the benchmark for you. Just make sure that’s a research group which has been happily using your technology for years and has little or no experience with the competition.

Marketing value

The marketing value of a such a benchmark report should not be underestimated. These numbers, written in black and white, which clearly show the superiority of your Solver against another Solver, make a strong argument:

  • To close sales deals, when in direct competition with the other Solver.

  • To convince developers, researchers and students to learn and use your technology.

  • To build a strong, long-term reputation.

    • Benchmarks from the 90’s can still affect the Google search results today, for example for "performance of Java vs C++".

    • Such information spreads virally, and counter claims might not.

Empirical evidence

Are all competitive benchmark reports lying? Yes, they are probably misrepresenting the truth.

Should we therefor restrict users from publicizing benchmarks on our Solver? No, of course not (even if our open source licence would allow such conditions, which it does not).

Computer science - like any other science - is build on empirical evidence: the promise that any experiment I publish can be repeated by others independently. If we prevent people from publishing such repeated experiments, we undermine our science. In fact, the more people which report their benchmarks, the clearer our strengths and weaknesses show. Historically, this approach has already enabled us to diagnose and fix weaknesses, regardless whether those were caused by our Solver or the user’s domain specific implementation.

Therefore, OptaPlanner welcomes external benchmark reports. I believe in Open Science, as strongly as I believe in Open Source. I do ask the courtesy of allowing public comments/feedback on a public report website, as well as to publicize the details (such as the Solver configuration). If you use the OptaPlanner Benchmarker toolkit (which you will find convenient), simply share the benchmarker HTML report.

To run any of the benchmarks of the OptaPlanner Examples locally, simply run a *BenchmarkApp executable class, for example CloudBalancingBenchmarkApp. Notice how a small change in the *BenchmarkConfig.xml, such as switching score calculation from Easy Java to Drools or from Drools to Incremental Java, can have a serious effect in the results.

In short: I like external benchmarks, but dislike competitive benchmarks, except for …​

Independent research challenges

Can we compare fairly with our competition? Yes, through an independent research challenge.

Regularly, the academic community launches such challenges. Each challenge:

  • defines a real-world use case with real-world constraints

  • provides multiple, real-world datasets (half of which they keep hidden)

  • expects reproducible results within a specific time limit on specific hardware

  • gets worldwide participation from the academic and/or enterprise Operations Research community

  • benchmarks each contestant’s implementation on the same hardware in the same time limit to determine a winner

  • benchmarks those hidden datasets to counter overfitting and dataset recognition

It’s fair: each jockey rides his own horse. Most of the arguments against competitive benchmarking do not apply. And as an added bonus, we get to learn from and compare with the academic research community.

In the past, OptaPlanner has done well on these challenges, despite the limited weekend time we have to spend on them. In the last challenge, the ICON power scheduling challenge, we (Lukas, Matej and me) finished in 2nd place. A minority of the researchers still beat us (with their innovative algorithms in their experimental contraptions and massive time to tweak/build those), but it’s been years since a competitive Solver has beaten us.

Long term vision

Sharing our benchmarks and enabling others to easily reproduce them, is part of a bigger vision: Too many research papers (on metaheuristics and other optimization algorithms) are hard to reproduce. That’s the paradox in computer science research: to reproduce the findings of a research paper, all we really need is a computer and the code. We don’t need an expensive laboratory. Yet, in practice, the code is usually closed and the raw benchmark data is not accessible. It’s like everyone is scared of sharing the dirty secrets of their code and their benchmarks.

I believe that we - the worldwide optimization research community - need to create a benchmark repository: a centralized repository of benchmarks for every use case, for every dataset, for every algorithm, for every implementation version, for any amount of running time. That, together with a good statistical interface, will give us some real insight as to which optimization algorithms are good under which circumstances.

We - in OptaPlanner - are well on our way to build exactly that:

  • OptaPlanner Examples already implements 14 distinct use cases.

  • For each use case, we’re already benchmarking on many different optimization algorithms.

  • Our benchmarker HTML report already includes many useful statistics to analyse the raw benchmark data.

Join us :)


Comments Permalink
 tagged as community

Comments

Visit our forum to comment
  • Vehicle routing with real road distances
  • Prototyping an enterprise webapp at Devoxx Hack...
Atom News feed
Don't want to miss a single blog post?
Follow us on
  • T
  • Fb
Blog archive
Latest release
  • 8.1.0.Final released
    Fri 15 January 2021
Upcoming events
  • KIE Live
    Worldwide - Tue 19 January 2021
    • OptaPlanner Shadow Variables for the Vehicle Routing Problem and Task Assignment by Geoffrey De Smet, Karina Varela, Alex Porcelli
  • Javaland
    Worldwide - Tue 16 March 2021
    • AI on Quarkus: I love it when an OptaPlan comes together by Geoffrey De Smet
Add event / Archive
Latest blog posts
  • Solve the facility location problem
    Fri 9 October 2020
     Jiří Locker
  • OptaPlanner Week 2020 recordings
    Mon 7 September 2020
     Geoffrey De Smet
  • Let’s OptaPlan your jBPM tasks (part 1) - Integrating the two worlds
    Fri 3 July 2020
     Walter Medvedeo
  • AI versus Covid-19: How Java helps nurses and doctors in this fight
    Fri 8 May 2020
     Christopher Chianelli
  • Workflow processes with AI scheduling
    Tue 5 May 2020
     Christopher Chianelli
  • Constraint Streams - Modern Java constraints without the Drools Rule Language
    Tue 7 April 2020
     Geoffrey De Smet
  • How to plan (and optimize) a Secret Santa
    Wed 18 December 2019
     Christopher Chianelli
Blog archive
Latest videos
  • YT Shadow variables
    Tue 19 January 2021
     Geoffrey De Smet
  • YT Domain modeling and design patterns
    Tue 17 November 2020
     Geoffrey De Smet
  • YT Quarkus insights: AI constraint solving
    Tue 20 October 2020
     Geoffrey De Smet
  • YT AI in kotlin
    Wed 23 September 2020
     Geoffrey De Smet
  • YT Planning agility: continuous planning, real-time planning and more
    Thu 3 September 2020
     Geoffrey De Smet
  • YT Quarkus and OptaPlanner: create a school timetable application
    Thu 3 September 2020
     Radovan Synek
  • YT Business use cases and the impact of OptaPlanner
    Thu 3 September 2020
     Satish Kale
Video archive

KIE projects

  • Drools rule engine
  • OptaPlanner constraint solver
  • jBPM workflow engine

Community

  • Blog
  • Get Help
  • Team
  • Governance
  • Academic research

Code

  • Build from source
  • Submit a bug
  • License (Apache-2.0)
  • Release notes
  • Upgrade recipes
Sponsored by
Red Hat
More coder content at
Red Hat Developers
© Copyright 2006-2021, Red Hat, Inc. or third-party contributors - Privacy statement - Terms of use - Website info