OptaPlanner logo
  • Download
  • Learn
    • Documentation
    • Videos

    • Use cases
    • Compatibility
    • Testimonials and case studies
  • Get help
  • Blog
  • Source
  • Team
  • Services
  • Star
  • T
  • L
  • F
  • YT
Fork me on GitHub

Monitor OptaPlanner solvers through Micrometer

Tue 12 October 2021
Avatar Christopher Chianelli
Christopher Chianelli

GitHub

OptaPlanner developer

It’s 11 PM on Friday evening. Everything was working fine — until now. Suddenly services are failing left and right, and your boss wants to know why. One tool in achieving a diagnosis is to use a monitoring system. Let’s enable monitoring to the OptaPlanner nodes and see if we can diagnose this issue.

What is monitoring?

Monitoring is observing the quality of a service over time. Monitoring is similar to logging, except its output is more easily analyzed by a machine and can be aggregated across multiple nodes. The software being monitored outputs metrics, a numerical measurement of some aspect of the software. The metrics are then recorded in a monitoring system, where they can be graphed, trigger alerts, and be correlated with events.

For instance, you can monitor an OptaPlanner service. You can check how many solvers ran last night, and how many are currently running. If you notice the count is higher than usual, than maybe the failures are caused by CPU starvation since OptaPlanner utilizes most of the CPU. Additionally, you can also check how long each solver ran. If a solver ran abnormally long or short, there might be an anomaly that should be investigated. You might also want to check if the solver threw any errors. Perhaps the issue is caused by bad data being passed to the solver.

Connecting monitoring systems to OptaPlanner

OptaPlanner uses Micrometer to collect its metrics. Micrometer then sends the metrics to different monitoring systems using registries. Every monitoring system supported by Micrometer has its own registry. In Quarkus, to connect OptaPlanner to Prometheus, you would add its extension as a dependency in the pom.xml:

<dependency>
  <groupId>io.quarkus</groupId>
  <artifactId>quarkus-micrometer-registry-prometheus</artifactId>
</dependency>

In Spring Boot, add the Spring Boot actuator to your project along with the registry dependency:

<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
  <groupId>io.micrometer</groupId>
  <artifactId>micrometer-registry-prometheus</artifactId>
</dependency>

and enable the metrics endpoint in application.properties:

management.endpoints.web.exposure.include=metrics,prometheus

For information on how to connect Micrometer to other monitoring systems, visit the Micrometer documentation.

Monitoring the Solver

Now that we know what monitoring is and how it can be useful, let’s walk through an actual example. We’ll be running a modified version of the school timetabling quickstart with support for multitenancy. You can find the complete source code, along with scripts for running it on the optaplanner-micrometer-blog GitHub page.

Starting the application

Clone the example code:

git clone https://github.com/Christopher-Chianelli/optaplanner-micrometer-blog
cd optaplanner-micrometer-blog

Start the application in development mode:

mvn quarkus:dev

After the application is started, you can access the application at http://localhost:8080. To see available metrics for the application, visit http://localhost:8080/q/metrics.

Example Application

Starting Prometheus

Prometheus has a prebuilt Docker image on Dockerhub that can be used for running Prometheus. We’ll need to modify its configuration so it’ll scrape our metrics endpoint.

Create the prometheus.yml file with the following text:

scrape_configs:
- job_name: local application
  scrape_interval: 1s
  metrics_path: /q/metrics
  static_configs:
  - targets:
    - localhost:8080

This configures Prometheus to scrape metrics from localhost:8080/q/metrics every second.

Start Prometheus with the preceding configuration:

docker run \
    --network host \
    --mount type=bind,source=prometheus.yml,destination=/etc/prometheus/prometheus.yml,ro=true,relabel=shared \
     prom/prometheus

You can see the Prometheus UI by visiting http://localhost:9090.

Prometheus UI

Starting Grafana

Grafana provides a much more robust UI with additional features. Grafana has a prebuilt image on Dockerhub that you can use to run Grafana locally. Start it using the following command:

docker run --network host grafana/grafana

It might take a while to start. After it starts, visit http://localhost:3000 to see the Grafana UI. Log in with the username "admin" and the password "admin".

Connecting Grafana to Prometheus

  1. Click the Gear icon to go to the Configuration page.

  2. Click the "Add data source" button.

  3. Select "Prometheus".

  4. Enter "http://localhost:9090" for the URL field.

  5. (Optional) Set the Scrape Interval to be equal to the one set for Prometheus (1s).

  6. Click "Save & Test".

If it is set up correctly, a green textbox will appear above "Save & Test" with text "Data source is working".

Grafana Prometheus Configuration

Create a dashboard to monitor metrics

With all that setup, we can finally graph some metrics.

  1. Click the "+" icon on the left sidebar.

  2. Click "Add an empty panel".

  3. Beneath "A", in the text box to the right of "Metrics", enter "optaplanner_solver_solve_duration_seconds_active_count". This adds a graph for the number of active solvers. It might say "No data" if no solvers were started yet.

  4. Click the clock icon in the top right, and select "Last 15 minutes" under "Relative time range". This makes the dashboard show data that occurred during the past 15 minutes.

Go to "http://localhost:8080" and start some solvers. Use the "School Id" selector to change schools, and click the "Solve" button to start solving the current school timetable.

The dashboard should display a graph similar to this one depending on how many solvers were started:

Grafana Graph

Metrics available

Beside "optaplanner_solver_solve_duration_seconds_active_count", there are several other metrics available by default:

  • optaplanner_solver_errors_total: the total number of errors that occurred while solving since the start of the measuring.

  • optaplanner_solver_solve_duration_seconds_max: run time of the longest-running currently active solver.

  • optaplanner_solver_solve_duration_seconds_duration_sum: the sum of each active solver’s solve duration. For example, if there are two active solvers, one running for three minutes and the other for one minute, the total solve time is four minutes.

In 8.12.0.Final and above, additional metrics can be configured by adding a <monitoring> section to the solver config:

<?xml version="1.0" encoding="UTF-8"?>
<solver xmlns="https://www.optaplanner.org/xsd/solver" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="https://www.optaplanner.org/xsd/solver https://www.optaplanner.org/xsd/solver/solver.xsd">
  <monitoring>
    <metric>BEST_SCORE</metric>
    <metric>CONSTRAINT_MATCH_TOTAL_BEST_SCORE</metric>
    <!-- ... -->
  </monitoring>
</solver>

For more infomation about OptaPlanner monitoring support, see the Monitoring section of the OptaPlanner documentation.

What next?

This tutorial covers the basics of what you can do with Grafana. Additional things you can do:

  • Create alerts that trigger whenever a certain condition is met

  • Perform transformations on queries

  • Visualize data in a variety of graphs and charts

Conclusion

Monitoring systems are a helpful tool for diagnosing and alerting us to issues. OptaPlanner integrates with monitoring systems using Micrometer, providing useful metrics such as active solver count. One example of a monitoring system is Prometheus, which scrape metrics from an endpoint. Grafana is an analytics visualization platform that allows us to visualize data and create alerts when certain conditions are met. As always, the complete source code for this example is available on GitHub.


Permalink
 tagged as monitoring production

Comments

Visit our forum to comment
AtomNews feed
Don’t want to miss a single blog post?
Follow us on
  • T
  • L
  • F
Blog archive
Latest release
  • 9.44.0.Final released
    Wed 6 September 2023
Upcoming events
    Add event / Archive
Latest blog posts
  • Scaling Up Vehicle Routing Problem with planning list variable and Nearby Selector
    Thu 27 April 2023
    Anna Dupliak
  • OptaPlanner 9 has been released
    Mon 24 April 2023
    Radovan Synek
  • OptaPlanner 9 is coming
    Tue 21 February 2023
    Lukáš Petrovický
  • Farewell - a new lead
    Tue 15 November 2022
    Geoffrey De Smet
  • Run OptaPlanner workloads on OpenShift, part II
    Wed 9 November 2022
    Radovan Synek
  • Bavet - A faster score engine for OptaPlanner
    Tue 6 September 2022
    Geoffrey De Smet
  • Run OptaPlanner workloads on OpenShift, part I.
    Thu 9 June 2022
    Radovan Synek
  • Blog archive
Latest videos
  • The Vehicle Routing Problem
    Fri 23 September 2022
    Geoffrey De Smet
  • Introduction to OptaPlanner AI constraint solver
    Thu 25 August 2022
    Anna Dupliak
  • On schedule: Artificial Intelligence plans that meet expectations
    Sat 23 July 2022
    Geoffrey De Smet
  • Host your OptaPlanner app on OpenShift (Kubernetes)
    Mon 7 February 2022
    Geoffrey De Smet
  • OptaPlanner - A fast, easy-to-use, open source AI constraint solver for software developers
    Mon 31 January 2022
  • Order picking planning with OptaPlanner
    Fri 31 December 2021
    Anna Dupliak
  • AI lesson scheduling on Quarkus with OptaPlanner
    Thu 18 November 2021
    Geoffrey De Smet
  • Video archive

OptaPlanner is open. All dependencies of this project are available under the Apache Software License 2.0 or a compatible license. OptaPlanner is trademarked.

This website was built with JBake and is open source.

Community

  • Blog
  • Get Help
  • Team
  • Governance
  • Academic research

Code

  • Build from source
  • Issue tracker
  • Release notes
  • Upgrade recipes
  • Logo and branding
CC by 3.0 | Privacy Policy
Sponsored by Red Hat