The Parallel Universe Blog

May 08, 2014

An Opinionated Guide to Modern Java, Part 2: Deployment, Monitoring & Management, Profiling and Benchmarking

By Ron

This is part 2 in a three-part series: part 1, part 3

Welcome to part 2 of the OGMJ. In part 1 we presented new Java language features, libraries and tools, that make Java into a much more lightweight development environment: new build tools, easier docs, expressive code and lightweight concurrency. In this post, we’ll go beyond the code to discuss Java operations, namely deployment, monitoring and management, profiling and benchmarking. Even though the examples will be in Java, most of what we discuss in this post is relevant to all JVM languages as much as it is for Java, the language.

But before we begin, I’d like to shortly go over some of the responses to the previous post raised by readers, and clarify a couple of things. It turns out that the most contentious recommendation I made in part 1 was the build tool. I wrote, “the modern Java developer uses Gradle”. Some readers took issue with that, and made a good case for Maven. While I personally prefer Gradle’s nice DSL and the ability to use imperative code for non-common build operations, I can understand the preference for the fully declarative Maven, even if it requires lots of plugins. The modern Java developer, then, might prefer Maven to Gradle. I would like to say, though, that in order to use Gradle one does not need to know Groovy, even if one wishes to do some non-standard stuff; I don’t. I just learned a few useful Groovy expressions that I found in Gradle examples online.

Also, some readers took my use of JUnit and Guava in the example to mean I endorse them. Well, I do. Guava is a very useful library, and JUnit is a fine unit-test framework. TestNG is a fine unit-testing framework as well, but JUnit is so ubiquitous that there is little reason to choose something else, even if another library has some advantages. Also, the unit test example used Hamcrest matchers. One reader pointed me at AssertJ, which looks like a very nice alternative to Hamcrest.

It is important to understand that this guide is not intended to be comprehensive. There are so many good Java libraries out there that we can’t possibly explore them all. My intention is to give a taste of what’s possible with modern Java.

Some readers expressed their preference to short Javadoc comments that don’t necessarily fill out all of the fields in the Javadoc “standard form”. For example, this:

/**
 * Returns the result
 */
 int getResult();

is preferable to:

/**
 * This method returns the result.
 * @return the result
 */
 int getResult();

With that I wholeheartedly agree. My example simply demonstrated mixing Markdown with standard Javadoc taglets, and was not intended as a guideline.

Finally, a few words regarding Android. While Android can execute, via a series of transformations, code written in Java (and perhaps some other JVM languages), Android is not a JVM, and in fact Android, both officially and in practice, is not Java (that’s the result of two multinational corporations not being able to reach a licensing agreement). Because Android is not Java, what we covered in part 1 may or may not apply to it, and because Android does not have a JVM, little if anything in this post applies to it.

Now back to our business.

Modern Java Packaging and Deployment

For those of you unfamiliar with the Java ecosystem, Java (or any JVM language) source files, are compiled into .class files (essentially Java binaries), one for each class. The basic mechanism of packaging class files is bundling them (this is normally done by the build tool or IDE) into JAR (Java Archive) files, which are Java binary packages. JARs are just ZIP files containing class files, and an additional manifest file describing the contents, and possibly containing other information about the distribution (the manifest file also contains the electronic signatures in signed JARs). If you package an application (as opposed to a library) in a JAR, the manifest can point to the app’s main class, in which case the application can be launched with the command java -jar app.jar; that’s called an executable JAR.

Java libraries are packaged into JARs, and then deployed into Maven repositories (those are used by practically all JVM build tools – not just Maven). Maven repositories manage binaries’ versioning and dependencies (when you request a library from a Maven repository, you can ask for all its transitive dependencies as well)1. Open source JVM libraries are often hosted at the Central Repository, or other similar public repositories, and organizations manage their own private Maven repositories with tools like Artifactory or Nexus. You can even host your own Maven repository on GitHub. But Maven repos are normally accessed (by your build tool) at build time, and normally host libraries rather than executables.

Java web applications have traditionally been run in application servers, or servlet containers. Those containers can run multiple web applications, loading and unloading apps on demand. Java web applications are deployed to servlet containers in WAR (Web Archive) files, which are really JARs whose contents are arranged in some standard way, and contain additional configurations. But, as we’ll see in part 3 of this post series, as far as modern Java is concerned, Java application servers are dead.

Java desktop applications are often packaged and deployed as platform specific binaries, bundled along with their own JVM. The JDK contains a tool to do just that (here’s how to use it in NetBeans), and a third-party tool called Packr provides similar functionality. This mechanism is great for desktop apps and games, but not what we want for server software: the packages tend to be very big and require installation by the user. In addition, because they bundle a copy of the JVM, they cannot be patched with security and performance upgrades.

What we want is a simple, lightweight, fire-and-forget packaging and deployment tool for server-side code. Preferably we would like to take advantage of executable JARs’ simplicity and platform independence. But executable JARs have several deficiencies. Every library is usually packaged into its own JAR, and merging all dependencies into a single JAR, might cause collisions, especially with packaged resources (non-class files). Also, a native library can’t just be dropped into the JAR, and, perhaps most importantly, configuring the JVM (setting heap sizes etc.) falls to the user, and must be done at the command line. Tools like Maven’s Shade plugin or Gradle’s Shadow plugin solve the collision issue, and One-Jar also supports native libraries, but both might interfere with the application in subtle ways, and neither solve the JVM configuration problem. Gradle can bundle the application in a ZIP and generate os-specific launch scripts to configure the JVM, but that approach requires installation, and we can go much more lightweight than that. Also, with a powerful, ubiquitous resource like Maven repositories at our disposal, it would be a shame not to take advantage of them.

This blog post series is meant to be about how easy and fun it is to work with modern Java (without sacrificing any of it power), but when I looked for a fun, easy and lightweight way to package, distribute and deploy server-side Java apps, I came up empty handed. And so Capsule2 was born (if you know of any alternatives, please let me know).

Capsule uses the nice platform independence of executable JARs – but without their deficiencies – and (optionally) combines it with the power and convenience of Maven repositories. A capsule is a JAR that contains all or some of the Capsule project’s classes, and a manifest with deployment configuration options. When launched (with a simple java -jar app.jar), the capsule will do all or some of the following: extract the JAR into a cache directory, download and cache Maven dependencies, find an appropriate JVM installation, and configure and run the application in a new JVM process.

Now let’s take Capsule for a spin. We’ll begin with our JModern project that we created in part 1. This is our build.gradle file:

apply plugin: 'java'
apply plugin: 'application'

sourceCompatibility = '1.8'

mainClassName = 'jmodern.Main'

repositories {
    mavenCentral()
}

configurations {
    quasar
}

dependencies {
    compile "co.paralleluniverse:quasar-core:0.5.0:jdk8"
    compile "co.paralleluniverse:quasar-actors:0.5.0"
    quasar "co.paralleluniverse:quasar-core:0.5.0:jdk8"

    testCompile 'junit:junit:4.11'
}

run {
    jvmArgs "-javaagent:${configurations.quasar.iterator().next()}"
}

and here’s our jmodern.Main class:

package jmodern;

import co.paralleluniverse.fibers.Fiber;
import co.paralleluniverse.strands.Strand;
import co.paralleluniverse.strands.channels.Channel;
import co.paralleluniverse.strands.channels.Channels;

public class Main {
    public static void main(String[] args) throws Exception {
        final Channel<Integer> ch = Channels.newChannel(0);

        new Fiber<Void>(() -> {
            for (int i = 0; i < 10; i++) {
                Strand.sleep(100);
                ch.send(i);
            }
            ch.close();
        }).start();

        new Fiber<Void>(() -> {
            Integer x;
            while((x = ch.receive()) != null)
                System.out.println("--> " + x);
        }).start().join(); // join waits for this fiber to finish
    }
}

To test if our program is working correctly, we’ll try a gradle run.

Now, let’s package it into a capsule. In the build file, we’ll add a capsule configuration. Then, we’ll add the following line to our dependencies:

capsule "co.paralleluniverse:capsule:0.3.1"

There are two basic ways to create a capsule (although you can mix them both). The first is to embed all of the dependencies in the capsule, and the second is to let the capsule download them when first launched. We’ll try the first approach – the “full” capsule – first. We’ll add the following to the bottom of our build file:

task capsule(type: Jar, dependsOn: jar) {
    archiveName = "jmodern-capsule.jar"

    from jar // embed our application jar
    from { configurations.runtime } // embed dependencies

    from(configurations.capsule.collect { zipTree(it) }) { include 'Capsule.class' } // we just need the single Capsule class

    manifest {
        attributes(
            'Main-Class'  : 'Capsule',
            'Application-Class' : mainClassName,
            'Min-Java-Version' : '1.8.0',
            'JVM-Args' : run.jvmArgs.join(' '), // copy JVM args from the run task
            'System-Properties' : run.systemProperties.collect { k,v -> "$k=$v" }.join(' '), // copy system properties
            'Java-Agents' : configurations.quasar.iterator().next().getName()
        )
    }
}

Now let’s build the capsule with gradle capsule, and run it:

java -jar build/libs/jmodern-capsule.jar

If you want to see exactly what Capsule is doing, preface -jar with -Dcapsule.log=verbose, but, because it’s a capsule with embedded dependencies, Capsule will extract the JAR into a cache directory (.capsule/apps/jmodern.Main in the user’s home directory) – the first time it’s run – and then launch a new JVM, configured according to the capsule’s manifest. If you have a Java 7 installation, you can try launching the capsule under Java 7 (by setting the JAVA_HOME environment variable to Java 7’s home directory). Even though it’s launched under Java 7, because the capsule specifies a minimum Java version of 8 (or 1.8, which is the same thing), the capsule will find the Java 8 installation and use it to run our app.

Now for the second approach. We’ll create a capsule with external dependencies. To make the capsule creation easier, we’ll first add a function to our build file (you don’t need to understand it; a Gradle plugin will make this a lot easier – contributions are welcome, BTW – but for now we’ll create the capsule “manually”):

// converts Gradle dependencies to Capsule dependencies
def getDependencies(config) {
    return config.getAllDependencies().collect {
        def res = it.group + ':' + it.name + ':' + it.version +
            (!it.artifacts.isEmpty() ? ':' + it.artifacts.iterator().next().classifier : '')
        if(!it.excludeRules.isEmpty()) {
            res += "(" + it.excludeRules.collect { it.group + ':' + it.module }.join(',') + ")"
        }
        return res
    }
}

Then we’ll change the capsule task in the build file to read:

task capsule(type: Jar, dependsOn: classes) {
    archiveName = "jmodern-capsule.jar"
    from sourceSets.main.output // this way we don't need to extract
    from { configurations.capsule.collect { zipTree(it) } }

    manifest {
        attributes(
            'Main-Class'  :   'Capsule',
            'Application-Class'   : mainClassName,
            'Extract-Capsule' : 'false', // no need to extract the capsule
            'Min-Java-Version' : '1.8.0',
            'JVM-Args' : run.jvmArgs.join(' '),
            'System-Properties' : run.systemProperties.collect { k,v -> "$k=$v" }.join(' '),
            'Java-Agents' : getDependencies(configurations.quasar).iterator().next(),
            'Dependencies': getDependencies(configurations.runtime).join(' ')
        )
    }
}

Now let’s build the new capsule with gradle capsule, and run it again:

java -jar build/libs/jmodern-capsule.jar

The first time it’s run, the capsule will download all of our project’s dependencies into a cache directory, where they will be shared by other capsules using them. Instead of listing the dependencies in the JAR manifest, you can place your project’s pom file (especially useful if you’re using Maven as a build tool), into the capsule’s root. See the Capsule docs for details.

Finally, because this post is applicable to any JVM language, here’s a tiny project packaging a Node.js app in a capsule. The app uses Project Avatar, which allows running Node.js-like JavaScript applications on the JVM3. It consists of this source code:

var http = require('http');

var server = http.createServer(function (request, response) {
  response.writeHead(200, {"Content-Type": "text/plain"});
  response.end("Hello World\n");
});
server.listen(8000);
console.log("Server running at http://127.0.0.1:8000/");

And two Gradle build files. One creating a “full” capsule (with embedded dependencies), and the other packaging a capsule with external dependencies. This example demonstrates a capsule with native library dependencies. To build the capsule, run

gradle -b build1.gradle capsule

for a full capsule, or:

gradle -b build2.gradle capsule

for a capsule with external dependencies (the project includes a Gradle wrapper, so you don’t even need Gradle installed to build it; simply type ./gradlew instead of gradle to build).

To run:

java -jar build/libs/hello-nodejs.jar

Project Jigsaw, scheduled for inclusion in Java 9, is intended to fix Java deployment and a host of other issues, like stripped JVM distributions, reduced startup time (this is an interesting talk about Jigsaw). In the meantime, Capsule is a lean, and quite satisfactory solution for modern Java packaging and deployment. Capsule is stateless and installation-free.

Logging

Before we get into Java’s more advanced monitoring features, let’s get logging out of the way. Java is known to have a bazillion – give or take – logging libraries, on top of the one built into the JDK. Don’t think about that too much. If you need logging, use SLF4J as the logging API, period. It’s become the de-facto logging standard, and it has bindings for virtually all logging engines. Once you use SLF4J, you can leave the choice of a logging engine for later (you can even pick an engine at deployment time). SLF4J chooses a logging engine at runtime, based on whatever relevant JARs are included as dependencies. Most libraries now use SLF4J, and if one of your dependencies doesn’t, SLF4J lets you pipe calls to any logging libraries back to SLF4J, and from there to your engine of choice. Speaking of choosing a logging engine, if your needs are simple, pick the JDK’s java.util.logging. For heavy-duty, high-performance logging, pick the Log4j 2 (unless you feel really tied to some other logging engine).

Let’s add logging to our app. To our dependencies, we’ll add:

compile "org.slf4j:slf4j-api:1.7.7"    // the SLF4J API
runtime "org.slf4j:slf4j-jdk14:1.7.7"  // SLF4J binding for java.util.logging

If we run gradle dependencies we see that our app’s dependencies, in turn, depend on Log4j, which we don’t want for the purpose of this demonstration, add the following line to the build.gradle’s configuration section:

all*.exclude group: "org.apache.logging.log4j", module: "*"

Finally, we’ll add some logging to our code:

package jmodern;

import co.paralleluniverse.fibers.Fiber;
import co.paralleluniverse.strands.Strand;
import co.paralleluniverse.strands.channels.Channel;
import co.paralleluniverse.strands.channels.Channels;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class Main {
    static final Logger log = LoggerFactory.getLogger(Main.class);

    public static void main(String[] args) throws Exception {
        final Channel<Integer> ch = Channels.newChannel(0);

        new Fiber<Void>(() -> {
            for (int i = 0; i < 100000; i++) {
                Strand.sleep(100);
                log.info("Sending {}", i); // log something
                ch.send(i);
                if (i % 10 == 0)
                    log.warn("Sent {} messages", i + 1); // log something
            }
            ch.close();
        }).start();

        new Fiber<Void>(() -> {
            Integer x;
            while ((x = ch.receive()) != null)
                System.out.println("--> " + x);
        }).start().join(); // join waits for this fiber to finish
    }
}

If you now run the app (gradle run), you’ll see the log statements printed to the standard output (that’s the default; we’re not going to get into configuring log files – refer to your logging engine docs for that). Both “info” and “warn” logs are printed by default. The logging level can be set in the log configuration (which, again, we’re not going to do now), or at runtime, as we’ll soon see.

Monitoring and Management with jcmd and jstat

The JDK includes several command line monitoring and management tools, but here we’ll only shortly cover a couple: jcmd and jstat.

In order to play with them, we’re going to have to make our app not terminate so quickly, so change the for loop limit in the first fiber from 10 to, say, 1000000, and run it in a terminal with gradle run. In another terminal window, run jcmd. If your JDK is installed correctly and jcmd is on your path, you’ll see something like this:

22177 jmodern.Main
21029 org.gradle.launcher.daemon.bootstrap.GradleDaemon 1.11 /Users/pron/.gradle/daemon 10800000 86d63e7b-9a18-43e8-840c-649e25c329fc -XX:MaxPermSize=256m -XX:+HeapDumpOnOutOfMemoryError -Xmx1024m -Dfile.encoding=UTF-8
22182 sun.tools.jcmd.JCmd

It’s a list of all currently running JVM processes. Now, run:

jcmd jmodern.Main help

You will see a list of jcmd commands that the particular JVM process supports. Let’s try:

jcmd jmodern.Main Thread.print

This will print out the current stack trace for all threads running in the JVM. Now try:

jcmd jmodern.Main PerfCounter.print

This will print out a long list of various JVM performance counters (you’ll have to Google for their meaning). You can now try some of the other commands (like GC.class_histogram).

jstat is like top for the JVM, only it displays information about GC and JIT activity. Suppose our app’s pid is 95098 (you can find it by running jcmd or jps in the shell). Running

jstat -gc 95098 1000

Will print GC information every 1000 milliseconds. It will look like this:

 S0C    S1C    S0U    S1U      EC       EU        OC         OU       PC     PU    YGC     YGCT    FGC    FGCT     GCT
80384.0 10752.0  0.0   10494.9 139776.0 16974.0   148480.0   125105.4    ?      ?        65    1.227   8      3.238    4.465
80384.0 10752.0  0.0   10494.9 139776.0 16985.1   148480.0   125105.4    ?      ?        65    1.227   8      3.238    4.465
80384.0 10752.0  0.0   10494.9 139776.0 16985.1   148480.0   125105.4    ?      ?        65    1.227   8      3.238    4.465
80384.0 10752.0  0.0   10494.9 139776.0 16985.1   148480.0   125105.4    ?      ?        65    1.227   8      3.238    4.465

The numbers are the current capacity of various GC regions. To learn more about what each means, see the jstat documentation.

Both jcmd and jstat can connect to JVMs running on remote machines; see their documentation for details (jcmd, jstat).

Monitoring and Management with JMX

One of the JVM’s greatest strengths is how it exposes every single detail of its operation – and that of the standard libraries – for runtime monitoring and management. JMX (Java Management Extensions), is a runtime management and monitoring standard. JMX specifies simple Java objects, called MBeans, that expose monitoring and management operations of the JVM itself, the JDK libraries, and any JVM application. JMX also specifies standard ways of connecting to JVM instances – either locally or remotely – to interact with the MBeans. In fact, jcmd gets much of its information with JMX. We will see how to write our own MBeans in the next section, but let’s first see how we can examine the baked-in ones.

With our app running in one terminal, run jvisualvm (included as part of the JDK) in another. This will launch VisualVM. Before we start playing with it, we need to install some plugins. Go to Tools->Plugins and pick Available Plugins. For the purpose of our demonstration we only need VisualVM-MBeans, but you might as well install all of them except maybe VisualVM-Glassfish and BTrace Workbench). Now pick jmodern.Main in the left pane, and choose the Monitor tab. You’ll see something like this:

VisualVM Monitor tab

The monitor tab uses information exposed as JMX MBeans about the running JVM and displays them graphically, but we can also manually examine those MBeans (and many more) by choosing the MBeans tab (which will be available only after installing the VisualVM-MBeans plugin), we can examine and interact with all MBeans registered on our JVM instance. The one used in the heap plot, for example, is found under java.lang/Memory (double-click the attribute value in order to expand it):

VisualVM memory MBean

Now let’s pick the java.util.logging/Logging MBean. The LoggerNames attribute in the right pane, will list all registered logger, including the one we’ve added to our code, jmodern.Main (double-click the attribute value in order to expand it).

VisualVM logging MBean

MBeans let us not only inspect monitoring values, but also to set them, and invoke various management operations. Pick the Operations tab (in the right pane, next to the Attributes tab). We can now change the logging level at runtime via the JMX MBean. In the setLoggerLevel, fill two values: jmodern.Main in the first, and WARNING in the second (the new logging level), as in the screenshot below:

VisualVM change logging level

Now, when you click the setLoggerLevel button, the “info” log messages will no longer be displayed. If you set the level to SEVERE, both log messages will stop appearing. VisualVM automatically generates this simple GUI without any effort on the developer creating the MBean.

We can allow VisualVM (and other JMX consoles) to access our app remotely, by adding some system properties. To do that, we add the following lines to our build file’s run section:

systemProperty "com.sun.management.jmxremote", ""
systemProperty "com.sun.management.jmxremote.port", "9999"
systemProperty "com.sun.management.jmxremote.authenticate", "false"
systemProperty "com.sun.management.jmxremote.ssl", "false"

(in production, you’d naturally want to enable security).

As we’ve seen, in addition to MBean inspection, VisualVM also has custom monitoring views, some relying on JMX for data and some on other means: it monitors thread state and current stack trace for all threads, it provides insights into the GC and general memory usage, performs and analyzes heap dumps and core dumps, and much, much more. VisualVM is one of the most important tools in the modern Java developer’s toolbox. Here’s a screenshot of some advanced monitoring information provided by VisualVM’s traces plugins:

VisualVM Traces

A modern Java developer might sometimes prefer a CLI to a nice GUI. A nice project called jmxterm, provides a CLI for JMX MBeans. Unfortunately, it does not yet support Java 7 and 8, but the developer says it will soon (if not, we will release a fix; we already have a working fork).

One thing is certain, though. The modern Java developer likes REST APIs (if for no other reason that they’re ubiquitous and are easy to build web GUIs for). While the JMX standard supports a few different local and remote connectors, it does not yet include an HTTP connector (it’s supposed to in Java 9). However, a beautiful project called Jolokia fills that void, and gives us RESTful access to our MBeans. Let’s give it a try. Merge the following into the your build.gradle:

configurations {
    jolokia
}

dependencies {
    runtime "org.jolokia:jolokia-core:1.2.1"
    jolokia "org.jolokia:jolokia-jvm:1.2.1:agent"
}

run {
    jvmArgs "-javaagent:${configurations.jolokia.iterator().next()}=port=7777,host=localhost"
}

(the fact that Gradle requires a new configuration for each dependency used as a Java agent annoys me to no end. Dear Gradle team: please, please make it easier!).

We also want to have Jolokia in our capsule, so we’ll change the Java-Agents attribute in capsule task to read:

'Java-Agents' : getDependencies(configurations.quasar).iterator().next() +
               + " ${getDependencies(configurations.jolokia).iterator().next()}=port=7777,host=localhost",

Run the application with gradle run, or with the capsule (gradle capsule; java -jar build/libs/jmodern-capsule.jar), and point your browser at http://localhost:7777/jolokia/version. If Jolokia is working properly, you’ll get a JSON response. Now, to examine our app’s heap usage, do:

curl http://localhost:7777/jolokia/read/java.lang:type\=Memory/HeapMemoryUsage

To change our logger’s log level, you can do:

curl http://localhost:7777/jolokia/exec/java.util.logging:type\=Logging/setLoggerLevel\(java.lang.String,java.lang.String\)/jmodern.Main/WARNING

Jolokia has a very nice HTTP API, that can use both GET and POST operations, and it also allows for secure access. For more information, consult the (excelleny) Jolokia documentation.

An HTTP API opens the door to web management consoles. The Jolokia site has a demo of a Cubism GUI for JMX MBeans. Another project by JBoss, called hawtio uses the Jolokia HTTP API to construct a full featured, browser-based monitoring and manangement console for JVM applications. Aside from being a browser app, hawatio differs from VisualVM in that it is intended as a continuous manitoring/management tool for production code, while VisualVM is more of a troubleshooting tool.

Writing Your Own MBeans

Writing and registering your own MBeans is easy:

package jmodern;

import co.paralleluniverse.fibers.Fiber;
import co.paralleluniverse.strands.Strand;
import co.paralleluniverse.strands.channels.*;
import java.lang.management.ManagementFactory;
import java.util.concurrent.atomic.AtomicInteger;
import javax.management.MXBean;
import javax.management.ObjectName;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class Main {
    static final Logger log = LoggerFactory.getLogger(Main.class);

    public static void main(String[] args) throws Exception {
        final AtomicInteger counter = new AtomicInteger();
        final Channel<Object> ch = Channels.newChannel(0);

        // create and register MBean
        ManagementFactory.getPlatformMBeanServer().registerMBean(new JModernInfo() {
            @Override
            public void send(String message) {
                try {
                    ch.send(message);
                } catch (Exception e) {
                    throw new RuntimeException(e);
                }
            }

            @Override
            public int getNumMessagesReceived() {
                return counter.get();
            }
        }, new ObjectName("jmodern:type=Info"));

        new Fiber<Void>(() -> {
            for (int i = 0; i < 100000; i++) {
                Strand.sleep(100);
                log.info("Sending {}", i); // log something
                ch.send(i);
                if (i % 10 == 0)
                    log.warn("Sent {} messages", i + 1); // log something
            }
            ch.close();
        }).start();

        new Fiber<Void>(() -> {
            Object x;
            while ((x = ch.receive()) != null) {
                counter.incrementAndGet();
                System.out.println("--> " + x);
            }
        }).start().join(); // join waits for this fiber to finish

    }

    @MXBean
    public interface JModernInfo {
        void send(String message);
        int getNumMessagesReceived();
    }
}

We’ve added an MBean that lets us monitor the number of messages received by the second fiber, and also exposes a send operation, that will slip a message into the channel. When we run the app, we can now see our monitored property in VisualVM:

JModern MBean 1

plot it (by double clicking the value):

JModern MBean 2

and use our MBean operation in the Operations tab to sneak a message into the channel:

JModern MBean operation

Easy Health and Performance Monitoring with Metrics

Metrics is a cool, modern, library for easy performance and health monitoring, built by Coda Hale back when he was at Yammer. It contains common metrics collection and reporting classes like histograms, timers, counters gauges, etc. Let’s take it out for a spin.

We’re not going to be using Jolokia now, so you can take all of the Jolokia stuff out from the build file. Instead, you’ll need to add the following dependency:

compile "com.codahale.metrics:metrics-core:3.0.2"

Metrics can report its metrics via JMX MBeans, write them to CSV files, expose them via a RESTful interface, or publish them to Graphite or Ganglia. We will just be reporting to JMX (although we’ll see Metrics’s HTTP reporting in part 3, when we discuss Dropwizard). This will be our new Main class:

package jmodern;

import co.paralleluniverse.fibers.Fiber;
import co.paralleluniverse.strands.Strand;
import co.paralleluniverse.strands.channels.*;
import com.codahale.metrics.*;
import static com.codahale.metrics.MetricRegistry.name;
import java.util.concurrent.ThreadLocalRandom;
import static java.util.concurrent.TimeUnit.*;

public class Main {
    public static void main(String[] args) throws Exception {
        final MetricRegistry metrics = new MetricRegistry();
        JmxReporter.forRegistry(metrics).build().start(); // starts reporting via JMX

        final Channel<Object> ch = Channels.newChannel(0);

        new Fiber<Void>(() -> {
            Meter meter = metrics.meter(name(Main.class, "messages" , "send", "rate"));
            for (int i = 0; i < 100000; i++) {
                Strand.sleep(ThreadLocalRandom.current().nextInt(50, 500)); // random sleep
                meter.mark(); // measures event rate

                ch.send(i);
            }
            ch.close();
        }).start();

        new Fiber<Void>(() -> {
            Counter counter = metrics.counter(name(Main.class, "messages", "received"));
            Timer timer = metrics.timer(name(Main.class, "messages", "duration"));

            Object x;
            long lastReceived = System.nanoTime();
            while ((x = ch.receive()) != null) {
                final long now = System.nanoTime();
                timer.update(now - lastReceived, NANOSECONDS); // creates duration histogram
                lastReceived = now;
                counter.inc(); // counts

                System.out.println("--> " + x);
            }
        }).start().join(); // join waits for this fiber to finish

    }
}

In this example we’ve used a Metrics counter (which counts things), a meter (measures rate of events) and a timer (produces a histogram of time durations). Now let’s run the app and fire up VisualVM. Here are our metrics (the screenshot shows the timer metric):

Metrics report

Profiling

Profiling an application is the key to meeting our performance requirements. Only by profiling can we know which parts of the code affect overall execution speed, and concentrate our efforts to optimize them, and only them. Java has always had good profilers, both as part of its IDEs or as separate products. But as a result of merging parts of the JRockit JVM code into HotSpot, Java recently gained its most precise, and lightweight profiler yet. While the tool (or, rather, a combination of two tools), we will explore in this section are not open-source, we discuss them because they are included in the standard Oracle JDK, and are free to use in you development environment (but require a commercial license for use on code deployed to production).

We will begin with a test program. It’s our original demo with some artificial work added:

package jmodern;

import co.paralleluniverse.fibers.Fiber;
import co.paralleluniverse.strands.Strand;
import co.paralleluniverse.strands.channels.*;
import com.codahale.metrics.*;
import static com.codahale.metrics.MetricRegistry.name;
import java.util.concurrent.ThreadLocalRandom;
import static java.util.concurrent.TimeUnit.*;

public class Main {
    public static void main(String[] args) throws Exception {
        final MetricRegistry metrics = new MetricRegistry();
        JmxReporter.forRegistry(metrics).build().start(); // starts reporting via JMX

        final Channel<Object> ch = Channels.newChannel(0);

        new Fiber<Void>(() -> {
            Meter meter = metrics.meter(name(Main.class, "messages", "send", "rate"));
            for (int i = 0; i < 100000; i++) {
                Strand.sleep(ThreadLocalRandom.current().nextInt(50, 500)); // random sleep
                meter.mark();

                ch.send(i);
            }
            ch.close();
        }).start();

        new Fiber<Void>(() -> {
            Counter counter = metrics.counter(name(Main.class, "messages", "received"));
            Timer timer = metrics.timer(name(Main.class, "messages", "duration"));

            Object x;
            long lastReceived = System.nanoTime();
            while ((x = ch.receive()) != null) {
                final long now = System.nanoTime();
                timer.update(now - lastReceived, NANOSECONDS);
                lastReceived = now;
                counter.inc();

                double y = foo(x);
                System.out.println("--> " + x + " " + y);
            }
        }).start().join();
    }

    static double foo(Object x) { // do crazy work
        if (!(x instanceof Integer))
            return 0.0;

        double y = (Integer)x % 2723;
        for(int i=0; i<10000; i++) {
            String rstr = randomString('A', 'Z', 1000);
            y *= rstr.matches("ABA") ? 0.5 : 2.0;
            y = Math.sqrt(y);
        }
        return y;
    }

    public static String randomString(char from, char to, int length) {
        return ThreadLocalRandom.current().ints(from, to + 1).limit(length)
                .mapToObj(x -> Character.toString((char)x)).collect(Collectors.joining());
    }
}

The function foo does some nonsensical computations, so don’t try to make sense of them. When you run the application (gradle run), you may notice warning from Quasar, that one of the fibers is consuming an inordinate amount of CPU time. To figure out what’s going on, we need to profile.

The profiler we’ll use is very precise and has a very low overhead. It’s made of two components, The first, Java Flight Recorder is baked into the HotSpot VM. It can record many different JVM events into very efficient buffers. It can be triggered to start and stop recording with jcmd, but we will control it with the second tool. The second tool is Java Mission Control (or JMC), which is included in the JDK installation. JMC was the analog to VisualVM in the JRockit VM, and it performs many of the same functions (like MBean inspection), only it is far uglier. But we will be using JMC for it’s ability to control the Java Flight Recorder, and analyze its recordings (I hope Oracle eventually move this functionality into the much prettier VisualVM).

Flight Recorder has to be enabled when the program is launched (it won’t record anything yet and won’t affect performance), so we’ll stop the execution, and add this line to build.gradle’s run section:

jvmArgs "-XX:+UnlockCommercialFeatures", "-XX:+FlightRecorder"

The UnlockCommercialFeatures flag is necessary because Flight Recorder is a commercial feature, but it’s free for use in development. Now let’s launch the program again.

In another terminal, let’s fire up Mission Control with the jmc command. In the left pane, right-click jmodern.Main, and choose “Start Flight Recording…”. In the wizard window’s “Event settings” dropbox pick “Profiling - on server”, and click “Next >” (not “Finish”!).

JMC Start Recording

In the next screen, check the “Heap Statistics” and “Allocation Profiling” boxes, and click “Finish”.

JMC Start Recording 2

JMC will now wait for one minute until Flight Recorder concludes the recording. It will then open the recording file for analysis, at which point you can safely terminate our application.

The “Code” section’s Hot Methods tab immediately uncovers the randomString method as the culprit, responsible for almost 90% of the program’s execution time.

JMC Hot Methods

The Memory section’s Garbage Collection tab shows the heap usage, over the duration of the recording:

JMC Heap

And the GC Times tab shows the durtation of the garbage collector’s collections:

JMC GC

We can also examine allocation behavior:

JMC Allocations

and the heap’s contents:

JMC Objects

Java Flight Recorder also has an (unsupported) API that lets us record our own application events in JFR’s recordings.

Advanced Topic: Profiling and Debugging with Byteman

Like last time, we will conclude this installment with advanced topics. First up is profiling and debugging with Byteman. As we mentioned in part 1, one of the JVM’s most powerful features is the ability to dynamically load code at runtime (which goes far beyond loading dynamic libraries in native applications). Not only that: the JVM gives us the ability to transform, and re-transform, already running code.

A useful tool which takes advantage of this ability is Byteman by JBoss. Byteman allows us to inject tracing, debugging and profiling code into a running application. It is included here as an advanced topic because its support for Java 7, let alone 8, is a bit shaky, and must require fixes (to Byteman, that is). The project is actively developed, but lagging behind. We will therefore use byteman on very basic code.

This will be our main class:

package jmodern;

import java.util.concurrent.ThreadLocalRandom;

public class Main {
    public static void main(String[] args) throws Exception {
        for (int i = 0;; i++) {
            System.out.println("Calling foo");
            foo(i);
        }
    }

    private static String foo(int x) throws InterruptedException {
        long pause = ThreadLocalRandom.current().nextInt(50, 500);
        Thread.sleep(pause);
        return "aaa" + pause;
    }
}

foo simulates calling a service which may take some unknown time to complete.

Next, we’ll merge the following into our build file:

configurations {
    byteman
}

dependencies {
  byteman "org.jboss.byteman:byteman:2.1.4.1"
}

run {
    jvmArgs "-javaagent:${configurations.byteman.iterator().next()}=listener:true,port:9977"
    // remove the quasar agent
}

If you want to test Byteman with a capsule, you can change the Java-Agents attribute in the build file to read:

'Java-Agents' : "${getDependencies(configurations.byteman).iterator().next()}=listener:true,port:9977",

Now, we’ll download Byteman here (we can’t rely only on the dependencies because we’re going to use some of Byteman’s command line tools), unzip the archive, and set the environment variable BYTEMAN_HOME to point to Byteman’s directory.

Now, launch the app in one terminal:

gradle run

It will print something like this:

Calling foo
Calling foo
Calling foo
Calling foo
Calling foo

We want to know how long each call to foo takes, but we’ve forgotten to measure and log that. We’ll use Byteman to insert that log while the program is running.

Start an editor, and create the file jmodern.btm in the project’s directory:

RULE trace foo entry
CLASS jmodern.Main
METHOD foo
AT ENTRY
IF true
DO createTimer("timer")
ENDRULE

RULE trace foo exit
CLASS jmodern.Main
METHOD foo
AT EXIT
IF true
DO traceln("::::::: foo(" + $1 + ") -> " + $! + " : " + resetTimer("timer") + "ms")
ENDRULE

These are Byteman rules, which we will apply to the program. While the program is still running in one terminal, open another, and type:

$BYTEMAN_HOME/bin/bmsubmit.sh -p 9977 jmodern.btm

Our app will now start printing something like this in the first terminal:

Calling foo
::::::: foo(152) -> aaa217 : 217ms
Calling foo
::::::: foo(153) -> aaa281 : 281ms
Calling foo
::::::: foo(154) -> aaa282 : 283ms
Calling foo
::::::: foo(155) -> aaa166 : 166ms
Calling foo
::::::: foo(156) -> aaa160 : 161ms

To see which rules are applied:

$BYTEMAN_HOME/bin/bmsubmit.sh -p 9977

Finally, to unload our Byteman script:

$BYTEMAN_HOME/bin/bmsubmit.sh -p 9977 -u

And now our injected log messages are gone!

Byteman is an extremely powerful tool, made possible by the JVMs flexible code transformations. You can use it to examine variables, log events (either to the standard output or to a file), insert delays and more. You can even easily add your own Byteman actions (say, if you want to log events using your logging engine). For more information, please refer to the Byteman documentation.

Advanced Topic: Benchmarking with JMH

Advances in both hardware architecture as well as compiler technology have made benchmarking the only viable way to reason about code performance. Modern CPUs (and modern compilers) are so clever (see this excelent talk by Cliff Click on modern hardware) that creating a mental performance profile of our code – like game programmers did until the end of the 90s – is damn-near impossible; even for a program written in C; heck, even for a program written in Assembly. On the other hand, the cleverness of compilers and CPUs are exactly what makes micro-benchmarking (benchmarking small snippets of code) so hard, as execution speed is very dependent on context (for example. it is affected by the state of the CPU cahce, which, in turn, is affected by what other threads are doing). Microbenchmarking JVM programs is doubly tricky because, as I mentioned in part 1, the JVM’s JIT is a profile-guided optimizing compiler4, which is very much affected by the context in which code is run. Therefore, code in a microbenchmark can be much faster, or much slower, than the same code in the context of a larger program.

JMH is an open-source Java benchmarking harness by Oracle, that runs code snippets in just the right way so you can truly reason about their performance (see this great talk (slides) by Aleksey Shipilev, JMH’s main author, about Java performance benchmarking). There is another, older tool, called Caliper, which was made by Google to serve the same purpose as JMH, but it is far cruder, and might even give wrong results. Don’t use it.

We will take JMH for a spin right away, but first the usual warning regarding microbenchmarks: premature optimization is the root of all evil. There’s is no point in benchmarking two algorithms or data structures to find out that one is 100 times faster than the other, if the algorithm accounts for a grand total of 1% of your app’s execution time. Even making that algorithm run infinitely fast would only save your 2%. Benchmark only after you’ve profiled your application and determined which bits would generate the most gain if accelerated.

As always, we’ll begin with the build file. Add these to your dependencies:

testCompile 'org.openjdk.jmh:jmh-core:0.8'
testCompile 'org.openjdk.jmh:jmh-generator-annprocess:0.8'

and put this bench task at the bottom of the build file:

task bench(type: JavaExec, dependsOn: [classes, testClasses]) {
    classpath = sourceSets.test.runtimeClasspath // we'll put jmodern.Benchamrk in the test directory
    main = "jmodern.Benchmark";
}

Finally, we’ll put our benchmark code in src/test/java/jmodern/Benchmark.java. I mentioned 90s game programmers before, and in order to show that some of their techniques still work, we’ll benchmark a “standard” inverse square-root computation, with the fast inverse square root algorithm, misattributed to John Carmack:

package jmodern;

import java.util.concurrent.TimeUnit;
import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.profile.*;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.options.OptionsBuilder;
import org.openjdk.jmh.runner.parameters.TimeValue;

@State(Scope.Thread)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class Benchmark {
    public static void main(String[] args) throws Exception {
        new Runner(new OptionsBuilder()
                .include(Benchmark.class.getName() + ".*")
                .forks(1)
                .warmupTime(TimeValue.seconds(5))
                .warmupIterations(3)
                .measurementTime(TimeValue.seconds(5))
                .measurementIterations(5)
                .build()).run();
    }

    private double x = 2.0; // prevent constant folding

    @GenerateMicroBenchmark
    public double standardInvSqrt() {
        return 1.0/Math.sqrt(x);
    }

    @GenerateMicroBenchmark
    public double fastInvSqrt() {
        return invSqrt(x);
    }

    static double invSqrt(double x) {
        double xhalf = 0.5d * x;
        long i = Double.doubleToLongBits(x);
        i = 0x5fe6ec85e7de30daL - (i >> 1);
        x = Double.longBitsToDouble(i);
        x = x * (1.5d - xhalf * x * x);
        return x;
    }
}

By the way, like the Checker framework we discussed in part 1, JMH uses an annotation processor. Unlike Checker, JMH does it right, so you get automatic IDE integration with all IDEs. Here for example, is what happens in NetBeans if you forget to include the @State annotation:

JMH IDE integration

To run the benchmarks, type gradle bench at the shell. I got these results:

Benchmark                       Mode   Samples         Mean   Mean error    Units
j.Benchmark.fastInvSqrt         avgt        10        2.708        0.019    ns/op
j.Benchmark.standardInvSqrt     avgt        10       12.824        0.065    ns/op

Nice, but keep in mind that fast-inv-sqrt is a rough approximation, good only to a few decimal places.

Here’s another example, this one also reports the time spent garbage-collecting, and gives a crude method stack profile:

package jmodern;

import java.util.*;
import java.util.concurrent.*;
import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.profile.*;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.options.OptionsBuilder;
import org.openjdk.jmh.runner.parameters.TimeValue;

@State(Scope.Thread)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class Benchmark {
    public static void main(String[] args) throws Exception {
        new Runner(new OptionsBuilder()
                .include(Benchmark.class.getName() + ".*")
                .forks(2)
                .warmupTime(TimeValue.seconds(5))
                .warmupIterations(3)
                .measurementTime(TimeValue.seconds(5))
                .measurementIterations(5)
                .addProfiler(GCProfiler.class)    // report GC time
                .addProfiler(StackProfiler.class) // report method stack execution profile
                .build()).run();
    }

    @GenerateMicroBenchmark
    public Object arrayList() {
        return add(new ArrayList<>());
    }

    @GenerateMicroBenchmark
    public Object linkedList() {
        return add(new LinkedList<>());
    }

    static Object add(List<Integer> list) {
        for (int i = 0; i < 4000; i++)
            list.add(i);
        return list;
    }
}

Here is an example of the profiling info printed by JMH:

Iteration   3: 33783.296 ns/op
          GC | wall time = 5.000 secs,  GC time = 0.048 secs, GC% = 0.96%, GC count = +97
             |
       Stack |  96.9%   RUNNABLE jmodern.generated.Benchmark_arrayList.arrayList_AverageTime_measurementLoop
             |   1.8%   RUNNABLE java.lang.Integer.valueOf
             |   1.3%   RUNNABLE java.util.Arrays.copyOf
             |   0.0%            (other)
             |

JMH is a very rich framework. Unfortunately, it is a little thin on documentation, but it does have a rather nice tutorial (which also demonstrates the pitfalls of naive Java microbenchmarking) written as a series of code samples. You can also read this good introductory post by Nitsan Wakart.

So, What Have We Learned So Far?

In this post we’ve covered some of the best tools for JVM management, monitoring and profiling. The JVM is very serious about providing deep insight into its operation, which is, in addition to its great performance, the main reason I wouldn’t replace the JVM as the platform for a heavy duty, long-running server-side app, with any other technology. We’ve also seen how powerful the JVM is when it comes to modifying running code with tools like Byteman.

We’ve also introduced Capsule, a lightweight, single-file, stateless, installation-free deployment package for JVM apps. It optionally supports automatic upgrades – for the entire app, or just a library dependency – via a public, or organizational Maven repository.

In part 3, we will discuss writing lightweight, scalable HTTP services with Dropwizard and Comsat, Web Actors, and dependency injection with JSR-330.

Discuss on Hacker News

  1. Maven repositories also usually store not just the binaries, but additional artifacts like sources and Javadocs; those are downloaded by IDEs to allow stepping into third-party code and to provide inline documentation for library calls.

  2. While Capsule is – for the time being, at least – managed by Parallel Universe, I hope it becomes a community-led project.

  3. I know some of you are thinking that running Node.js code on the JVM is like taking a chocolate-smeared four-year-old from school in a brand new Ferrari, but I’m not passing judgement.

  4. For more information about how the JVM works, you can watch these talks: A JVM does that? by Cliff Click, and JVM Mechanics – A Peek Under the Hood by Gil Tene)

Join our mailing list

Sign up to receive news and updates.

Tags: ,

comments powered by Disqus