Skip to main content
Java

Speeding up JAXB with caching and pooling - with benchmarks

8 mins

Two office workers: one is digging through a mountain of paper every time, while the other has a neatly organized file cabinet labeled Cache, representing cached and non-cached jaxb instances

What is JAXB? #

JAXB stands for Java Architecture for XML Binding, and is a Java API that allows developers to convert Java objects into XML and vice versa. So XML data is mapper to Java objects, and Java objects can be converted to XML data.

XML has a schema definition (XSD) that defines the structure of the XML data. JAXB uses this schema to generate Java classes that represent the XML data. This allows developers to work with Java objects instead of dealing with XML directly.

The XSD schema is used to generate Java classes that represent the XML data. A class is typically generated for each element in the XML schema, including the root element.

The JAXBContext #

JAXBContext is the main entry point for JAXB. It is create using the newInstance() method, which takes the package of the generated classes. This method creates a new JAXBContext instance that can be used to create Marshaller and Unmarshaller objects for converting Java objects to XML and vice versa.

  • createMarshaller(): This method creates a Marshaller object that can be used to convert Java objects to XML.
  • createUnmarshaller(): This method creates an Unmarshaller object that can be used to convert XML data to Java objects.

So a typical JAXB workflow is as follows:

  1. Create a JAXBContext instance using the newInstance() method.
  2. Create a Marshaller object using the createMarshaller() method.
  3. Use the Marshaller object to convert Java objects to XML.
  4. Save the XML data to a file or send it over the network.

More information about JAXB can be found in the JAXB documentation .

Caching JAXBContext #

JAXBContext.newInstance() is slow #

Creating a new JAXBContext instance is expensive, and it can take a significant amount of time to create. This is especially true if you are creating multiple instances in a loop or in a multi-threaded environment.

Note: The XSD schema used in the benchmarking is a fairly large schema with 20,000 lines and 40 plus generated classes. This is so some reasonable time is taken to create the JAXBContext instance, as the code was ran on a powerful machine with 128GB of RAM and 24 cores.

Lets examine the performance of creating a new JAXBContext instance in a loop, and compare it to using a cached instance of JAXBContext.

No Caching #

The following code creates a new JAXBContext instance for each iteration of the loop:

public long noCacheRun(int maxRuns) throws JAXBException {
   long startTime = System.currentTimeMillis();

   for (int i = 0; i < maxRuns; i++) {
       JAXBContext jaxbContext = JAXBContext.newInstance("dev.programmerpulse.jaxb");
       jaxbContext.createMarshaller();
   }

   long endTime = System.currentTimeMillis();
   return endTime - startTime;
}

The noCacheRun() method creates a new JAXBContext instance for each iteration of the loop, and uses the instance to create a Marshaller object.

Caching #

The cached version of the code creates a single instance of JAXBContext and reuses it for each iteration of the loop:

private static JAXBContext jaxbContextCache;

public long runWithCache(int maxRuns) throws JAXBException {
    long startTime = System.currentTimeMillis();

    for (int i = 0; i < maxRuns; i++) {
        jaxbContextCache.createMarshaller();
    }

    long endTime = System.currentTimeMillis();
    return endTime - startTime;
}

public static void main(String[] args) throws Exception {
    jaxbContextCache = JAXBContext.newInstance("dev.programmerpulse.jaxb");
    // ...
}

Cached vs Non-Cached Results #

The table belows shows that the cached version of the code is significantly faster than the non-cached version.

Iteration Non-Cached (ms) Cached (ms)
1 25 ms <1 ms
10 238 ms <1 ms
50 1162 ms <1 ms
100 2342 ms <1 ms
200 4638 ms <1 ms
500 11440 ms <1 ms
1000 22967 ms <1 ms
2000 44790 ms <1 ms
5000 109631 ms <1 ms

Although the test is not a typical use case, as it will be unlikely that there will be code logic where a JAXBContext instance is created multiple times within a flow.

However, you may be using JAXB in a multi-threaded environment, where multiple threads are creating JAXBContext instances. In this case, you will see a performance improvement by using a cached instance of JAXBContext due to the reduced overhead of creating new instances and the improved memory management.

--- config: themeVariables: xyChart: titleColor: "#222222" lineColorList: ["#222233", "#333357"] plotColorPalette: "#000000, #0000FF, #00FF00, #FF0000" --- %% plotColorPalette: "black, blue, green, red" xychart-beta title "Non-Cached Times" x-axis "iterations" [1, 10, 50, 100, 200, 500, 1000, 2000, 5000] y-axis "time in milliseconds" 0 --> 120000 %% blue line [25, 238, 1162, 2342, 4638, 11440, 22967, 44790, 109631]

JAXBContext instances are thread-safe

JAXBContext instances are thread-safe, so you can safely share a single instance across multiple threads. However, the Marshaller and Unmarshaller objects are not thread-safe, so you should create a new instance of these objects for each thread.

See also Performance and thread-safety .

Pooling the Marshaller and Unmarshaller #

Since the Marshaller and Unmarshaller objects are not thread-safe, resuing the same (un)marshaller across multiple threads can lead to issues.

Instead of caching the Marshaller and Unmarshaller objects, you can use a pool instead. It is a more complex solution as your code will need to ensure that the Marshaller and Unmarshaller objects are not being used by multiple threads at the same time and are always returned to the pool.

Pool Manager #

Lets create a simple pool manager that will manage Marshaller objects. The pool manager will create a fixed number of Marshaller objects and store them in a queue. When a thread needs a Marshaller object, it will take one from the queue. When it is done with the Marshaller object, it will return it to the queue.

package dev.programmerpulse;

import jakarta.xml.bind.JAXBContext;
import jakarta.xml.bind.Marshaller;

import java.util.concurrent.BlockingDeque;
import java.util.concurrent.LinkedBlockingDeque;

public enum JaxbPoolManager {
    INSTANCE;

    private final JAXBContext jaxbContext;
    public final int POOL_SIZE = 10;
    private BlockingDeque<Marshaller> pool = new LinkedBlockingDeque<>();

    JaxbPoolManager() {
        try {
            jaxbContext = JAXBContext.newInstance("dev.programmerpulse.jaxb");
            for (int i = 0; i < POOL_SIZE; i++) {
                Marshaller marshaller = jaxbContext.createMarshaller();
                pool.add(marshaller);
            }
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }

    public Marshaller getMarshaller() {
        try {
            return pool.take();
        } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
            throw new RuntimeException(e);
        }
    }

    public void returnMarshaller(Marshaller marshaller) {
        if (marshaller != null) {
            pool.offer(marshaller);
        }
    }
}

The getMarshaller() method is a blocking call since the take() method will block until a Marshaller object is available in the pool. This is useful behaviour as it causes the thread to wait until a Marshaller object is available, rather than continously calling the getMarshaller() method until a Marshaller object is available.

Without Pooling #

The following code creates a new Marshaller object for each iteration of the loop:

private long runWithoutPool(int maxRuns) throws JAXBException {

    long startTime = System.nanoTime();
    for (int i = 0; i < maxRuns; i++) {
        try {
            // create marshaller
            Marshaller marshaller = jaxbContext.createMarshaller();

            // Simulate some work with the marshaller
            XyzRequest request = new XyzRequest();
            request.setSystemId("abc" + i);

            StringWriter sw = new StringWriter();
            marshaller.marshal(request, sw);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }of objects
    return endTime - startTime;
}

The jaxbContext instance is created once in the main method as we are interested in the performance of creating the Marshaller object.

With Pooling #

The following code uses the pool manager to get a Marshaller object from the pool:

private long runWithPool(int maxRuns) {
    ExecutorService executorService = Executors.newFixedThreadPool(10);

    long startTime = System.nanoTime();
    for (int i = 0; i < maxRuns; i++) {
        final int id = i;
        executorService.execute(() -> {
            Marshaller marshaller = null;
            try {
                // get marshaller from the pool
                marshaller = JaxbPoolManager.INSTANCE.getMarshaller();

                // Simulate some work with the marshaller
                XyzRequest request = new XyzRequest();
                request.setSystemId("abc" + id);

                StringWriter sw = new StringWriter();
                marshaller.marshal(request, sw);
            } catch (Exception e) {
                e.printStackTrace();
            } finally {
                // ensure marshaller is returned to the pool
                if (marshaller != null) {
                    JaxbPoolManager.INSTANCE.returnMarshaller(marshaller);
                }
            }
        });
    }

    long endTime = System.nanoTime();
    long elapsedTime = endTime - startTime;
    executorService.shutdown();
    return elapsedTime;
}

Always ensure that the Marshaller object is returned to the pool

To ensure that the Marshaller object is returned to the pool, a finally block is used. This ensures that the Marshaller object is returned to the pool even if an exception is thrown. Otherwise, the pool may run out of Marshaller objects, and the getMarshaller() method will block indefinitely if multiple exceptions are thrown.

Pooling vs Non-Pooling Results #

Here the results, time is in nanoseconds.

Iteration Non-Pooling (ns) Pooling (ns)
10000 102,529,731 1,320,463
20000 80,457,435 2,670,014
30000 107,754,668 2,798,352
40000 133,273,640 1,461,262
50000 156,965,485 1,878,108
60000 193,966,052 2,133,248
70000 218,009,450 2,051,565
80000 257,850,855 2,342,317
90000 281,000,376 2,556,567
100000 323,297,950 2,933,371

So again, there is a significant performance improvement when using a pool of Marshaller objects. The pooled version stays below 3ms, while the non-pooled version is an order of magnitude slower.

Further optimizations can be made like setting the pool size to a larger number, as the tests were not doing any real work so did not hold on to the Marshaller object for long.

--- config: themeVariables: xyChart: titleColor: "#222222" lineColorList: ["#222233", "#333357"] plotColorPalette: "#000000, #0000FF, #00FF00, #FF0000" --- %% plotColorPalette: "black, blue, green, red" xychart-beta title "Pooled vs Non-Pooled" x-axis "iterations" [10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000] y-axis "time in nanoseconds" 0 --> 350000000 %% blue line [102529731, 80457435, 107754668, 133273640, 156965485, 193966052, 218009450, 257850855, 281000376, 323297950] %% green line [1320463, 2670014, 2798352, 1461262, 1878108, 2133248, 2051565, 2342317, 2556567, 2933371]

Conclusion #

In conclusion, caching and pooling can significantly improve performance in Java applications.

Points to note are:

  • JABXContext instances are thread-safe.
  • Cache the JAXBContext instance.
  • Marshaller and Unmarshaller objects are not thread-safe.
  • Use a pool of Marshaller and Unmarshaller objects.