Speeding up JAXB with caching and pooling - with benchmarks
Table of Contents
You Might Also Like
- How to use Java ThreadLocal with Thread Pools ‐ A Practical Use Case
- Why Do Java Methods Modify Objects but Not Primitives?
- 3 Annotations to Speed Up Spring Bean Startup
- Handling ExecutionException - tips for Writing Robust Concurrent Code
- What happens when the ExecutorService has no available threads
What is JAXB? #
JAXB stands for Java Architecture for XML Binding, and is a Java API that allows developers to convert Java objects into XML and vice versa. So XML data is mapper to Java objects, and Java objects can be converted to XML data.
XML has a schema definition (XSD) that defines the structure of the XML data. JAXB uses this schema to generate Java classes that represent the XML data. This allows developers to work with Java objects instead of dealing with XML directly.
The XSD schema is used to generate Java classes that represent the XML data. A class is typically generated for each element in the XML schema, including the root element.
The JAXBContext #
JAXBContext is the main entry point for JAXB. It is create using the newInstance() method, which takes the package of the generated classes. This method creates a new JAXBContext instance that can be used to create Marshaller and Unmarshaller objects for converting Java objects to XML and vice versa.
createMarshaller(): This method creates a Marshaller object that can be used to convert Java objects to XML.createUnmarshaller(): This method creates an Unmarshaller object that can be used to convert XML data to Java objects.
So a typical JAXB workflow is as follows:
- Create a JAXBContext instance using the
newInstance()method. - Create a Marshaller object using the
createMarshaller()method. - Use the Marshaller object to convert Java objects to XML.
- Save the XML data to a file or send it over the network.
More information about JAXB can be found in the JAXB documentation .
Caching JAXBContext #
JAXBContext.newInstance() is slow #
Creating a new JAXBContext instance is expensive, and it can take a significant amount of time to create. This is especially true if you are creating multiple instances in a loop or in a multi-threaded environment.
Note: The XSD schema used in the benchmarking is a fairly large schema with 20,000 lines and 40 plus generated classes. This is so some reasonable time is taken to create the JAXBContext instance, as the code was ran on a powerful machine with 128GB of RAM and 24 cores.
Lets examine the performance of creating a new JAXBContext instance in a loop, and compare it to using a cached instance of JAXBContext.
No Caching #
The following code creates a new JAXBContext instance for each iteration of the loop:
public long noCacheRun(int maxRuns) throws JAXBException {
long startTime = System.currentTimeMillis();
for (int i = 0; i < maxRuns; i++) {
JAXBContext jaxbContext = JAXBContext.newInstance("dev.programmerpulse.jaxb");
jaxbContext.createMarshaller();
}
long endTime = System.currentTimeMillis();
return endTime - startTime;
}
The noCacheRun() method creates a new JAXBContext instance for each iteration of the loop, and uses the instance to create a Marshaller object.
Caching #
The cached version of the code creates a single instance of JAXBContext and reuses it for each iteration of the loop:
private static JAXBContext jaxbContextCache;
public long runWithCache(int maxRuns) throws JAXBException {
long startTime = System.currentTimeMillis();
for (int i = 0; i < maxRuns; i++) {
jaxbContextCache.createMarshaller();
}
long endTime = System.currentTimeMillis();
return endTime - startTime;
}
public static void main(String[] args) throws Exception {
jaxbContextCache = JAXBContext.newInstance("dev.programmerpulse.jaxb");
// ...
}
Cached vs Non-Cached Results #
The table belows shows that the cached version of the code is significantly faster than the non-cached version.
| Iteration | Non-Cached (ms) | Cached (ms) |
|---|---|---|
| 1 | 25 ms | <1 ms |
| 10 | 238 ms | <1 ms |
| 50 | 1162 ms | <1 ms |
| 100 | 2342 ms | <1 ms |
| 200 | 4638 ms | <1 ms |
| 500 | 11440 ms | <1 ms |
| 1000 | 22967 ms | <1 ms |
| 2000 | 44790 ms | <1 ms |
| 5000 | 109631 ms | <1 ms |
Although the test is not a typical use case, as it will be unlikely that there will be code logic where a JAXBContext instance is created multiple times within a flow.
However, you may be using JAXB in a multi-threaded environment, where multiple threads are creating JAXBContext instances. In this case, you will see a performance improvement by using a cached instance of JAXBContext due to the reduced overhead of creating new instances and the improved memory management.
JAXBContext instances are thread-safe
JAXBContext instances are thread-safe, so you can safely share a single instance across multiple threads. However, the Marshaller and Unmarshaller objects are not thread-safe, so you should create a new instance of these objects for each thread.
See also Performance and thread-safety .
Pooling the Marshaller and Unmarshaller #
Since the Marshaller and Unmarshaller objects are not thread-safe, resuing the same (un)marshaller across multiple threads can lead to issues.
Instead of caching the Marshaller and Unmarshaller objects, you can use a pool instead. It is a more complex solution as your code will need to ensure that the Marshaller and Unmarshaller objects are not being used by multiple threads at the same time and are always returned to the pool.
Pool Manager #
Lets create a simple pool manager that will manage Marshaller objects. The pool manager will create a fixed number of Marshaller objects and store them in a queue. When a thread needs a Marshaller object, it will take one from the queue. When it is done with the Marshaller object, it will return it to the queue.
package dev.programmerpulse;
import jakarta.xml.bind.JAXBContext;
import jakarta.xml.bind.Marshaller;
import java.util.concurrent.BlockingDeque;
import java.util.concurrent.LinkedBlockingDeque;
public enum JaxbPoolManager {
INSTANCE;
private final JAXBContext jaxbContext;
public final int POOL_SIZE = 10;
private BlockingDeque<Marshaller> pool = new LinkedBlockingDeque<>();
JaxbPoolManager() {
try {
jaxbContext = JAXBContext.newInstance("dev.programmerpulse.jaxb");
for (int i = 0; i < POOL_SIZE; i++) {
Marshaller marshaller = jaxbContext.createMarshaller();
pool.add(marshaller);
}
} catch (Exception e) {
throw new RuntimeException(e);
}
}
public Marshaller getMarshaller() {
try {
return pool.take();
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
throw new RuntimeException(e);
}
}
public void returnMarshaller(Marshaller marshaller) {
if (marshaller != null) {
pool.offer(marshaller);
}
}
}
The getMarshaller() method is a blocking call since the take() method will block until a Marshaller object is available in the pool. This is useful behaviour as it causes the thread to wait until a Marshaller object is available, rather than continously calling the getMarshaller() method until a Marshaller object is available.
Without Pooling #
The following code creates a new Marshaller object for each iteration of the loop:
private long runWithoutPool(int maxRuns) throws JAXBException {
long startTime = System.nanoTime();
for (int i = 0; i < maxRuns; i++) {
try {
// create marshaller
Marshaller marshaller = jaxbContext.createMarshaller();
// Simulate some work with the marshaller
XyzRequest request = new XyzRequest();
request.setSystemId("abc" + i);
StringWriter sw = new StringWriter();
marshaller.marshal(request, sw);
} catch (Exception e) {
e.printStackTrace();
}
}of objects
return endTime - startTime;
}
The jaxbContext instance is created once in the main method as we are interested in the performance of creating the Marshaller object.
With Pooling #
The following code uses the pool manager to get a Marshaller object from the pool:
private long runWithPool(int maxRuns) {
ExecutorService executorService = Executors.newFixedThreadPool(10);
long startTime = System.nanoTime();
for (int i = 0; i < maxRuns; i++) {
final int id = i;
executorService.execute(() -> {
Marshaller marshaller = null;
try {
// get marshaller from the pool
marshaller = JaxbPoolManager.INSTANCE.getMarshaller();
// Simulate some work with the marshaller
XyzRequest request = new XyzRequest();
request.setSystemId("abc" + id);
StringWriter sw = new StringWriter();
marshaller.marshal(request, sw);
} catch (Exception e) {
e.printStackTrace();
} finally {
// ensure marshaller is returned to the pool
if (marshaller != null) {
JaxbPoolManager.INSTANCE.returnMarshaller(marshaller);
}
}
});
}
long endTime = System.nanoTime();
long elapsedTime = endTime - startTime;
executorService.shutdown();
return elapsedTime;
}
Always ensure that the Marshaller object is returned to the pool
To ensure that the Marshaller object is returned to the pool, a finally block is used. This ensures that the Marshaller object is returned to the pool even if an exception is thrown. Otherwise, the pool may run out of Marshaller objects, and the getMarshaller() method will block indefinitely if multiple exceptions are thrown.
Pooling vs Non-Pooling Results #
Here the results, time is in nanoseconds.
| Iteration | Non-Pooling (ns) | Pooling (ns) |
|---|---|---|
| 10000 | 102,529,731 | 1,320,463 |
| 20000 | 80,457,435 | 2,670,014 |
| 30000 | 107,754,668 | 2,798,352 |
| 40000 | 133,273,640 | 1,461,262 |
| 50000 | 156,965,485 | 1,878,108 |
| 60000 | 193,966,052 | 2,133,248 |
| 70000 | 218,009,450 | 2,051,565 |
| 80000 | 257,850,855 | 2,342,317 |
| 90000 | 281,000,376 | 2,556,567 |
| 100000 | 323,297,950 | 2,933,371 |
So again, there is a significant performance improvement when using a pool of Marshaller objects. The pooled version stays below 3ms, while the non-pooled version is an order of magnitude slower.
Further optimizations can be made like setting the pool size to a larger number, as the tests were not doing any real work so did not hold on to the Marshaller object for long.
Conclusion #
In conclusion, caching and pooling can significantly improve performance in Java applications.
Points to note are:
- JABXContext instances are thread-safe.
- Cache the JAXBContext instance.
- Marshaller and Unmarshaller objects are not thread-safe.
- Use a pool of Marshaller and Unmarshaller objects.