QuickBuffers is a Java implementation of Google's Protocol Buffers that has been developed for low latency use cases in zero-allocation environments. It has no external dependencies, and the API follows Protobuf-Java where feasible to simplify migration.
The main highlights are
- Allocation-free in steady state. All parts of the API are mutable and reusable.
- No reflections. GraalVM native-images and R8/ProGuard obfuscation (config) are supported out of the box
- Faster encoding and decoding speed
- Smaller code size than protobuf-javalite
- Built-in JSON marshalling compliant with the proto3 mapping
- Improved order for optimized sequential memory access
- Optional accessors as an opt-in feature (java8)
QuickBuffers passes all proto2 conformance tests and is compatible with all Java versions from 6 through 20 as well as Android. Proto3 messages can be generated and are wire compatible, but so far the behavioral differences have not been explicitly added due to some proto3 design decisions that have kept us from using it. Current limitations include
- Services are not implemented
- Extensions are embedded directly into the extended message, so support is limited to generation time.
- Well-known proto3 types such as Timestamp and Duration are not special cased in JSON marshalling
- Unsigned integer types are JSON encoded as signed integer numbers
In order to use QuickBuffers you need to generate messages and add the corresponding runtime dependency. The runtime can be found at the Maven coordinates below.
<properties>
<quickbuf.version>1.2</quickbuf.version>
<quickbuf.options>indent=4,allocation=lazy,extensions=embedded</quickbuf.options>
</properties>
<dependency>
<groupId>us.hebi.quickbuf</groupId>
<artifactId>quickbuf-runtime</artifactId>
<version>${quickbuf.version}</version>
</dependency>
The message generator protoc-gen-quickbuf
is set up as a plugin for the protocol buffers compiler protoc
. You can install one of the pre-built packages and run:
protoc-quickbuf --quickbuf_out=${options>:<outputDir> <protoFiles>
or use a protoc-gen-quickbuf-${version}-${arch}.exe plugin binary with an absolute pluginPath
:
protoc --plugin-protoc-gen-quickbuf=${exePath} --quickbuf_out=${options>:<outputDir> <protoFiles>
or build messages in Maven using the protoc-jar-maven-plugin:
<!-- Downloads protoc w/ plugin and generates messages -->
<!-- Default settings expect .proto files to be in src/main/protobuf -->
<plugin>
<groupId>com.github.os72</groupId>
<artifactId>protoc-jar-maven-plugin</artifactId>
<version>3.11.4</version>
<executions>
<execution>
<phase>generate-sources</phase>
<goals>
<goal>run</goal>
</goals>
<configuration>
<protocVersion>3.21.12</protocVersion>
<outputTargets>
<outputTarget>
<type>quickbuf</type>
<pluginArtifact>us.hebi.quickbuf:protoc-gen-quickbuf:${quickbuf.version}</pluginArtifact>
<outputOptions>${quickbuf.options}</outputOptions>
</outputTarget>
</outputTargets>
</configuration>
</execution>
</executions>
</plugin>
The generator features several options that can be supplied as a comma-separated list. The default values are marked bold.
Option | Value | Description |
---|---|---|
indent | 2, 4, 8, tab | sets the indentation in generated files |
replace_package | (pattern)=replacement | replaces the Java package of the generated messages to avoid name collisions with messages generated by --java_out . |
input_order | quickbuf, number, none | improves decoding performance when parsing messages that were serialized in a known order. number matches protobuf-java, and none disables this optimization (not recommended). |
output_order | quickbuf, number | number matches protobuf-java serialization to pass conformance tests that require binary equivalence (not recommended). |
store_unknown_fields | false, true | generates code to retain unknown fields that were encountered during parsing. This allows messages to be routed without losing information, even if the schema is not fully known. Unknown fields are stored in binary form and are ignored in equality checks. |
enforce_has_checks | false, true | throws an exception when accessing fields that were not set |
allocation | eager, lazy, lazymsg | changes the allocation strategy for nested types. eager allocates up-front and results in fewer runtime-allocations, but it may be wasteful and prohibits recursive type declarations. lazy waits until the field is actually needed. lazymsg acts lazy for nested messages, and eager for everything else. |
extensions | disabled, embedded | embedded adds extensions from within a single protoc call directly to the extended message. This requires extensions to be known at generation time. Some plugins may do a separate request per file, so it may require an import to combine multiple files. |
java8_optional | false, true | creates tryGet methods that are short for return if(hasField()) ? Optional.of(getField()) : Optional.absent() . Requires a runtime with Java 8 or higher. |
We tried to keep the public API as close to Google's protobuf-java
as possible, so most use cases should require very few changes. The Java related file options are all supported and behave the same way.
// .proto definition
message RootMessage {
optional string text = 1;
optional NestedMessage nested_message = 2;
repeated Person people_list = 3;
}
message NestedMessage {
optional double value = 1;
}
message Person {
optional uint32 id = 1;
optional string name = 2;
}
The main difference is that there are no extra builder classes and that all message contents are mutable. The getMutable()
accessors set the has flag and provide access to the nested references.
// Use fluent-style to set values
RootMessage msg = RootMessage.newInstance()
.setText("Hello World");
// Use getMutable() to set nested messages
msg.getMutableNestedMessage()
.setValue(1.0);
// Write repeated values into the internally allocated list
RepeatedMessage<Person> people = msg.getMutablePeopleList().reserve(4);
for (int i = 0; i < 4; i++) {
Person person = people.next()
.setId(i)
.setName("person " + i);
}
Messages can be read from a ProtoSource
and written to a ProtoSink
. newInstance
instantiates optimized implementations for accessing contiguous blocks of memory such as byte[]
and ByteBuffer
. Reads and writes do not modify the ByteBuffer
state, so positions and limits need to be manually if needed.
// Convenience wrappers
byte[] buffer = msg.toByteArray();
RootMessage result = RootMessage.parseFrom(buffer);
assertEquals(result, msg);
The internal state can be reset with the setInput
and setOutput
methods. ProtoMessage::getSerializedSize
sets an internally cached size, so it should always be called before serialization if there were any changes.
// Reusable objects
byte[] buffer = new byte[512];
ProtoSink sink = ProtoSink.newArraySink();
ProtoSource source = ProtoSource.newArraySource();
// Stream messages
for (int i = 0; i < 100; i++) {
int length = msg.getSerializedSize();
msg.writeTo(sink.setOutput(buffer, 0, length));
result.clearQuick().mergeFrom(source.setInput(buffer, 0, length));
}
Additionally, there are also (non-optimized) convenience wrappers for InputStream
, OutputStream
, and ByteBuffer
.
ProtoSink.newInstance(new ByteArrayOutputStream());
ProtoSource.newInstance(new ByteArrayInputStream(bytes));
Keep in mind that mutability comes at the cost of thread-safety, so contents should be cloned with ProtoMessage::clone
or copied with ProtoMessage::copyFrom
before being passed to another thread.
Direct Source/Sink
Depending on platform support for sun.misc.Unsafe
, the DirectSource
and DirectSink
implementations allow working with off-heap memory. This is intended for reducing unnecessary memory copies when working with direct NIO buffers. Besides not needing to copy data, there is no performance benefit compared to working with heap arrays.
// Write to direct buffer
ByteBuffer directBuffer = ByteBuffer.allocateDirect(msg.getSerializedSize());
ProtoSink directSink = ProtoSink.newDirectSink();
msg.writeTo(directSink.setOutput(directBuffer));
directBuffer.limit(directSink.getTotalBytesWritten());
// Read from direct buffer
ProtoSource directSource = ProtoSource.newDirectSource();
RootMessage result = RootMessage.parseFrom(directSource.setInput(directBuffer));
assertEquals(msg, result);
JSON Source/Sink
ProtoMessages also support reading from and writing to JSON as specified in the proto3 mapping.
// Set some contents
RootMessage msg = RootMessage.newInstance();
msg.setText("π QuickBuffers \uD83D\uDC4D");
msg.getMutablePeopleList().next()
.setId(0)
.setName("First Name");
msg.getMutablePeopleList().next()
.setId(1)
.setName("Last Name");
// Print as prettified json
System.out.println(msg);
The default toString method for all messages returns prettified json. The above prints:
{
"text": "π QuickBuffers π",
"peopleList": [{
"id": 0,
"name": "First Name"
}, {
"id": 1,
"name": "Last Name"
}]
}
More fine grained control is exposed via the JsonSink
and JsonSource
interfaces.
// json options
JsonSink sink = JsonSink.newInstance()
.setPrettyPrinting(false)
.setWriteEnumsAsInts(false)
.setPreserveProtoFieldNames(false);
// use ProtoMessage::writeTo or JsonSink::writeMessage to serialize the contents
msg.writeTo(sink.clear());
RepeatedByte bytes = sink.getBytes();
// use ProtoMessage::parseFrom or JsonSource::parseMessage to parse the contents
JsonMessage result = JsonSource.newInstance(bytes)
.setIgnoreUnknownFields(true)
.parseMessage(JsonMessage.getFactory());
Parts can be combined to convert an incoming protobuf stream to outgoing json and vice-versa
msg.clearQuick()
.mergeFrom(protoSource.setInput(input))
.writeTo(jsonSink.clear());
The default implementation encodes the minimal representation accepted by the protobuf spec, i.e., floating point numbers do not append a trailing zero, and long integers are encoded without quotes. Alternative implementations based on GSON and Jackson can be found in the quickbuf-compat
artifact.
Note that the built-in JsonSink has been optimized quite a bit, but the JsonSource is very barebones due to a lack of an internal use case for JSON decoding.
The project can be built with mvn package
using jdk 8 through jdk 20.
mvn clean package --projects generator,runtime -am
omits building the benchmarks.
Note that the package
goal is always required, and that mvn clean test
is not enough to work. This limitation is introduced by the plugin mechanism of protoc
, which exchanges information with plugins via protobuf messages on std::in
and std::out
. Using std::in
makes it comparatively easy to get schema information, but it is quite difficult to set up unit tests and debug plugins during development. To enable standard tests, the parser
module contains a tiny protoc-plugin that stores the raw request from std::in
inside a file that can be loaded during testing and development of the actual generator plugin. This makes the generator
module dependent on the packaged output of the parser
module.
All nested object types such as message or repeated fields have getField()
and getMutableField()
accessors. Both return the same internal storage object, but getField()
should be considered read-only. Once a field is cleared, it should also no longer be modified.
All primitive values generate the same accessors and behavior as Protobuf-Java's Builder
classes
// .proto
message SimpleMessage {
optional int32 primitive_value = 1;
}
// simplified generated code
public final class SimpleMessage {
public SimpleMessage setPrimitiveValue(int value);
public SimpleMessage clearPrimitiveValue();
public boolean hasPrimitiveValue();
public int getPrimitiveValue();
private int primitiveValue;
}
Nested message types are allocated internally. The recommended way to set nested message content is by accessing the internal store with getMutableNestedMessage()
. Setting content using setNestedMessage(NestedMessage.newInstance())
copies the data, but does not change the internal reference.
// .proto
message NestedMessage {
optional int32 primitive_value = 1;
}
message RootMessage {
optional NestedMessage nested_message = 1;
}
// simplified generated code
public final class RootMessage {
public RootMessage setNestedMessage(NestedMessage value); // copies contents to internal message
public RootMessage clearNestedMessage(); // clears has bit as well as the backing object
public boolean hasNestedMessage();
public NestedMessage getNestedMessage(); // internal message -> treat as read-only
public NestedMessage getMutableNestedMessage(); // internal message -> may be modified until has state is cleared
private final NestedMessage nestedMessage = NestedMessage.newInstance();
}
// (1) setting nested values via 'set' (does a data copy!)
msg.setNestedMessage(NestedMessage().newInstance().setPrimitiveValue(0));
// (2) modify the internal store directly (recommended)
RootMessage msg = RootMessage.newInstance();
msg.getMutableNestedMessage().setPrimitiveValue(0);
String
types are internally stored as Utf8String
that are lazily parsed and can be set with CharSequence
. Since Java String
objects are immutable, there are additional access methods to allow for decoding characters into a reusable StringBuilder
instance, as well as for using a custom Utf8Decoder
that can implement interning.
// .proto
message SimpleMessage {
optional string optional_string = 2;
}
// simplified generated code
public final class SimpleMessage {
public SimpleMessage setOptionalString(CharSequence value);
public SimpleMessage clearOptionalString(); // sets length = 0
public boolean hasOptionalString();
public String getOptionalString(); // lazily converted string
public Utf8String getOptionalStringBytes(); // internal representation -> treat as read-only
public Utf8String getMutableOptionalStringBytes(); // internal representation -> may be modified until has state is cleared
private final Utf8String optionalString = Utf8String.newEmptyInstance();
}
// Get characters
SimpleMessage msg = SimpleMessage.newInstance().setOptionalString("my-text");
StringBuilder chars = new StringBuilder();
msg.getOptionalStringBytes().getChars(chars); // chars now contains "my-text"
Repeated scalar fields work mostly the same as String fields, but the internal array()
can be accessed directly if needed. Repeated messages and object types provide a next()
method that adds one element and provides a mutable reference to it.
// .proto
message SimpleMessage {
repeated double repeated_double = 42;
}
// simplified generated code
public final class SimpleMessage {
public SimpleMessage addRepeatedDouble(double value); // adds one value
public SimpleMessage addAllRepeatedDouble(double... values); // adds N values
public SimpleMessage clearRepeatedDouble(); // sets length = 0
public boolean hasRepeatedDouble();
public RepeatedDouble getRepeatedDouble(); // internal store -> treat as read-only
public RepeatedDouble getMutableRepeatedDouble(); // internal store -> may be modified
private final RepeatedDouble repeatedDouble = RepeatedDouble.newEmptyInstance();
}
There are no reflections, so none of the fields need to be preserved or special cased. However, Proguard may warn about missing methods when obfuscating against an older runtime. This is related to an intentional workaround, so the warnings can just be disabled by adding the line below to the proguard.conf
file. R8 should automatically pick it up from the bundled config file.
-dontwarn us.hebi.quickbuf.JdkMethods
Many internals and large parts of the generated API are based on Protobuf-Java. The encoding of floating point numbers during JSON serialization is based on Schubfach [Giu2020]. Many other JSON parts were inspired by dsl-json, jsoniter, and jsoniter-scala.