Skip to content

0x26res/ptars

Repository files navigation

ptars

PyPI Version Python Version PyPI Wheel Documentation Downloads Downloads Build Status codecov License Ruff snyk Github Stars GitHub issues Contributing FOSSA Status Repo Size

Protobuf to Arrow, using Rust

Example

Take a protobuf:

message SearchRequest {
  string query = 1;
  int32 page_number = 2;
  int32 result_per_page = 3;
}

And convert serialized messages directly to pyarrow.RecordBatch:

from ptars import HandlerPool


messages = [
    SearchRequest(
        query="protobuf to arrow",
        page_number=0,
        result_per_page=10,
    ),
    SearchRequest(
        query="protobuf to arrow",
        page_number=1,
        result_per_page=10,
    ),
]
payloads = [message.SerializeToString() for message in messages]

pool = HandlerPool([SearchRequest.DESCRIPTOR.file])
handler = pool.get_for_message(SearchRequest.DESCRIPTOR)
record_batch = handler.list_to_record_batch(payloads)
query page_number result_per_page
protobuf to arrow 0 10
protobuf to arrow 1 10

You can also convert a pyarrow.RecordBatch back to serialized protobuf messages:

array: pa.BinaryArray = handler.record_batch_to_array(record_batch)
messages_back: list[SearchRequest] = [
    SearchRequest.FromString(s.as_py()) for s in array
]

Benchmark against protarrow

Ptars is a rust implementation of protarrow, which is implemented in plain python. It is:

  • 2.5 times faster when converting from proto to arrow.
  • 3 times faster when converting from arrow to proto.
---- benchmark 'to_arrow': 2 tests ----
Name (time in ms)        Mean          
---------------------------------------
protarrow_to_arrow     9.4863 (2.63)   
ptars_to_arrow         3.6009 (1.0)    
---------------------------------------

---- benchmark 'to_proto': 2 tests -----
Name (time in ms)         Mean          
----------------------------------------
protarrow_to_proto     20.8297 (3.20)   
ptars_to_proto          6.5013 (1.0)    
----------------------------------------

About

Protobuf to Arrow, using Rust

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 5

Languages