Note: This article a work in progress. If there is anything that needs correcting please let me know by leaving a comment.
Originally this comparison included a look at JSON. Although JSON is small, lightweight, fast to transmit and easily serialized/de-serialized, it is disqualified simply on the basis that there is no built-in way to allow versioning of objects. Both Protobuf and Thrift allow some type of data versioning so that both clients and servers can continue to work without being upgraded, even if the protocol has changed. This is handy when rolling out a new protocol since there’s no need to orchestrate a massive protocol update across services before flipping the switch on a new protocol. Existing services will simply act on the parts of the data they understand while retaining and passing-through whatever other data is associated with the object they’re manipulating. Eishay Smith has a great demo of this functionality using Protobuf.
|Language Bindings||Java, C++, Python, C#, Cocoa, Erlang, Haskell, OCaml, Perl, PHP, Ruby, Smalltalk||Java, C++, Python|
|Primitive Types||bool, byte, 16/32/64-bit integers, double, string, byte sequence, map<t1,t2>, list<t>, set<t>||bool, 32/64-bit integers, float, double, string, byte sequence, “repeated” properties act like lists|
|Composite Type Extensions||No||Yes|
- More languages supported out of the box
- Richer data structures than Protobuf (e.g.: Map and Set)
- Includes RPC implementation for services
- Slightly faster than Thrift when using "optimize_for = SPEED"
- Serialized objects slightly smaller than Thrift due to more aggressive data compression
- Better documentation
- API a bit cleaner than Thrift
- Good examples are hard to find
- Missing/incomplete documentation
|- .proto can define services, but no RPC implementation is defined (although stubs are generated for you).|
Much of this table was originally compiled by Stuart Sierra but has been edited to include additional information relevant to my own requirements.
Thrift and Protocol Buffers are both great choices and there seems like no clear winner between them. Since they seem to have more in common than they do in conflict, it really comes down to real-world application-specific needs.
I’d choose Protocol Buffers over Thrift if:
- You’re only using Java, C++ or Python. Experimental support for other languages is being developed by third parties but are generally not considered ready for production use
- You already have an RPC implementation
- On-the-wire data size is crucial
- The lack of any real documentation is scary to you
I’d choose Thrift over Protocol Buffers if:
- Your language requirements are anything but Java, C++ or Python. See above.
- You need additional data structures like Map and Set
- You want a full client/server RPC implementation built-in
- You’re a Rock Star programmer that doesn’t need documentation or examples