Thrift vs Protocol Bufffers vs JSON

Monday, 01 June 2009

Note: This article a work in progress. If there is anything that needs correcting please let me know by leaving a comment.

Originally this comparison included a look at JSON. Although JSON is small, lightweight, fast to transmit and easily serialized/de-serialized, it is disqualified simply on the basis that there is no built-in way to allow versioning of objects. Both Protobuf and Thrift allow some type of data versioning so that both clients and servers can continue to work without being upgraded, even if the protocol has changed. This is handy when rolling out a new protocol since there’s no need to orchestrate a massive protocol update across services before flipping the switch on a new protocol. Existing services will simply act on the parts of the data they understand while retaining and passing-through whatever other data is associated with the object they’re manipulating. Eishay Smith has a great demo of this functionality using Protobuf.

The Comparison

  Thrift Protobuf
Language Bindings Java, C++, Python, C#, Cocoa, Erlang, Haskell, OCaml, Perl, PHP, Ruby, Smalltalk Java, C++, Python
Primitive Types bool, byte, 16/32/64-bit integers, double, string, byte sequence, map<t1,t2>, list<t>, set<t> bool, 32/64-bit integers, float, double, string, byte sequence, “repeated” properties act like lists
Enumerations Yes Yes
Constants Yes No
Composite Type Struct Message
Exception Handling Yes No
Documentation Lacking Good
License Apache BSD-style
Compiler C++ C++
RPC Interfaces Yes Yes
RPC Implementation Yes No
Composite Type Extensions No Yes
Data Versioning Yes Yes
Pros - More languages supported out of the box
- Richer data structures than Protobuf (e.g.: Map and Set)
- Includes RPC implementation for services
- Slightly faster than Thrift when using "optimize_for = SPEED"
- Serialized objects slightly smaller than Thrift due to more aggressive data compression
- Better documentation
- API a bit cleaner than Thrift
Cons - Good examples are hard to find
- Missing/incomplete documentation
- .proto can define services, but no RPC implementation is defined (although stubs are generated for you).

Much of this table was originally compiled by Stuart Sierra but has been edited to include additional information relevant to my own requirements.

Thrift and Protocol Buffers are both great choices and there seems like no clear winner between them. Since they seem to have more in common than they do in conflict, it really comes down to real-world application-specific needs.

I’d choose Protocol Buffers over Thrift if:

  • You’re only using Java, C++ or Python. Experimental support for other languages is being developed by third parties but are generally not considered ready for production use
  • You already have an RPC implementation
  • On-the-wire data size is crucial
  • The lack of any real documentation is scary to you

I’d choose Thrift over Protocol Buffers if:

  • Your language requirements are anything but Java, C++ or Python. See above.
  • You need additional data structures like Map and Set
  • You want a full client/server RPC implementation built-in
  • You’re a Rock Star programmer that doesn’t need documentation or examples

Questions or Comments about this entry?
Send me a message on Twitter (@mirthlab) or via Email.