« Back to the main CS 131 website
Lab 8: Protobuf and gRPC
Due May 5th at 8pm
Introduction
We’ve learned how applications can use socket system calls to communicate with other computers of the network. In particular, a computer can either act as a server or as a client for a given connection. The socket system calls help clients connect to servers, and they help servers accept connections. But you as the programmer still have to write all of the logic to handle a connection, decode client requests, etc. on the server, and logic to encode client requests into bytes on the client! Building distributed systems would be much easier if we could use a library for this “boilerplate” code, and focus our efforts on implementing the actual service we’re trying to build.
This lab will teach you the basics of Protobuf and gRPC, two libraries designed to make programming of networked services and distributed systems easier. Protobuf (short for “Protocol Buffers”) is a wire format, meaning that it specifies how to encode requests into bytes to send over the network, while gRPC is a code generator for remote procedure call (RPC) code that matches a high-level API specification. Protobuf and gRPC are often coupled together, but they’re actually separate frameworks that make client-server communication easier in different ways.
The lab will demonstrate why these libraries are useful, and prepare you for future assignments that use them.
Protobufs
Protobufs are a message format similar to JSON. Unlike JSON, which is a human-readable text format, Protobufs can be encoded into a space-efficient binary representation. This requires sending fewer bytes over the network. Protobuf messages can represent nested data structures containing primitive values, strings, enums (similar to C language unions), and other Protobuf messages. They are encoded by the sender, sent over the network, and decoded by the receiver.
What makes Protobufs convenient is that you only write a high-level description of the data you want to encode, and then the Protobuf compiler generates much of the encode/decode logic for you, in the language(s) of your choice. So for example, you could have a C++ server talk to clients written in Java, Go, or Python. As long as both the server and client are using the generated functions to encode/decode the Protobuf messages, they can understand each other!
gRPC
gRPC is a framework to create servers that allow clients to interact with them via a remote procedure call (RPC) interface. You define your server API in a file, and gRPC generates server and client-side code for you. The generated client code is complete, and you use it to make any API request to the server. However, the generated server code is incomplete! Since your server API is just an interface, gRPC leaves the implementation of your API functions up to you, but it does practically everything else for you.
Although gRPC generates server and client code that can read requests and send responses, it needs to base this around a message format. gRPC supports many message formats, but generates different code depending on your choice. If you want your server to use JSON for sending/receiving messages, this will generate different code than if you chose Protobufs or XML instead. The most common message format used with gRPC is Protobufs.
Assignment installation
First, ensure that your repository has a handout remote. Type:
$ git remote show handout
If this reports an error, run:
$ git remote add handout https://github.com/csci1310/cs131-s20-labs.git
Then run:
$ git pull
$ git pull handout master
This will merge our Lab 8 stencil code with your previous work. If you have any “conflicts” from Lab 7, resolve them before continuing further. Run git push to save your work back to your personal repository.
Protobuf by Example
Protobufs (“protos”) are defined in .proto files, whose syntax follows the Protocol Buffer definition language. Let’s look at a simple example that defines a message containing info about a person:
message Person {
string name = 1;
string email = 2;
enum PhoneType {
MOBILE = 0;
HOME = 1;
WORK = 2;
}
message PhoneNumber {
string number = 1;
PhoneType type = 2 [default = HOME];
}
repeated PhoneNumber phone = 3;
}
Here, a Person message contains up to three fields: a name, an email, and an array of phone numbers. Note that each field has type, a name, and a unique ID associated with it. Most primitive types like bool, int32, double, andstring are supported. Arrays are supported with the repeated keyword, and maps are supported, too. Messages can also be nested, like the array of PhoneNumbers is inside Person.
Still, Protobufs are designed to be simple. In fact, the entire Protobuf language description fits on this one page. Each message is a struct-like data container. Importantly, it is not a class that allows you to add custom methods or logic!
Once you’ve defined your messages, you can run the Protobuf compiler and have it generate data classes for your messages in whatever language(s) you wish. These classes includes simple getters and setters, as well as functions to encode and decode the message.
So, for example, if your chosen language is C++, running the compiler on the above .proto file will generate a Person class. You can then use this class in your application to populate, serialize, and retrieve Person protos. You might then write some code like this:
Person p;
p.set_name("John Doe");
p.set_email("jdoe@example.com");
std::string person_str = p.SerializeAsString();
This creates a person, sets their name and email, and gets the compressed string encoding of this Person. Then, if you sent this encoding to someone else, they could decode it:
std::string person_str = "xxxxxx";
Person person;
person.ParseFromString(&person_str);
printf("Name: %s\n", person.name().c_str());
printf("Email: %s\n", person.email().c_str());
Another important thing about Protobufs is that they are designed to be both forwards- and backwards-compatible. Each field is optional and contains a unique ID. To see how this achieves compatibility with prior or future versions, let’s add an address field to our Person message:
message Person {
string name = 1;
string email = 2;
...
repeated PhoneNumber phone = 3;
string address = 4;
}
Now suppose your server uses this updated version, but some of your clients do not. This is not an issue. In Protobuf messages are roughly encoded as
FieldID1 Data1
FieldID2 Data2
...
FieldIDn DataN
where the “FieldID” consists of the field number and the field type. What’s more, if a field is empty or not set, it is not encoded. So in reality the encoding might be like
FieldID3 Data3
FieldID7 Data7
FieldID11 Data11
If a client using the old version of our Person proto sends our server a message, then it will be missing the new address field that we added. But the address field – just like any other field – is optional, and the server will need to check whether it’s present.
On the other hand, if the server sends the client a Person message with an address, the old client code will simply ignore this new address field while parsing the message.
gRPC By Example
In gRPC, a client application can directly call a function on a server application on a different machine as if it were a local function, making it easier for you to create distributed applications and services. gRPC is based around the idea of defining a service. This requires you to specify the methods that are available for clients to call remotely, alongside their parameters and return types.
On the server side, the server implements this interface and runs a gRPC server to handle client calls. On the client side, the generated client code exposes the same methods as the server.
To better understand gRPC, let’s expand our Person example from earlier. Imagine we wanted to create an AddressBook service in the same .proto file. This service will allow you to add a contact, and search for a person in the AddressBook:
service AddressBook {
rpc AddContact(Person) returns (Empty) {}
rpc Search(Name) returns (People) {}
}
message Person {
string name = 1;
string email = 2;
...
}
message People {
repeated Person people = 1;
}
message Name {
string name = 1;
}
message Empty {}
Notice that our gRPC service definition is like an interface, with each method’s parameter and return types specified as Protobufs. Our AddContact RPC, for instance, takes a Person message as an input and returns an Empty message. RPC methods must have only one input argument and only one return type. Even if you don’t need an input/return type, you must provide one!
The gRPC Code You Should (And Shouldn’t) Write
Compiling our proto file generates client and server code. The specific code depends on the output language; here we use C++. Our client stub will look like the following:
class Stub final : public StubInterface {
public:
...
::grpc::Status AddContact(::grpc::ClientContext* context, const ::protos::Person& request, ::protos::Empty* response) override;
::grpc::Status Search(::grpc::ClientContext* context, const ::protos::Name& request, ::protos::People* response) override;
...
}
This client stub is a fully-functional client. It implements the StubInterface, and its methods match those in our service definition. For instance, the AddContact method takes in a Person request, and returns an Empty response. It also requires a ClientContext for additional request metadata, and returns a status code, which is like an HTTP response code. Successful requests always return with status OK; other options include INVALID_ARGUMENT, DEADLINE_EXCEEDED, PERMISSION_DENIED, etc.
Our server code, on the other hand, is incomplete. The following server code is generated:
class Service : public ::grpc::Service {
public:
Service();
virtual ~Service();
virtual ::grpc::Status AddContact(::grpc::ServerContext* context, const ::protos::Person* request, ::protos::Empty* response);
virtual ::grpc::Status Search(::grpc::ServerContext* context, const ::protos::Name* request, ::protos::People* response);
};
Here our AddContact and Search methods are virtual. Virtual methods are a C++ concept that allows specifying an interface, but requiring the programmer to implement a class that satisfies this interface. Specifically, the virtual methods come without an implementation, and you as the server developer have to provide that implementation.
The Service class does not define a functional server. For us to make one, we need to make a subclass that implements the virtual methods. Everything other than that is already done for us, or can be configured to meet our needs – from the networking code, to async, security, and even load balancing!
Once we’ve implemented our server logic, how can our server and client talk to one another? From the client-side, we can use our Stub class to connect to the server and send it an AddContact RPC:
std::string srv_addr = "xxxx";
grpc::Channel c = grpc::CreateChannel(srv_addr,grpc::InsecureChannelCredentials());
std::unique_ptr<AddressBook::Stub> client(AddressBook::NewStub(c));
grpc::ClientContext ctx;
Empty empty;
Person p;
p.set_name("Jim Bob");
p.set_email("jim@bob.com");
client->AddContact(&ctx, p, &empty);
The resulting code is simple, but quite powerful. In the call to AddContact, a lot of code runs behind the scenes, on both the client and the server. Most of that code was generated for us by gRPC and Protobuf. On the client side, it almost seems as though we simply called AddContact on some local AddressBook class. However, our client’s AddContact function actually sets in motion a sequence of events:
- It encodes our
Person message and sends it to the server over the network.
- Our
AddressBook server fetches and decodes the message.
- The server executes the
AddContact implementation on an underlying AddressBook instance that has the virtual methods implemented.
- The server encodes the response and sends it back to the client over the network.
- The client parses the response and returns from
AddContact.
The only code that you need to write is the client code (shown above), and the logic that runs in step 3.
You’re now ready to start working with Protobufs and gRPC! 
Task
Please use the course VM to complete this lab.
In this lab, you’ll call functions that Protobuf has generated for you. Although the naming conventions for these functions will differ across programming languages, in C++ they are as follows.
Given a message
message Foo {
string x = 1;
bool y = 2;
repeated string z = 3;
}
the Protobuf compiler will generate a Foo class with the following methods:
std::string Foo::x(): gets the value of the x field. This returns a C++ string.
Foo::set_x(std::string val): sets the x field to be the string val.
bool Foo::y(): gets the value of the y field. This returns a bool.
Foo::set_y(bool val): sets the y field to be the desired value.
For the repeated field z, the generated functions are more complex and documented here.
Task: For this lab, you’ll create a simple todo list application.
You’ll write the entire .proto file, and some of the client and server code.
- Run
setup.sh to install Protobuf and gRPC. This may take a while (~20 minutes).
- Complete the
todo.proto file.
- Complete the remaining functions in
todo_client.cc and todo_server.cc.
Compile and Run
This lab uses the CMake build tool, which many larger C++ projects use. CMake auto-generates Makefiles for you, which is handy when your project consists of many files.
To make the lab for the first time run:
$ cmake .
$ make clean all
Afterwards, you can just run make clean all.
Note: The lab will not build until you’ve correctly completed the .proto file.
To run the lab, first open up a server with:
$ ./todo_server
To get a client to connect to it, in a separate terminal run:
$ ./todo_client
Handin instructions
Turn in your code by pushing your git repository to github.com/csci1310/cs131-s20-labs-YOURNAME.git.
Then, head to the grading server. On the “Labs” page, use the “Lab 8 checkoff” button to check off your lab.
« Back to the main CS 131 website
Lab 8: Protobuf and gRPC
Due May 5th at 8pm
Introduction
We’ve learned how applications can use socket system calls to communicate with other computers of the network. In particular, a computer can either act as a server or as a client for a given connection. The socket system calls help clients connect to servers, and they help servers accept connections. But you as the programmer still have to write all of the logic to handle a connection, decode client requests, etc. on the server, and logic to encode client requests into bytes on the client! Building distributed systems would be much easier if we could use a library for this “boilerplate” code, and focus our efforts on implementing the actual service we’re trying to build.
This lab will teach you the basics of Protobuf and gRPC, two libraries designed to make programming of networked services and distributed systems easier. Protobuf (short for “Protocol Buffers”) is a wire format, meaning that it specifies how to encode requests into bytes to send over the network, while gRPC is a code generator for remote procedure call (RPC) code that matches a high-level API specification. Protobuf and gRPC are often coupled together, but they’re actually separate frameworks that make client-server communication easier in different ways.
The lab will demonstrate why these libraries are useful, and prepare you for future assignments that use them.
Protobufs
Protobufs are a message format similar to JSON. Unlike JSON, which is a human-readable text format, Protobufs can be encoded into a space-efficient binary representation. This requires sending fewer bytes over the network. Protobuf messages can represent nested data structures containing primitive values, strings, enums (similar to C language unions), and other Protobuf messages. They are encoded by the sender, sent over the network, and decoded by the receiver.
What makes Protobufs convenient is that you only write a high-level description of the data you want to encode, and then the Protobuf compiler generates much of the encode/decode logic for you, in the language(s) of your choice. So for example, you could have a C++ server talk to clients written in Java, Go, or Python. As long as both the server and client are using the generated functions to encode/decode the Protobuf messages, they can understand each other!
gRPC
gRPC is a framework to create servers that allow clients to interact with them via a remote procedure call (RPC) interface. You define your server API in a file, and gRPC generates server and client-side code for you. The generated client code is complete, and you use it to make any API request to the server. However, the generated server code is incomplete! Since your server API is just an interface, gRPC leaves the implementation of your API functions up to you, but it does practically everything else for you.
Although gRPC generates server and client code that can read requests and send responses, it needs to base this around a message format. gRPC supports many message formats, but generates different code depending on your choice. If you want your server to use JSON for sending/receiving messages, this will generate different code than if you chose Protobufs or XML instead. The most common message format used with gRPC is Protobufs.
Assignment installation
First, ensure that your repository has a
handoutremote. Type:If this reports an error, run:
Then run:
This will merge our Lab 8 stencil code with your previous work. If you have any “conflicts” from Lab 7, resolve them before continuing further. Run
git pushto save your work back to your personal repository.Protobuf by Example
Protobufs (“protos”) are defined in
.protofiles, whose syntax follows the Protocol Buffer definition language. Let’s look at a simple example that defines a message containing info about a person:Here, a
Personmessage contains up to three fields: a name, an email, and an array of phone numbers. Note that each field has type, a name, and a unique ID associated with it. Most primitive types likebool,int32,double, andstringare supported. Arrays are supported with therepeatedkeyword, andmapsare supported, too. Messages can also be nested, like the array ofPhoneNumbers is insidePerson.Still, Protobufs are designed to be simple. In fact, the entire Protobuf language description fits on this one page. Each message is a struct-like data container. Importantly, it is not a class that allows you to add custom methods or logic!
Once you’ve defined your messages, you can run the Protobuf compiler and have it generate data classes for your messages in whatever language(s) you wish. These classes includes simple getters and setters, as well as functions to encode and decode the message.
So, for example, if your chosen language is C++, running the compiler on the above
.protofile will generate aPersonclass. You can then use this class in your application to populate, serialize, and retrievePersonprotos. You might then write some code like this:This creates a person, sets their name and email, and gets the compressed string encoding of this Person. Then, if you sent this encoding to someone else, they could decode it:
Another important thing about
Protobufsis that they are designed to be both forwards- and backwards-compatible. Each field is optional and contains a unique ID. To see how this achieves compatibility with prior or future versions, let’s add an address field to ourPersonmessage:Now suppose your server uses this updated version, but some of your clients do not. This is not an issue. In Protobuf messages are roughly encoded as
where the “FieldID” consists of the field number and the field type. What’s more, if a field is empty or not set, it is not encoded. So in reality the encoding might be like
If a client using the old version of our
Personproto sends our server a message, then it will be missing the new address field that we added. But the address field – just like any other field – is optional, and the server will need to check whether it’s present.On the other hand, if the server sends the client a
Personmessage with an address, the old client code will simply ignore this new address field while parsing the message.gRPC By Example
In gRPC, a client application can directly call a function on a server application on a different machine as if it were a local function, making it easier for you to create distributed applications and services. gRPC is based around the idea of defining a service. This requires you to specify the methods that are available for clients to call remotely, alongside their parameters and return types.
On the server side, the server implements this interface and runs a gRPC server to handle client calls. On the client side, the generated client code exposes the same methods as the server.
To better understand gRPC, let’s expand our
Personexample from earlier. Imagine we wanted to create anAddressBookservice in the same.protofile. This service will allow you to add a contact, and search for a person in theAddressBook:Notice that our gRPC service definition is like an interface, with each method’s parameter and return types specified as Protobufs. Our
AddContactRPC, for instance, takes aPersonmessage as an input and returns anEmptymessage. RPC methods must have only one input argument and only one return type. Even if you don’t need an input/return type, you must provide one!The gRPC Code You Should (And Shouldn’t) Write
Compiling our
protofile generates client and server code. The specific code depends on the output language; here we use C++. Our client stub will look like the following:This client stub is a fully-functional client. It implements the
StubInterface, and its methods match those in our service definition. For instance, theAddContactmethod takes in aPersonrequest, and returns anEmptyresponse. It also requires aClientContextfor additional request metadata, and returns a status code, which is like anHTTPresponse code. Successful requests always return with statusOK; other options includeINVALID_ARGUMENT,DEADLINE_EXCEEDED,PERMISSION_DENIED, etc.Our server code, on the other hand, is incomplete. The following server code is generated:
Here our
AddContactandSearchmethods are virtual. Virtual methods are a C++ concept that allows specifying an interface, but requiring the programmer to implement a class that satisfies this interface. Specifically, thevirtualmethods come without an implementation, and you as the server developer have to provide that implementation.The
Serviceclass does not define a functional server. For us to make one, we need to make a subclass that implements the virtual methods. Everything other than that is already done for us, or can be configured to meet our needs – from the networking code, to async, security, and even load balancing!Once we’ve implemented our server logic, how can our server and client talk to one another? From the client-side, we can use our
Stubclass to connect to the server and send it anAddContactRPC:The resulting code is simple, but quite powerful. In the call to
AddContact, a lot of code runs behind the scenes, on both the client and the server. Most of that code was generated for us by gRPC and Protobuf. On the client side, it almost seems as though we simply calledAddContacton some localAddressBookclass. However, our client’sAddContactfunction actually sets in motion a sequence of events:Personmessage and sends it to the server over the network.AddressBookserver fetches and decodes the message.AddContactimplementation on an underlyingAddressBookinstance that has the virtual methods implemented.AddContact.The only code that you need to write is the client code (shown above), and the logic that runs in step 3.
You’re now ready to start working with Protobufs and gRPC!
Task
Please use the course VM to complete this lab.
In this lab, you’ll call functions that Protobuf has generated for you. Although the naming conventions for these functions will differ across programming languages, in C++ they are as follows.
Given a message
the Protobuf compiler will generate a
Fooclass with the following methods:std::string Foo::x(): gets the value of thexfield. This returns a C++ string.Foo::set_x(std::string val): sets thexfield to be the stringval.bool Foo::y(): gets the value of theyfield. This returns a bool.Foo::set_y(bool val): sets the y field to be the desired value.For the repeated field
z, the generated functions are more complex and documented here.Task: For this lab, you’ll create a simple todo list application.
You’ll write the entire
.protofile, and some of the client and server code.setup.shto install Protobuf and gRPC. This may take a while (~20 minutes).todo.protofile.todo_client.ccandtodo_server.cc.Compile and Run
This lab uses the CMake build tool, which many larger C++ projects use. CMake auto-generates Makefiles for you, which is handy when your project consists of many files.
To make the lab for the first time run:
# to compile the lab $ cmake . $ make clean allAfterwards, you can just run
make clean all.Note: The lab will not build until you’ve correctly completed the
.protofile.To run the lab, first open up a server with:
To get a client to connect to it, in a separate terminal run:
Handin instructions
Turn in your code by pushing your git repository to
github.com/csci1310/cs131-s20-labs-YOURNAME.git.Then, head to the grading server. On the “Labs” page, use the “Lab 8 checkoff” button to check off your lab.