Best Practices for Writing Protobuf

This page contains the best practices that are used in VSETH to write proto files.

The requirement level keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" used in this document are to be interpreted as described in RFC 2119.

General

In general, VSETH follows the Google API Design Guide. This guide is quite extensive, and therefore this page summarizes and highlights important guidelines out of this document. To manage APIs, we use SerVIS, which enforces some constraints that may differ from the Google Guide and standard conventions, most importantly:

  • Organization: Each API has one proto file with one Servis
  • Versioning: There is always one proto file, which might contain experimental features (RPCs, fields, messages). Features are introduced or removed over a specific process, where breaking changes can not happen spontaneously. It is, therefore, essential to design APIs to be extendable, as breaking changes are sometimes only achievable in a cumbersome process.

(MUST) Use Google.Protobuf.WellKnownTypes

Google provides a collection of predefined Proto Types for several general use cases, such as an empty message (google.protobuf.Empty). Instead of redefining these in every proto file, you MUST use these types.

You can find the list of types here.

Example

You MUST NOT define an EmptyMessage in every proto, like:

BAD Example
service Pizza {
	rpc ListPizzas(EmptyMessage) returns (ListPizzaResponse);
}

// Thi is a *COUNTER* Example do not use.
message EmptyMessage {}

You MUST use the provided types, like:

Good Example
import "google/protobuf/empty.proto";

service Pizza {
	rpc DeletePizzas(Pizza) returns (google.protobuf.Empty);
}

Why?

Don't repeat yourself (DRY) is a principle of software development aimed at reducing the repetition of software patterns, replacing them with abstractions, or using data normalization to avoid redundancy. According to this standard, types should be used wherever possible.

(MUST) Standard Fields

Some fields appear in many messages, for example:

  • A Timestamp when the object represented by the message was created or updated
  • Fields used for pagination

For these fields, you MUST follow the naming convention defined by the Google API Guide.

Example

You MUST NOT use different names for fields defined in the naming convention, such as:

BAD Example
service Pizza {
	rpc GetPizza(GetPizzaRequest) returns (Pizza);
}


message Pizza {
	// This is a *COUNTER* example, do not use!
	google.protobuf.Timestamp created_on = 1;
}

You MUST use the names for fields defined in the naming convention, such as:

Good Example
service Pizza {
	rpc GetPizza(GetPizzaRequest) returns (Pizza);
}

message Pizza {
	google.protobuf.Timestamp created_time = 1;
}

Why?

Consistency.

(SHOULD) Message Structure

We recommend putting metadata fields (e.g., paging-related fields) at the end of the message.

(SHOULD) Standard Methods

The Google API Design Guide defines the notion of Standard Methods. Whenever possible, you SHOULD use these standard methods to work with resources:

MethodExampleDescription
ListListBooks(ListBooksRequest) returns (ListBooksResponse)Used to return a potentially filtered list of resources. It is also used for standard search queries, which filter a selection. List should always 
GetGetBook(GetBookRequest) returns (Book)Used to retrieve a single object by its URL / ID
CreateCreateBook(CreateBookRequest) returns (Book)The CreateBookRequest must contain the resource (i.e., Book). Fields that a client cannot set must be documented as "Output-only."
UpdateUpdateBook(UpdateBookRequest)returns(Book)The UpdateBookRequest message contains the resource (i.e., Book) and potentially further parameters.
DeleteDeleteBook(DeleteBookRequest) returns (google.protobuf.Empty)The delete method should return the empty message if the resource is deleted. If a resource only gets marked as deleted or some other state change occurs, it should return the resource. The method call should be idempotent the first response should be a success message, and subsequent calls result in a NOT_FOUND message

(SHOULD) Typical Custom Methods

Some custom methods often occur. They should be implemented as follows:

MethodExampleDescription
CancelCancelBackup(CancelBackupRequest) returns (CancelBackupResponse)For longer operations, you might want a cancel method that cancels a task provided its identifier.
BatchBatchGetBooks(BatchGetBooksRequest) returns (BachGetBooksResponse)There might be a need to execute a method on a batch of objects to increase the performance.

Move

MoveBook(MoveBookRequest) returns (Book)Move an entity from one parent to another one.
SearchSearchBooks(SearchBooksRequest) returns (SearchBooksResponse)While basic filter search queries are performed using List, Search should be used for more advanced Search where a search query is provided. 

(MUST) Naming Conventions

  • correct American English
  • message names: UpperCamelCase
  • fields: lower_snake_case
  • enum fields: ALL_UPPER_SNAKE_CASE
  • use well-known abbreviations
    • config (configuration)
    • id (identifier)
    • spec (specification)
    • stats (statistics)

Message Names

The request and response message of an RPC should be named <RPC Name>Request/Response, except if they return a single entity (e.g., Book), in which case that message should be returned or for the Empty.

Example

You MUST NOT use arbitrary message names:

BAD Example
service Pizza {
	rpc ListPizza(PizzaIDs) returns (PizzaList);
}

message PizzaIDs {
	repeated string id = 1;
}

message PizzaList {
	repeated Pizza = 1;
}

You MUST use consistent message names for requests and responses:

Good Example
service Pizza {
	rpc GetPizza(GetPizzaRequest) returns (Pizza);
}

message Pizza {
	google.protobuf.Timestamp created_time = 1;
}

Why?

Besides consistency, this allows further fields to be added in the future. The specification should always be extensible to prevent breaking changes in the future. 

(MUST) Enum Default Value

Enums must define a 0 value which is also the default value. If there is a common default value, then you should use the enum value 0 (e.g., ENGLISH as a Language). If there is no common default behavior, it should be named <ENUM_NAME>_UNSPECIFIED. Also, consider that a request can never remove the default value; thus, rather not set a default value when in doubt. 

Example

Good Example
enum Pizza {
	PIZZA_UNSPECIFIED = 0;
	MARGHERITA = 1;
	NAPOLI = 2;
}

Why?

The default value is both used as a default if no value is set and can not be removed in the future without breaking changes. 

(MAY) Resource View

Sometimes are more than one typical subsets of a resource that are useful. 

Example

Good Example
message ListBooksRequest {
  string name = 1;

  // Specifies which parts of the book resource should be returned
  // in the response.
  BookView view = 2;
}

enum BookView {
  // Server responses only include author, title, ISBN and unique book ID.
  // The default value.
  BASIC = 0;

  // Full representation of the book is returned in server responses,
  // including contents of the book.
  FULL = 1;
}

(SHALL) Document

  • Output only: An entity (e.g., Book) might contain some output only (read-only) fields (e.g., create_time). Such fields shall be documented as `Output only.`. If an output-only field, the content must accept the request and ignore the value.
  • Required: A required field should be marked; otherwise, the service should not expect the field to be populated. 
  • Input only: Fields can be provided but are not empty on retrieval. 

Example

Good Example
message Table {
  // Required. The resource name of the table.
  string name = 1;
  // Input only. Whether to dry run the table creation.
  bool dryrun = 2;
  // Output only. The timestamp when the table was created. Assigned by
  // the server.
  Timestamp create_time = 3;
  // The display name of the table.
  string display_name = 4;
}

(SHOULD) Pagination

Methods such as List or Search should provide pagination using the following attributes:

NameTypeDescription
page_tokenstringThe pagination token (e.g., 2 for the 2nd page) in the request.
page_sizeint32The size of the page in the request. 
total_sizeint32

The total count of items in the list irrespective of pagination in the response.

next_page_tokenstringThe token for the next page in the response (if this is empty, there are no further pages).

Pagination should always be provided for any method that might use it in the future, as adding it later changes API behavior

(SHOULD) File Structure

You should adhere to the file structure defined in: https://cloud.google.com/apis/design/file_structure

Especially:

  • this applicable section should be in the following order: syntax, package, import, and option
  • The RPC request and response message definitions should be in the same order as the corresponding methods. Each request message must precede its corresponding response message if any.
  • A parent resource must be defined before its child resource(s).


(Should) Filtering

  • Not follow AIP-160 for filtering. Even though Google recommends this, we are of the opinion that writing the necessary parsers would be too cumbersome and writing code-gens for every language too complex.
  • Write custom filter messages which consists of the fields of the resource type being queried.
  • For dates (timestamps) or numerical values, one can also add further sub-messages which allow specifying a range.
  • Field masks should be used to specify which fields are to be considered when filtering.
  • Have sort_by as a separate field in the list request. Use a Message specific enum to define the different options.

    Good Example
    message Table {
      // Required. The resource name of the table.
      string name = 1;
      // Input only. Whether to dry run the table creation.
      bool dryrun = 2;
      // Output only. The timestamp when the table was created. Assigned by
      // the server.
      Timestamp create_time = 3;
      // The display name of the table.
      string display_name = 4;
    
      message Filter {
        string name = 1;
        DateRange create_time = 2;
        string display_name = 3;
    	google.protobuf.FieldMask filter_mask = 4;
         message DateRange {
          oneof from {
            bool unlimited_from_range = 1;
            google.protobuf.Timestamp timestamp_from = 2;
          }
          oneof until {
            bool unlimited_until_range = 3;
            google.protobuf.Timestamp timestamp_until = 4;
          }
        }
      }
    	
      enum SortyBy {
    	UNSPECIFIED = 0;
    	NAME = 1;
    	CREATE_TIME = 2;
    	DISPLAY_NAME = 3;
      }
    }
    
    message ListTablesRequest {
      Table.Filter filter = 1;
      Table.SortyBy sort_by = 2;
      int32 page_size = 3;
      string page_token = 4;
    }

VSETH Specific

(MUST) Not use unsigned Integer Types

Unsigned inter types (uint32, fixed32) must not be used, as Java supports them properly. To indicate that an integer must be non-negative should be documented in the comment (e.g., The accuracy of the latitude and longitude coordinates, in meters. Must be non-negative.)

(SHOULD) Use vseth.type.*

There are custom types used by VSETH which should be used when applicable. If there is a google.type.X and vseth.type.X, you should use the vseth.type.X., E.g., there is a vseth.type.Money, as we can look at a more simplified concept of money that ignores sub-cents amounts in favor of a simpler type.

(SHOULD NOT) Use Mask

To support partial responses, you should use resource views instead of the FiledMasks.

Why?

This way, there are no constant strings in the code, and enum fields can be used.

Other (to integrate)

  • Standard Fields of Resources
    The complete list can be found here: https://cloud.google.com/apis/design/standard_fields

    NameTypeDescription
    namestring

    The relative resource name (e.g., shelves/shelf1/books/book2)

    parentstring

    The relative resource name of the parent object (e.g.,/shelves/shelf1)

    display_namestring
    titlestring

    The formal version of display_name. Differs from the display_name if the display_name is not formal. 

    descriptionstringOne or more paragraphs of text description of an entity.
    create_timeTimestampThe creation timestamp of an entity.
    update_timeTimestampThe last update timestamp of an entity.
    delete_timeTimestampThe deletion timestamp of an entity.
    expire_timeTimestampThe expiration timestamp of an entity.
    start_timeTimestampThe timestamp marking the beginning of some time period.
    end_timeTimestamp

    The timestamp marking the end of some time period or operation.

    deletedbool

    Indicates if a resource has been deleted.

    validation_onlybool

    If true, it indicates that the given request should only be validated, not executed.

    request_idstring

    A unique string id. It is used to detecting duplicated requests for idempotent requests. 

    update_maskFieldMaskIt is used for Update request messages for performing a partial update on a resource. This mask is relative to the resource, not to the request message.

    The display name of an entity. To support multilingual fields, the BCP-47 language code (e.g., ch-DE, en-US) should be provided under the language_code label, and the entity has to be requested in a specified language.

  • List and Search

    NameTypeDescription
    filterstringThe standard filter parameter for List methods.
    querystringThe same as the filter field if being applied to a search method. 
    order_bystringSpecifies the result ordering of List requests.
  • Errors
    • Each RPC can return an error with a status and a message. A client should use the status messages. Error messages are meant for developers. User error messages should be derived from the error code.
    • Error propagation
      • hide implementation details
      • when you receive an error, you should return an INTERNAL error to your caller