Protocol Buffers (协议缓冲) 之 Language Guide

Assigning Tags
As you can see, each field in the message definition has a unique numbered tag. These tags are used to identify your fields in the message binary format, and should not be changed once your message type is in use. Note that tags with values in the range 1 through 15 take one byte to encode. Tags in the range 16 through 2047 take two bytes. So you should reserve the tags 1 through 15 for very frequently occurring message elements. Remember to leave some room for frequently occurring elements that might be added in the future.
The smallest tag number you can specify is 1, and the largest is 229 - 1, or 536,870,911. You also cannot use the numbers 19000 though 19999 (FieldDescriptor::kFirstReservedNumber through FieldDescriptor::kLastReservedNumber), as they are reserved for the Protocol Buffers implementation - the protocol buffer compiler will complain if you use one of these reserved numbers in your .proto.

For historical reasons, repeated fields of basic numeric types aren't encoded as efficiently as they could be. New code should use the special option [packed=true] to get a more efficient encoding. For example:
repeated int32 samples = 4 [packed=true];

谨慎使用required描述符，因为在以后的扩展中，很难去掉该字段。建议全部使用optional和repeated来实现。

Adding Comments
To add comments to your .proto files, use C/C++-style // syntax.

int32, sint32, int64, sint64
sint32, sint64支持负数
更多：

.proto Type	Notes	C++ Type	Java Type
double		double	double
float		float	float
int32	Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint32 instead.	int32	int
int64	Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint64 instead.	int64	long
uint32	Uses variable-length encoding.	uint32	int^[1]
uint64	Uses variable-length encoding.	uint64	long^[1]
sint32	Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int32s.	int32	int
sint64	Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int64s.	int64	long
fixed32	Always four bytes. More efficient than uint32 if values are often greater than 2²⁸.	uint32	int^[1]
fixed64	Always eight bytes. More efficient than uint64 if values are often greater than 2⁵⁶.	uint64	long^[1]
sfixed32	Always four bytes.	int32	int
sfixed64	Always eight bytes.	int64	long
bool		bool	boolean
string	A string must always contain UTF-8 encoded or 7-bit ASCII text.	string	String
bytes	May contain any arbitrary sequence of bytes.	string	ByteString

Optional Fields And Default Values
If the default value is not specified for an optional element, a type-specific default value is used instead: for strings, the default value is the empty string. For bools, the default value is false. For numeric types, the default value is zero. For enums, the default value is the first value listed in the enum's type definition.

Enumerations
You can define enums within a message definition, as in the above example, or outside – these enums can be reused in any message definition in your .proto file. You can also use an enum type declared in one message as the type of a field in a different message, using the syntax MessageType.EnumType.

Importing Definitions
In the above example, the Result message type is defined in the same file as SearchResponse – what if the message type you want to use as a field type is already defined in another .proto file?
You can use definitions from other .proto files by importing them. To import another .proto's definitions, you add an import statement to the top of your file:
import "myproject/other_protos.proto";
The protocol compiler searches for imported files in a set of directories specified on the protocol compiler command line using the -I/--import_path flag. If no flag was given, it looks in the directory in which the compiler was invoked.

Updating A Message Type
If an existing message type no longer meets all your needs. It's very simple to update message types without breaking any of your existing code. Just remember the following rules:
Don't change the numeric tags for any existing fields.
Any new fields that you add should be optional or repeated. This means that any messages serialized by code using your "old" message format can be parsed by your new generated code, as they won't be missing any required elements. You should set up sensible default values for these elements so that new code can properly interact with messages generated by old code. Similarly, messages created by your new code can be parsed by your old code: old binaries simply ignore the new field when parsing. However, the unknown fields are not discarded, and if the message is later serialized, the unknown fields are serialized along with it – so if the message is passed on to new code, the new fields are still available. Note that preservation of unknown fields is currently not available for Python.
Non-required fields can be removed, as long as the tag number is not used again in your updated message type (it may be better to rename the field instead, perhaps adding the prefix "OBSOLETE_", so that future users of your .proto can't accidentally reuse the number).
A non-required field can be converted to an extension and vice versa, as long as the type and number stay the same.
int32, uint32, int64, uint64, and bool are all compatible – this means you can change a field from one of these types to another without breaking forwards- or backwards-compatibility. If a number is parsed from the wire which doesn't fit in the corresponding type, you will get the same effect as if you had cast the number to that type in C++ (e.g. if a 64-bit number is read as an int32, it will be truncated to 32 bits).
sint32 and sint64 are compatible with each other but are not compatible with the other integer types.
string and bytes are compatible as long as the bytes are valid UTF-8.
Embedded messages are compatible with bytes if the bytes contain an encoded version of the message.
fixed32 is compatible with sfixed32, and fixed64 with sfixed64.
Changing a default value is generally OK, as long as you remember that default values are never sent over the wire. Thus, if a program receives a message in which a particular field isn't set, the program will see the default value as it was defined in that program's version of the protocol. It will NOT see the default value that was defined in the sender's code.

Extensions
a.proto:
message Foo {
// ...
extensions 100 to 199;
}
b.proto:
import "a.proto"
extend Foo {
optional int32 bar = 126;
}
Similarly, the Foo class defines templated accessors HasExtension(), ClearExtension(), GetExtension(), MutableExtension(), and AddExtension(). All have semantics matching the corresponding generated accessors for a normal field.

Defining Services
RPC (Remote Procedure Call) 远程过程调用

FAQ
1 问题：执行protoc.exe产生的代码编译出错。
描述：当跨目录生成代码时（用到import "xxx/aaa.proto";），执行protoc.exe --cpp_out=. test.proto，产生的代码.cc里有::protobuf_AddDesc....，这个函数中总是多了个xxx（即那个目录名），导致编译失败。
解决：同一个项目里执行protoc.exe的目录不要改变，与该项目的Makefile在同一个目录下。
如项目在E:\workspace\test\qt\test2，那么执行：protoc.exe -I=/e/workspace/test/qt/test2/ --cpp_out=/e/workspace/test/qt/test2/ /e/workspace/test/qt/test2/protobuf/personalmain/LPersonalMainCategory.proto

posted on 2011-01-26 09:19 seahouse 阅读(1406) 评论(1) 编辑收藏引用所属分类: 数据

# re: Protocol Buffers (协议缓冲) 之 Language Guide 回复 更多评论

protobuf_AddDesc...我也遇到了

2012-05-23 10:24 | 秒大刀

刷新评论列表

只有注册用户登录后才能发表评论。
【推荐】100%开源！大型工业跨平台软件C++源码提供，建模，组态！

相关文章: sqlite命令 Protocol Buffers (协议缓冲) 之 Language Guide Protocol Buffers (协议缓冲) 简单使用 Protocol Buffers (协议缓冲) 介绍及安装

网站导航: 博客园 IT新闻 BlogJava 博问 Chat2DB 管理

# re: Protocol Buffers (协议缓冲) 之 Language Guide 回复 更多评论

海屋