[意译]Berkeley DB 文档 - C++入门篇 - 1.2节 - Berkeley DB 概述
译者序(转载 -- Berkeley DB简介):
Berkeley DB是由美国Sleepycat Software公司开发的一套开放源码的嵌入式数据库的程序库(database library),它为应用程序提供可伸缩的、高性能的、有事务保护功能的数据管理服务。Berkeley DB为数据的存取和管理提供了一组简洁的函数调用API接口。
它是一个经典的C-library模式的toolkit,为程序员提供广泛丰富的函数集,是为应用程序开发者提供工业级强度的数据库服务而设计的。其主要特点如下:
嵌入式(Embedded):它直接链接到应用程序中,与应用程序运行于同样的地址空间中,因此,无论是在网络上不同计算机之间还是在同一台计算机的不同进程之间,数据库操作并不要求进程间通讯。
Berkeley DB为多种编程语言提供了API接口,其中包括C、C++、Java、Perl、Tcl、Python和PHP,所有的数据库操作都在程序库内部发生。多个进程,或者同一进程的多个线程可同时使用数据库,有如各自单独使用,底层的服务如加锁、事务日志、共享缓冲区管理、内存管理等等都由程序库透明地执行。
轻便灵活(Portable):它可以运行于几乎所有的UNIX和Linux系统及其变种系统、Windows操作系统以及多种嵌入式实时操作系统之下。它在32位和64位系统上均可运行,已经被好多高端的因特网服务器、台式机、掌上电脑、机顶盒、网络交换机以及其他一些应用领域所采用。一旦 Berkeley DB被链接到应用程序中,终端用户一般根本感觉不到有一个数据库系统存在。
可伸缩(Scalable):这一点表现在很多方面。Database library本身是很精简的(少于300KB的文本空间),但它能够管理规模高达256TB的数据库。它支持高并发度,成千上万个用户可同时操纵同一个数据库。Berkeley DB能以足够小的空间占用量运行于有严格约束的嵌入式系统,也可以在高端服务器上耗用若干GB的内存和若干TB的磁盘空间。
Berkeley DB在嵌入式应用中比关系数据库和面向对象数据库要好,有以下两点原因:
(1)因为数据库程序库同应用程序在相同的地址空间中运行,所以数据库操作不需要进程间的通讯。在一台机器的不同进程间或在网络中不同机器间进行进程通讯所花费的开销,要远远大于函数调用的开销;
(2)因为Berkeley DB对所有操作都使用一组API接口,因此不需要对某种查询语言进行解析,也不用生成执行计划,大大提高了运行效.
正文
Berkeley DB Documentation -- C++ Getting Started Guide
Berkeley DB 文档 -- C++入门篇
Berkeley DB Concepts
Berkeley DB 概述
Before continuing, it is useful to describe some of the larger concepts that you will encounter when building a DB application.
先看看构建一个DB应用时你可能遇到的一些概念.
Conceptually, DB databases contain records. Logically each record represents a single entry in the database. Each such record contains two pieces of information: a key and a data. This manual will on occasion describe a a record's key or a record's data when it is necessary to speak to one or the other portion of a database record.
概念上说,DB数据库包含了记录(records),每条技术是一个逻辑上的实体.这种记录通常包含两部分的信息,键和值.本手册会适当的时候来进一步说明.
Because of the key/data pairing used for DB databases, they are sometimes thought of as a two-column table. However, data (and sometimes keys, depending on the access method) can hold arbitrarily complex data. Frequently, C structures and other such mechanisms are stored in the record. This effectively turns a 2-column table into a table with n columns, where n-1 of those columns are provided by the structure's fields.
根据键值对的DB数据库模型,我们有时会认为它是一个只有两列的表.然而,值(有时也可以是键,这由访问的方式决定)可以是一种复杂的数据结构.通常,记录中保存的是C的结构体或一些类似的东西.这种方式有效的把2列的表转变为n列的表,其中n-1列是由结构体提供.
Note that a DB database is very much like a table in a relational database system in that most DB applications use more than one database (just as most relational databases use more than one table).
注意,我们这里的DB数据库和关系数据库中的表十分类似,绝大部分DB应用使用不只一个数据库(正如大部分关系数据库不只一个表).
Unlike relational systems, however, a DB database contains a single collection of records organized according to a given access method (BTree, Queue, Hash, and so forth). In a relational database system, the underlying access method is generally hidden from you.
但与关系系统不同的是,DB数据库可以通过指定方式(B树,队列,哈希,诸如此类)访问一个的数据集.在关系数据库系统中,这些隐藏的访问方式通常是用户不可见的.
In any case, frequently DB applications are designed so that a single database stores a specific type of data (just as in a relational database system, a single table holds entries containing a specific set of fields). Because most applications are required to manage multiple kinds of data, a DB application will often use multiple databases.
基本上,常用的DB应用设计是一个单独的数据库来保存一个特定的数据类型(正如关系数据库中,一张单独的表包含一种特定的集合).由于一个程序通常要包含多种数据类型,一个DB应用通常也使用多个数据库.
For example, consider an accounting application. This kind of an application may manage data based on bank accounts, checking accounts, stocks, bonds, loans, and so forth. An accounting application will also have to manage information about people, banking institutions, customer accounts, and so on. In a traditional relational database, all of these different kinds of information would be stored and managed using a (probably very) complex series of tables. In a DB application, all of this information would instead be divided out and managed using multiple databases.
比如,想像一个账目程序.这种程序可以管理银行帐号,支票帐号,股票,债券,贷款等等.一个账目程序还需要管理用户,用户帐户,银行等等的信息.在传统的关系数据库中,所有的这些信息可能通过一系列(也许非常非常)复杂的表来保存和管理.在DB应用中,所有的这些信息则被分离,用多个数据库来管理.
DB applications can efficiently use multiple databases using an optional mechanism called an environment. For more information, see Environments.
DB应用可以通过一种可选的叫做"环境(environment)"机制来有效的使用多个数据库.更多内容参见"Environments"章节.
You interact with most DB APIs using special structures that contain pointers to functions. These callbacks are called methods because they look so much like a method on a C++ class. The variable that you use to access these methods is often referred to as a handle. For example, to use a database you will obtain a handle to that database.
你通过使用特定包含指针和函数的结构体和DB的API互交.这些回调和C++中类的method看上去很像,因此被称为方法(methods).
用来访问方法的变量通常是一个句柄(handle).比如,使用一个数据库,你必须获得一个数据库的句柄.
Retrieving a record from a database is sometimes called getting the record because the method that you use to retrieve the records is called get(). Similarly, storing database records is sometimes called putting the record because you use the put() method to do this.
从数据库中找回一条记录有时被称为get一条记录,原因是找回记录的方法是get().类似保存一条记录有时被称为put一条记录,因为相应的方法是put().
When you store, or put, a record to a database using its handle, the record is stored according to whatever sort order is in use by the database. Sorting is mostly performed based on the key, but sometimes the data is considered too. If you put a record using a key that already exists in the database, then the existing record is replaced with the new data. However, if the database supports duplicate records (that is, records with identical keys but different data), then that new record is stored as a duplicate record and any existing records are not overwritten.
当你通过句柄保存,或者说是put一条记录,这条记录通过数据库指定的顺序被插入.顺序主要是由键来确定,但有时值也是被考虑的因素.如果你put了一个键已经存在的记录,它会用新的数据替换原有数据.然而,如果数据库支持多重(duplicate)记录(也就是说键同值不同),那么新的记录会作为另一个副本保存.
If a database supports duplicate records, then you can use a database handle to retrieve only the first record in a set of duplicate records.
如果数据库支持多重记录,你可以通过数据库找回多重记录集中的第一条记录.
In addition to using a database handle, you can also read and write data using a special mechanism called a cursor. Cursors are essentially iterators that you can use to walk over the records in a database. You can use cursors to iterate over a database from the first record to the last, and from the last to the first. You can also use cursors to seek to a record. In the event that a database supports duplicate records, cursors are the only way you can access all the records in a set of duplicates.
再提一点关于数据库句柄使用,你可以通过一种叫做游标(cursor)的特定机制来读/写数据.游标本质上是用来遍历数据库中记录的迭代器(iterator).你可以使用游标来查询记录.在数据库支持多重记录的情况下,游标是唯一可以访问这些记录副本集的方式.
Finally, DB provides a special kind of a database called a secondary database. Secondary databases serve as an index into normal databases (called primary database to distinguish them from secondaries). Secondary databases are interesting because DB records can hold complex data types, but seeking to a given record is performed only based on that record's key. If you wanted to be able to seek to a record based on some piece of information that is not the key, then you enable this through the use of secondary databases.
最后,DB提供了一种特殊的数据库叫做次级(secondary)数据库.次级数据库作为普通数据库(为了区别,称他为主数据库)的索引而存在.由于DB数据库可以保存复杂的数据结构,但是查询时只能通过记录的键.如果你想通过不是键的部分进行查询,就需要通过次级数据库来实现了.