You are viewing the version of this documentation from Perl 5.8.6. View the latest version

CONTENTS

NAME

DB_File - Perl5 access to Berkeley DB version 1.x

SYNOPSIS

use DB_File;

[$X =] tie %hash,  'DB_File', [$filename, $flags, $mode, $DB_HASH] ;
[$X =] tie %hash,  'DB_File', $filename, $flags, $mode, $DB_BTREE ;
[$X =] tie @array, 'DB_File', $filename, $flags, $mode, $DB_RECNO ;

$status = $X->del($key [, $flags]) ;
$status = $X->put($key, $value [, $flags]) ;
$status = $X->get($key, $value [, $flags]) ;
$status = $X->seq($key, $value, $flags) ;
$status = $X->sync([$flags]) ;
$status = $X->fd ;

# BTREE only
$count = $X->get_dup($key) ;
@list  = $X->get_dup($key) ;
%list  = $X->get_dup($key, 1) ;
$status = $X->find_dup($key, $value) ;
$status = $X->del_dup($key, $value) ;

# RECNO only
$a = $X->length;
$a = $X->pop ;
$X->push(list);
$a = $X->shift;
$X->unshift(list);
@r = $X->splice(offset, length, elements);

# DBM Filters
$old_filter = $db->filter_store_key  ( sub { ... } ) ;
$old_filter = $db->filter_store_value( sub { ... } ) ;
$old_filter = $db->filter_fetch_key  ( sub { ... } ) ;
$old_filter = $db->filter_fetch_value( sub { ... } ) ;

untie %hash ;
untie @array ;

DESCRIPTION

DB_File is a module which allows Perl programs to make use of the facilities provided by Berkeley DB version 1.x (if you have a newer version of DB, see "Using DB_File with Berkeley DB version 2 or greater"). It is assumed that you have a copy of the Berkeley DB manual pages at hand when reading this documentation. The interface defined here mirrors the Berkeley DB interface closely.

Berkeley DB is a C library which provides a consistent interface to a number of database formats. DB_File provides an interface to all three of the database types currently supported by Berkeley DB.

The file types are:

DB_HASH

This database type allows arbitrary key/value pairs to be stored in data files. This is equivalent to the functionality provided by other hashing packages like DBM, NDBM, ODBM, GDBM, and SDBM. Remember though, the files created using DB_HASH are not compatible with any of the other packages mentioned.

A default hashing algorithm, which will be adequate for most applications, is built into Berkeley DB. If you do need to use your own hashing algorithm it is possible to write your own in Perl and have DB_File use it instead.

DB_BTREE

The btree format allows arbitrary key/value pairs to be stored in a sorted, balanced binary tree.

As with the DB_HASH format, it is possible to provide a user defined Perl routine to perform the comparison of keys. By default, though, the keys are stored in lexical order.

DB_RECNO

DB_RECNO allows both fixed-length and variable-length flat text files to be manipulated using the same key/value pair interface as in DB_HASH and DB_BTREE. In this case the key will consist of a record (line) number.

Using DB_File with Berkeley DB version 2 or greater

Although DB_File is intended to be used with Berkeley DB version 1, it can also be used with version 2, 3 or 4. In this case the interface is limited to the functionality provided by Berkeley DB 1.x. Anywhere the version 2 or greater interface differs, DB_File arranges for it to work like version 1. This feature allows DB_File scripts that were built with version 1 to be migrated to version 2 or greater without any changes.

If you want to make use of the new features available in Berkeley DB 2.x or greater, use the Perl module BerkeleyDB instead.

Note: The database file format has changed multiple times in Berkeley DB version 2, 3 and 4. If you cannot recreate your databases, you must dump any existing databases with either the db_dump or the db_dump185 utility that comes with Berkeley DB. Once you have rebuilt DB_File to use Berkeley DB version 2 or greater, your databases can be recreated using db_load. Refer to the Berkeley DB documentation for further details.

Please read "COPYRIGHT" before using version 2.x or greater of Berkeley DB with DB_File.

Interface to Berkeley DB

DB_File allows access to Berkeley DB files using the tie() mechanism in Perl 5 (for full details, see "tie()" in perlfunc). This facility allows DB_File to access Berkeley DB files using either an associative array (for DB_HASH & DB_BTREE file types) or an ordinary array (for the DB_RECNO file type).

In addition to the tie() interface, it is also possible to access most of the functions provided in the Berkeley DB API directly. See "THE API INTERFACE".

Opening a Berkeley DB Database File

Berkeley DB uses the function dbopen() to open or create a database. Here is the C prototype for dbopen():

DB*
dbopen (const char * file, int flags, int mode, 
        DBTYPE type, const void * openinfo)

The parameter type is an enumeration which specifies which of the 3 interface methods (DB_HASH, DB_BTREE or DB_RECNO) is to be used. Depending on which of these is actually chosen, the final parameter, openinfo points to a data structure which allows tailoring of the specific interface method.

This interface is handled slightly differently in DB_File. Here is an equivalent call using DB_File:

tie %array, 'DB_File', $filename, $flags, $mode, $DB_HASH ;

The filename, flags and mode parameters are the direct equivalent of their dbopen() counterparts. The final parameter $DB_HASH performs the function of both the type and openinfo parameters in dbopen().

In the example above $DB_HASH is actually a pre-defined reference to a hash object. DB_File has three of these pre-defined references. Apart from $DB_HASH, there is also $DB_BTREE and $DB_RECNO.

The keys allowed in each of these pre-defined references is limited to the names used in the equivalent C structure. So, for example, the $DB_HASH reference will only allow keys called bsize, cachesize, ffactor, hash, lorder and nelem.

To change one of these elements, just assign to it like this:

$DB_HASH->{'cachesize'} = 10000 ;

The three predefined variables $DB_HASH, $DB_BTREE and $DB_RECNO are usually adequate for most applications. If you do need to create extra instances of these objects, constructors are available for each file type.

Here are examples of the constructors and the valid options available for DB_HASH, DB_BTREE and DB_RECNO respectively.

$a = new DB_File::HASHINFO ;
$a->{'bsize'} ;
$a->{'cachesize'} ;
$a->{'ffactor'};
$a->{'hash'} ;
$a->{'lorder'} ;
$a->{'nelem'} ;

$b = new DB_File::BTREEINFO ;
$b->{'flags'} ;
$b->{'cachesize'} ;
$b->{'maxkeypage'} ;
$b->{'minkeypage'} ;
$b->{'psize'} ;
$b->{'compare'} ;
$b->{'prefix'} ;
$b->{'lorder'} ;

$c = new DB_File::RECNOINFO ;
$c->{'bval'} ;
$c->{'cachesize'} ;
$c->{'psize'} ;
$c->{'flags'} ;
$c->{'lorder'} ;
$c->{'reclen'} ;
$c->{'bfname'} ;

The values stored in the hashes above are mostly the direct equivalent of their C counterpart. Like their C counterparts, all are set to a default values - that means you don't have to set all of the values when you only want to change one. Here is an example:

$a = new DB_File::HASHINFO ;
$a->{'cachesize'} =  12345 ;
tie %y, 'DB_File', "filename", $flags, 0777, $a ;

A few of the options need extra discussion here. When used, the C equivalent of the keys hash, compare and prefix store pointers to C functions. In DB_File these keys are used to store references to Perl subs. Below are templates for each of the subs:

    sub hash
    {
        my ($data) = @_ ;
        ...
        # return the hash value for $data
	return $hash ;
    }

    sub compare
    {
	my ($key, $key2) = @_ ;
        ...
        # return  0 if $key1 eq $key2
        #        -1 if $key1 lt $key2
        #         1 if $key1 gt $key2
        return (-1 , 0 or 1) ;
    }

    sub prefix
    {
	my ($key, $key2) = @_ ;
        ...
        # return number of bytes of $key2 which are 
        # necessary to determine that it is greater than $key1
        return $bytes ;
    }

See "Changing the BTREE sort order" for an example of using the compare template.

If you are using the DB_RECNO interface and you intend making use of bval, you should check out "The 'bval' Option".

Default Parameters

It is possible to omit some or all of the final 4 parameters in the call to tie and let them take default values. As DB_HASH is the most common file format used, the call:

tie %A, "DB_File", "filename" ;

is equivalent to:

tie %A, "DB_File", "filename", O_CREAT|O_RDWR, 0666, $DB_HASH ;

It is also possible to omit the filename parameter as well, so the call:

tie %A, "DB_File" ;

is equivalent to:

tie %A, "DB_File", undef, O_CREAT|O_RDWR, 0666, $DB_HASH ;

See