Data API
NOTE: This page describes how to use the Data API with the synchronous Bigtable client. Examples for using the Data API with the async client can be found in the Getting Started Guide.
After creating a Table and some
column families, you are ready to store and retrieve data.
Cells vs. Columns vs. Column Families
As explained in the table overview, tables can have many column families.
As described below, a table can also have many rows which are specified by row keys.
Within a row, data is stored in a cell. A cell simply has a value (as bytes) and a timestamp. The number of cells in each row can be different, depending on what was stored in each row.
Each cell lies in a column (not a column family). A column is really just a more specific modifier within a column family. A column can be present in every column family, in only one or anywhere in between.
Within a column family there can be many columns. For example, within the column family
foowe could have columnsbarandbaz. These would typically be represented asfoo:barandfoo:baz.
Modifying Data
Since data is stored in cells, which are stored in rows, we
use the metaphor of a row in classes that are used to modify
(write, update, delete) data in a
Table.
Direct vs. Conditional vs. Append
There are three ways to modify data in a table, described by the MutateRow, CheckAndMutateRow and ReadModifyWriteRow API methods.
The direct way is via MutateRow which involves simply adding, overwriting or deleting cells. The
DirectRowclass handles direct mutations.The conditional way is via CheckAndMutateRow. This method first checks if some filter is matched in a given row, then applies one of two sets of mutations, depending on if a match occurred or not. (These mutation sets are called the “true mutations” and “false mutations”.) The
ConditionalRowclass handles conditional mutations.The append way is via ReadModifyWriteRow. This simply appends (as bytes) or increments (as an integer) data in a presumed existing cell in a row. The
AppendRowclass handles append mutations.
Row Factory
A single factory can be used to create any of the three row types.
To create a DirectRow:
row = table.row(row_key)
Unlike the previous string values we’ve used before, the row key must
be bytes.
To create a ConditionalRow,
first create a RowFilter and
then
cond_row = table.row(row_key, filter_=filter_)
To create an AppendRow
append_row = table.row(row_key, append=True)
Building Up Mutations
In all three cases, a set of mutations (or two sets) are built up on a row before they are sent off in a batch via
row.commit()
Direct Mutations
Direct mutations can be added via one of four methods
set_cell()allows a single value to be written to a column
row.set_cell(column_family_id, column, value,
timestamp=timestamp)
If the timestamp is omitted, the current time on the Google Cloud
Bigtable server will be used when the cell is stored.
The value can either be bytes or an integer, which will be converted to bytes as a signed 64-bit integer.
delete_cell()deletes all cells (i.e. for all timestamps) in a given column
row.delete_cell(column_family_id, column)
Remember, this only happens in the row we are using.
If we only want to delete cells from a limited range of time, a
TimestampRange can
be used
row.delete_cell(column_family_id, column,
time_range=time_range)