NAME¶
KiokuDB::Backend::DBI - DBI backend for KiokuDB
SYNOPSIS¶
my $dir = KiokuDB->connect(
"dbi:mysql:foo",
user => "blah",
password => "moo',
columns => [
# specify extra columns for the 'entries' table
# in the same format you pass to DBIC's add_columns
name => {
data_type => "varchar",
is_nullable => 1, # probably important
},
],
);
$dir->search({ name => "foo" }); # SQL::Abstract
DESCRIPTION¶
This backend for KiokuDB leverages existing DBI accessible databases.
The schema is based on two tables, "entries" and "gin_index"
(the latter is only used if a Search::GIN extractor is specified).
The "entries" table has two main columns, "id" and
"data" (currently in JSPON format, in the future the format will be
pluggable), and additional user specified columns.
The user specified columns are extracted from inserted objects using a callback
(or just copied for simple scalars), allowing SQL where clauses to be used for
searching.
COLUMN EXTRACTIONS¶
The columns are specified using a DBIx::Class::ResultSource instance.
One additional column info parameter is used, "extract", which is
called as a method on the inserted object with the column name as the only
argument. The return value from this callback will be used to populate the
column.
If the column extractor is omitted then the column will contain a copy of the
entry data key by the same name, if it is a plain scalar. Otherwise the column
will be "NULL".
These columns are only used for lookup purposes, only "data" is
consulted when loading entries.
DBIC INTEGRATION¶
This backend is layered on top of DBIx::Class::Storage::DBI and reused
DBIx::Class::Schema for DDL.
Because of this objects from a DBIx::Class::Schema can refer to objects in the
KiokuDB entries table, and vice versa.
For more details see DBIx::Class::Schema::KiokuDB.
SUPPORTED DATABASES¶
This driver has been tested with MySQL 5 (4.1 should be the minimal supported
version), SQLite 3, and PostgreSQL 8.3.
The SQL code is reasonably portable and should work with most databases. Binary
column support is required when using the Storable serializer.
Transactions¶
For reasons of performance and ease of use database vendors ship with read
committed transaction isolation by default.
This means that read locks are
not acquired when data is fetched from the
database, allowing it to be updated by another writer. If the current
transaction then updates the value it will be silently overwritten.
IMHO this is a much bigger problem when the data is unstructured. This is
because data is loaded and fetched in potentially smaller chunks, increasing
the risk of phantom reads.
Unfortunately enabling truly isolated transaction semantics means that
"txn_commit" may fail due to a lock contention, forcing you to
repeat your transaction. Arguably this is more correct "read
comitted", which can lead to race conditions.
Enabling repeatable read or serializable transaction isolation prevents
transactions from interfering with eachother, by ensuring all data reads are
performed with a shared lock.
For more information on isolation see
<
http://en.wikipedia.org/wiki/Isolation_(computer_science)>
SQLite
SQLite provides serializable isolation by default.
<
http://www.sqlite.org/pragma.html#pragma_read_uncommitted>
MySQL
MySQL provides read committed isolation by default.
Serializable level isolation can be enabled by by default by changing the
"transaction-isolation" global variable,
http://dev.mysql.com/doc/refman/5.1/en/set-transaction.html#isolevel_serializable
<
http://dev.mysql.com/doc/refman/5.1/en/set-transaction.html#isolevel_serializable>
PostgreSQL
PostgreSQL provides read committed isolation by default.
Repeatable read or serializable isolation can be enabled by setting the default
transaction isolation level, or using the "SET TRANSACTION" SQL
statement.
http://www.postgresql.org/docs/8.3/interactive/transaction-iso.html
<
http://www.postgresql.org/docs/8.3/interactive/transaction-iso.html>,
http://www.postgresql.org/docs/8.3/interactive/runtime-config-client.html#GUC-DEFAULT-TRANSACTION-ISOLATION
<
http://www.postgresql.org/docs/8.3/interactive/runtime-config-client.html#GUC-DEFAULT-TRANSACTION-ISOLATION>
ATTRIBUTES¶
- schema
- Created automatically.
This is DBIx::Class::Schema object that is used for schema deployment,
connectivity and transaction handling.
- connect_info
- An array reference whose contents are passed to
"connect" in DBIx::Class::Schema.
If omitted will be created from the attrs "dsn", "user",
"password" and "dbi_attrs".
- dsn
- user
- password
- dbi_attrs
- Convenience attrs for connecting using "connect"
in KiokuDB.
User in "connect_info"'s builder.
- columns
- Additional columns, see "COLUMN
EXTRACTIONS".
- serializer
- KiokuDB::Serializer. Coerces from a string, too:
KiokuDB->connect("dbi:...", serializer => "storable");
Defaults to KiokuDB::Serializer::JSON.
- create
- If true the existence of the tables will be checked for and
the DB will be deployed if not.
Defaults to false.
- extract
- An optional Search::GIN::Extract used to create the
"gin_index" entries.
Usually Search::GIN::Extract::Callback.
- schema_hook
- A hook that is called on the backend object as a method
with the schema as the argument just before connecting.
If you need to modify the schema in some way (adding indexes or constraints)
this is where it should be done.
- for_update
- If true (the defaults), will cause all select statement to
be issued with a "FOR UPDATE" modifier on MySQL, Postgres and
Oracle.
This is highly reccomended because these database provide low isolation
guarantees as configured out the box, and highly interlinked graph
databases are much more susceptible to corruption because of lack of
transcational isolation than normalized relational databases.
- sqlite_sync_mode
- If this attribute is set and the underlying database is
SQLite, then "PRAGMA syncrhonous=..." will be issued with this
value.
Can be "OFF", "NORMAL" or "FULL" (SQLite's
default), or 0, 1, or 2.
See <http://www.sqlite.org/pragma.html#pragma_synchronous>.
- mysql_strict
- If true (the default), sets MySQL's strict mode.
This is HIGHLY reccomended, or you may enjoy some of MySQL's more
interesting features, like automatic data loss when the columns are too
narrow.
See http://dev.mysql.com/doc/refman/5.0/en/server-sql-mode.html
<http://dev.mysql.com/doc/refman/5.0/en/server-sql-mode.html> and
DBIx::Class::Storage::DBI::mysql for more details.
- on_connect_call
- See DBIx::Class::Storage::DBI.
This attribute is constructed based on the values of
"mysql_version" and "sqlite_sync_mode", but may be
overridden if you need more control.
- dbic_attrs
- See DBIx::Class::Storage::DBI.
Defaults to
{ on_connect_call => $self->on_connect_call }
- batch_size
- SQL that deals with entries run in batches of the amount
provided in "batch_size". If it is not provided, the statements
will run in a single batch.
This solves the issue with SQLite where lists can only handle 999 elements
at a time. "batch_size" will be set to 999 by default if the
driver in use is SQLite.
METHODS¶
See KiokuDB::Backend and the various roles for more info.
- deploy
- Calls "deploy" in DBIx::Class::Schema.
Deployment to MySQL requires that you specify something like:
$dir->backend->deploy({ producer_args => { mysql_version => 4 } });
because MySQL versions before 4 did not have support for boolean types, and
the schema emitted by SQL::Translator will not work with the queries
used.
- drop_tables
- Drops the "entries" and "gin_index"
tables.
TROUBLESHOOTING¶
I get "unexpected end of string while parsing JSON
string"¶
You are problably using MySQL, which comes with a helpful data compression
feature: when your serialized objects are larger than the maximum size of a
"BLOB" column MySQL will simply shorten it for you.
Why "BLOB" defaults to 64k, and how on earth someone would consider
silent data truncation a sane default I could never fathom, but nevertheless
MySQL does allow you to disable this by setting the "strict" SQL
mode in the configuration.
To resolve the actual problem (though this obviously won't repair your lost
data), alter the entries table so that the "data" column uses the
nonstandard "LONGBLOB" datatype.
VERSION CONTROL¶
http://github.com/nothingmuch/kiokudb-backend-dbi
<
http://github.com/nothingmuch/kiokudb-backend-dbi>
AUTHOR¶
Yuval Kogman <nothingmuch@woobling.org>
COPYRIGHT¶
Copyright (c) 2008, 2009 Yuval Kogman, Infinity Interactive. All
rights reserved This program is free software; you can redistribute
it and/or modify it under the same terms as Perl itself.