NAME¶
MongoDB::Examples - Some examples of MongoDB syntax
VERSION¶
version v0.705.0.0
MAPPING SQL TO MONGODB¶
For developers familiar with SQL, the following chart should help you see how
many common SQL queries could be expressed in MongoDB.
These are Perl-specific examples of translating SQL queries to MongoDB's query
language. To see the JavaScript (or other languages') mappings, see
<
http://dochub.mongodb.org/core/sqlToMongo>.
In the following examples, $db is a MongoDB::Database object which was retrieved
by using "get_database". See MongoDB::MongoClient for more.
- "CREATE TABLE USERS (a Number, b Number)"
-
Implicit, can be done explicitly.
- "INSERT INTO USERS VALUES(1,1)"
-
$db->get_collection( 'users' )->insert( { a => 1, b => 1 } );
- "SELECT a,b FROM users"
-
$db->get_collection( 'users')->find( { } )->fields( { a => 1, b => 1 });
- "SELECT * FROM users"
-
$db->get_collection( 'users' )->find;
- "SELECT * FROM users WHERE age=33"
-
$db->get_collection( 'users' )->find( { age => 33 } )
- "SELECT a,b FROM users WHERE age=33"
-
$db->get_collection( 'users' )->find( { age => 33 } )->fields( { a => 1, b => 1 });
- "SELECT * FROM users WHERE age=33 ORDER BY name"
-
$db->get_collection( 'users' )->find( { age => 33 } )->sort( { name => 1 } );
- "<SELECT * FROM users WHERE age"33>>
-
$db->get_collection( 'users' )->find( { age => { '$gt' => 33 } } );
- "<SELECT * FROM users WHERE age<33">
-
$db->get_collection( 'users' )->find( { age => { '$lt' => 33 } } );
- "SELECT * FROM users WHERE name LIKE "%Joe%""
-
$db->get_collection( 'users' )->find( { name => qr/Joe/ } );
- "SELECT * FROM users WHERE name LIKE "Joe%""
-
$db->get_collection( 'users' )->find( {name => qr/^Joe/ } );
- "<SELECT * FROM users WHERE age"33 AND
age<=40>>
-
$db->get_collection( 'users' )->find( { age => { '$gt' => 33, '$lte' => 40 } } );
- "SELECT * FROM users ORDER BY name DESC"
-
$db->get_collection( 'users' )->find->sort( { name => -1 } );
- "CREATE INDEX myindexname ON users(name)"
-
$db->get_collection( 'users' )->ensure_index( { name => 1 } );
- "CREATE INDEX myindexname ON users(name,ts DESC)"
-
$db->get_collection( 'users' )->ensure_index( Tie::IxHash->new( name => 1, ts => -1 ) );
In this example, we must use Tie::IxHash to preserve the ordering of the
arguments to "ensureIndex".
- "SELECT * FROM users WHERE a=1 and b='q'"
-
$db->get_collection( 'users' )->find( {a => 1, b => "q" } );
- "SELECT * FROM users LIMIT 10 SKIP 20"
-
$db->get_collection( 'users' )->find->limit(10)->skip(20);
- "SELECT * FROM users WHERE a=1 or b=2"
-
$db->get_collection( 'users' )->find( { '$or' => [ {a => 1 }, { b => 2 } ] } );
- "SELECT * FROM users LIMIT 1"
-
$db->get_collection( 'users' )->find->limit(1);
- "EXPLAIN SELECT * FROM users WHERE z=3"
-
$db->get_collection( 'users' )->find( { z => 3 } )->explain;
- "SELECT DISTINCT last_name FROM users"
-
$db->run_command( { distinct => "users", key => "last_name" } );
- "SELECT COUNT(*y) FROM users"
-
$db->get_collection( 'users' )->count;
- "<SELECT COUNT(*y) FROM users where age " 30>>
-
$db->get_collection( 'users' )->find( { "age" => { '$gt' => 30 } } )->count;
- "SELECT COUNT(age) from users"
-
$db->get_collection( 'users' )->find( { age => { '$exists' => 1 } } )->count;
- "UPDATE users SET a=1 WHERE b='q'"
-
$db->get_collection( 'users' )->update( { b => "q" }, { '$set' => { a => 1 } } );
- "UPDATE users SET a=a+2 WHERE b='q'"
-
$db->get_collection( 'users' )->update( { b => "q" }, { '$inc' => { a => 2 } } );
- "DELETE FROM users WHERE z="abc""
-
$db->get_database( 'users' )->remove( { z => "abc" } );
DATABASE COMMANDS¶
If you do something in the MongoDB shell and you would like to translate it to
Perl, the best way is to run the function in the shell without parentheses,
which will print the source. You can then generally translate the source into
Perl fairly easily.
For example, suppose we want to use "db.foo.validate" in Perl. We
could run:
> db.foo.validate
function (full) {
var cmd = {validate:this.getName()};
if (typeof full == "object") {
Object.extend(cmd, full);
} else {
cmd.full = full;
}
var res = this._db.runCommand(cmd);
if (typeof res.valid == "undefined") {
res.valid = false;
var raw = res.result || res.raw;
if (raw) {
var str = "-" + tojson(raw);
res.valid = !(str.match(/exception/) || str.match(/corrupt/));
var p = /lastExtentSize:(\d+)/;
var r = p.exec(str);
if (r) {
res.lastExtentSize = Number(r[1]);
}
}
}
return res;
}
Thus, we can translate the important parts into Perl:
$db->run_command( { validate => "foo" } );
Find-and-modify¶
The find-and-modify command is similar to update (or remove), but it will return
the modified document. It can be useful for implementing queues or locks.
For example, suppose we had a list of things to do, and we wanted to remove the
highest-priority item for processing. We could do a "find" in
MongoDB::Collection and then a "remove" in MongoDB::Collection, but
that wouldn't be atomic (a write could occur between the query and the
remove). Instead, we can use find and modify.
my $next_task = $db->run_command({
findAndModify => "todo",
sort => {priority => -1},
remove => 1
});
This will atomically find and pop the next-highest-priority task.
See <
http://www.mongodb.org/display/DOCS/findAndModify+Command> for more
details on find-and-modify.
AGGREGATION¶
The aggregation framework is MongoDB's analogy for SQL GROUP BY queries, but
more generic and more powerful. An invocation of the aggregation framework
specifies a series of stages in a pipeline to be executed in order by the
server. Each stage of the pipeline is drawn from one of the following
so-called "pipeline operators": $project, $match, $limit, $skip,
$unwind, $group, $sort, and $geoNear.
The aggregation framework is the preferred way of performing most aggregation
tasks. New in version 2.2, it has largely obviated mapReduce
(<
http://docs.mongodb.org/manual/reference/command/mapReduce/#dbcmd.mapReduce>),
and group
(<
http://docs.mongodb.org/manual/reference/command/group/#dbcmd.group>).
See the MongoDB aggregation framework documentation for more information
(<
http://docs.mongodb.org/manual/aggregation/>).
$match and $group¶
The $group pipeline operator is used like GROUP BY in SQL. For example, suppose
we have a number of local businesses stored in a "business"
collection. If we wanted to find the number of coffeeshops in each
neighborhood, we could do:
my $out = $db->get_collection('business')->aggregate(
[
{'$match' => {'type' => 'coffeeshop'}},
{'$group' => {'_id' => '$neighborhood', 'num_coffeshops' => {'$sum' => 1}}}
]
);
The SQL equivalent is "SELECT neighborhood, COUNT(*) FROM business GROUP BY
neighborhood WHERE type = 'coffeeshop'". After executing the above
aggregation query, $out will contain an array of result documents such as the
following:
[
{
'_id' => 'Soho',
'num_coffeshops' => 23
},
{
'_id' => 'Chinatown',
'num_coffeshops' => 14
},
{
'_id' => 'Upper East Side',
'num_coffeshops' => 10
},
{
'_id' => 'East Village',
'num_coffeshops' => 87
}
]
Note that "aggregate" in MongoDB::Collection takes an array reference
as an argument. Each element of the array is document which specifies a stage
in the aggregation pipeline. Here our aggregation query consists of a $match
phase followed by a $group phase. Use $match to filter the documents in the
collection prior to aggregation. The "_id" field in the $group stage
specifies the key to group by; the "$" in '$neighborhood' indicates
that we are referencing the name of a key. Finally, we use the $sum operator
to add one for every document in a particular neighborhood. There are other
operators, such as $avg, $max, $min, $push, and $addToSet, which can be used
in the $group phase and work much like $sum.
$project and $unwind¶
Now let's look at a more complex example of the aggregation framework that makes
use of the $project and $unwind pipeline operators. Suppose we have a
collection called 'courses' which contains information on college courses. An
example document in the collection looks like this:
{
'_id' => 'CSCI0170',
'name' => 'Computer Science 17',
'description' => 'An Integrated Introduction to Computer Science',
'instructor_id' => 29823498,
'instructor_name' => 'A. Greenwald',
'students' => [
{ 'student_id' => 91736114, 'student_name' => 'D. Storch' },
{ 'student_id' => 89100891, 'student_name' => 'J. Rassi' }
]
}
We wish to generate a report containing one document per student that indicates
the courses in which each student is enrolled. The following call to
"aggregate" in MongoDB::Collection will do the trick:
my $out = $db->get_collection('courses')->aggregate([
{'$unwind' => '$students'},
{'$project' => {
'_id' => 0,
'course' => '$_id',
'student_id' => '$students.student_id',
}
},
{'$group' => {
'_id' => '$student_id',
'courses' => {'$addToSet' => '$course'}
}
}
]);
The output documents will each have a student ID number and an array of the
courses in which that student is enrolled:
[
{
'_id' => 91736114,
'courses' => ['CSCI0170', 'CSCI0220', 'APMA1650', 'HIST1230']
}
{
'_id' => 89100891,
'courses' => ['CSCI0170', 'CSCI1670', 'CSCI1690']
}
]
The $unwind stage of the aggregation query "peels off" elements of the
courses array one-by-one and places them in their own documents. After this
phase completes, there is a separate document for each (course, student) pair.
The $project stage then throws out unnecessary fields and keeps the ones we
are interested in. It also pulls the student ID field out of its subdocument
and creates a top-level field with the key "student_id". Last, we
group by student ID, using $addToSet in order to add the unique courses for
each student to the "courses" array.
$sort, $skip, and $limit¶
The $sort, $skip, and $limit pipeline operators work much like their companion
methods in MongoDB::Cursor. Returning to the previous students and courses
example, suppose that we were particularly interested in the student with the
ID that is numerically third-to-highest. We could retrieve the course list for
that student by adding $sort, $skip, and $limit phases to the pipeline:
my $out = $db->get_collection('courses')->aggregate([
{'$unwind' => '$students'},
{'$project' => {
'_id' => 0,
'course' => '$_id',
'student_id' => '$students.student_id',
}
},
{'$group' => {
'_id' => '$student_id',
'courses' => {'$addToSet' => '$course'}
}
},
{'$sort' => {'_id' => -1}},
{'$skip' => 2},
{'$limit' => 1}
]);
Group¶
In addition to the aggregation framework, MongoDB offers a few special commands
for common aggregation tasks: "group", "distinct", and
"count".
Returning to the coffeeshop example, the same result could be obtained using
"group" with the following code:
my $reduce = <<REDUCE;
function(doc, prev) {
if (doc.type == "coffeeshop") {
prev["num coffeeshops"]++;
}
}
REDUCE
my $result = $db->run_command({group => {
'ns' => "business",
'key' => {"neighborhood" => 1},
'initial' => {"num coffeeshops" => 0},
'$reduce' => MongoDB::Code->new(code => $reduce)
Modern code should generally prefer the $group aggregation pipeline operator to
the "group" database command.
Distinct¶
The distinct command returns all values for a given key in a collection. For
example, suppose we had a collection with the following documents
("_id" value ignored):
{ 'name' => 'a', code => 1 }
{ 'name' => 'b', code => 1 }
{ 'name' => 'c', code => 2 }
{ 'name' => 'd', code => "3" }
If we wanted to see all of values in the "code" field, we could run:
my $result = $db->run_command([
"distinct" => "collection_name",
"key" => "code",
"query" => { }
]);
Notice that the arguments are in an array, to ensure that their order is
preserved. You could also use a Tie::IxHash.
"query" is an optional argument, which can be used to only run
"distinct" on specific documents. It takes a hash (or Tie::IxHash or
array) in the same form as "find($query)" in MongoDB::Collection.
Running "distinct" on the above collection would give you:
{
'ok' => '1',
'values' => [
1,
2,
"3"
]
};
MapReduce¶
For some special purpose aggregation tasks, the aggregation framework may not be
sufficient. In this case, the database server can execute special MapReduce
jobs written in JavaScript. Be warned: MapReduce is generally slower than the
aggregation framework and should be avoided unless your application requires
the flexibility that it provides.
This example counts the number of occurrences of each tag in a collection. Each
document contains a "tags" array that contains zero or more strings.
my $map = <<MAP;
function() {
this.tags.forEach(function(tag) {
emit(tag, {count : 1});
});
}
MAP
my $reduce = <<REDUCE;
function(prev, current) {
result = {count : 0};
current.forEach(function(item) {
result.count += item.count;
});
return result;
}
REDUCE
my $cmd = Tie::IxHash->new("mapreduce" => "foo",
"map" => $map,
"reduce" => $reduce);
my $result = $db->run_command($cmd);
See the MongoDB documentation on MapReduce for more information
(<
http://docs.mongodb.org/manual/core/map-reduce>).
QUERYING¶
Nested Fields¶
MongoDB allows you to store deeply nested structures and then query for fields
within them using
dot-notation. For example, suppose we have a users
collection with documents that look like:
{
"userId" => 12345,
"address" => {
"street" => "123 Main St",
"city" => "Springfield",
"state" => "MN",
"zip" => "43213"
}
}
If we want to query for all users from Springfield, we can do:
my $cursor = $users->find({"address.city" => "Springfield"});
This will search documents for an "address" field that is a
subdocument and a "city" field within the subdocument.
UPDATING¶
Positional Operator¶
In MongoDB 1.3.4 and later, you can use positional operator, "$", to
update elements of an array. For instance, suppose you have an array of user
information and you want to update a user's name.
A sample document in JavaScript:
{
"users" : [
{
"name" : "bill",
"age" : 60
},
{
"name" : "fred",
"age" : 29
},
]
}
The update:
$coll->update({"users.name" => "fred"}, {'users.$.name' => "george"});
This will update the array so that the element containing "name" =>
"fred" now has "name" => "george".
AUTHORS¶
- •
- David Golden <david.golden@mongodb.org>
- •
- Mike Friedman <friedo@mongodb.com>
- •
- Kristina Chodorow <kristina@mongodb.org>
- •
- Florian Ragwitz <rafl@debian.org>
COPYRIGHT AND LICENSE¶
This software is Copyright (c) 2014 by MongoDB, Inc..
This is free software, licensed under:
The Apache License, Version 2.0, January 2004