tags:
- "#database"
- "#mongodb-notes"
Notes On MongoDB
Table of Contents
MongoDB in 30 minutes: https://www.youtube.com/watch?v=pWbMrx5rVBE (old)
https://www.youtube.com/watch?v=-56x56UppqQ
MongoDB cheat sheet for simple use cases:
https://gist.github.com/bradtraversy/f407d642bdc3b31681bc7e56d95485b6
MongoDB crash course 2022 Traversy Media: https://www.youtube.com/watch?v=2QQGWYe7IDU
MongoDB Schema design best practices: https://www.youtube.com/watch?v=leNCfU5SYR8
https://www.mongodb.com/blog/post/6-rules-of-thumb-for-mongodb-schema-design-part-1
https://speakerdeck.com/joekarlsson/mongodb-schema-design-best-practices
https://www.mongodb.com/blog/post/building-with-patterns-a-summary
Videos - ChangeStreams: https://www.youtube.com/watch?v=9LA7_CSyZb8
Change Streams: https://www.slideshare.net/mongodb/whats-new-in-mongodb-36-83345782
SQL to MongoDB converter online https://www.site24x7.com/tools/sql-to-mongodb.html
https://www.slideshare.net/mongodb/jakes-schema-design-houston-mug-20150311
https://studio3t.com/knowledge-base/articles/mongodb-query-performance/
Design patterns summary: https://www.mongodb.com/blog/post/building-with-patterns-a-summary
Aggregation Cheatsheet: https://www.practical-mongodb-aggregations.com/appendices/cheatsheet.html
$ mkdir -p /data/db # Default data dir
$ mongod
$ mongo
MongoDB shell version v5.0.6
connecting to: mongodb://127.0.0.1:27017/
MongoDB server version: 5.0.6
================
Warning: the "mongo" shell has been superseded by "mongosh",
> db
test
> show databases
admin 0.000GB
config 0.000GB
dev 0.462GB
local 0.000GB
> use local
> show tables # Tables and Collections are synonymous
> show collections
startup_log
> use newdb # will create database. No separate command to create db
> show databases # will not show empty databases.
> db.posts.insert({
title: 'Post One',
body: 'Body of post one',
category: 'News',
tags: ['news', 'events'],
user: {
name: 'John Doe',
status: 'author'
},
date: Date()
})
> db.posts.insertMany([
{
title: 'Post Two',
body: 'Body of post two',
category: 'Technology',
date: Date()
},
{
title: 'Post Three',
body: 'Body of post three',
category: 'News',
date: Date()
},
{
title: 'Post Four',
body: 'Body of post three',
category: 'Entertainment',
date: Date()
}
])
# Collections are created on-the-fly.
> db.myNewCollection.insert( { "name" : "Thava", "msg" : "Hello!" })
WriteResult({ "nInserted" : 1 })
# You can create capped collection using createCollection method:
> db.createCollection("mySecondCollection", {
capped : true, // supports max document count
size : 2, // Max collection size is 2MB
max : 2}) // Max count is 2 documents (max count to show)
# Note: If you insert more documents than max count, then only the latest docs will show up,
but old ones will be retained internally.
> db.myNewCollection.insertMany( [ { json1 }, {json2} ... ])
# List documents ...
> db.myNewCollection.insert( { "name" : "Thava2", "msg" : "Hello2!" })
> db.myNewCollection.find()
{ "_id" : ObjectId("6229c5d2f7b6acf25e81c6aa"), "name" : "Thava", "msg" : "Hello!" }
{ "_id" : ObjectId("6229c922f7b6acf25e81c6ab"), "name" : "Thava2", "msg" : "Hello2!" }
> db.myNewCollection.find().pretty() # Pretty print.
# Filter condition with operators such as $eq, $ne, $gt, $lt, $lte, $gte has this syntax...
> db.myNewCollection.find( { name: { $eq : 'Thava' } })
> db.myNewCollection.distinct( 'name' )
[ "Thava", "Thava2" ]
> db.myNewCollection.find({name: /ava/} )
... both 2 documents displayed ...
> db.myNewCollection.distinct( 'name' ).count()
2
# To update by specifying entire new set of attributes, do this:
> db.myNewCollection.update({name : 'Thava'}, {newName: 'New Me', planet: 'earth'});
# Note: The existing attribute name is gone after the update!!!
# To update specific field(s), use following form:
> db.myNewCollection.update({name : 'Thava'}, {$set: {age: 20}})
# Update does nothing if there is no match. However if you speciy {upsert:1},
# then it inserts on update even if there is no match!
# Following creates a new record with { name: 'Thava', age: 30 } !!!
> db.myNewCollection.update({name : 'Thava'}, {$set: {age: 20}}, {upsert:1})
WriteResult({
"nMatched" : 0,
"nUpserted" : 1,
"nModified" : 0,
"_id" : ObjectId("622e0cce6abf50ff9f577040")
})
> db.myNewCollection.find()
{ "_id" : ObjectId("622a436dc78724a63a99fe95"), "name" : "Thava2", "msg" : "Second Hello!" }
{ "_id" : ObjectId("622a4ae8c78724a63a99fe96"), "name" : "Thava5", "msg" : "Hello5!" }
{ "_id" : ObjectId("622e0cce6abf50ff9f577040"), "name" : "Thava", "age" : 20 }
# Following creates new record { age: 20 } only!
> db.myNewCollection.update({name : /Nothing/}, {$set: {age: 20}}, {upsert:1})
# Delete a field from document ...
# Note that 1 in {age:1} is a dummy value which is ignored.
> db.myNewCollection.update({name : 'Thava'}, {$unset: {age:1}})
# If you want multi-delete to occur ...
> db.myNewCollection.update({name : 'Thava'}, // Filter condition
{$unset: {age:1}}, // Which field(s) to unset
{multi: true}); // Can we delete more than 1 record?
> db.myNewCollection.insert( { "name" : "Thava2", "msg" : "Duplicate Hello22!", age: 22 })
> db.myNewCollection.insert( { "name" : "Thava2", "msg" : "Duplicate Hello25!", age: 25 })
> db.myNewCollection.aggregate([{$group: {_id: "$name", total_by_name: {$sum:1} }}])
{ "_id" : "Thava2", "total_by_name" : 3 }
{ "_id" : "Thava5", "total_by_name" : 1 }
> db.myNewCollection.aggregate([{$group: {_id: "$name", ages: {$push: "$age"}} }])
{ "_id" : "Thava5", "ages" : [ ] }
{ "_id" : "Thava2", "ages" : [ 22, 25 ] }
> db.myNewCollection.aggregate([{$group: {_id: "$name", min_age: {$min: "$age"}} }])
{ "_id" : "Thava5", "min_age" : null }
{ "_id" : "Thava2", "min_age" : 22 }
# Other Aggregate operations: $first (pickup first value), $last, $avg, $max, $addToSet, etc.
# Pipeline operations: $project, $match, $group, $sort, $skip, $limit, $unwind
# $unwind operation is reverse-group-by operation. Takes array field, creates multiple rows (documents)
#
# Text Search!
# A collection can only have one text search index, but that index can cover multiple fields.
# Atlas Search supports multiple text indexes per collection.
db.posts.find({
$text: {
$search: "\"Post O\""
}
})
# Remove document ...
> db.myNewCollection.remove({name: "Thava"});
> db.myNewCollection.find({});
# Drop all of collection and documents and indexes.
> db.myNewCollection.drop();
# To drop database itself ...
> db.dropDatabase()
::
> show dbs
::
> db
::
use acme
::
db.dropDatabase()
db.createCollection('posts')
show collections
db.posts.insert({
title: 'Post One',
body: 'Body of post one',
category: 'News',
tags: ['news', 'events'],
user: {
name: 'John Doe',
status: 'author'
},
date: Date()
})
db.posts.insertMany([
{
title: 'Post Two',
body: 'Body of post two',
category: 'Technology',
date: Date()
},
{
title: 'Post Three',
body: 'Body of post three',
category: 'News',
date: Date()
},
{
title: 'Post Four',
body: 'Body of post three',
category: 'Entertainment',
date: Date()
}
])
db.posts.find() db.posts.find().pretty()
db.posts.find({ category: 'News' })
# asc
db.posts.find().sort({ title: 1 }).pretty()
# desc
db.posts.find().sort({ title: -1 }).pretty()
db.posts.find().count() db.posts.find({ category: 'news' }).count() db.posts.find({ category: 'News' }).count() // Filter is case sensitive db.posts.find({ category: /^New/ }).count() // This regex will reuse any existing index on category.
db.posts.find().limit(2).pretty()
db.posts.find().limit(2).sort({ title: 1 }).pretty()
db.posts.find().forEach(function(doc) {
print("Blog Post: " + doc.title)
})
db.posts.findOne({ category: 'News' })
db.posts.find({ title: 'Post One' }, {
title: 1,
author: 1
})
db.posts.update({ title: 'Post Two' },
{
title: 'Post Two',
body: 'New body for post 2',
date: Date()
},
{
upsert: true, // Insert if not found
})
db.posts.update({ title: 'Post Two' },
{
$set: {
body: 'Body for post 2',
category: 'Technology'
}
})
db.posts.update({ title: 'Post Two' },
{
$inc: {
likes: 5
}
})
::
db.posts.update({ title: 'Post Two' }, { $rename: { likes: 'views' } })
db.posts.remove({ title: 'Post Four' })
db.posts.update({ title: 'Post One' },
{
$set: {
comments: [
{
body: 'Comment One',
user: 'Mary Williams',
date: Date()
},
{
body: 'Comment Two',
user: 'Harry White',
date: Date()
}
]
}
})
::
db.posts.find({
comments: {
$elemMatch: {
user: 'Mary Williams' }
}
}
)
# Create an ascending String index on String value
db.posts.createIndex({ title: 1})
# Create a text index ...
db.posts.createIndex({ title: 'text' }, {name: "title_idx", default_language: "english"})
# Create a Compound text index on title and body...
db.posts.createIndex({ title: 'text', body : 'text' }, {sparse: true})
# The second arg options are:
unique: true,
sparse: true,
expireAfterSeconds: 3600
hidden: true, // If true, it is hidden from query planner
When you create a regular index on a string field it indexes the entire value in the string. Mostly useful for single word strings (like a username for logins) where you can match exactly.
A text index on the other hand will tokenize and stem the content of the field.
db.posts.find({
$text: {
$search: "body post", // Will search compound index for either word match.
}
}
# Full text indexes are by default case and diatrics insensitive.
db.posts.find({
$text: {
$search: "\"Post O\"" // String quote inside string means exact Phrase search.
// $caseSenstive: true // Will make it case sensitive text search. By default it is false.
// diacriticSensitive: true // Default is false
}
})
// Find either word body or post by should not contain "News" !
db.posts.find( { $text: { $search: "body post -News" } } )
// Sort results by search score
db.posts.find(
{ $text: { $search: "body post " } }, // Search either one word
{ score: { $meta: "textScore" } } // Project score field. id is auto included.
).sort( { score: { $meta: "textScore" } } ) // {score:-1} means descending.
// Here score to be treated as special meta type.
::
db.posts.find({ views: { $gt: 2 } }) db.posts.find({ views: { $gte: 7
} }) db.posts.find({ views: { $lt: 7 } }) db.posts.find({ views: {
$lte: 7 } })
See https://stackoverflow.com/questions/1863399/mongodb-is-it-possible-to-make-a-case-insensitive-query
When you create index specify collation with strength 1 or 2. (3 is default for case sensitive). The regex search is slow and not collation aware, so can not do case insensitive search.
::
db.cities.createIndex(
{ city: 1 }, { collation: { locale: 'en', strength: 2 // 1 - Compare
base chars; 2 - compare diatrics; 3 - Case sensitive (default) } }
);
db.cities.find({city: "New York"})
An inefficient way (may not even work all times) for case insensitive search is:
db.cities.find({ city: /^new york$/i })
The above will do full scan and can not use index.
Best Fit for ... * Big Data * User Data Management * Mobile and Social Infrastructure
Question: How to use index and Join ?
MongoDB can run as single instance, but it is typically run as cluster.
In MongoDB, clusters can refer to two different architectures: * They can either mean a replica set * or a sharded cluster
ReplicaSet involves single primary node (used for writes) and couple of secondary nodes (for reads).
Sharded Cluster involves mongos process as the router which uses external configuration servers to understand the cluster and routes the query to different replica-sets (running set of mongod processes).
Usual stuff ...
Synopsis:
$ mongod --port 27017 --dbpath "D:\set up\mongodb\data" --replSet rs0 # Invoke Primary Node
$ mongo
> rs.initiate()
> rs.add("mongod1.net:27017"); # Add another secondary member
> db.isMaster() # Is this DB instance master or secondary ?
Sharding is another higher level artifact on top of replication. In cluster, there are 2 or more shards which each shard is a replica set.
Other usual stuff ...
Application ----- mongos (sharding router) ---1:N--- mongod (single server or replica set)
ReplicaSet has primary master and read-only slaves.
Replica Master (mongod) ===1:3====== ReadOnly Slave (mongod)
Using Javascript code :
const { MongoClient } = require("mongodb");
// "mongodb+srv://sample-hostname:27017/?maxPoolSize=20&w=majority";
const uri = "mongdodb+srv://localhost:27017";
// Create a new MongoClient
const client = new MongoClient(uri);
async function run() {
try {
// Connect the client to the server
await client.connect();
// Establish and verify connection
await client.db("admin").command({ ping: 1 });
console.log("Connected successfully to server");
} finally {
// Ensures that the client will close when you finish/error
await client.close();
}
}
run().catch(console.dir);
const database = client.db("sample_mflix");
const movies = database.collection("movies");
// Create an ascending index on the "type" and "genre" fields
// in the "movies" collection.
// *** The order of keys is important!!! Javascript honors creation order now a days! ***
const result = await movies.createIndex({ type: 1}); // Simple Index type "Ascending"
const result = await movies.createIndex({ type: 1, genre: 1 }); // Compound String Index
console.log(`Index created: ${result}`);
// To create text index ...
const result = await movies.createIndex({ type: "text"}); // Simple Text Index
// Wildcard text index will include all string fields in the index ...
const result = await movies.createIndex({ "$**": "text"}); // Wildcard Text Index
// Compound Index involving Ascending Key and Text Index Key ...
const result = await movies.createIndex({genre:1, "$**": "text"}); // Wildcard Text Index
// The following is an example of a query that would be covered by the index created above.
const query = { type: "movie", genre: "Drama" };
const sort = { type: 1, genre: 1 };
const projection = { type: 1, genre: 1 };
const cursor = movies
.find(query)
.sort(sort)
.project(projection);
Creating Text indexes:
onst database = client.db("sample_mflix");
const movies = database.collection("movies");
// Create a text index on the "fullplot" field in the
// "movies" collection.
const result = await movies.createIndex({ fullplot: "text" }, { default_language: "english" });
console.log(`Index created: ${result}`);
The following is an example of a query that would be covered by the index created above. Note that the sort is omitted because text indexes do not contain sort order:
const query = { $text: { $search: "java coffee shop" } };
const projection = { fullplot: 1 };
const cursor = movies
.find(query)
.project(projection);
Note: The normal field names can not start with $, to avoid conflict with operations like $text, etc.
Using mongo client:
# Even ids are not auto-indexed by default??? Issue following command.
> db.myNewCollection.createIndex({KEY:1});
>
> db.myNewCollection.createIndex({"name":1}); # Create asc index on top level "name" field.
> db.myNewCollection.createIndex({"name":1, "age": -1}); # name asc, age desc index.
>
You can create index on Array fields and sub-document fields as well:
> db.users.createIndex({"tags":1}, {name: "tags_idx"}) // tags is an array field.
> db.users.find({tags:"cricket"})
> db.users.find({tags:"cricket"}).explain()
// Indexing Sub-Document Fields
> db.users.createIndex({"address.city":1,"address.state":1,"address.pincode":1})
> db.users.find({"address.city":"Los Angeles"})
> db.users.find({"address.city":"Los Angeles","address.state":"California"})
> db.users.find({"address.city":"LosAngeles","address.state":"California",
"address.pincode":"123"})
> db.users.getIndexes()
> db.users.dropIndex()
Indexing can't be used in queries which use:
An ObjectId is a 12-byte BSON type having the following structure:
Note: BSON is an extension of JSON with additional (binary) data types.
To generate a new ObjectId use the following code:
// Generate new object Id
> newObjectId = ObjectId()
ObjectId("5349b4ddd2781d08c09890f3")
// You can also provide a 12-byte id −
> myObjectId = ObjectId("5349b4ddd2781d08c09890f4")
> ObjectId("5349b4ddd2781d08c09890f4").getTimestamp()
ISODate("2014-04-12T21:49:17Z")
> newObjectId.str
5349b4ddd2781d08c09890f3
By default collections can have heterogenous documents. Each document can have different fields and/or same field having different data types. From version 3.2, you can enforce schema validation. (May benefit from some sort of controled schema rules)
db.createCollection("students", {
validator: {
$jsonSchema: {
bsonType: "object",
required: [ "name", "year", "major", "address" ],
properties: {
name: {
bsonType: "string",
description: "must be a string and is required"
},
major: {
enum: [ "Math", "English", "Computer Science", "History", null ],
description: "can only be one of the enum values and is required"
}, ...
}
}
}
})
See Also: * db.createCollection() with validator option * Validation for existing collection: collMod command with validator option.
Writing into single document is always atomic. However writing into multiple documents need transactions API to be atomic.
// For a replica set, include the replica set name and a seedlist of the members in the URI string; e.g.
// const uri = 'mongodb://mongodb0.example.com:27017,mongodb1.example.com:27017/?replicaSet=myRepl'
// For a sharded cluster, connect to the mongos instances; e.g.
// const uri = 'mongodb://mongos0.example.com:27017,mongos1.example.com:27017/'
const client = new MongoClient(uri);
await client.connect();
// Prereq: Create collections.
await client
.db('mydb1')
.collection('foo')
.insertOne({ abc: 0 }, { writeConcern: { w: 'majority' } });
await client
.db('mydb2')
.collection('bar')
.insertOne({ xyz: 0 }, { writeConcern: { w: 'majority' } });
// Step 1: Start a Client Session
const session = client.startSession();
// Step 2: Optional. Define options to use for the transaction
const transactionOptions = {
readPreference: 'primary',
readConcern: { level: 'local' },
writeConcern: { w: 'majority' }
};
// Step 3: Use withTransaction to start a transaction, execute the callback, and commit (or abort on error)
// Note: The callback for withTransaction MUST be async and/or return a Promise.
try {
await session.withTransaction(async () => {
const coll1 = client.db('mydb1').collection('foo');
const coll2 = client.db('mydb2').collection('bar');
// Important:: You must pass the session to the operations
await coll1.insertOne({ abc: 1 }, { session });
await coll2.insertOne({ xyz: 999 }, { session });
}, transactionOptions);
} finally {
await session.endSession();
await client.close();
}
There are aggregation stages and operations aka expression.
Aggregations stages examples:
For unsharded collection, you can use helper methods like distinct directly::
db.coll.distinct("x")
For sharded collection, use the aggregation pipeline with the $group stage instead::
db.coll.aggregate([
{ $group: { id: null, distinctValues: {
$addToSet: "$x" } } }, { $project: { id: 0 } }
])
The pipeline returns a cursor to a document:: { "distinctValues" : [ 2, 3, 1 ] }
The aggregation javascript object is of class AggregationCursor:
cursor = db.coll.aggregate([]);
console.dir(cursor) // Can inspect the object
typeof(cursor) // 'object'
k.constructor // [class AggregationCursor extends AbstractCursor]
k.constructor.name // AggregationCursor
To get results from aggregate cursor, do one of the following:
- cursor.forEach( (doc) => console.log(doc), (err) => console.log('Error: ', err ));
- let results = cursor.toArray()
- var myCursor = collection.aggregate(pipeline);
while (myCursor.hasNext()) { print(tojson(myCursor.next()));
Map reduce is deprecated. Use $accumulator and $function instead. Note that Atlas free cluster does not have security.javascriptEnabled parameter enabled. This is server side configuration and can not be changed through any API calls.
Conversion operators: $convert, $toBool, $toDate, $dateFromString, ... Example:
{ $convert:
{
input: "$price",
to: "decimal",
onError: "Error",
onNull: Decimal128("0")
}
}
Write output to new collection: { $out: "<output-collection>" } Also: { $out: { db: "<output-db>", coll: "<output-collection>" } }
const collection = db.collection('inventory');
const changeStream = collection.watch();
changeStream.on('change', next => {
// process next document
});
// Or use iterator style ...
const changeStreamIterator = collection.watch();
const next = await changeStreamIterator.next();
> db.collection.mapReduce(
function() {emit(key,value);}, //map function
function(key,values) {return reduceFunction}, { //reduce function
out: collection,
query: document,
sort: document,
limit: number
}
)
> mongodump # Backup all data of all databases to /bin/dump/ directory.
> mongodump --host <hostname> --port <port> --out <backup_dir> --db <db_name> --collection <collection>
> mongorestore
::
> mongostat > mongotop
Consider one person having multiple addresses:
Person <---1:N---- Address
::
> db.myNewCollection.insert({"name":1}, "age": -1); # name asc, age
desc index. > db.myNewCollection.insert( { "name" : "Thava10",
"address": [ { line1: "Bharathi Street", pin: 627001 }, { line1:
"Bharathi2 Street", pin: 627002 } ] }); > >
db.myNewCollection.find({"name": "Thava10"});
# The second arg in find used to select subset of all fields. "name" does not appear in result. > db.myNewCollection.find({"name": "Thava10"}, {"address":1}); { "_id" : ObjectId("622adea7ad8473e2f3e91c9d"), // "name" column missing in output "address" : [ ... ] }
Embed array of addresses in Person record like below:
{
"_id":ObjectId("52ffc33cd85242f436000001"),
"name": "John Smith",
"address": [ {
"line1": "22 A, Indiana Apt",
"pin": 123456
}, ...
]
}
Either we can use either:
{
"_id":ObjectId("52ffc33cd85242f436000001"),
"name": "John Smith",
"address_ids": [
ObjectId("..."),
ObjectId("...")
],
]
}
> db.myNewCollection.update({name : 'Thava10'},
{$set: {address_ids: [100, 200, 300]}},
{multi: true});
> let result = db.myNewCollection.findOne({"name": /Thava10/},{"address_ids":1})
> db.myAddress.insertMany([{_id: 100, line1 : 'Addr100 Line1', pin: 100001},
{_id: 200, line1 : 'Addr200 Line1', pin: 200001},
{_id: 300, line1 : 'Addr300 Line1', pin: 300001}
]);
{ "acknowledged" : true, "insertedIds" : [ 100, 200, 300 ] }
> let addresses = db.myAddress.find({"_id":{"$in":result["address_ids"]}})
{ "_id" : 100, "line1" : "Addr100 Line1", "pin" : 100001 }
{ "_id" : 200, "line1" : "Addr200 Line1", "pin" : 200001 }
{ "_id" : 300, "line1" : "Addr300 Line1", "pin" : 300001 }
We use DBRefs when the references may point to different collections. e.g. address may point to entries in home_address collection and office_address collection, etc. So, the external collection name could not be static but could be a dynamic value.
There are three fields in DBRefs:
* $ref − This field specifies the collection of the referenced document
* $id − This field specifies the _id field of the referenced document
* $db − This optional field points to the database of the referenced document
Consider a sample user document having DBRef field address as shown in the code snippet:
{
"_id":ObjectId("53402597d852426020000002"),
"address": {
"$ref": "address_home",
"$id": ObjectId("534009e4d852427820000002"),
"$db": "tutorialspoint"},
"contact": "987654321",
"dob": "01-01-1991",
"name": "Tom Benzamin"
}
To look up the references:
> let user = db.users.findOne({"name":"Tom Benzamin"})
> let dbRef = user.address
> db[dbRef.$ref].findOne({"_id":(dbRef.$id)})
Note: The $ref, $id, $db is not a special word in mongoDB; It is the convention that is used to implement the DBRef concept.
Covered queries can lookup data from index only with out even looking into the disk data. * The indexes are in RAM, so this is fast. * Compound Indexes are used for these. * You can exclude the id field in the result by specifying { id:0 } in projection. This may help an index to be a covering index for the query.
Example:
> db.users.createIndex({gender:1,user_name:1})
> db.users.find({gender:"M"},{user_name:1,_id:0}) // Covered query
> db.users.find({gender:"M"},{user_name:1}) // Not a covered query since lookup id needed.
> db.users.createIndex({gender:1,user_name:1})
> db.users.find({gender:"M"},{user_name:1,_id:0}).explain()
{
"cursor" : "BtreeCursor gender_1_user_name_1", // For TableScan, it is like "BasicCursor *"
"isMultiKey" : false,
"n" : 1, // Total docs matched!
"nscannedObjects" : 0, // Total docs really scanned.
"nscanned" : 1, // Total docs OR index looked up.
"nscannedObjectsAllPlans" : 0,
"nscannedAllPlans" : 1, "scanAndOrder" : false,
"indexOnly" : true, // <===== Note this.
"nYields" : 0, "nChunkSkips" : 0, "millis" : 0,
"indexBounds" : {
"gender" : [ [ "M", "M" ] ],
"user_name" : [ [ { "$minElement" : 1 }, { "$maxElement" : 1 } ] ]
}
}
> db.users.find({gender:"M"},{user_name:1,_id:0}).hint({gender:1,user_name:1})
> db.users.find({gender:"M"},{user_name:1,_id:0}).hint({gender:1,user_name:1}).explain()
Also See: https://emptysqua.re/blog/optimizing-mongodb-compound-indexes/ If the selectivity is not atleast 90%
Views:
db.createView(
"<viewName>", "<source>", [<pipeline>], { "collation" : {
<collation> } }
)
Technically materialized view is a collection which is usually created out of $merge operation from a pipeline.
On-Demand Maeterilized View:
updateMonthlySales = function(startDate) {
db.bakesales.aggregate( [
{ $match: { date: { $gte: startDate } } },
{ $group: { _id: { $dateToString: { format: "%Y-%m", date: "$date" } }, sales_quantity: { $sum: "$quantity"}, sales_amount: { $sum: "$amount" } } },
{ $merge: { into: "monthlybakesales", whenMatched: "replace" } }
] );
};
updateMonthlySales(new ISODate("1970-01-01"));
After the initial run, the monthlybakesales contains the following documents;
i.e. db.monthlybakesales.find().sort( { _id: 1 } ) returns the following:
{ "_id" : "2018-12", "sales_quantity" : 41, "sales_amount" : NumberDecimal("506") }
{ "_id" : "2019-01", "sales_quantity" : 86, "sales_amount" : NumberDecimal("896") }
db.bakesales.insertMany( [
{ date: new ISODate("2019-01-28"), item: "Cake - Chocolate", quantity: 3, amount: new NumberDecimal("90") },
...
{ date: new ISODate("2019-02-03"), item: "Cake - Red Velvet", quantity: 5, amount: new NumberDecimal("100") }
] )
updateMonthlySales(new ISODate("2019-01-01"));
// Note Jan month aggregation updated (replaced) and Feb added.
{ "_id" : "2018-12", "sales_quantity" : 41, "sales_amount" : NumberDecimal("506") }
{ "_id" : "2019-01", "sales_quantity" : 102, "sales_amount" : NumberDecimal("1142") }
{ "_id" : "2019-02", "sales_quantity" : 15, "sales_amount" : NumberDecimal("284") }
The $merge stage:
In simplest case having single condition you can use:
db.bakesales.aggregate( [ { $match: { date: { $gte: startDate } } } ]
db.inventory.find( { status: "A", qty: { $lt: 30 } } ) // Implicit And Condition
db.inventory.find( { // Combination of And and OR
status: "A",
$or: [ { qty: { $lt: 30 } }, { item: /^p/ } ]
} )
For complex queries you need to use:
db.testexample.find({$expr:{$eq:["$dateFieldYear", "$$dateFieldYear"]}}) // Query operator
db.testexample.aggregate({$match:{$expr:{$eq:["$dateFieldYear", {"$year": "$dateField"}]}})
The $match query syntax is identical to the read operation query syntax; i.e. $match does not accept raw aggregation expressions. To include aggregation expression in $match, use a $expr query expression.
"$<field>" is equivalent to "$$CURRENT.<field>" where the CURRENT is a system variable that defaults to the root of the current object, unless stated otherwise in specific stages. $$CURRENT and $$ROOT is same by default.
An operator could be update operator if it pertains to update operation:
db.scores.updateOne( { _id: 1 }, { $min: { lowScore: 150 } } )
Note that $min is not an aggregate operation but update operation like $set --because it updates the lowScore field with min (lowScore, 150).
Pipeline stage is different from operator expression. E.g. $match is a pipeline stage $and is a query operator. $sum is an aggregation expression. But note that $count is not an aggregate expression but pipeline stage. i.e. { $count: 'myTotal' } at the top level creates new field with total count value.
Examples of Aggregation operator but not a query operator:
$first, $addToSet, $concat, etc.
Examples where same operator exists as Query operator also as Aggregation Operator:
$and (query Operator): db.inventory.find( { $and: [ { price: { $ne: 1.99 } },
{ price: { $exists: true } } ] } )
$and (aggregation) :
db.inventory.aggregate(
[
{
$project:
{
item: 1,
result: { $and: [ { $gt: [ "$qty", 100 ] }, { $lt: [ "$qty", 250 ] } ] }
}
}
]
)
BI Connector is used to translate SQL syntax to mongo DB queries. To see how it maps json array and subdocuments and array of subdocuments to new tables, See this link: https://www.mongodb.com/blog/post/bi-connector-mapping-document-structures-to-relational-schema-part-1
Consider a collection called bands which conatins entries like :
{
"_id": ObjectId("5bfabde76f280102ddf27969"),
"band": "Slayer",
"formation": { "year": ISODate("1982-01-01T00:00:00Z"), "city": "Los Angeles" },
"popular_albums": [ "Show No Mercy!", "Seasons in the Abyss" ],
"members": [ { "name": "Tom Araya", "instrument": "Bass/Vocals" }, ... ]
}
mysql> show tables;
+-----------------------------------------+
| Tables_in_sql |
+-----------------------------------------+
| bands |
| bands_members |
| bands_popular_albums |
+-----------------------------------------+
Note: Both arrays became new tables. Subdocuments are just flattened.
MongoDB Atlas -- is the cloud offering of MongoDB
Atlas Search is enterprise feature
Regular mongoDB supports only one text index at collection level. But Atlas supports multiple.
You can search by phrase or words. The words condition is neither "AND" nor "OR". It takes all into account and sorts the output according to relevance.
It does auto stemming of the keywords.
Note that $search is the pipeline (enterprise atlast feature) and $text is the operator. (MongoDB indexing feature)
For partial word match you can use wildcard query :
db.collection.aggregate([{ $search: {
wildcard: {
query: '*tex*',
path: { 'wildcard': '*' },
allowAnalyzedField: true }} } ]);
Phrase search involves quoting inside string e.g. '"My String"'
When you combine phrase search and word search, you should use compound operator:
db.collection.aggregate([
{
$search: {
'compound': {
'should': [
{
'phrase': {
'path': '<field_path>',
'query': '[<first_phrase>, <second_phrase>]'
}
},
{
'text': {
'path': '<field_path>',
'query': '[<first_word>, <second_word>]'
}
}
]
}
}
}
])
Atlas Search is MongoDB's full-text search feature built on Apache Lucene. It allows you to perform advanced text search operations on your MongoDB data without needing to use a separate search engine. Key features of Atlas Search:
Compared to MongoDB's traditional text indexes:
Functionality:
Performance:
Setup and maintenance:
Atlas Search requires using MongoDB Atlas (cloud) Traditional text indexes can be used in any MongoDB deployment (including self-hosted)
Query syntax:
Language support: