Install MongoDB
|
|
|
|
|
|
|
|
|
|
Introduction to MongoDB
Documents
A record in MongoDB is a document, which is a data structure composed of field and value pairs. MongoDB documents are similar to JSON objects. The values of fields may include other documents, arrays, and arrays of documents.
Collections
MongoDB stores documents in collections. Collections are analogous to tables in relational databases. Unlike a table, however, a collection does not require its documents to have the same schema.
In MongoDB, documents stored in a collection must have a unique _id field that acts as a primary key.
Import Example Dataset
|
|
Python Driver (PyMongo)
|
|
|
|
|
|
Insert Data with PyMongo
|
|
|
|
If you attempt to add documents to a collection that does not exist, MongoDB will create the collection for you.
from datetime import datetime
result = db.restaurants.insert_one(
{
"address": {
"street": "2 Avenue",
"zipcode": "10075",
"building": "1480",
"coord": [-73.9557413, 40.7720266]
},
"borough": "Manhattan",
"cuisine": "Italian",
"grades": [
{
"date": datetime.strptime("2014-10-01", "%Y-%m-%d"),
"grade": "A",
"score": 11
},
{
"date": datetime.strptime("2014-01-16", "%Y-%m-%d"),
"grade": "B",
"score": 17
}
],
"name": "Vella",
"restaurant_id": "41704620"
}
The operation returns an InsertOneResult object, which includes an attribute inserted_id that contains the _id of the inserted document. Access the inserted_id attribute: result.inserted_id
The ObjectId of your inserted document will differ from the one shown.
Find or Query Data with PyMongo
db.coll.find(filter=None, projection=None, skip=0, limit=0, no_cursor_timeout=False, cursor_type=CursorType.NON_TAILABLE, sort=None, allow_partial_results=False, oplog_replay=False, modifiers=None, manipulate=True)
projection (optional): A list of field names that should be returned in the result document or a mapping specifying the fields to include or exclude. If projection is a list “_id” will always be returned. Use a mapping to exclude fields from the result (e.g. projection={‘_id’: False}).
|
|
-
Specify Equality Conditions
{ <field1>: <value1>, <field2>: <value2>, … }
If the <field> is in an embedded document or an array, use dot notation to access the field.
Query by a Top Level Field
1cursor = db.restaurants.find({"borough": "Manhattan"})Query by a Field in an Embedded Document
1cursor = db.restaurants.find({"address.zipcode": "10075"})Query by a Field in an Array
1cursor = db.restaurants.find({"grades.grade": "B"})The following queries for documents whose grades array contains an embedded document with a field grade equal to “B”.
-
Specify Conditions with Operators
{ <field1>: { <operator1>: <value1> } }
1cursor = db.restaurants.find({"grades.score": {"$gt": 30}})1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53Comparison: $eq $gt $gte $lt $lte $ne $in $nin Logical: $or $and $not $nor Element: $exists (have the specified field.) $type (Selects documents if a field is of the specified type.) Evaluation: $mod (Performs a modulo operation on the value of a field and selects documents with a specified result.) $regex $text $where Geospatial $geoWithin $geoIntersects $near $nearSphere Array: $all $elemMatch $size Bitwise: $bitsAllSet $bitsAnySet $bitsAllClear $bitsAnyClear Comments: $comment Projection Operators $ $elemMatch $meta $slice Logical AND -
You can specify a logical conjunction (AND) for a list of query conditions by separating the conditions with a comma in the conditions document.
1cursor = db.restaurants.find({"cuisine": "Italian", "address.zipcode": "10075"})Logical OR
1cursor = db.restaurants.find({"$or": [{"cuisine": "Italian"}, {"address.zipcode": "10075"}]}) -
Sort Query Results
To specify an order for the result set, append the sort() method to the query. Pass to sort() method a document which contains the field(s) to sort by and the corresponding sort type, e.g.
pymongo.ASCENDINGfor ascending andpymongo.DESCENDINGfor descending.1import pymongo1 2 3 4cursor = db.restaurants.find().sort([ ("borough", pymongo.ASCENDING), ("address.zipcode", pymongo.ASCENDING) ]) -
find_one(filter=None, *args, **kwargs)
find_one_and_delete(filter, projection=None, sort=None, **kwargs)
1 2 3>>> db.test.find_one_and_delete( ... {'x': 1}, sort=[('_id', pymongo.DESCENDING)]) {u'x': 1, u'_id': 2}The projection option can be used to limit the fields returned.
1 2>>> db.test.find_one_and_delete({'x': 1}, projection={'_id': False} {u'x': 1}find_one_and_replace(filter, replacement, projection=None, sort=None, return_document=ReturnDocument.BEFORE, **kwargs)
1>>> db.test.find_one_and_replace({'x': 1}, {'y': 1})find_one_and_update(filter, update, projection=None, sort=None, return_document=ReturnDocument.BEFORE_, **kwargs)
1 2>>> db.test.find_one_and_update( ...{'_id': 665}, {'$inc': {'count': 1}, '$set': {'done': True}})The upsert option can be used to create the document if it doesn’t already exist.
1 2 3 4 5 6>>> db.example.find_one_and_update( ... {'_id': 'userid'}, \ ... {'$inc': {'seq': 1}}, \ ... projection={'seq': True, '_id': False}, \ ... upsert=True, \ ... return_document=ReturnDocument.AFTER)1 2 3 4>>> db.test.find_one_and_update( ... {'done': True}, ... {'$set': {'final': True}}, ... sort=[('_id', pymongo.DESCENDING)])
Update Data with PyMongo
update_one(filter, update, upsert=False, bypass_document_validation=False, collation=None)
update_many(_filter_,_update_,_upsert=False_,_bypass_document_validation=False_, _collation=None_)
result = db.restaurants.update_one(
{"name": "Juni"},
{
"$set": {
"cuisine": "American (New)"
},
"$currentDate": {"lastModified": True}
}
)
The operation returns a UpdateResult object that reports the count of documents matched and modified.
attribute:
result.matched_count
result.modified_count
replace_one(filter,replacement,upsert=False,bypass_document_validation=False, collation=None)
>>> for doc in db.test.find({}):
... print(doc)
...
{u'x': 1, u'_id': ObjectId('54f4c5befba5220aa4d6dee7')}
>>> result = db.test.replace_one({'x': 1}, {'y': 1})
>>> result.matched_count
1
>>> result.modified_count
1
>>> for doc in db.test.find({}):
... print(doc)
...
{u'y': 1, u'_id': ObjectId('54f4c5befba5220aa4d6dee7')}
The replace_one operation returns an UpdateResult object which contains result.matched_count and result.modified_count attribute.
Update Operators
Fields:
$inc
$mul
$rename
$setOnInsert
$unset Removes the specified field from a document.
$min Only updates the field if the specified value is less than the existing field value.
$max
$currentDate
Operators:
$ Acts as a placeholder to update the first element that matches the query condition in an update.
$addToSet Adds elements to an array only if they do not already exist in the set.
$pop Removes the first or last item of an arra
$pullAll Removes all matching values from an array.
$pull Removes all array elements that match a specified query.
$push
Modifiers
$each Modifies the $push and $addToSet operators to append multiple items for array updates.
$slice Modifies the $push operator to limit the size of updated arrays.
$sort Modifies the $push operator to reorder documents stored in an array.
$position Modifies the $push operator to specify the position in the array to add elements.
Bitwise:
$bit Performs bitwise AND, OR, and XOR updates of integer values.
Isolation:
$isolated Modifies the behavior of a write operation to increase the isolation of the operation.
Remove Data with PyMongo
delete_one(_filte_r, collation=None)
delete_many(filter, collation=None)
An instance of DeleteResult.
>>> db.test.count({'x': 1})
3
>>> result = db.test.delete_one({'x': 1})
>>> result.deleted_count \
1
>>> db.test.count({'x': 1})
2
>>> db.test.count({'x': 1})
3
>>> result = db.test.delete_many({'x': 1})
>>> result.deleted_count
3
>>> db.test.count({'x': 1})
0
The operation returns a DeleteResult which reports the number of documents removed with deleted_count attribute
Remove All Documents
To remove all documents from a collection, pass an empty conditions document {} to the delete_many() method.
The remove all operation only removes the documents from the collection. The collection itself, as well as any indexes for the collection, remain
Drop a Collection
db.restaurants.drop()
drop the entire collection, including the indexes, and then recreate the collection and rebuild the indexes.
The following two calls are equivalent:
>>> db.foo.drop()
>>> db.drop_collection("foo")
Atomicity
In MongoDB, a write operation is atomic on the level of a single document, even if the operation modifies multiple embedded documents within a single document.
When a single write operation modifies multiple documents, the modification of each document is atomic, but the operation as a whole is not atomic and other operations may interleave. However, you can isolate a single write operation that affects multiple documents using the $isolated operator.
Transaction-Like Semantics
Since a single document can contain multiple embedded documents, single-document atomicity is sufficient for many practical use cases. For cases where a sequence of write operations must operate as if in a single transaction, you can implement a two-phase commit in your application.
However, two-phase commits can only offer transaction-like semantics. Using two-phase commit ensures data consistency, but it is possible for applications to return intermediate data during the two-phase commit or rollback.
Data Aggregation with PyMongo
aggregate(pipeline, **kwargs)
db.collection.aggregate([<stage1>, <stage2>, …])
Group Documents by a Field and Calculate Count
|
|
|
|
{
"result" : [
{
"_id" : "w3cschool.cc",
"`count`" : 2
},
{
"_id" : "Neo4j",
"`count`" : 1
}
],
"ok" : 1
}
Filter and Group Documents
Use the $match stage to filter documents. $match uses the MongoDB query syntax. Then the $group stage groups the matching documents by the address.zipcode field and uses the $sum accumulator to calculate the count.
cursor = db.restaurants.aggregate(
[
{"$match": {"borough": "Queens", "cuisine": "Brazilian"}},
{"$group": {"_id": "$address.zipcode", "count": {"$sum": 1}}}
]
)
pipline:
$collStats
$project
$match
$redact
$limit
$skip
$unwind
$group
$sample
$sort
$geoNear
$lookup
$out
$indexStats
$facet
$bucket
$bucketAuto
$sortByCount
$addFields
$replaceRoot
$count
$graphLookup
Indexes with PyMongo
Indexes can support the efficient execution of queries. Without indexes, MongoDB must perform a collection scan, If an appropriate index exists for a query, MongoDB can use the index to limit the number of documents it must inspect.
Create a Single-Field Index:
create_index(keys, **kwargs)
import pymongo
db.restaurants.create_index([("cuisine", pymongo.ASCENDING)])
The method returns the name of the created index.
|
|
Create a compound index:
|
|
The method returns the name of the created index.
create_indexes(indexes)
>>> from pymongo import IndexModel, ASCENDING, DESCENDING
>>> index1 = IndexModel([("hello", DESCENDING),
... ("world", ASCENDING)], name="hello_world")
>>> index2 = IndexModel([("goodbye", DESCENDING)])
>>> db.test.create_indexes([index1, index2])
["hello_world"]
drop_index(index_or_name)
drop_indexes() # Drops all indexes on this collection.
reindex() # Rebuilds all indexes on this collection.
list_indexes()
|
|
index_information()
MONGODB MANUAL
Introduction to MongoDB
Databases and Collections
Databases
use myDB
If a database does not exist, MongoDB creates the database when you first store data for that database. As such, you can switch to a non-existent database and perform the following operation in the mongo shell:
|
|
The insertOne() operation creates both the database myNewDB and the collection myNewCollection1if they do not already exist.
Collections
If a collection does not exist, MongoDB creates the collection when you first store data for that collection.
MongoDB provides the db.createCollection() method to explicitly create a collection with various options, such as setting the maximum size or the documentation validation rules. If you are not specifying these options, you do not need to explicitly create the collection since MongoDB creates new collections when you first store data for the collections.
Create View
|
|
To remove a view, use the db.collection.drop() method on the view.
Documents
MongoDB stores data records as BSON documents.
Field Names
Documents have the following restrictions on field names:
- The field name _id is reserved for use as a primary key; its value must be unique in the collection, is immutable, and may be of any type other than an array.
- The field names cannot start with the dollar sign ($) character.
- The field names cannot contain the dot (.) character.
- The field names cannot contain the null character.
Dot Notation
Arrays
"<array>.<index>"
{
...
contribs: [ "Turing machine", "Turing test", "Turingery" ],
...
}
To specify the third element in the contribs array, use the dot notation “contribs.2”.
Embedded Documents { … name: { first: “Alan”, last: “Turing” }, contact: { phone: { type: “cell”, number: “111-222-3333” } }, … }
To specify the field named last in the name field, use the dot notation “name.last”. To specify the number in the phone document in the contact field, use the dot notation “contact.phone.number”. The _id Field In MongoDB, each document stored in a collection requires a unique _id field that acts as a primary key.
Other Uses of the Document Structure
|
|
# Update Specification Documents
{
<operator1>: { <field1>: <value1>, ... },
<operator2>: { <field2>: <value2>, ... },
...
}
|
|
Documents
MongoDB stores data records as BSON documents. BSON is a binary representation of JSON documents, though it contains more data types than JSON.A record in MongoDB is a document, which is a data structure composed of field and value pairs. MongoDB documents are similar to JSON objects. The values of fields may include other documents, arrays, and arrays of documents.
|
|
Field Names
- The field name _id is reserved for use as a primary key; its value must be unique in the collection, is immutable, and may be of any type other than an array.
- The field names cannot start with the dollar sign ($) character.
- The field names cannot contain the dot (.) character.
- The field names cannot contain the null character.
Dot Notation
MongoDB uses the dot notation to access the elements of an array and to access the fields of an embedded document.
“<array>.<index>”
|
|
To specify the third element in the contribs array, use the dot notation “contribs.2”.
“<embedded document>.<field>”
|
|
To specify the number in the phone document in the contact field, use the dot notation “contact.phone.number”
The _id Field
- By default, MongoDB creates a unique index on the _id field during the creation of a collection. often use an ObjectId.
- The _id field is always the first field in the documents. If the server receives a document that does not have the _id field first, then the server will move the field to the beginning. generate an ObjectId
- The _id field may contain values of any BSON data type, other than an array.
Query Filter Documents
use <field>:<value> expressions to specify the equality condition and query operator expressions.
Update Specification Documents
Update specification documents use update operators to specify the data modifications to perform on specific fields during an db.collection.update() operation.
|
|
Index Specification Documents
Index specifications document define the field to index and the index type:
|
|
BSON Types
**BSON **supports the following data types as values in documents.
|
|
ObjectId
If an inserted document omits the _id field, the MongoDB driver automatically generates an ObjectId for the _id field.
in the mongo shell, you can access the creation time of the ObjectId, using the ObjectId.getTimestamp() method.
sorting on an _id field that stores ObjectId values is roughly equivalent to sorting by creation time.
|
|
Comparison/Sort Order
When comparing values of different BSON types, MongoDB uses the following comparison order, from lowest to highest:
|
|
-
Numeric Types
MongoDB treats some types as equivalent for comparison purposes. For instance, numeric types undergo conversion before comparison.
-
Strings
1 2 3 4 5 6 7 8 9 10 11 12Binary Comparison Collation { locale: <string>, caseLevel: <boolean>, caseFirst: <string>, strength: <int>, numericOrdering: <boolean>, alternate: <string>, maxVariable: <string>, backwards: <boolean> } -
Arrays
With arrays, a less-than comparison or an ascending sort compares the smallest element of arrays, and a greater-than comparison or a descending sort compares the largest element of the arrays.
-
Dates and Timestaps
Date objects sort before Timestamp objects. Previously Date and Timestamp objects sorted together.
-
Non-existent Fields
a sort on the a field in documents { } and { a: null } would treat the documents as equivalent in sort order.
-
BinData
First, the length or size of the data.Then, by the BSON one-byte subtype. Finally, by the data, performing a byte-by-byte comparison.
MongoDB Extended JSON
JSON can only represent a subset of the types supported by BSON.
Strict mode. Strict mode representations of BSON types conform to the JSON RFC. Any JSON parser can parse these strict mode representations as key/value pairs; however, only the MongoDB internal JSON parser recognizes the type information conveyed by the format.
mongo Shell mode. The MongoDB internal JSON parser and the mongo shell can parse this mode.
Input in Strict Mode
- REST Interfaces
- mongoimport
- –query option of various MongoDB tools
- MongoDB Compass
Input in mongo Shell Mode
- REST Interfaces
- mongoimport
- –query option of various MongoDB tools
- mongo shell
Output in Strict mode
mongoexport and REST and HTTP Interfaces output data in Strict mode.
Output in mongo Shell Mode
bsondump outputs in mongo Shell mode.
BSON Data Types and Associated Representations
| Strict Mode | mongo Shell Mode | |
| data_binary | { "$binary": "<bindata>", "$type": "<t>" } | BinData ( <t>, <bindata> ) |
| data_date | { "$date": "<date>" } |
new Date ( <date> ) |
| data_timestamp | { "$timestamp": { "t": <t>, "i": <i> } } |
Timestamp( <t>, <i> ) |
| Regular Expression | { "$regex": "<sRegex>", "$options": "<sOptions>" } | /<jRegex>/<jOptions> |
| data_oid | { "$oid": "<id>" } | ObjectId( "<id>" ) |
| data_ref | { "$ref": "<name>", "$id": "<id>" } |
DBRef("<name>", "<id>") |
| data_undefined | { "$undefined": true } | undefined |
| data_minkey | { "$minKey": 1 } | MinKey |
| data_maxkey | { "$maxKey": 1 } | MaxKey |
| data_numberlong | { "$numberLong": "<number>" } | NumberLong( "<number>" ) |
Developers
MongoDB CRUD Operations
Create Operations
Read Operations
Update Operations
Delete Operations
Bulk Write
PyMongo
MongoDB sorts BinData in the following order:
- First, the length or size of the data.
- Then, by the BSON one-byte subtype.
- Finally, by the data, performing a byte-by-byte comparison.
Collections
MongoDB stores documents in collections. Collections are analogous to tables in relational databases. Unlike a table, however, a collection does not require its documents to have the same schema.
In MongoDB, documents stored in a collection must have a unique _id field that acts as a primary key.
Installing PyMongo
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Tutorial
|
|
connect on the default host and port.
|
|
specify host and port
|
|
use uri format
|
|