Python MongoDB

Tutorial 53 of 65 · pythondeck.com Python course

The official pymongo driver exposes MongoDB documents as Python dicts. Collections are schema-less and support rich query and aggregation pipelines.

MongoDB stores flexible BSON documents—great for evolving schemas, content catalogs, and event logs where rigid tables fight product velocity.

PyMongo and Motor (async) are the native drivers; understanding documents, indexes, and aggregation pipelines unlocks efficient queries.

Document databases shine when read patterns cluster on one entity; they punish unbounded growth of embedded arrays and ad hoc joins across collections.

Documents & collections — JSON-like dicts; _id is immutable primary key.

CRUD — insert_one, find with filters, update operators ($set, $inc).

Indexes — single, compound, text; explain() plans.

Aggregation — $match, $group, $lookup for server-side analytics.

Schema design — embed vs reference using access patterns.

Transactions — multi-document ACID when replica set requires it.

Design around query patterns: embed one-to-few relationships; reference unbounded arrays. The aggregation framework replaces many MapReduce-era jobs. Change streams feed real-time consumers. For strict relational integrity across many entities, RDBMS may still fit.

Atlas or self-hosted: tune write concern and read preference for consistency vs latency. BSON types include Decimal128—convert carefully at API boundaries.

Schema validation rules at collection level catch bad documents at insert time—pair with application validators for defense in depth.

Time-series collections tier old data to cheaper storage when retention policies allow.

Unindexed queries on large collections scanning every document.

16 MB document limit exceeded by unbounded embedded arrays.

Treating MongoDB as schemaless excuse for inconsistent field names in production.

Ignoring write concern and losing data on primary failover.

Model indexes in code reviews alongside feature queries.

Use projection to return only needed fields; cap default find limits.

Validate documents at API ingress (pydantic/jsonschema).

Backup with mongodump or cloud snapshots; test restores quarterly.

Cap batch insert sizes and use ordered=False only when duplicate key races are acceptable.

Re-read the examples below with these ideas in mind; change variable names and inputs to match your own project.

The program below demonstrates connect / insert. Read the comments on each line, run the code, then change names or values to see how the output shifts.

# Example: Connect / insert
# Run in the REPL or save as a .py file and execute with python.
from pymongo import MongoClient
client = MongoClient("mongodb://localhost:27017")
db = client["shop"]
db.products.insert_many([
    {"name": "Tkinter Pro", "price": 49},
    {"name": "Kivy Pro",    "price": 59},
])

This sample walks through query in a small, runnable script. Paste it into the REPL or save it as a .py file before you continue to the next block.

# Example: Query
# Run in the REPL or save as a .py file and execute with python.
for p in db.products.find({"price": {"$gte": 50}}):
    print(p)

Here is a hands-on illustration of aggregation. Follow the inline comments first; only then execute the snippet and compare the result with what you expected.

# Example: Aggregation
# Run in the REPL or save as a .py file and execute with python.
pipeline = [
    {"$group": {"_id": None, "avg": {"$avg": "$price"}}}
]
print(list(db.products.aggregate(pipeline)))

The program below demonstrates insert find. Read the comments on each line, run the code, then change names or values to see how the output shifts.

# PyMongo talks BSON documents to MongoDB
# pip install pymongo  # install driver
from pymongo import MongoClient  # high-level client
client = MongoClient("mongodb://localhost:27017")  # connection URI
db = client["shop"]  # database handle
col = db["products"]  # collection
col.insert_one({"name": "Pen", "price": 1.5})  # insert document
doc = col.find_one({"name": "Pen"})  # query
print(doc["price"])  # 1.5

This sample walks through update filter in a small, runnable script. Paste it into the REPL or save it as a .py file before you continue to the next block.

# Filters use Mongo query operators
from pymongo import MongoClient  # pymongo
col = MongoClient()["shop"]["products"]  # collection shortcut
col.update_one({"name": "Pen"}, {"$set": {"price": 2.0}})  # patch field
cursor = col.find({"price": {"$gte": 2}})  # range query
print([d["name"] for d in cursor])  # names meeting filter
col.delete_many({"price": {"$lt": 1}})  # remove cheap items
print(col.count_documents({}))  # remaining count

« Python MySQL All tutorials Python Flask »