Introduction
In MongoDB and Mongoose, managing relationships between documents is an important part of designing a scalable and efficient data model. Unlike relational databases that use tables and foreign keys, MongoDB lets you choose between embedding related data in documents or referencing data in collections. Each approach has its own advantages and disadvantages, and the right choice depends on the structure and requirements of your application.
In this guide, we will discuss when to embed data versus when to reference it, along with practical examples to help you effectively model relationships in Mongoose.
Understanding Relationships in MongoDB
MongoDB is a NoSQL database that offers flexibility in managing data relationships. You can design relationships in MongoDB in two main ways:
- Embedding: Store related data in a document.
- Referencing: Store related data in a separate collection and link them using references (ObjectIds).
The choice between embedding and referencing depends on data access patterns, document size, compatibility requirements, and the frequency of updating the related data.
1. Embedding Documents
Embedding involves storing related data directly in the parent document as a nested object or array. This is ideal for data that is frequently accessed together and has a one-to-many or one-to-many relationship.
Benefits of embedding
- Access to the single document: All data is placed in one document, reducing the need for joins or additional requests.
- Atomic operations: Updates and reads are atomic, making it easier to maintain data consistency.
- Quick reading function: Because related data is stored together, document retrieval is faster and avoids additional database calls.
Embedding restrictions
- Document size limit: MongoDB documents are limited to 16 MB. Large nested arrays can quickly approach this limit.
- Repeat in update: Updating nested data in documents can lead to data duplication and potential inconsistencies.
- Limited flexibility: Embedded documents are less flexible for querying relationships, especially if the data grows over time.
Example: Embedding comments in a blog post
In a blog application, you may embed comments directly into a post document, as comments are fully associated with the post.
const mongoose = require("mongoose");
const commentSchema = new mongoose.Schema({
user: { type: String, required: true },
message: { type: String, required: true },
date: { type: Date, default: Date.now }
});
const postSchema = new mongoose.Schema({
title: { type: String, required: true },
content: { type: String, required: true },
comments: [commentSchema] // Embedding comments in the post
});
const Post = mongoose.model("Post", postSchema);In this setup, each post document contains an array of comments, which makes it easy to retrieve the post along with its comments in a single query.
2. Referencing Documents
Reference (or normalization) stores related data in separate collections and uses ObjectIds to link them. This is ideal for large datasets or when related data needs to be queried independently.
Benefits of Referral
- Data reusability: Shared data, such as user profiles, only needs to be stored once and can be referenced by multiple documents.
- Reduce document size: Referencing keeps the document size small and avoids large nested structures.
- Scalability: References provide flexibility as data grows and enable you to manage many-to-many relationships more efficiently.
Referral restrictions
- Multiple queries: Retrieving related data requires additional queries or fill operations, which can increase response time.
- Compatibility challenges: Separate documents must be updated individually, potentially leading to data inconsistencies.
- Complexity: References introduce more complexity to manage relationships and ensure data integrity.
Example: Author attribution and comments in a blog post
In more complex setups, you might store authors and comments in separate collections and reference them in the post document.
const authorSchema = new mongoose.Schema({
name: String,
bio: String
});
const commentSchema = new mongoose.Schema({
userId: { type: mongoose.Schema.Types.ObjectId, ref: "User" },
message: String,
date: { type: Date, default: Date.now }
});
const postSchema = new mongoose.Schema({
title: String,
content: String,
author: { type: mongoose.Schema.Types.ObjectId, ref: "Author" },
comments: [{ type: mongoose.Schema.Types.ObjectId, ref: "Comment" }]
});
const Author = mongoose.model("Author", authorSchema);
const Comment = mongoose.model("Comment", commentSchema);
const Post = mongoose.model("Post", postSchema);In this setup, the Post model references the Author and Comment documents using ObjectIds. This approach is best for applications that may need to access or update comments and authors independently of the posts.
Filling in references
You can use Mongoose's fill method to fetch referenced data.
const post = await Post.findById(postId)
.populate("author")
.populate("comments");This document retrieves the post along with the full author and comment documentation, providing a complete view of the post, its author, and comments.
Hybrid approach: combining embedding and referencing
In complex applications, you may need a hybrid approach, where you embed certain data and reference others. For example, in an e-commerce application, you might embed product details in an order but reference customer details.
Example: Embedding products and referring customers to an order
const productSchema = new mongoose.Schema({
productId: mongoose.Schema.Types.ObjectId,
name: String,
price: Number,
quantity: Number
});
const orderSchema = new mongoose.Schema({
customer: { type: mongoose.Schema.Types.ObjectId, ref: "Customer" },
products: [productSchema], // Embedding products
orderDate: { type: Date, default: Date.now }
});
const Order = mongoose.model("Order", orderSchema);In this example:
- Referred to the customer because they may have many orders and can be independently queried.
- Products are embedded within the order because they are directly related to each specific order.
This approach keeps frequently accessed data together while allowing more complex relationships to be managed separately.
Result
Relationship management in Mongoose with embedding and reference provides flexibility for designing efficient data models in MongoDB. Embedding works well for closely related data and one-to-many relationships, while reference is better for larger, frequently updated relationships or many-to-many relationships.
By understanding the strengths and limitations of each approach, and by considering factors such as data access patterns, document size,
and update frequency, you can design a MongoDB schema that is both scalable and efficient. Implement these strategies in your Mongoose applications to effectively manage relationships and create robust, maintainable data models.









