Embedding vs. Referencing in MongoDB

Introduction

In MongoDB and Mongoose, managing relationships between documents is an important part of designing a scalable and efficient data model. Unlike relational databases that use tables and foreign keys, MongoDB lets you choose between embedding related data in documents or referencing data in collections. Each approach has its own advantages and disadvantages, and the right choice depends on the structure and requirements of your application.

In this guide, we will discuss when to embed data versus when to reference it, along with practical examples to help you effectively model relationships in Mongoose.

Understanding Relationships in MongoDB

MongoDB is a NoSQL database that offers flexibility in managing data relationships. You can design relationships in MongoDB in two main ways:

Embedding: Store related data in a document.
Referencing: Store related data in a separate collection and link them using references (ObjectIds).

The choice between embedding and referencing depends on data access patterns, document size, compatibility requirements, and the frequency of updating the related data.

1. Embedding Documents

Embedding involves storing related data directly in the parent document as a nested object or array. This is ideal for data that is frequently accessed together and has a one-to-many or one-to-many relationship.

Benefits of embedding

Access to the single document: All data is placed in one document, reducing the need for joins or additional requests.
Atomic operations: Updates and reads are atomic, making it easier to maintain data consistency.
Quick reading function: Because related data is stored together, document retrieval is faster and avoids additional database calls.

Embedding restrictions

Document size limit: MongoDB documents are limited to 16 MB. Large nested arrays can quickly approach this limit.
Repeat in update: Updating nested data in documents can lead to data duplication and potential inconsistencies.
Limited flexibility: Embedded documents are less flexible for querying relationships, especially if the data grows over time.

Example: Embedding comments in a blog post

In a blog application, you may embed comments directly into a post document, as comments are fully associated with the post.

const mongoose = require("mongoose");
const commentSchema = new mongoose.Schema({
user: { type: String, required: true },
message: { type: String, required: true },
date: { type: Date, default: Date.now }
});
const postSchema = new mongoose.Schema({
title: { type: String, required: true },
content: { type: String, required: true },
comments: [commentSchema] // Embedding comments in the post
});
const Post = mongoose.model("Post", postSchema);

In this setup, each post document contains an array of comments, which makes it easy to retrieve the post along with its comments in a single query.

2. Referencing Documents

Reference (or normalization) stores related data in separate collections and uses ObjectIds to link them. This is ideal for large datasets or when related data needs to be queried independently.

Benefits of Referral

Data reusability: Shared data, such as user profiles, only needs to be stored once and can be referenced by multiple documents.
Reduce document size: Referencing keeps the document size small and avoids large nested structures.
Scalability: References provide flexibility as data grows and enable you to manage many-to-many relationships more efficiently.

Referral restrictions

Multiple queries: Retrieving related data requires additional queries or fill operations, which can increase response time.
Compatibility challenges: Separate documents must be updated individually, potentially leading to data inconsistencies.
Complexity: References introduce more complexity to manage relationships and ensure data integrity.

Example: Author attribution and comments in a blog post

In more complex setups, you might store authors and comments in separate collections and reference them in the post document.

const authorSchema = new mongoose.Schema({
name: String,
bio: String
});
const commentSchema = new mongoose.Schema({
userId: { type: mongoose.Schema.Types.ObjectId, ref: "User" },
message: String,
date: { type: Date, default: Date.now }
});
const postSchema = new mongoose.Schema({
title: String,
content: String,
author: { type: mongoose.Schema.Types.ObjectId, ref: "Author" },
comments: [{ type: mongoose.Schema.Types.ObjectId, ref: "Comment" }]
});
const Author = mongoose.model("Author", authorSchema);
const Comment = mongoose.model("Comment", commentSchema);
const Post = mongoose.model("Post", postSchema);

In this setup, the Post model references the Author and Comment documents using ObjectIds. This approach is best for applications that may need to access or update comments and authors independently of the posts.

Filling in references

You can use Mongoose's fill method to fetch referenced data.

const post = await Post.findById(postId)
.populate("author")
.populate("comments");

This document retrieves the post along with the full author and comment documentation, providing a complete view of the post, its author, and comments.

Hybrid approach: combining embedding and referencing

In complex applications, you may need a hybrid approach, where you embed certain data and reference others. For example, in an e-commerce application, you might embed product details in an order but reference customer details.

Example: Embedding products and referring customers to an order

const productSchema = new mongoose.Schema({
productId: mongoose.Schema.Types.ObjectId,
name: String,
price: Number,
quantity: Number
});
const orderSchema = new mongoose.Schema({
customer: { type: mongoose.Schema.Types.ObjectId, ref: "Customer" },
products: [productSchema], // Embedding products
orderDate: { type: Date, default: Date.now }
});
const Order = mongoose.model("Order", orderSchema);

In this example:

Referred to the customer because they may have many orders and can be independently queried.
Products are embedded within the order because they are directly related to each specific order.

This approach keeps frequently accessed data together while allowing more complex relationships to be managed separately.

Result

Relationship management in Mongoose with embedding and reference provides flexibility for designing efficient data models in MongoDB. Embedding works well for closely related data and one-to-many relationships, while reference is better for larger, frequently updated relationships or many-to-many relationships.

By understanding the strengths and limitations of each approach, and by considering factors such as data access patterns, document size,

and update frequency, you can design a MongoDB schema that is both scalable and efficient. Implement these strategies in your Mongoose applications to effectively manage relationships and create robust, maintainable data models.

Embedding vs. Referencing in MongoDB

Introduction

Understanding Relationships in MongoDB

1. Embedding Documents

Benefits of embedding

Embedding restrictions

Example: Embedding comments in a blog post

2. Referencing Documents

Benefits of Referral

Referral restrictions

Example: Author attribution and comments in a blog post

Filling in references

Hybrid approach: combining embedding and referencing

Example: Embedding products and referring customers to an order

In this example:

Result

In this article:

Post written by: Hadi Bahadori

Leave a Reply

EducationalGetter and setter in mongodb

EducationalWhat is AJAX and how does it work?

How to create a dedicated CS2 (Counter Strike 2) server

10 of the best languages to start programming

The story of the entire Dark Souls series

Wbuntu: Ubuntu OS that looks like Windows 11

Hosting an AI chatbot with Olama and Open WebUI

How to use Netcat to create and test TCP and UDP connections

Virtual Box installation tutorial

How to configure repositories in Ubuntu 20.04

Embedding vs. Referencing in MongoDB

Introduction

Understanding Relationships in MongoDB

1. Embedding Documents

Benefits of embedding

Embedding restrictions

Example: Embedding comments in a blog post

2. Referencing Documents

Benefits of Referral

Referral restrictions

Example: Author attribution and comments in a blog post

Filling in references

Hybrid approach: combining embedding and referencing

Example: Embedding products and referring customers to an order

In this example:

Result

In this article:

Post written by: Hadi Bahadori

Follow

Leave a Reply

You May Also Like