schema design Archives | Page 7 of 8

One frequently asked question when it comes to MongoDB is “How do I structure my schema in MongoDB for my application?” The honest answer is, it depends. Does your application do more reads than writes? What data needs to be together when read from the database? What performance considerations are there? How large are the documents? How large will they get? How do you anticipate your data will grow and scale?

All of these questions, and more, factor into how one designs a database schema in MongoDB. It has been said that MongoDB is schemaless. In fact, schema design is very important in MongoDB. The hard fact is that most performance issues we’ve found trace back to poor schema design.

Over the course of this blog post series, we’ll take a look at twelve common Schema Design Patterns that work well in MongoDB. We hope this series will establish a common methodology and vocabulary you can use when designing schemas. Leveraging these patterns allows for the use of “building blocks” in schema planning, resulting in more methodology being used than art.

MongoDB uses a document data model. This model is inherently flexible, allowing for data models to support your application needs. The flexibility also can lead to schemas being more complex than they should. When thinking of schema design, we should be thinking of performance, scalability, and simplicity.

Let’s start our exploration into schema design with a look at what can be thought as the base for all patterns, the Polymorphic Pattern. This pattern is utilized when we have documents that have more similarities than differences. It’s also a good fit for when we want to keep documents in a single collection.

The Polymorphic Pattern

When all documents in a collection are of similar, but not identical, structure, we call this the Polymorphic Pattern. As mentioned, the Polymorphic Pattern is useful when we want to access (query) information from a single collection. Grouping documents together based on the queries we want to run (instead of separating the object across tables or collections) helps improve performance.

Imagine that our application tracks professional sports athletes across all different sports.

We still want to be able to access all of the athletes in our application, but the attributes of each athlete are very different. This is where the Polymorphic Pattern shines. In the example below, we store data for athletes from two different sports in the same collection. The data stored about each athlete does not need to be the same even though the documents are in the same collection.

Polymorphic Design Pattern with Common Fields

Professional athlete records have some similarities, but also some differences. With the Polymorphic Pattern, we are easily able to accommodate these differences. If we were not using the Polymorphic Pattern, we might have a collection for Bowling Athletes and a collection for Tennis Athletes. When we wanted to query on all athletes, we would need to do a time-consuming and potentially complex join. Instead, since we are using the Polymorphic Pattern, all of our data is stored in one Athletes collection and querying for all athletes can be accomplished with a simple query.

This design pattern can flow into embedded sub-documents as well. In the above example, Martina Navratilova didn’t just compete as a single player, so we might want to structure her record as follows:

Polymorphic Design Pattern with sub-documents

From an application development standpoint, when using the Polymorphic Pattern we’re going to look at specific fields in the document or sub-document to be able to track differences. We’d know, for example, that a tennis player athlete might be involved with different events, while a different sports player may not be. This will, typically, require different code paths in the application code based on the information in a given document. Or, perhaps, different classes or subclasses are written to handle the differences between tennis, bowling, soccer, and rugby players.

Sample Use Case

One example use case of the Polymorphic Pattern is Single View applications. Imagine working for a company that, over the course of time, acquires other companies with their technology and data patterns. For example, each company has many databases, each modeling “insurances with their customers” in a different way. Then you buy those companies and want to integrate all of those systems into one. Merging these different systems into a unified SQL schema is costly and time-consuming.

MetLife was able to leverage MongoDB and the Polymorphic Pattern to build their single view application in a few months. Their Single View application aggregates data from multiple sources into a central repository allowing customer service, insurance agents, billing, and other departments to get a 360° picture of a customer. This has allowed them to provide better customer service at a reduced cost to the company. Further, using MongoDB’s flexible data model and the Polymorphic Pattern, the development team was able to innovate quickly to bring their product online.

A Single View application is one use case of the Polymorphic Pattern. It also works well for things like product catalogs where a bicycle has different attributes than a fishing rod. Our athlete example could easily be expanded into a more full-fledged content management system and utilize the Polymorphic Pattern there.

Conclusion

The Polymorphic Pattern is used when documents have more similarities than they have differences. Typical use cases for this type of schema design would be:

Single View applications
Content management
Mobile applications
A product catalog

The Polymorphic Pattern provides an easy-to-implement design that allows for querying across a single collection and is a starting point for many of the design patterns we’ll be exploring in upcoming posts. The next pattern we’ll discuss is the Attribute Pattern.

If you have questions, please leave a comment below.

This post was originally published on the MongoDB Blog.

I mentioned in a previous post on Schema Design I mentioned a book on the subject and that I hadn’t, at the time, read it. After hearing The Little Mongo DB Schema Design Book by Christian Kvalheim mentioned elsewhere I thought I would see what it was all about. The book was published in May of 2015. Even though it is a bit old, the coverage of schema design is still relevant.

MongoDB Overview

Kvalheim starts off the book with a quick introduction to MongoDB and some basic principles of schema design before moving into some examples of data modeling patterns. I thought his discussion of One-To-One, One-To-Many, and Many-To-Many data models was well done. He used some good examples of blogs and users to explain the concepts in an easy to follow fashion.

Kvalheim moves on from there to cover an overview of storage engines available in MongoDB. Specifically the MMAP and WiredTiger storage engines. This provides nice coverage for those using older, pre version 3.2 instances of MongoDB, as well as those who have opted to upgrade to more recent versions. At the time of this writing, version 3.6 is the most current.

After the discussion on storage engines, we are provided with information indexes and sharding concepts before diving into specifics about schema design itself.

Schema Design Patterns

Once we move into the design pattern section of the book, Kvalheim does a nice job of breaking each design option down. He follows a consistent format for each pattern discussing the unique aspects of typical data modeling patterns. He showcases their operations and provides recommendations for indexing, scaling, and performance implications.

The examples are done very well and provide some great coverage of a wide variety of use cases for data storage. Some example schema designs covered are:

Time Series
Account Transactions
Internationalization
Shopping Carts
Reservations

In total eleven distinct design concepts are explored.

Improvements

One of the downsides to print books about technology topics is the speed in which the information changes. There are indeed a lot of installations of MongoDB using versions before version 3.2. Being a user of MongoDB after 3.2 I found the discussions of the MMAP storage engine to be less relevant than they were in 2015.

There were a few type-setting issues in this schema design book but I didn’t find those to be too troubling. They generally just required rereading the sentence a time or two to grasp the meaning of the sentence.

Wrap Up on the Schema Design Book

Overall, I found this book to be a great resource for schema design. Definitely an excellent addition to one’s library for application development when using MongoDB as a database. There are some features in post-2015 releases of MongoDB that assist developers and database administrators in schema management as well. Document validation was introduced in version 3.2. Version 3.6 extended the validation process with schema validation.

Follow me on Twitter @kenwalger to get the latest updates on my postings. Or better yet, sign up for the email list to get updates in your mailbox!

There are a few MongoDB specific terms in this post. I created a MongoDB Dictionary skill for the Amazon Echo line of products. Check it out and you can say “Alexa, ask MongoDB for the definition of a document?” and get a helpful response.

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Tag: schema design

Building With Patterns: The Polymorphic Pattern