Book Review: The Little Mongo DB Schema Design Book

I mentioned in a previous post on Schema Design I mentioned a book on the subject and that I hadn’t, at the time, read it. After hearing The Little Mongo DB Schema Design Book by Christian Kvalheim mentioned elsewhere I thought I would see what it was all about. The book was published in May of 2015. Even though it is a bit old, the coverage of schema design is still relevant.

The Little Mongo DB Schema Design Book

MongoDB Overview

Kvalheim starts off the book with a quick introduction to MongoDB and some basic principles of schema design before moving into some examples of data modeling patterns. I thought his discussion of One-To-One, One-To-Many, and Many-To-Many data models was well done. He used some good examples of blogs and users to explain the concepts in an easy to follow fashion.

Kvalheim moves on from there to cover an overview of storage engines available in MongoDB. Specifically the MMAP and WiredTiger storage engines. This provides nice coverage for those using older, pre version 3.2 instances of MongoDB, as well as those who have opted to upgrade to more recent versions. At the time of this writing, version 3.6 is the most current.

After the discussion on storage engines, we are provided with information indexes and sharding concepts before diving into specifics about schema design itself.

Schema Design Patterns

Once we move into the design pattern section of the book, Kvalheim does a nice job of breaking each design option down. He follows a consistent format for each pattern discussing the unique aspects of typical data modeling patterns. He showcases their operations and provides recommendations for indexing, scaling, and performance implications.

The examples are done very well and provide some great coverage of a wide variety of use cases for data storage. Some example schema designs covered are:

  • Time Series
  • Account Transactions
  • Internationalization
  • Shopping Carts
  • Reservations

In total eleven distinct design concepts are explored.

Improvements

One of the downsides to print books about technology topics is the speed in which the information changes. There are indeed a lot of installations of MongoDB using versions before version 3.2. Being a user of MongoDB after 3.2 I found the discussions of the MMAP storage engine to be less relevant than they were in 2015.

There were a few type-setting issues in this schema design book but I didn’t find those to be too troubling. They generally just required rereading the sentence a time or two to grasp the meaning of the sentence.

Wrap Up on the Schema Design Book

Overall, I found this book to be a great resource for schema design. Definitely an excellent addition to one’s library for application development when using MongoDB as a database. There are some features in post-2015 releases of MongoDB that assist developers and database administrators in schema management as well. Document validation was introduced in version 3.2. Version 3.6 extended the validation process with schema validation.


Follow me on Twitter @kenwalger to get the latest updates on my postings. Or better yet, sign up for the email list to get updates in your mailbox!

There are a few MongoDB specific terms in this post. I created a MongoDB Dictionary skill for the Amazon Echo line of products. Check it out and you can say “Alexa, ask MongoDB for the definition of a document?” and get a helpful response.

Facebooktwittergoogle_plusredditlinkedinmail

Schema Validation in MongoDB 3.6

MongoDB 3.6 brings lots of great new features with the new release. I’ve already covered Change Streams and Retryable Writes in previous posts. This post will cover a feature which expands upon the document validation feature from MongoDB 3.2, schema validation. Schema validation allows for teams to define a prescribed document structure for each collection.

There are times when enforcing strict data structures and content are required, even with the flexible schema and document data model that MongoDB provides. Schema validation allows for the ability to define a prescribed document structure for each collection. If one tries to insert or update a document which does not conform to the applied structure the operation can be rejected. The rules for document structure are based on the JSON schema draft specification.

Schema Validation

Let’s take a quick look at how this works in action before diving into a discussion of feature benefits.

Imagine a collection of food recipes. Each recipe will have a recipe name, number of servings, cooking method, ingredients, and list of instructions. For the ingredients, we want to enforce that there is a numerical quantity, a measure of quantity, the ingredient name, and an optional value for any prep work on the ingredient, such as “peeled” or “brunoise“.  For our example here my options are not all-inclusive for options but simply serve as examples. I hope my former culinary colleagues forgive me.

We’ll begin with creating a recipes collection and assigning the schema validation rules with the validator option and the new $jsonSchema operator.

db.createCollection( "recipes",
{
  validator: 
    {
      $jsonSchema:
        {
          bsonType: "object",
          required: ["name", "servings", "ingredients"],
          additionalProperties: false,
          properties:
            {
              _id: {},
              name: {
                bsonType: "string",
                description: "'name' is required and is a string"
                    },
              servings: {
                bsonType: ["int", "double"],
                minimum: 0,
                description: "'servings' is required and must be an integer greater than zero."
                    },
              cooking_method: {
                 enum: ["broil", "grill", "roast", "bake", "saute", "pan-fry", "deep-fry", "poach", "simmer", "boil", "steam", "braise", "stew"],
              description: "'cooking_method' is optional but, if used, must be one of the listed options."
                    },
              ingredients: 
              {
                bsonType: ["array"],
                minItems: 1,
                maxItems: 50,
                items: {
                  bsonType: ["object"],     
                  required: ["quantity", "measure", "ingredient"],
                  additionalProperties: false,
                  description: "'ingredients' must contain the stated fields.",
                  properties: 
                  {
                    quantity: {
                      bsonType: ["double", "decimal"],
                      description: "'quantity' is required and is of double or decimal type"
                            },
                    measure: {
                      enum: ["tsp", "Tbsp", "cup", "ounce", "pound",  "each"],
                      description: "'measure' is required and can only be one of the given enum values"
                            },
                    ingredient: {
                      bsonType: "string",
                      description: "'ingredient' is required and is a string"
                            },
                    format: {
                      bsonType: "string",
                      description: "'format' is an optional field of type string"
                            }
                  }
              }
           }
        }
     }
   }
})

Our validator can include many more rules that are beyond the scope of this introduction to schema validation. However, we can see how powerful this feature is in this small example. Let’s look at some example documents to insert as well to see what would happen if we try to insert a document into the collection.

Sample Inserts
db.recipes.insertOne({name: "Chocolate Sponge Cake Filling", 
servings: 4, 
ingredients: [{quantity: 7, measure: "ounce", ingredient: "bittersweet chocolate", format: "chopped"}, 
{quantity: 2, measure: "cup", ingredient: "heavy cream"}
]})

This insert works since it covers all of the required fields in their proper format. It also doesn’t include any invalid extra fields which would be prohibited since we have additionalProperties: false set. If we were to try to insert the following document, however, we would get an error.

db.recipes.insertOne({name: "Chocolate Sponge Cake Filling", 
servings: 4, 
ingredients: [{quantity: 7, measure: "ounce", ingredient: "bittersweet chocolate", format: "chopped"}, 
{quantity: 2, measure: "cup", ingredient: "heavy cream"}],
directions: "Boil cream and pour over chocolate. Stir until chocolate is melted."
})

Since we added in a directions field into our recipe document, the insert will fail with additionalProperties: false set. I should add in an optional directions field into the schema validation to allow for that as directions are indeed important.

Schema Validation Benefits

Even with these basic examples, I think it is clear that schema validation is a great new enhancement. The flexibility of a dynamic schema in MongoDB can now easily be paired with data governance controls over an entire collection. While there are lots of practical benefits from this, here are some specific benefits.

  1. Application logic simplification. With a strictly defined definition of what a collection looks like, there isn’t a need for the application to handle the guarantees and associated errors.
  2. Control of data. Client applications must adhere to the rules set forth by the collection. No need to worry about data being updated or inserted that has incorrect field names or data types.
  3. Governmental compliance. There are applications and data models in a variety of industries and locales that require data to be stored in specific formats. For example, the EU General Data Protection Regulation in the European Union.

Wrap Up

With MongoDB 3.6 schema validation, administrators have tunable controls over the database. Documents that don’t conform to a prescribed set of conditions can be rejected or still written and a message logged about the action. It also brings with it the ability to query based on the schema definition to allow, for example, a search on all documents that don’t conform to the schema.

Schema validation is another exciting feature in the 3.6 release of MongoDB. I would urge you to download it and try it out. I’d bet you’ll be as excited about it as I am.


Follow me on Twitter @kenwalger to get the latest updates on my postings. Or better yet, sign up for the email list to get updates in your mailbox!

There are a few MongoDB specific terms in this post. I created a MongoDB Dictionary skill for the Amazon Echo line of products. Check it out and you can say “Alexa, ask MongoDB for the definition of a document?” and get a helpful response. I also created a culinary skill for the Echo if you’d like to update your culinary knowledge as well.

Facebooktwittergoogle_plusredditlinkedinmail