Schema Validation in MongoDB 3.6

MongoDB 3.6 brings lots of great new features with the new release. I’ve already covered Change Streams and Retryable Writes in previous posts. This post will cover a feature which expands upon the document validation feature from MongoDB 3.2, schema validation. Schema validation allows for teams to define a prescribed document structure for each collection.

There are times when enforcing strict data structures and content are required, even with the flexible schema and document data model that MongoDB provides. Schema validation allows for the ability to define a prescribed document structure for each collection. If one tries to insert or update a document which does not conform to the applied structure the operation can be rejected. The rules for document structure are based on the JSON schema draft specification.

Schema Validation

Let’s take a quick look at how this works in action before diving into a discussion of feature benefits.

Imagine a collection of food recipes. Each recipe will have a recipe name, number of servings, cooking method, ingredients, and list of instructions. For the ingredients, we want to enforce that there is a numerical quantity, a measure of quantity, the ingredient name, and an optional value for any prep work on the ingredient, such as “peeled” or “brunoise“.  For our example here my options are not all-inclusive for options but simply serve as examples. I hope my former culinary colleagues forgive me.

We’ll begin with creating a recipes collection and assigning the schema validation rules with the validator option and the new $jsonSchema operator.

db.createCollection( "recipes",
{
  validator: 
    {
      $jsonSchema:
        {
          bsonType: "object",
          required: ["name", "servings", "ingredients"],
          additionalProperties: false,
          properties:
            {
              _id: {},
              name: {
                bsonType: "string",
                description: "'name' is required and is a string"
                    },
              servings: {
                bsonType: ["int", "double"],
                minimum: 0,
                description: "'servings' is required and must be an integer greater than zero."
                    },
              cooking_method: {
                 enum: ["broil", "grill", "roast", "bake", "saute", "pan-fry", "deep-fry", "poach", "simmer", "boil", "steam", "braise", "stew"],
              description: "'cooking_method' is optional but, if used, must be one of the listed options."
                    },
              ingredients: 
              {
                bsonType: ["array"],
                minItems: 1,
                maxItems: 50,
                items: {
                  bsonType: ["object"],     
                  required: ["quantity", "measure", "ingredient"],
                  additionalProperties: false,
                  description: "'ingredients' must contain the stated fields.",
                  properties: 
                  {
                    quantity: {
                      bsonType: ["double", "decimal"],
                      description: "'quantity' is required and is of double or decimal type"
                            },
                    measure: {
                      enum: ["tsp", "Tbsp", "cup", "ounce", "pound",  "each"],
                      description: "'measure' is required and can only be one of the given enum values"
                            },
                    ingredient: {
                      bsonType: "string",
                      description: "'ingredient' is required and is a string"
                            },
                    format: {
                      bsonType: "string",
                      description: "'format' is an optional field of type string"
                            }
                  }
              }
           }
        }
     }
   }
})

Our validator can include many more rules that are beyond the scope of this introduction to schema validation. However, we can see how powerful this feature is in this small example. Let’s look at some example documents to insert as well to see what would happen if we try to insert a document into the collection.

Sample Inserts
db.recipes.insertOne({name: "Chocolate Sponge Cake Filling", 
servings: 4, 
ingredients: [{quantity: 7, measure: "ounce", ingredient: "bittersweet chocolate", format: "chopped"}, 
{quantity: 2, measure: "cup", ingredient: "heavy cream"}
]})

This insert works since it covers all of the required fields in their proper format. It also doesn’t include any invalid extra fields which would be prohibited since we have additionalProperties: false set. If we were to try to insert the following document, however, we would get an error.

db.recipes.insertOne({name: "Chocolate Sponge Cake Filling", 
servings: 4, 
ingredients: [{quantity: 7, measure: "ounce", ingredient: "bittersweet chocolate", format: "chopped"}, 
{quantity: 2, measure: "cup", ingredient: "heavy cream"}],
directions: "Boil cream and pour over chocolate. Stir until chocolate is melted."
})

Since we added in a directions field into our recipe document, the insert will fail with additionalProperties: false set. I should add in an optional directions field into the schema validation to allow for that as directions are indeed important.

Schema Validation Benefits

Even with these basic examples, I think it is clear that schema validation is a great new enhancement. The flexibility of a dynamic schema in MongoDB can now easily be paired with data governance controls over an entire collection. While there are lots of practical benefits from this, here are some specific benefits.

  1. Application logic simplification. With a strictly defined definition of what a collection looks like, there isn’t a need for the application to handle the guarantees and associated errors.
  2. Control of data. Client applications must adhere to the rules set forth by the collection. No need to worry about data being updated or inserted that has incorrect field names or data types.
  3. Governmental compliance. There are applications and data models in a variety of industries and locales that require data to be stored in specific formats. For example, the EU General Data Protection Regulation in the European Union.

Wrap Up

With MongoDB 3.6 schema validation, administrators have tunable controls over the database. Documents that don’t conform to a prescribed set of conditions can be rejected or still written and a message logged about the action. It also brings with it the ability to query based on the schema definition to allow, for example, a search on all documents that don’t conform to the schema.

Schema validation is another exciting feature in the 3.6 release of MongoDB. I would urge you to download it and try it out. I’d bet you’ll be as excited about it as I am.


Follow me on Twitter @kenwalger to get the latest updates on my postings. Or better yet, sign up for the email list to get updates in your mailbox!

There are a few MongoDB specific terms in this post. I created a MongoDB Dictionary skill for the Amazon Echo line of products. Check it out and you can say “Alexa, ask MongoDB for the definition of a document?” and get a helpful response. I also created a culinary skill for the Echo if you’d like to update your culinary knowledge as well.

Facebooktwittergoogle_plusredditlinkedinmail