NoSQL Archives | Page 21 of 26 | Blog of Ken W. Alger

There are many different considerations to be made when running queries in MongoDB. A helpful thing to use in the mongo shell when running a find() operation is to use the explain() method. In this blog post, I’ll take a look at some of the options for explain() and what the results mean.

explain()

As discussed in a previous post on indexing in MongoDB, we can use the explain() method to learn about the selected query plan. This allows for an examination of the performance of a given query. It can be used in the following manner:

db.collection.find().explain()

The information generated can be used to see what index is being used for a query, if the query is a covered query, and which servers in a sharded collection the query is run against, to name a few.

Three different verbosity modes can be utilized to determine the amount of information provided.

Verbosity modes

queryPlanner – the given query provided in the find() method is put through the query optimizer to find the most efficient query. This “winning plan” is then passed to the queryPlanner and the information is returned for the evaluated query. The query is not run in this mode. As a result things like query time, e.g. executionTimeMillisEstimate are true estimates since the query has not been executed.
executionStats – when running in this mode, the query optimizer is run and the query is fully executed. The information returned details the results of the are what actually happened during that specific query.
allPlansExecution – as the name might suggest, this mode returns information about all possible query plans. While the winning plan is executed and statistics returned for it, other candidate plan information is returned as well. This is the default mode of explain().

The variety of information these different modes provides can be extremely useful. Let’s take a look at some returned results of explain() and walk through what they show.

Results

For this example, I will use a test example database of a blog. The database contains two collections, users and articles, and is running on a single, unsharded, machine. Each collection has, roughly, 550,500 documents and is not indexed beyond the index for _id.

Let’s start with looking at what gets returned from a query for a single username. And take a look at some of the bits and pieces of information provided.

db.users.find( { "username": "User_9"} ).explain()

The parsedQuery section is the query we are exploring. The query stage provides a description of the type of operation that occurred for the winning plan.

Operation Types

COLLSCAN – indicates a collection scan occurred for the query, meaning that the query looked at each document to get the results
IXSCAN – indicates an index was used for the query
FETCH – for retrieving documents
SHARD_MERGE – the result of merging data from shards

The stage is a tree structure and can have multiple, child, stages. The direction of the query shows whether the query was performed in a forward or reverse order. The serverInfo section displays information on the server the query was run against and includes, in the version key, the version of the MongoDB database. If the collection was in a sharded environment, each accessed shard would be listed in the serverInfo.

When the command is run using the “executionStats” verbosity mode:

db.users.find({ "username": "User_9"} ).explain("executionStats")

additional information is provided as a result of the query being run on the data.

Here we see, among other things, the time the query took to run, along with how many documents were returned, nReturned, and how many documents were examined by the database, totalDocsExamined. As mentioned in my post on indexing, ideally these two numbers should be very close to the same value.

Wrap Up

There is a lot of information available when using the explain() method. It provides some great information about how queries are actually being run and gives an indication as to where a collection can benefit from an index. It should be your first stop when examining slow queries before moving onto other MongoDB tools.

There are a lot of MongoDB specific terms in this post. I created a MongoDB Dictionary skill for the Amazon Echo line of products. Check it out and you can say “Alexa, ask MongoDB what is an index?” and get a helpful response.

Follow me on Twitter @kenwalger to get the latest updates on my postings.

I have discussed one of the GUI tools MongoDB offers, Compass, previously. Sometimes, however, using the command line interface (CLI) is required. MongoDB provides some very helpful CLI tools. Let’s have a quick look at what is included in the MongoDB installation package, and what the tools and files do.

Package Components

There are two main types of files which come included in the MongoDB download, process/service files and tools. The process files are the core components of the MongoDB system include the following:

Process

mongod, which is the core database process
mongos, which controls and routes queries in a sharded environment
mongo is the interactive, JavaScript based, MongoDB shell.

Service

In the Windows download there are some additional files for running and configuring MongoDB as a Windows Service.

mongod.exe, the core database
mongos.exe, the sharded environment controller

These are the main applications you will find yourself using most often to get the server up and running (mongod) and interacting with the server in the mongo shell (mongo). It is possible to get some server information with shell database methods such as serverStatus() and stats(). However, there are some CLI tools which offer much more detailed information for us.

CLI Tools

There are a couple of different buckets in which the CLI tools fall; import/export and diagnostic tools. Let’s take a closer look:

Import/Export Tools

As with most databases, having a way to bring data into and out of the database is extremely useful. MongoDB is no different. Being a document database doesn’t mean that it can’t provide a way to utilize structured data when needed. Or, to provide a way to export it’s rich document data for use elsewhere.

MongoDB stores data on disk in BSON format and allows for the importing (restoration) and exporting (dumping) of files in this format with the following CLI tools.

mongodump, generates a BSON file from a running mongod server.
mongorestore allows for the restoration of the files

There also in an included application, bsondump, which will convert the BSON dump files into JSON files. Recall that BSON is a binary form of JSON and includes some important data features such as data typing, such as date, integer, long, double, and decimal.

For working with formats other than BSON, MongoDB provides for support for importing and exporting data in JSON, CSV, or TSV format. These can be especially useful when bringing in established relational data of many types.

mongoimport brings the JSON, CSV, or TSV formatted data into a running MongoDB database
mongoexport, yep you guessed it, exports the database data.

CLI Tools for Diagnosis

This is where the real workhorses come in for examining the health of your MongoDB server environment. The provided tools allow for the examination of the current operation of a MongoDB server. One can also look at, and capture, network traffic or manage LDAP configurations.

mongostat is great for a obtaining the overview status of a running mongod or mongos instance. For example, it can provide information regarding the number of inserts, queries, updates, or deletes and lots more.
Mongostat results while doing large inserts
mongotop looks at, at a collection level, the time for reading and writing of data. It provides, at a high level, a view of where Mongodb is spending it’s time.
mongotop results while doing inserts
mongoperf checks disk performance
mongoreplay is pretty cool. It allows for, among other things, the capture of commands sent to a MongoDB instance and the ability to replay them in a different environment. This is very handy for testing and trouble shooting.
mongoldap allows for testing LDAP configuration options against LDAP server(s).

Other Tools

The last tool which comes in the MongoDB package is mongofiles which allows for interaction with GridFS objects. GridFS allows for the storage of files larger than the BSON limit of 16MB per document.

Conclusion

Graphical User Interfaces (GUI) are great, but sometimes you will find it necessary to use a CLI tool to really understand what is going on with your system. MongoDB provides a great assortment of tools to do just that. I would recommend having a look at some of them to expand your personal tool-kit.