Monetary Data Type Storage in MongoDB

One piece of information that is frequently stored in databases is monetary data. Sometimes this poses challenges in data storage as we are left with decisions to be made as to what value to store. One option to store data is to store the data strictly in numeric value. If an item costs $12.99, we could store the monetary value of 12.99 and designate it as USD.

Troubles with Monetary Values

This often can lead to some rounding and data precision issues however when using double values. For example, if a value is 13.4999999999 it might be stored as 13.5000000000. These rounding issues can obviously pose problems over the long run.

Another option might be to store the value as a string which, while maintaining precision, poses some challenges for doing calculations on the data. Another frequent method to store monetary data is to store the value in cents value of the item. Here in the United States, for example, we use dollars that are valued at one-hundred cents. Therefore we could store the value of a $11.99 item as 1199 and then perform conversion calculations to get back to 11.99.

Storing Monetary Values

As mentioned above, we can store data in a variety of ways. In general, however, two basic approaches are taken.

The first approach is to store the value that is displayed to customers, $11.99 in our example as a string and then to also store an approximate value as a float. Something along the lines of:

{
    product_name: "fidget spinner",
    price: { value: "11.99", float_value: 11.99000000000, currency: "USD" }
}

That certainly works, but it still has the potential for rounding errors and there are two different values that must be updated. Wouldn’t it be great if we could simply store our value in the database? Fortunately, there is.

NumberDecimal in MongoDB

One of the features in version 3.4 of MongoDB is support for the NumberDecimal data type. This data type allows for 128-bit decimal based values. It is specifically designed and intended for use for applications needing to store monetary or high precision values. It is an implementation of the BSON decimal type. Since MongoDB stores data in BSON format, it allows us to model monetary data in our database with ease.

Now, for our fidget spinner product, we can model our data using NumberDecimal and take advantage of its features.

{
    product_name: "fidget spinner",
    price: { value: NumberDecimal("11.99"), currency: "USD" }
}

This allows us to not need a scale factor either. Monetary data will be stored on the server in a mathematically useful fashion. This allows for calculations to be made on the server using MongoDB’s aggregation pipeline. By doing calculations on the server we get less network traffic which can ultimately lead to better applications.

Wrap Up

I would highly recommend looking at and using NumberDecimal for your data type when storing monetary data in MongoDB. It is another reason to upgrade to version 3.4. If you haven’t upgraded yet, it might be a good time to do so. Or even better, check out their DBaaS offering Atlas.

There are several MongoDB specific terms in this post. I created a MongoDB Dictionary skill for the Amazon Echo line of products. Check it out and you can say “Alexa, ask MongoDB what is BSON?” and get a helpful response.


Follow me on Twitter @kenwalger to get the latest updates on my postings.

Facebooktwitterredditlinkedinmail

MongoDB Storage Engine Journaling

I came across a question the other day as it relates to journaling in MongoDB. Specifically how it is handled in the different supported storage engines and is it necessary to use. There was an interesting discussion on this topic so I thought I would generate some thoughts and explanations.

To start with, some questions arise. What exactly is a MongoDB journal? Why is journaling important? For the sake of this post, I’m going to be relating this information to 64-bit builds of mongod and based on the 3.4 version of the database.

What is Journaling?

Much like one uses a journal to record thoughts and daily events, MongoDB uses a journal to ensure data integrity. This is accomplished through writing data first to the journal files and then to the core data files. In the event of an untimely server shutdown, the data can be restored to a consistent state.

This is accomplished through MongoDB’s write operation durability guarantee. If your mongod process stops in an unexpected manner, data from the journal will be used to re-apply the write operations when it is restarted. MongoDB creates, when journaling is enabled, a subdirectory for the journal data called journal. This resides under the dbPath directory and contains the write ahead logs.

Since each different storage engine in MongoDB implements crash resiliency and data persistence slightly differently, let’s see how journaling is utilized.

Storage Engine Implementations

There are three different storage engines that are predominately used with MongoDB. MMAPv1, WiredTiger, and In-Memory. They each have their own strengths and weaknesses. Those differences are beyond the scope of this post, but I would like to look at how journaling is implemented in each.

MMAPv1

Starting in version 3.2 of MongoDB, MMAPv1 is no longer the default storage engine. However, it is still in use and in certain circumstances is a better option. Therefore, it is still good to understand how journaling works with this storage engine in its default configuration.

In a nutshell, when a write command is issued, the operation is applied to an internal private view, then written to the journal. Once the data has been updated in the journal the changes are applied to an internal shared view and then written to disk.

In MMAPv1, the journal is updated every 100 milliseconds in batch processes called group commits. Data is written to disk, though, every 60 seconds in the process flushing the shared view to disk. Depending on the quantity and availability of system memory, the flushing of data may occur more often.

Where then does the importance of the journal come in? Well, in the case of an unexpected shutdown of the mongod process the journal can be used to restore the data. Without journaling on a standalone server, there is a more lengthy and involved repair process involved.

On systems using a properly configured replica set, data recovery may be simplified without a journal over using the repair process. It is still not as clean as with journaling enabled, however.

WiredTiger

The WiredTiger storage engine takes a different approach to write operation data concurrency. WiredTiger uses checkpoints in conjunction with a journal. These checkpoints allow for data to be recovered after the last checkpoint.

When a write operation is called, a snapshot is taken of the data. When data is written to disk (every 60 seconds by default), the data is written across all data files and becomes durable. This becomes a new checkpoint and can be used as a recovery point.

This allows for WiredTiger to be covered from the last checkpoint without a journal. Pretty slick. However, if an unexpected shutdown occurs between checkpoints and journaling is disabled, data will be lost. The journal in WiredTiger, therefore, utilizes a write-ahead log similar to MMAPv1 between checkpoints for data durability.

So journaling and replica sets are still important pieces of a server environment when using WiredTiger. It just is implemented in a slightly different way than MMAPv1.

In-Memory

For those that are running an Enterprise version of MongoDB, there is a storage engine that stores data in memory. Because memory is stored in memory, the data is non-persistent. The concept of a journal does not apply in this situation.

Study Question

I have seen questions similar to “Why is the journal unnecessary for WiredTiger” listed in various study guides. As we have learned, it is indeed not required for data consistency. At least not in the same fashion as it is for MMAP. That being said, I might argue that “unnecessary” is a bit of a misleading word. WiredTiger’s data consistency model is just different than MMAP. Journaling may not be “necessary” perhaps, but I wouldn’t run a system without it.

Wrap Up

All of these details of journaling can be a lot to think about and potentially manage. This is one of the great advantages of MongoDB Atlas, in that these internal matters are handled by them. If you are running and/or managing a MongoDB server, it is a best practice to leave journaling on for data integrity. Further, it is recommended to have your system use a replica set at a minimum as data recovery is often simplified even more.

There are several MongoDB specific terms in this post. I created a MongoDB Dictionary skill for the Amazon Echo line of products. Check it out and you can say “Alexa, ask MongoDB for the definition of a journal?” and get a helpful response.


Follow me on Twitter @kenwalger to get the latest updates on my postings.

Facebooktwitterredditlinkedinmail