{"id":554,"date":"2017-08-24T08:31:13","date_gmt":"2017-08-24T15:31:13","guid":{"rendered":"https:\/\/www.kenwalger.com\/blog\/?p=554"},"modified":"2017-08-24T17:33:29","modified_gmt":"2017-08-25T00:33:29","slug":"data-durability-mongodb","status":"publish","type":"post","link":"https:\/\/www.kenwalger.com\/blog\/nosql\/mongodb\/data-durability-mongodb\/","title":{"rendered":"Data Durability in MongoDB"},"content":{"rendered":"<p>When designing a database we want to make sure the data that we want to be stored actually <strong>gets<\/strong> stored. Data durability is a key factor in applications. \u00a0On local servers and test environments, this typically isn&#8217;t a huge issue. We can pretty easily tell when and if our environment crashes. What happens though as our system grows? What happens when we move to a distributed environment with many different pieces to our application puzzle?<\/p>\n<p>In addition to many of the <a href=\"https:\/\/www.kenwalger.com\/blog\/nosql\/mongodb\/mongodb-performance-issues\/\">performance<\/a> considerations that <a href=\"https:\/\/www.mongodb.com\">MongoDB<\/a> has improved upon in recent versions, data durability is another improvement. It is also a topic from previous versions for which the product took some heat. Let&#8217;s take a look at some ways which we can design our applications around the idea of data durability.<\/p>\n<h3>Data Durability<\/h3>\n<p>There are a couple of different scenarios we need to consider when dealing with data durability, reads and writes. Let&#8217;s take a look at these two different considerations. In doing so we&#8217;ll see some ways to ensure our system is doing what we intend it to be doing.<\/p>\n<h4>Writes<\/h4>\n<p>Write operations in MongoDB follow a pretty clear path in MongoDB. At least in theory. From an application, they get sent to the <a href=\"https:\/\/docs.mongodb.com\/manual\/reference\/glossary\/#term-primary\">primary<\/a> server, in a <a href=\"https:\/\/docs.mongodb.com\/manual\/reference\/glossary\/#term-replica-set\">replica set<\/a> situation. The data goes into an in-memory store and <a href=\"https:\/\/docs.mongodb.com\/manual\/reference\/glossary\/#term-oplog\">oplog<\/a>. At this point, the server, by default, sends back an &#8220;okey dokey&#8221; to the application.<\/p>\n<p>Notice that I haven&#8217;t mentioned anything about writing data to disk yet. It hasn&#8217;t happened yet. The primary then writes the data to the <a href=\"https:\/\/www.kenwalger.com\/blog\/nosql\/mongodb\/mongodb-storage-engine-journaling\/\">journal<\/a> file and then to disk. The secondaries are writing data to the disk during this process as well, at some point after the data has been written to the journal.<\/p>\n<p>This can be all well and good in many situations as we are talking about small time frames between the application getting an &#8220;okay&#8221; and the data being persisted to disk. But there is still some latency there. Should something go wrong with the distributed system during that time, extra steps have to be taken to get the data. The application thinks is there, it did get a confirmation of it after all, and what actually took place with the disk writes.<\/p>\n<p>I&#8217;m not going to go into the background of what\u00a0<em>actually<\/em> goes on behind the scenes during an unexpected shutdown or failure. It is a bit beyond the scope of this particular post. I will show how to instruct MongoDB to wait to send our &#8220;okey dokey&#8221; signal to the application until the data is indeed on disk.<\/p>\n<h6>Write Acknowledgment<\/h6>\n<p>MongoDB has provided the functionality to set a level of acknowledgment for write operations with the <a href=\"https:\/\/docs.mongodb.com\/manual\/reference\/write-concern\/\">write concern<\/a> options. There are a few different options available for us here.<\/p>\n<ul>\n<li>We can request an acknowledgment that data has been written to a specific number of servers in a replica set with the <code>w<\/code> option.<\/li>\n<li>The\u00a0<code>j<\/code> option requests acknowledgment of the data being written to the journal. This is a boolean value.<\/li>\n<li>There is also a <code>wtimeout<\/code> option\u00a0which, as the name might lead you to deduce, sets a timeout, in milliseconds, for the acknowledgment to occur.<\/li>\n<\/ul>\n<p>With the <code>w<\/code> option, we can choose to tell MongoDB a specific number of servers that must confirm the write operation. Or, there is a handy &#8220;majority&#8221; option that allows for the write acknowledgment to occur when a majority of the data bearing members of the replia set have performed the write.<\/p>\n<p>If, for example, we want to insert a document in the <a href=\"https:\/\/docs.mongodb.com\/getting-started\/shell\/client\/\">mongo shell<\/a> and wait for a response from two members of our replica set with a two and a half second timeout period, we could do the following:<\/p>\n<pre><span class=\"nx\">db<\/span><span class=\"p\">.<\/span><span class=\"nx\">blogs<\/span><span class=\"p\">.<\/span><span class=\"nx\">insert<\/span><span class=\"p\">(<\/span>\n   <span class=\"p\">{<\/span> <span class=\"nx\">title<\/span><span class=\"o\">:<\/span> <span class=\"s2\">\"Data Durability in MongoDB\"<\/span><span class=\"p\">,<\/span> <span class=\"nx\">length<\/span><span class=\"o\">:<\/span> <span class=\"mi\">1099<\/span><span class=\"p\">,<\/span> <span class=\"nx\">topic<\/span><span class=\"o\">:<\/span> <span class=\"s2\">\"MongoDB\"<\/span> <span class=\"p\">},<\/span>\n   <span class=\"p\">{<\/span> <span class=\"nx\">writeConcern<\/span><span class=\"o\">:<\/span> <span class=\"p\">{<\/span> <span class=\"nx\">w<\/span><span class=\"o\">:<\/span> <span class=\"mi\">2<\/span><span class=\"p\">,<\/span> <span class=\"nx\">wtimeout<\/span><span class=\"o\">:<\/span> <span class=\"mi\">2500<\/span> <span class=\"p\">}<\/span> <span class=\"p\">}<\/span>\n<span class=\"p\">)<\/span><\/pre>\n<h6>Uses<\/h6>\n<p>So why not just always set a write concern? The main reason is latency. The more servers that must respond with an &#8220;okay&#8221;, the longer it will take for the application to get that response. In a distributed environment, the physical servers may be located all over a given country, or around the world. It is a trade off between responsiveness for the application and data durability.<\/p>\n<p>A good compromise, however, for application performance and data durability is to set w: 2 for your write concern. For writes that\u00a0<strong>must<\/strong> be acknowledged, however, choose <code>w: \"majority\"<\/code>.<\/p>\n<h4>Reads<\/h4>\n<p>What about reading our data and making sure that our application has the most recent data? How can we prevent dirty reads, reads that occur during the time frame between the in-memory storage of the data and the actual writing of the data to disk? Reads that might be affected by <a href=\"https:\/\/docs.mongodb.com\/manual\/core\/replica-set-rollbacks\/\">rollback<\/a> with a failure occurring?<\/p>\n<p>Similar to write concern, MongoDB offers, as of version 3.2, a\u00a0<em>read concern<\/em>. Based on our knowledge of write concern, we can extrapolate that <a href=\"https:\/\/docs.mongodb.com\/manual\/reference\/read-concern\/index.html\">read concern<\/a> allows us to specify\u00a0which data to return from a replica set. There are three options we can choose when selecting a read concern level.<\/p>\n<ul>\n<li><code>local<\/code> &#8211; this default setting returns the most recent data with no guarantee that the data will be impacted by a rollback.<\/li>\n<li><code>majority<\/code> &#8211; returns data that has been written to a majority of the data bearing members of the replica set.<\/li>\n<li><code>linearizable<\/code> &#8211; returns data that reflects all successful writes issued with a write concern of\u00a0&#8220;majority&#8221; <em>and<\/em>\u00a0acknowledged prior to the start of the read operation. Linearizable was introduced in <a href=\"https:\/\/www.kenwalger.com\/blog\/nosql\/new-version-new-features\/\">version 3.4<\/a> and is another great feature of that release.<\/li>\n<\/ul>\n<p>Dirty reads may seem like a huge concern. In practice, however, we want to design our application to properly handle the write operations so that we can negate these concerns. There are times, though, such as reading passwords, that making sure we are reading the most recent and durable data is critical.<\/p>\n<h3>Wrap Up<\/h3>\n<p>MongoDB continues to listen to the community and address the concerns (no pun intended) of their users. The data durability issues of old shouldn&#8217;t be a reason to not give MongoDB a try.<\/p>\n<p>There is also a great talk from <a href=\"https:\/\/explore.mongodb.com\/?page=0&amp;prevItm=349629841&amp;prevCol=648867&amp;ts=71115539\">MongoDB World 2017<\/a> by <a href=\"https:\/\/www.linkedin.com\/in\/adkomyagin\">Alex Komyagin<\/a> on\u00a0<em><a href=\"https:\/\/explore.mongodb.com\/vidyard-all-players\/mongodb-world-presentations-regency-c-alex-komyagin-6-21-2017\">ReadConcern and WriteConcern<\/a><\/em>. I would recommend having a look at that talk for additional information and use cases.<\/p>\n<hr \/>\n<p>There are several MongoDB specific terms in this post. I created a <a href=\"https:\/\/www.echoskillstore.com\/MongoDB-Dictionary\/45103\">MongoDB Dictionary<\/a> skill for the <a href=\"https:\/\/www.amazon.com\/gp\/product\/B01DFKC2SO\/ref=as_li_tl?ie=UTF8&amp;camp=1789&amp;creative=9325&amp;creativeASIN=B01DFKC2SO&amp;linkCode=as2&amp;tag=kenwalgersite-20&amp;linkId=f9e513223de2525a72b95cf9561db55b\" rel=\"noopener noreferrer\">Amazon Echo<\/a>\u00a0line of products. Check it out and you can say &#8220;Alexa, ask MongoDB what is a document?&#8221; and get a helpful response.<\/p>\n<hr \/>\n<p><em>Follow me on Twitter <a href=\"https:\/\/www.twitter.com\/kenwalger\">@kenwalger<\/a> to get the latest updates on my postings.<\/em><\/p>\n<a class=\"synved-social-button synved-social-button-share synved-social-size-48 synved-social-resolution-single synved-social-provider-facebook nolightbox\" data-provider=\"facebook\" target=\"_blank\" rel=\"nofollow\" title=\"Share on Facebook\" href=\"https:\/\/www.facebook.com\/sharer.php?u=https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F554&#038;t=Data%20Durability%20in%20MongoDB&#038;s=100&#038;p&#091;url&#093;=https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F554&#038;p&#091;images&#093;&#091;0&#093;=https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-content%2Fuploads%2F2017%2F08%2FData-Durability-e1503620587637.png&#038;p&#091;title&#093;=Data%20Durability%20in%20MongoDB\" style=\"font-size: 0px;width:48px;height:48px;margin:0;margin-bottom:5px;margin-right:5px\"><img loading=\"lazy\" decoding=\"async\" alt=\"Facebook\" title=\"Share on Facebook\" class=\"synved-share-image synved-social-image synved-social-image-share\" width=\"48\" height=\"48\" style=\"display: inline;width:48px;height:48px;margin: 0;padding: 0;border: none;box-shadow: none\" src=\"https:\/\/www.kenwalger.com\/blog\/wp-content\/plugins\/social-media-feather\/synved-social\/image\/social\/regular\/96x96\/facebook.png\" \/><\/a><a class=\"synved-social-button synved-social-button-share synved-social-size-48 synved-social-resolution-single synved-social-provider-twitter nolightbox\" data-provider=\"twitter\" target=\"_blank\" rel=\"nofollow\" title=\"Share on Twitter\" href=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F554&#038;text=Hey%20check%20this%20out\" style=\"font-size: 0px;width:48px;height:48px;margin:0;margin-bottom:5px;margin-right:5px\"><img loading=\"lazy\" decoding=\"async\" alt=\"twitter\" title=\"Share on Twitter\" class=\"synved-share-image synved-social-image synved-social-image-share\" width=\"48\" height=\"48\" style=\"display: inline;width:48px;height:48px;margin: 0;padding: 0;border: none;box-shadow: none\" src=\"https:\/\/www.kenwalger.com\/blog\/wp-content\/plugins\/social-media-feather\/synved-social\/image\/social\/regular\/96x96\/twitter.png\" \/><\/a><a class=\"synved-social-button synved-social-button-share synved-social-size-48 synved-social-resolution-single synved-social-provider-reddit nolightbox\" data-provider=\"reddit\" target=\"_blank\" rel=\"nofollow\" title=\"Share on Reddit\" href=\"https:\/\/www.reddit.com\/submit?url=https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F554&#038;title=Data%20Durability%20in%20MongoDB\" style=\"font-size: 0px;width:48px;height:48px;margin:0;margin-bottom:5px;margin-right:5px\"><img loading=\"lazy\" decoding=\"async\" alt=\"reddit\" title=\"Share on Reddit\" class=\"synved-share-image synved-social-image synved-social-image-share\" width=\"48\" height=\"48\" style=\"display: inline;width:48px;height:48px;margin: 0;padding: 0;border: none;box-shadow: none\" src=\"https:\/\/www.kenwalger.com\/blog\/wp-content\/plugins\/social-media-feather\/synved-social\/image\/social\/regular\/96x96\/reddit.png\" \/><\/a><a class=\"synved-social-button synved-social-button-share synved-social-size-48 synved-social-resolution-single synved-social-provider-linkedin nolightbox\" data-provider=\"linkedin\" target=\"_blank\" rel=\"nofollow\" title=\"Share on Linkedin\" href=\"https:\/\/www.linkedin.com\/shareArticle?mini=true&#038;url=https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F554&#038;title=Data%20Durability%20in%20MongoDB\" style=\"font-size: 0px;width:48px;height:48px;margin:0;margin-bottom:5px;margin-right:5px\"><img loading=\"lazy\" decoding=\"async\" alt=\"linkedin\" title=\"Share on Linkedin\" class=\"synved-share-image synved-social-image synved-social-image-share\" width=\"48\" height=\"48\" style=\"display: inline;width:48px;height:48px;margin: 0;padding: 0;border: none;box-shadow: none\" src=\"https:\/\/www.kenwalger.com\/blog\/wp-content\/plugins\/social-media-feather\/synved-social\/image\/social\/regular\/96x96\/linkedin.png\" \/><\/a><a class=\"synved-social-button synved-social-button-share synved-social-size-48 synved-social-resolution-single synved-social-provider-mail nolightbox\" data-provider=\"mail\" rel=\"nofollow\" title=\"Share by email\" href=\"mailto:?subject=Data%20Durability%20in%20MongoDB&#038;body=Hey%20check%20this%20out:%20https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F554\" style=\"font-size: 0px;width:48px;height:48px;margin:0;margin-bottom:5px\"><img loading=\"lazy\" decoding=\"async\" alt=\"mail\" title=\"Share by email\" class=\"synved-share-image synved-social-image synved-social-image-share\" width=\"48\" height=\"48\" style=\"display: inline;width:48px;height:48px;margin: 0;padding: 0;border: none;box-shadow: none\" src=\"https:\/\/www.kenwalger.com\/blog\/wp-content\/plugins\/social-media-feather\/synved-social\/image\/social\/regular\/96x96\/mail.png\" \/><\/a>","protected":false},"excerpt":{"rendered":"<p>When designing a database we want to make sure the data that we want to be stored actually gets stored. Data durability is a key factor in applications. \u00a0On local servers and test environments, this typically isn&#8217;t a huge issue. We can pretty easily tell when and if our environment crashes. What happens though as &hellip; <a href=\"https:\/\/www.kenwalger.com\/blog\/nosql\/mongodb\/data-durability-mongodb\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Data Durability in MongoDB&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":563,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"pmpro_default_level":"","_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_post_was_ever_published":false},"categories":[4],"tags":[825,827,826],"yst_prominent_words":[824,123,834,99,830,792,833,793,831,829,87,822,832,814,805,821,803,828,801,818],"class_list":["post-554","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-mongodb","tag-data-durability","tag-read-concern","tag-write-concern","pmpro-has-access"],"jetpack_featured_media_url":"https:\/\/www.kenwalger.com\/blog\/wp-content\/uploads\/2017\/08\/Data-Durability-e1503620587637.png","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p8lx70-8W","jetpack-related-posts":[],"_links":{"self":[{"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/posts\/554","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/comments?post=554"}],"version-history":[{"count":7,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/posts\/554\/revisions"}],"predecessor-version":[{"id":562,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/posts\/554\/revisions\/562"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/media\/563"}],"wp:attachment":[{"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/media?parent=554"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/categories?post=554"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/tags?post=554"},{"taxonomy":"yst_prominent_words","embeddable":true,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/yst_prominent_words?post=554"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}