{"id":899,"date":"2018-11-20T08:27:15","date_gmt":"2018-11-20T16:27:15","guid":{"rendered":"https:\/\/www.kenwalger.com\/blog\/?p=899"},"modified":"2018-11-17T13:17:56","modified_gmt":"2018-11-17T21:17:56","slug":"reducing-need-etl-mongodb-charts","status":"publish","type":"post","link":"https:\/\/www.kenwalger.com\/blog\/nosql\/mongodb\/reducing-need-etl-mongodb-charts\/","title":{"rendered":"Reducing the Need for ETL with MongoDB Charts"},"content":{"rendered":"<p>Databases as we know them have been around for over 40 years. When they first came about businesses would often keep data in separate systems and separate formats. There were a variety of reasons for these decisions. One of the side effects of these separate data stores is the need to combine together to be able to perform data analysis. This led to the long-standing practice of ETL, or Extract, Transform, Load.<\/p>\n<p>ETL is a process to <i>extract<\/i> data from a starting data source, <i>transform\u00a0<\/i>the data in some fashion, then <i>load<\/i> it into another data store. Sounds simple enough, but in fact, there is a lot of work going on under the covers and a lot of steps and decisions to navigate. These additional steps reduce the speed at which we can get meaningful insights from our data. Further, they rely on many assumptions about transforming data into what is assumed to be the correct format for later consumption &#8211; without knowing very much about the business questions to be asked of this data down the road.<\/p>\n<h4>From Data Warehouses to the Cloud<\/h4>\n<p>Traditionally, enterprise applications have relied on performing ETL operations to move data into an enterprise data warehouse (EDW).<\/p>\n<figure><figcaption><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" data-attachment-id=\"900\" data-permalink=\"https:\/\/www.kenwalger.com\/blog\/nosql\/mongodb\/reducing-need-etl-mongodb-charts\/attachment\/etl_visual\/\" data-orig-file=\"https:\/\/i0.wp.com\/www.kenwalger.com\/blog\/wp-content\/uploads\/2018\/11\/ETL_Visual.png?fit=5037%2C3880&amp;ssl=1\" data-orig-size=\"5037,3880\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"ETL_Visual\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/www.kenwalger.com\/blog\/wp-content\/uploads\/2018\/11\/ETL_Visual.png?fit=300%2C231&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/www.kenwalger.com\/blog\/wp-content\/uploads\/2018\/11\/ETL_Visual.png?fit=840%2C647&amp;ssl=1\" class=\"aligncenter size-medium wp-image-900\" src=\"https:\/\/i0.wp.com\/www.kenwalger.com\/blog\/wp-content\/uploads\/2018\/11\/ETL_Visual.png?resize=300%2C231&#038;ssl=1\" alt=\"Typical ETL Architecture\" width=\"300\" height=\"231\" srcset=\"https:\/\/i0.wp.com\/www.kenwalger.com\/blog\/wp-content\/uploads\/2018\/11\/ETL_Visual.png?resize=300%2C231&amp;ssl=1 300w, https:\/\/i0.wp.com\/www.kenwalger.com\/blog\/wp-content\/uploads\/2018\/11\/ETL_Visual.png?resize=768%2C592&amp;ssl=1 768w, https:\/\/i0.wp.com\/www.kenwalger.com\/blog\/wp-content\/uploads\/2018\/11\/ETL_Visual.png?resize=1024%2C789&amp;ssl=1 1024w, https:\/\/i0.wp.com\/www.kenwalger.com\/blog\/wp-content\/uploads\/2018\/11\/ETL_Visual.png?resize=1200%2C924&amp;ssl=1 1200w, https:\/\/i0.wp.com\/www.kenwalger.com\/blog\/wp-content\/uploads\/2018\/11\/ETL_Visual.png?w=1680&amp;ssl=1 1680w, https:\/\/i0.wp.com\/www.kenwalger.com\/blog\/wp-content\/uploads\/2018\/11\/ETL_Visual.png?w=2520&amp;ssl=1 2520w\" sizes=\"auto, (max-width: 300px) 85vw, 300px\" \/><\/figcaption><\/figure>\n<p>Creating a successful data warehouse can be a long, complicated, and expensive process. One of the technologies that have been created to help with the process is <a href=\"https:\/\/hadoop.apache.org\/\" target=\"_blank\" rel=\"noopener noreferrer\">Apache Hadoop<\/a>. Hadoop allows for the processing of massive amounts of data on commodity hardware with open source technologies. However, instead of simplification, the ETL and data warehousing landscape has only become more complex and cumbersome and the proliferation of tools combined with maturity and adoption issues have only increased the cost. Further, according to <a href=\"https:\/\/twitter.com\/nheudecker\/status\/928720268662530048\" target=\"_blank\" rel=\"noopener noreferrer\">Gartner analyst Nick Heudecker<\/a>, 85% of big data projects fail. Mostly due to the complexity of the process itself.<\/p>\n<p>With the transition to the cloud many organizations are undertaking, ETL becomes even more complicated from a meaningful and timely data analytics standpoint. Moving data from one source to another takes time. Now there is hidden data transfer and compute costs and latencies to navigate. While some meaningful analytics can be performed on stale data, most modern analytics need to be as close to real-time as possible.<\/p>\n<h4>Issues WITh ETL<\/h4>\n<p>A few of the problems that we are faced with when setting up ETL processes are:<\/p>\n<ol type=\"1\">\n<li><b>Latency &amp; Downtime<\/b> &#8211; There is an inherent cost of moving data from point A to point B. Forty years ago, when ETL started, we were working with megabytes of data and not needing \u201cinstant\u201d access. Today we\u2019re dealing with terra or petabytes of data and needing real-time insight from that data.Moving data across the network isn\u2019t free. On a 100 BaseT network, transferring one gigabyte of data takes 100 seconds. A terabyte takes 10,000 seconds or over two and a half hours. All assuming that it\u2019s on a dedicated network that isn\u2019t used by other applications. At ETL demands grow, data could easily be stale by many hours.We used to be able to schedule these transfers during \u201cdowntime\u201d at midnight. However, in today\u2019s global world, users are always online somewhere demanding instant access and insight. Downtime is simply no longer acceptable and latency has become the new downtime. Should suppliers on one side of the world suffer from poor performance just so executives on the other side of the world have up to date dashboards in the morning?<\/li>\n<li><b>Storage is cheap, labor is expensive<\/b> &#8211; Data warehouses started at a point in time in which storage was expensive. In 1981, one gigabyte of data storage cost about $290,000. Today that cost is under $0.10. It was, therefore, important to transform and compress as much data as possible when storing to save costs.As storage costs have decreased, labor costs have gone the opposite direction. Having a good database administrator to design, manage, and maintain your data warehouse and ETL path is expensive. Storing raw data is frequently seen as a more economically viable choice.<\/li>\n<li><b>ETL is hard<\/b> &#8211; ETL takes planning. Lots of it. And not just for your current load of data, but for what might happen to the load down the road. Additionally, ETL scripts can get long and complex.Bringing in data from a variety of sources, looping over them, adding logging, error handling, configurations are just the start. Determining how the data needs to be transformed can be complex, and fragile. What happens if data stored today as a string gets changed down the road? The process breaks and adjustments need to be made.Do you ever wonder why the first answer out of a DBA\u2019s mouth is an emphatic \u201cNo!\u201d when asked if something can be changed? One \u201csimple\u201d change can mean changing dozens or hundreds of lines of code. For these reasons and more, ETL requires planning for current and future data needs, loads, and shape.<\/li>\n<li><b>Are developers the right people to build the ETL pipeline?<\/b> &#8211; Developers are great at many things, however, knowing about data storage and ETL pipelines aren\u2019t often one of them. ETL design and implementation are typically best done by data engineers. While a developer may be able to get data through an ETL pipeline and into a data warehouse, generally speaking, it often isn\u2019t done in the most efficient manner. Specialized data engineers should be responsible for these tasks. If you don\u2019t have them on your team, this is another cost of ETL.<\/li>\n<li><b>Maintenance headaches<\/b> &#8211; As the size and complexity of data, applications, and analytics requirements grow, so does ETL maintenance. Maintaining changes in data velocity, formats, connections, and features takes time. Many of these challenges may not be thought of at the start of a project, but lead to long-term maintenance needs.<\/li>\n<\/ol>\n<h4>Use MongoDB Charts to Avoid the Headache of ETL<\/h4>\n<p>Companies today still have data in a variety of systems. In certain instances, ETL is the only option to be able to perform visualization and analysis of your data. Or, perhaps, you\u2019ve explored ETL but haven\u2019t taken the steps needed to get your data ready for analysis because it\u2019s overwhelming.<\/p>\n<p>If you\u2019ve leveraged MongoDB as your database, the need for ETL procedures has been dramatically reduced with the introduction of <a href=\"https:\/\/www.mongodb.com\/products\/charts\" target=\"_blank\" rel=\"noopener noreferrer\">MongoDB Charts<\/a>, now in beta. MongoDB Charts natively understands the MongoDB Document Model allowing for the rapid creation of data visualizations over your data.<\/p>\n<p>With MongoDB Charts you can connect to your MongoDB server, assign user authorization policies to your reports, and easily generate visualization dashboards. With over a dozen different chart variations to choose from, stunning visualizations are just a few clicks away.<\/p>\n<p>MongoDB Charts allows for data to be visualized without performing ETL operations, saving valuable time and resources. You don\u2019t need to write any code or rely on third-party tools. Further, you still get to leverage the richness of the Document Model.<\/p>\n<h4>Conclusion<\/h4>\n<p>For those situations that you want to quickly access your MongoDB Data, <a href=\"https:\/\/www.kenwalger.com\/blog\/nosql\/visualizing-data-mongodb-charts\/\">MongoDB Charts<\/a> is a terrific option. If you\u2019re in a situation that requires multiple data sources to be analyzed, we offer the MongoDB <a href=\"https:\/\/www.mongodb.com\/download-center\/bi-connector\" target=\"_blank\" rel=\"noopener noreferrer\">Connector for Business Intelligence<\/a>. If you are doing advanced analytics with <a href=\"https:\/\/spark.apache.org\/\" target=\"_blank\" rel=\"noopener noreferrer\">Apache Spark<\/a>, we have an option for that as well with the MongoDB <a href=\"https:\/\/www.mongodb.com\/products\/spark-connector\" target=\"_blank\" rel=\"noopener noreferrer\">Connector for Apache Spark<\/a>.<\/p>\n<p>For many roles in an organization, MongoDB Charts is a great tool for analyzing your data. There\u2019s no need to go through the pain of the ETL process. It is the fastest way to build visualizations over your MongoDB Data, wherever it\u2019s stored. On-premise or in the cloud hosted by <a href=\"https:\/\/www.mongodb.com\/cloud\/atlas\" target=\"_blank\" rel=\"noopener noreferrer\">MongoDB Atlas<\/a>. Give it a try today!<\/p>\n<hr \/>\n<p><em>This post was originally published on the <a href=\"https:\/\/www.mongodb.com\/blog\/post\/reducing-the-need-for-etl-with-mongodb-charts\">MongoDB Blog<\/a>.<\/em><\/p>\n<a class=\"synved-social-button synved-social-button-share synved-social-size-48 synved-social-resolution-single synved-social-provider-facebook nolightbox\" data-provider=\"facebook\" target=\"_blank\" rel=\"nofollow\" title=\"Share on Facebook\" href=\"https:\/\/www.facebook.com\/sharer.php?u=https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F899&#038;t=Reducing%20the%20Need%20for%20ETL%20with%20MongoDB%20Charts&#038;s=100&#038;p&#091;url&#093;=https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F899&#038;p&#091;images&#093;&#091;0&#093;=https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-content%2Fuploads%2F2018%2F11%2FETL_Visual-300x231.png&#038;p&#091;title&#093;=Reducing%20the%20Need%20for%20ETL%20with%20MongoDB%20Charts\" style=\"font-size: 0px;width:48px;height:48px;margin:0;margin-bottom:5px;margin-right:5px\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" alt=\"Facebook\" title=\"Share on Facebook\" class=\"synved-share-image synved-social-image synved-social-image-share\" width=\"48\" height=\"48\" style=\"display: inline;width:48px;height:48px;margin: 0;padding: 0;border: none;box-shadow: none\" src=\"https:\/\/i0.wp.com\/www.kenwalger.com\/blog\/wp-content\/plugins\/social-media-feather\/synved-social\/image\/social\/regular\/96x96\/facebook.png?resize=48%2C48&#038;ssl=1\" \/><\/a><a class=\"synved-social-button synved-social-button-share synved-social-size-48 synved-social-resolution-single synved-social-provider-twitter nolightbox\" data-provider=\"twitter\" target=\"_blank\" rel=\"nofollow\" title=\"Share on Twitter\" href=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F899&#038;text=Hey%20check%20this%20out\" style=\"font-size: 0px;width:48px;height:48px;margin:0;margin-bottom:5px;margin-right:5px\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" alt=\"twitter\" title=\"Share on Twitter\" class=\"synved-share-image synved-social-image synved-social-image-share\" width=\"48\" height=\"48\" style=\"display: inline;width:48px;height:48px;margin: 0;padding: 0;border: none;box-shadow: none\" src=\"https:\/\/i0.wp.com\/www.kenwalger.com\/blog\/wp-content\/plugins\/social-media-feather\/synved-social\/image\/social\/regular\/96x96\/twitter.png?resize=48%2C48&#038;ssl=1\" \/><\/a><a class=\"synved-social-button synved-social-button-share synved-social-size-48 synved-social-resolution-single synved-social-provider-reddit nolightbox\" data-provider=\"reddit\" target=\"_blank\" rel=\"nofollow\" title=\"Share on Reddit\" href=\"https:\/\/www.reddit.com\/submit?url=https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F899&#038;title=Reducing%20the%20Need%20for%20ETL%20with%20MongoDB%20Charts\" style=\"font-size: 0px;width:48px;height:48px;margin:0;margin-bottom:5px;margin-right:5px\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" alt=\"reddit\" title=\"Share on Reddit\" class=\"synved-share-image synved-social-image synved-social-image-share\" width=\"48\" height=\"48\" style=\"display: inline;width:48px;height:48px;margin: 0;padding: 0;border: none;box-shadow: none\" src=\"https:\/\/i0.wp.com\/www.kenwalger.com\/blog\/wp-content\/plugins\/social-media-feather\/synved-social\/image\/social\/regular\/96x96\/reddit.png?resize=48%2C48&#038;ssl=1\" \/><\/a><a class=\"synved-social-button synved-social-button-share synved-social-size-48 synved-social-resolution-single synved-social-provider-linkedin nolightbox\" data-provider=\"linkedin\" target=\"_blank\" rel=\"nofollow\" title=\"Share on Linkedin\" href=\"https:\/\/www.linkedin.com\/shareArticle?mini=true&#038;url=https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F899&#038;title=Reducing%20the%20Need%20for%20ETL%20with%20MongoDB%20Charts\" style=\"font-size: 0px;width:48px;height:48px;margin:0;margin-bottom:5px;margin-right:5px\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" alt=\"linkedin\" title=\"Share on Linkedin\" class=\"synved-share-image synved-social-image synved-social-image-share\" width=\"48\" height=\"48\" style=\"display: inline;width:48px;height:48px;margin: 0;padding: 0;border: none;box-shadow: none\" src=\"https:\/\/i0.wp.com\/www.kenwalger.com\/blog\/wp-content\/plugins\/social-media-feather\/synved-social\/image\/social\/regular\/96x96\/linkedin.png?resize=48%2C48&#038;ssl=1\" \/><\/a><a class=\"synved-social-button synved-social-button-share synved-social-size-48 synved-social-resolution-single synved-social-provider-mail nolightbox\" data-provider=\"mail\" rel=\"nofollow\" title=\"Share by email\" href=\"mailto:?subject=Reducing%20the%20Need%20for%20ETL%20with%20MongoDB%20Charts&#038;body=Hey%20check%20this%20out:%20https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F899\" style=\"font-size: 0px;width:48px;height:48px;margin:0;margin-bottom:5px\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" alt=\"mail\" title=\"Share by email\" class=\"synved-share-image synved-social-image synved-social-image-share\" width=\"48\" height=\"48\" style=\"display: inline;width:48px;height:48px;margin: 0;padding: 0;border: none;box-shadow: none\" src=\"https:\/\/i0.wp.com\/www.kenwalger.com\/blog\/wp-content\/plugins\/social-media-feather\/synved-social\/image\/social\/regular\/96x96\/mail.png?resize=48%2C48&#038;ssl=1\" \/><\/a>","protected":false},"excerpt":{"rendered":"<p>Databases as we know them have been around for over 40 years. When they first came about businesses would often keep data in separate systems and separate formats. There were a variety of reasons for these decisions. One of the side effects of these separate data stores is the need to combine together to be &hellip; <a href=\"https:\/\/www.kenwalger.com\/blog\/nosql\/mongodb\/reducing-need-etl-mongodb-charts\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Reducing the Need for ETL with MongoDB Charts&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"pmpro_default_level":"","_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[17,4],"tags":[1418,1400],"yst_prominent_words":[1416,1392,1415,1413,1406,1410,286,1404,1412,1414,1409,1419,87,1369,331,489,1408,1411,1405,1407],"class_list":["post-899","post","type-post","status-publish","format-standard","hentry","category-data-visualization","category-mongodb","tag-etl","tag-mongodb-charts","pmpro-has-access"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p8lx70-ev","jetpack-related-posts":[],"_links":{"self":[{"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/posts\/899","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/comments?post=899"}],"version-history":[{"count":3,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/posts\/899\/revisions"}],"predecessor-version":[{"id":904,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/posts\/899\/revisions\/904"}],"wp:attachment":[{"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/media?parent=899"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/categories?post=899"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/tags?post=899"},{"taxonomy":"yst_prominent_words","embeddable":true,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/yst_prominent_words?post=899"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}