{"id":652,"date":"2017-10-24T19:20:05","date_gmt":"2017-10-25T02:20:05","guid":{"rendered":"https:\/\/www.kenwalger.com\/blog\/?p=652"},"modified":"2018-01-11T07:59:07","modified_gmt":"2018-01-11T15:59:07","slug":"using-r-mongodb","status":"publish","type":"post","link":"https:\/\/www.kenwalger.com\/blog\/nosql\/mongodb\/using-r-mongodb\/","title":{"rendered":"Using R with MongoDB"},"content":{"rendered":"<h4><span style=\"color: #ff0000;\"><strong>NOTE: MongoDB 3.6 has a new R Language support. See my <a href=\"https:\/\/www.kenwalger.com\/blog\/nosql\/mongodb\/new-r-driver-option-mongodb-3-6\/\">other blog post<\/a> for the latest information.<\/strong><\/span><\/h4>\n<p>The R <a href=\"https:\/\/www.r-project.org\/\">programming language<\/a> is a powerful language used for statistical computing. When working with statistical computing it is frequently the case that the data being explored will come from a database. Some of the powers that R excels at are working with data in tables and matrices and joining columns and rows together. This seems like a great fit for SQL databases, but what about a NoSQL database like <a href=\"http:\/\/www.mongodb.com\">MongoDB<\/a>? Can R analyze a MongoDB <a href=\"https:\/\/www.kenwalger.com\/blog\/nosql\/document-model\/\">document<\/a> as easily as a SQL table?<\/p>\n<p>Well, this post would be pretty short if the answer was &#8220;No&#8221;, right? So let take a look and how to pull data into R from a MongoDB <a href=\"https:\/\/docs.mongodb.com\/master\/core\/databases-and-collections\/\">collection<\/a>. Then we&#8217;ll take a brief look at examining our data.<\/p>\n<h3>Setting Up<\/h3>\n<p>While there are <a href=\"https:\/\/plugins.jetbrains.com\/plugin\/6632-r-language-support\">plugins<\/a> available for a variety of IDEs, such as those by <a href=\"https:\/\/www.jetbrains.com\/\">JetBrains<\/a>, it is pretty common to use <a href=\"https:\/\/www.rstudio.com\/\">RStudio<\/a> when working with R. Somewhere along the line, I picked up a &#8220;scores&#8221; database in MongoDB that we&#8217;ll use as our sample data. I&#8217;ve posted it <a href=\"http:\/\/kenwalger.com\/public_html\/assets\/sample-data.zip\">here for download<\/a>.<\/p>\n<p>We can easily import the data into our MongoDB database using <a href=\"https:\/\/www.kenwalger.com\/blog\/nosql\/mongodb\/importing-data-mongoimport\/\">mongoimport<\/a>. In my case, I put it into a database called <code>kenblog<\/code> and a collection called <code>scores<\/code>. Pretty creative, eh? Here&#8217;s what a sample document in the collection looks like:<\/p>\n<pre>{\n   \"_id\" : ObjectId(\"5627207b33ff2cf40effc25e\"),\n   \"student\" : 2,\n   \"type\" : \"quiz\",\n   \"score\" : 74\n}\n<\/pre>\n<p>There are 1,787 records in our collection with the type of assignment being either <code>quiz<\/code>, <code>essay<\/code>, or <code>exam<\/code>.\u00a0Let&#8217;s see how we can access our data with R.<\/p>\n<p>First, we need to get and load our package for interfacing MongoDB with R. For this example I&#8217;ll be using <a href=\"https:\/\/cran.r-project.org\/web\/packages\/RMongo\/index.html\">RMongo<\/a>, but there is another package available, <a href=\"https:\/\/www.r-project.org\/nosvn\/pandoc\/rmongodb.html\">rmongodb<\/a>. Sadly it doesn&#8217;t look like there has been much in the way of current activity with either package&#8217;s GitHub repositories. Aside from that we can still connect and do some queries.<\/p>\n<h3>Connecting R to MongoDB<\/h3>\n<p>We need to bring in our package and establish our connection:<\/p>\n<pre>require(RMongo)\n\nmongo &lt;- mongoDbConnect('kenblog', 'localhost', 27017)\n<\/pre>\n<p>In the <code>mongoDbConnect<\/code> method, we have options for the name of the database, server name, and port number to which we want to connect.<\/p>\n<p>Next, we will want to send a query. For this example, let&#8217;s get only the <code>exam<\/code> data from our <code>scores<\/code> collection. We can use the <code>dbGetQuery<\/code> method for this which takes a connection object, the collection name, and the query.<\/p>\n<pre>examQuery &lt;- dbGetQuery(mongo, 'scores', \"{'type': 'exam'}\")\n<\/pre>\n<p>This loads in all of the records from our <code>scores<\/code> collection of type <code>exam<\/code>. Let&#8217;s take the values of our exam scores and create a vector from them.<\/p>\n<pre>exam_scores &lt;- examQuery[c('score')]\n<\/pre>\n<p>Nice! Now we can utilize some of the power of R to do some data analysis. Let&#8217;s get a simple summary of our data with <code>summary(exam_scores)<\/code>:<\/p>\n<pre>     score       \n Min.   : 60.00  \n 1st Qu.: 72.00  \n Median : 79.00  \n Mean   : 79.45  \n 3rd Qu.: 86.00  \n Max.   :100.00 \n<\/pre>\n<p>Neat. I realize that this particular example could be computed using MongoDB&#8217;s powerful <a href=\"https:\/\/docs.mongodb.com\/manual\/aggregation\/\">aggregation framework<\/a>. However, there are times when using outside resources and languages, like R, for processing is called for.<\/p>\n<h3>Wrap Up<\/h3>\n<p>Connecting to MongoDB from R is pretty straightforward and simple using the RMongo package. However, many of the new features that MongoDB has implemented in the last few years have not been included in the community R drivers. Further, as of this post, there isn&#8217;t an &#8220;official&#8221; R driver supported by MongoDB.<\/p>\n<p>R is a great statistical language and can definitely be used to query and analyze MongoDB collections. If you are using R in your work today, MongoDB is a definite option for storing your data to be analyzed.<\/p>\n<hr \/>\n<p>Follow me on Twitter <a href=\"https:\/\/www.twitter.com\/kenwalger\">@kenwalger<\/a> to get the latest updates on my postings.<\/p>\n<p>There are a few MongoDB specific terms in this post. I created a <a href=\"https:\/\/www.echoskillstore.com\/MongoDB-Dictionary\/45103\">MongoDB Dictionary<\/a> skill for the <a href=\"https:\/\/www.amazon.com\/gp\/product\/B01DFKC2SO\/ref=as_li_tl?ie=UTF8&amp;camp=1789&amp;creative=9325&amp;creativeASIN=B01DFKC2SO&amp;linkCode=as2&amp;tag=kenwalgersite-20&amp;linkId=f9e513223de2525a72b95cf9561db55b\" rel=\"noopener noreferrer\">Amazon Echo<\/a>\u00a0line of products. Check it out and you can say &#8220;Alexa, ask MongoDB for the definition of a\u00a0document?&#8221; and get a helpful response.<\/p>\n<a class=\"synved-social-button synved-social-button-share synved-social-size-48 synved-social-resolution-single synved-social-provider-facebook nolightbox\" data-provider=\"facebook\" target=\"_blank\" rel=\"nofollow\" title=\"Share on Facebook\" href=\"https:\/\/www.facebook.com\/sharer.php?u=https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F652&#038;t=Using%20R%20with%20MongoDB&#038;s=100&#038;p&#091;url&#093;=https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F652&#038;p&#091;images&#093;&#091;0&#093;=https%3A%2F%2Fi0.wp.com%2Fwww.kenwalger.com%2Fblog%2Fwp-content%2Fuploads%2F2017%2F10%2Ffeature.png%3Ffit%3D125%252C125%26ssl%3D1&#038;p&#091;title&#093;=Using%20R%20with%20MongoDB\" style=\"font-size: 0px;width:48px;height:48px;margin:0;margin-bottom:5px;margin-right:5px\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" alt=\"Facebook\" title=\"Share on Facebook\" class=\"synved-share-image synved-social-image synved-social-image-share\" width=\"48\" height=\"48\" style=\"display: inline;width:48px;height:48px;margin: 0;padding: 0;border: none;box-shadow: none\" src=\"https:\/\/i0.wp.com\/www.kenwalger.com\/blog\/wp-content\/plugins\/social-media-feather\/synved-social\/image\/social\/regular\/96x96\/facebook.png?resize=48%2C48&#038;ssl=1\" \/><\/a><a class=\"synved-social-button synved-social-button-share synved-social-size-48 synved-social-resolution-single synved-social-provider-twitter nolightbox\" data-provider=\"twitter\" target=\"_blank\" rel=\"nofollow\" title=\"Share on Twitter\" href=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F652&#038;text=Hey%20check%20this%20out\" style=\"font-size: 0px;width:48px;height:48px;margin:0;margin-bottom:5px;margin-right:5px\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" alt=\"twitter\" title=\"Share on Twitter\" class=\"synved-share-image synved-social-image synved-social-image-share\" width=\"48\" height=\"48\" style=\"display: inline;width:48px;height:48px;margin: 0;padding: 0;border: none;box-shadow: none\" src=\"https:\/\/i0.wp.com\/www.kenwalger.com\/blog\/wp-content\/plugins\/social-media-feather\/synved-social\/image\/social\/regular\/96x96\/twitter.png?resize=48%2C48&#038;ssl=1\" \/><\/a><a class=\"synved-social-button synved-social-button-share synved-social-size-48 synved-social-resolution-single synved-social-provider-reddit nolightbox\" data-provider=\"reddit\" target=\"_blank\" rel=\"nofollow\" title=\"Share on Reddit\" href=\"https:\/\/www.reddit.com\/submit?url=https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F652&#038;title=Using%20R%20with%20MongoDB\" style=\"font-size: 0px;width:48px;height:48px;margin:0;margin-bottom:5px;margin-right:5px\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" alt=\"reddit\" title=\"Share on Reddit\" class=\"synved-share-image synved-social-image synved-social-image-share\" width=\"48\" height=\"48\" style=\"display: inline;width:48px;height:48px;margin: 0;padding: 0;border: none;box-shadow: none\" src=\"https:\/\/i0.wp.com\/www.kenwalger.com\/blog\/wp-content\/plugins\/social-media-feather\/synved-social\/image\/social\/regular\/96x96\/reddit.png?resize=48%2C48&#038;ssl=1\" \/><\/a><a class=\"synved-social-button synved-social-button-share synved-social-size-48 synved-social-resolution-single synved-social-provider-linkedin nolightbox\" data-provider=\"linkedin\" target=\"_blank\" rel=\"nofollow\" title=\"Share on Linkedin\" href=\"https:\/\/www.linkedin.com\/shareArticle?mini=true&#038;url=https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F652&#038;title=Using%20R%20with%20MongoDB\" style=\"font-size: 0px;width:48px;height:48px;margin:0;margin-bottom:5px;margin-right:5px\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" alt=\"linkedin\" title=\"Share on Linkedin\" class=\"synved-share-image synved-social-image synved-social-image-share\" width=\"48\" height=\"48\" style=\"display: inline;width:48px;height:48px;margin: 0;padding: 0;border: none;box-shadow: none\" src=\"https:\/\/i0.wp.com\/www.kenwalger.com\/blog\/wp-content\/plugins\/social-media-feather\/synved-social\/image\/social\/regular\/96x96\/linkedin.png?resize=48%2C48&#038;ssl=1\" \/><\/a><a class=\"synved-social-button synved-social-button-share synved-social-size-48 synved-social-resolution-single synved-social-provider-mail nolightbox\" data-provider=\"mail\" rel=\"nofollow\" title=\"Share by email\" href=\"mailto:?subject=Using%20R%20with%20MongoDB&#038;body=Hey%20check%20this%20out:%20https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F652\" style=\"font-size: 0px;width:48px;height:48px;margin:0;margin-bottom:5px\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" alt=\"mail\" title=\"Share by email\" class=\"synved-share-image synved-social-image synved-social-image-share\" width=\"48\" height=\"48\" style=\"display: inline;width:48px;height:48px;margin: 0;padding: 0;border: none;box-shadow: none\" src=\"https:\/\/i0.wp.com\/www.kenwalger.com\/blog\/wp-content\/plugins\/social-media-feather\/synved-social\/image\/social\/regular\/96x96\/mail.png?resize=48%2C48&#038;ssl=1\" \/><\/a>","protected":false},"excerpt":{"rendered":"<p>NOTE: MongoDB 3.6 has a new R Language support. See my other blog post for the latest information. The R programming language is a powerful language used for statistical computing. When working with statistical computing it is frequently the case that the data being explored will come from a database. Some of the powers that &hellip; <a href=\"https:\/\/www.kenwalger.com\/blog\/nosql\/mongodb\/using-r-mongodb\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Using R with MongoDB&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":654,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"pmpro_default_level":"","_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[4],"tags":[1052],"yst_prominent_words":[1207,117,99,104,298,360,345,978,756,87,768,1051,722,799,115,963,1047,1043,1044,570],"class_list":["post-652","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-mongodb","tag-r","pmpro-has-access"],"jetpack_featured_media_url":"https:\/\/i0.wp.com\/www.kenwalger.com\/blog\/wp-content\/uploads\/2017\/10\/feature.png?fit=125%2C125&ssl=1","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p8lx70-aw","jetpack-related-posts":[],"_links":{"self":[{"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/posts\/652","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/comments?post=652"}],"version-history":[{"count":7,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/posts\/652\/revisions"}],"predecessor-version":[{"id":772,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/posts\/652\/revisions\/772"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/media\/654"}],"wp:attachment":[{"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/media?parent=652"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/categories?post=652"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/tags?post=652"},{"taxonomy":"yst_prominent_words","embeddable":true,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/yst_prominent_words?post=652"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}