{"id":1347,"date":"2026-04-07T09:19:00","date_gmt":"2026-04-07T16:19:00","guid":{"rendered":"https:\/\/www.kenwalger.com\/blog\/?p=1347"},"modified":"2026-03-17T10:21:23","modified_gmt":"2026-03-17T17:21:23","slug":"searching-physical-objects-data-indexing","status":"publish","type":"post","link":"https:\/\/www.kenwalger.com\/blog\/data-engineering\/searching-physical-objects-data-indexing\/","title":{"rendered":"The Backyard Quarry, Part 4: Searching a Pile of Rocks"},"content":{"rendered":"<p>By this point, the Backyard Quarry has a schema, a capture process, and a growing collection of records.<\/p>\n<p>Each rock has:<\/p>\n<ul>\n<li>metadata<\/li>\n<li>images<\/li>\n<li>possibly a 3D model<\/li>\n<\/ul>\n<p>In theory, everything is organized.<\/p>\n<p>In practice, it quickly becomes difficult to find anything.<\/p>\n<h2>The First Search Problem<\/h2>\n<p>With a handful of rocks, you can rely on memory.<\/p>\n<p>You remember roughly where things are.<\/p>\n<p>You recognize shapes and colors.<\/p>\n<p>But as the dataset grows, that breaks down.<\/p>\n<p>You start asking questions like:<\/p>\n<ul>\n<li>Which rocks are under 5 pounds?<\/li>\n<li>Which ones are suitable for landscaping?<\/li>\n<li>Where did that smooth gray stone go?<\/li>\n<\/ul>\n<p>At that point, you\u2019re no longer dealing with a pile.<\/p>\n<p>You\u2019re dealing with a dataset.<\/p>\n<p>And datasets need to be searchable.<\/p>\n<h2>Filtering by Metadata<\/h2>\n<p>The most straightforward approach is to use structured queries.<\/p>\n<p>If we have metadata like weight, color, and classification, we can filter directly.<\/p>\n<p>Conceptually:<\/p>\n<pre><code class=\"language-sql\">SELECT *\nFROM rocks\nWHERE weight_lb &lt; 5\nAND color = 'gray'\nAND rock_class &lt;= 'Class 2'\n<\/code><\/pre>\n<p>This works well for clearly defined attributes.<\/p>\n<p>It\u2019s predictable.<\/p>\n<p>It\u2019s efficient.<\/p>\n<p>And it\u2019s the foundation of most data systems.<\/p>\n<h2>The Role of Classification<\/h2>\n<p>This is where the Quarry Taxonomy starts to pay off.<\/p>\n<p>Instead of requiring precise measurements, we can use categories:<\/p>\n<ul>\n<li>Pebble Class<\/li>\n<li>Hand Sample<\/li>\n<li>Landscaping Rock<\/li>\n<li>Wheelbarrow Class<\/li>\n<li>Engine Block Class<\/li>\n<\/ul>\n<p>This allows for simpler queries:<\/p>\n<ul>\n<li><em>\u201cShow me everything below Wheelbarrow Class\u201d<\/em><\/li>\n<li><em>\u201cExclude Engine Block Class entirely\u201d<\/em><\/li>\n<\/ul>\n<p>Classification reduces complexity.<\/p>\n<p>It turns continuous values into discrete groups.<\/p>\n<p>This is a common pattern in real-world systems.<\/p>\n<h2>When Metadata Isn\u2019t Enough<\/h2>\n<p>Structured queries work well when you know exactly what you\u2019re looking for.<\/p>\n<p>But sometimes you don\u2019t.<\/p>\n<p>Sometimes the question looks more like:<\/p>\n<p><em>Find rocks that look like this one.<\/em><\/p>\n<p>Or:<\/p>\n<p><em>Find something similar to the smooth stone I saw earlier.<\/em><\/p>\n<p>At that point, metadata alone isn\u2019t enough.<\/p>\n<p>We need another way to compare objects.<\/p>\n<h2>Similarity and Representation<\/h2>\n<p>Images and 3D models contain information that isn\u2019t captured in simple fields like color or weight.<\/p>\n<p>To use that information, we need to represent it in a comparable way.<\/p>\n<p>One approach is to generate embeddings \u2014 numerical representations of images or shapes.<\/p>\n<p>Conceptually:<\/p>\n<ul>\n<li>each rock image \u2192 vector representation<\/li>\n<li>similar images \u2192 vectors close together<\/li>\n<li>dissimilar images \u2192 vectors further apart<\/li>\n<\/ul>\n<p>This allows for similarity search.<\/p>\n<p>Instead of filtering by attributes, we search by resemblance.<\/p>\n<h2>A Different Kind of Query<\/h2>\n<p>With similarity search, queries look different.<\/p>\n<p>Instead of:<\/p>\n<pre><code class=\"language-plaintext\">color = 'gray'\nweight &lt; 5\n<\/code><\/pre>\n<p>We might have:<\/p>\n<pre><code class=\"language-plaintext\">find nearest neighbors to this image\n<\/code><\/pre>\n<p>This shifts the system from exact matching to approximate matching.<\/p>\n<p>It\u2019s less precise.<\/p>\n<p>But often more useful.<\/p>\n<h2>A Familiar Pattern<\/h2>\n<p>At this point, the <strong>Backyard Quarry<\/strong> starts to resemble systems used in:<\/p>\n<ul>\n<li>image search engines<\/li>\n<li>product recommendation systems<\/li>\n<li>digital asset management platforms<\/li>\n<li>AI-powered retrieval systems<\/li>\n<\/ul>\n<p>The objects are different.<\/p>\n<p>The pattern is the same.<\/p>\n<p>Store data.<\/p>\n<p>Index it.<\/p>\n<p>Provide multiple ways to retrieve it.<\/p>\n<h2>Combining Approaches<\/h2>\n<p>In practice, the most useful systems combine both methods.<\/p>\n<p>Structured filtering:<\/p>\n<ul>\n<li>weight<\/li>\n<li>class<\/li>\n<li>location<\/li>\n<\/ul>\n<p>Similarity search:<\/p>\n<ul>\n<li>appearance<\/li>\n<li>shape<\/li>\n<li>texture<\/li>\n<\/ul>\n<p>Together, they provide flexibility.<\/p>\n<p>You can narrow down the dataset and then explore it.<\/p>\n<h2>The Cost of Search<\/h2>\n<p>Search doesn\u2019t come for free.<\/p>\n<p>It introduces:<\/p>\n<ul>\n<li>indexing overhead<\/li>\n<li>additional storage<\/li>\n<li>preprocessing steps<\/li>\n<li>more complex queries<\/li>\n<\/ul>\n<p>And like everything else in the Quarry system, these tradeoffs become more significant as the dataset grows.<\/p>\n<h2>The Realization<\/h2>\n<p>At this point, something interesting becomes clear.<\/p>\n<p>The hard part isn\u2019t collecting rocks.<\/p>\n<p>It isn\u2019t even modeling them.<\/p>\n<p>The hard part is making the data usable.<\/p>\n<p>And usability, in most systems, comes down to one thing:<\/p>\n<p>Search.<\/p>\n<h2>What Comes Next<\/h2>\n<p>With data captured and searchable, the next step is to zoom out.<\/p>\n<p>What we\u2019ve built so far is more than just a rock catalog.<\/p>\n<p>It\u2019s a small example of a larger idea.<\/p>\n<p>In the next post, we\u2019ll look at that idea more directly:<\/p>\n<p>Digital twins.<\/p>\n<p>Because once you can represent, store, and search objects, you\u2019ve taken the first step toward building systems that mirror the physical world.<\/p>\n<p>And somewhere in the process, it becomes clear that even a pile of rocks benefits from thoughtful indexing.<\/p>\n<p>Which is not something I expected to say when this started.<\/p>\n<h2>The Rock Quarry Series<\/h2>\n<ul>\n<li><a href=\"https:\/\/www.kenwalger.com\/blog\/software-engineering\/the-backyard-quarry-turning-rocks-into-data\/\">The Backyard Quarry: Turning Rocks Into Data<\/a><\/li>\n<li><a href=\"https:\/\/www.kenwalger.com\/blog\/data-engineering\/designing-a-schema-for-physical-objects\">The Backyard Quarry, Part 2: Designing a Schema for Physical Objects<\/a><\/li>\n<li><a href=\"https:\/\/www.kenwalger.com\/blog\/data-engineering\/capturing-physical-objects-data-pipeline\">The Backyard Quarry, Part 3: Capturing the Physical World<\/a><\/li>\n<\/ul>\n<a class=\"synved-social-button synved-social-button-share synved-social-size-48 synved-social-resolution-single synved-social-provider-facebook nolightbox\" data-provider=\"facebook\" target=\"_blank\" rel=\"nofollow\" title=\"Share on Facebook\" href=\"https:\/\/www.facebook.com\/sharer.php?u=https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F1347&#038;t=The%20Backyard%20Quarry%2C%20Part%204%3A%20Searching%20a%20Pile%20of%20Rocks&#038;s=100&#038;p&#091;url&#093;=https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F1347&#038;p&#091;images&#093;&#091;0&#093;=&#038;p&#091;title&#093;=The%20Backyard%20Quarry%2C%20Part%204%3A%20Searching%20a%20Pile%20of%20Rocks\" style=\"font-size: 0px;width:48px;height:48px;margin:0;margin-bottom:5px;margin-right:5px\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" alt=\"Facebook\" title=\"Share on Facebook\" class=\"synved-share-image synved-social-image synved-social-image-share\" width=\"48\" height=\"48\" style=\"display: inline;width:48px;height:48px;margin: 0;padding: 0;border: none;box-shadow: none\" src=\"https:\/\/i0.wp.com\/www.kenwalger.com\/blog\/wp-content\/plugins\/social-media-feather\/synved-social\/image\/social\/regular\/96x96\/facebook.png?resize=48%2C48&#038;ssl=1\" \/><\/a><a class=\"synved-social-button synved-social-button-share synved-social-size-48 synved-social-resolution-single synved-social-provider-twitter nolightbox\" data-provider=\"twitter\" target=\"_blank\" rel=\"nofollow\" title=\"Share on Twitter\" href=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F1347&#038;text=Hey%20check%20this%20out\" style=\"font-size: 0px;width:48px;height:48px;margin:0;margin-bottom:5px;margin-right:5px\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" alt=\"twitter\" title=\"Share on Twitter\" class=\"synved-share-image synved-social-image synved-social-image-share\" width=\"48\" height=\"48\" style=\"display: inline;width:48px;height:48px;margin: 0;padding: 0;border: none;box-shadow: none\" src=\"https:\/\/i0.wp.com\/www.kenwalger.com\/blog\/wp-content\/plugins\/social-media-feather\/synved-social\/image\/social\/regular\/96x96\/twitter.png?resize=48%2C48&#038;ssl=1\" \/><\/a><a class=\"synved-social-button synved-social-button-share synved-social-size-48 synved-social-resolution-single synved-social-provider-reddit nolightbox\" data-provider=\"reddit\" target=\"_blank\" rel=\"nofollow\" title=\"Share on Reddit\" href=\"https:\/\/www.reddit.com\/submit?url=https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F1347&#038;title=The%20Backyard%20Quarry%2C%20Part%204%3A%20Searching%20a%20Pile%20of%20Rocks\" style=\"font-size: 0px;width:48px;height:48px;margin:0;margin-bottom:5px;margin-right:5px\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" alt=\"reddit\" title=\"Share on Reddit\" class=\"synved-share-image synved-social-image synved-social-image-share\" width=\"48\" height=\"48\" style=\"display: inline;width:48px;height:48px;margin: 0;padding: 0;border: none;box-shadow: none\" src=\"https:\/\/i0.wp.com\/www.kenwalger.com\/blog\/wp-content\/plugins\/social-media-feather\/synved-social\/image\/social\/regular\/96x96\/reddit.png?resize=48%2C48&#038;ssl=1\" \/><\/a><a class=\"synved-social-button synved-social-button-share synved-social-size-48 synved-social-resolution-single synved-social-provider-linkedin nolightbox\" data-provider=\"linkedin\" target=\"_blank\" rel=\"nofollow\" title=\"Share on Linkedin\" href=\"https:\/\/www.linkedin.com\/shareArticle?mini=true&#038;url=https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F1347&#038;title=The%20Backyard%20Quarry%2C%20Part%204%3A%20Searching%20a%20Pile%20of%20Rocks\" style=\"font-size: 0px;width:48px;height:48px;margin:0;margin-bottom:5px;margin-right:5px\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" alt=\"linkedin\" title=\"Share on Linkedin\" class=\"synved-share-image synved-social-image synved-social-image-share\" width=\"48\" height=\"48\" style=\"display: inline;width:48px;height:48px;margin: 0;padding: 0;border: none;box-shadow: none\" src=\"https:\/\/i0.wp.com\/www.kenwalger.com\/blog\/wp-content\/plugins\/social-media-feather\/synved-social\/image\/social\/regular\/96x96\/linkedin.png?resize=48%2C48&#038;ssl=1\" \/><\/a><a class=\"synved-social-button synved-social-button-share synved-social-size-48 synved-social-resolution-single synved-social-provider-mail nolightbox\" data-provider=\"mail\" rel=\"nofollow\" title=\"Share by email\" href=\"mailto:?subject=The%20Backyard%20Quarry%2C%20Part%204%3A%20Searching%20a%20Pile%20of%20Rocks&#038;body=Hey%20check%20this%20out:%20https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F1347\" style=\"font-size: 0px;width:48px;height:48px;margin:0;margin-bottom:5px\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" alt=\"mail\" title=\"Share by email\" class=\"synved-share-image synved-social-image synved-social-image-share\" width=\"48\" height=\"48\" style=\"display: inline;width:48px;height:48px;margin: 0;padding: 0;border: none;box-shadow: none\" src=\"https:\/\/i0.wp.com\/www.kenwalger.com\/blog\/wp-content\/plugins\/social-media-feather\/synved-social\/image\/social\/regular\/96x96\/mail.png?resize=48%2C48&#038;ssl=1\" \/><\/a>","protected":false},"excerpt":{"rendered":"<p>By this point, the Backyard Quarry has a schema, a capture process, and a growing collection of records. Each rock has: metadata images possibly a 3D model In theory, everything is organized. In practice, it quickly becomes difficult to find anything. The First Search Problem With a handful of rocks, you can rely on memory. &hellip; <a href=\"https:\/\/www.kenwalger.com\/blog\/data-engineering\/searching-physical-objects-data-indexing\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;The Backyard Quarry, Part 4: Searching a Pile of Rocks&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"pmpro_default_level":"","_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[1739,1738],"tags":[1736,1713],"yst_prominent_words":[810],"class_list":["post-1347","post","type-post","status-publish","format-standard","hentry","category-data-engineering","category-software-engineering","tag-data-engineering","tag-system-design","pmpro-has-access"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p8lx70-lJ","jetpack-related-posts":[],"_links":{"self":[{"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/posts\/1347","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/comments?post=1347"}],"version-history":[{"count":2,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/posts\/1347\/revisions"}],"predecessor-version":[{"id":1349,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/posts\/1347\/revisions\/1349"}],"wp:attachment":[{"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/media?parent=1347"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/categories?post=1347"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/tags?post=1347"},{"taxonomy":"yst_prominent_words","embeddable":true,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/yst_prominent_words?post=1347"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}