The Backyard Quarry, Part 2: Designing a Schema for Physical Objects

In the first post of this series we set the stage for the Backyard Quarry project.

Once you decide every rock in the yard should have a record, the next question appears immediately:

What exactly should we record?

It’s a deceptively simple question. And like most simple questions in engineering, it opens the door to a surprisingly large number of decisions.

The First Attempt

The most straightforward approach is to keep things minimal.

Each rock gets an identifier and a few attributes.

Something like:

rock_id
size
price

At first glance, this seems reasonable.

We can identify the rock. We can describe it in some vague way. We can assign a price.

But this model breaks down almost immediately.

“Size” is ambiguous. Is that weight? Volume? Longest dimension? All of the above?

Two rocks of the same “size” might behave very differently when you try to move them.

And more importantly, this model doesn’t capture anything about the rock beyond its most basic characteristics.

It’s enough to sell a rock.

It’s not enough to understand one.

Expanding the Model

To make the system more useful, we need to be more explicit.

A slightly richer model might look like this:

rock_id
weight_lb
length_cm
width_cm
height_cm
color
rock_type
location_found
status

Now we’re getting somewhere.

We can distinguish between rocks that look similar but behave differently.

We can track where each rock came from.

We can start to answer questions like:

  • How many rocks do we have in a given area?
  • What size distribution does the dataset have?
  • Which rocks are suitable for different uses?

This is the point where the rock pile starts to feel less like a random collection and more like a dataset.

The Object Data Model

At a higher level, what we’re really doing is separating a physical object into a few distinct components.

Diagram showing how a physical rock is represented as a digital record with metadata, images, and a 3D model.
A simple model for representing a physical object as structured data and associated assets.

Each rock has:

  • metadata describing its properties
  • images representing its appearance
  • optionally, a 3D model capturing its shape

This separation turns out to be important.

Metadata is small, structured, and easy to query.

Images and 3D models are large, unstructured assets that need to be stored and referenced.

Keeping those concerns separate is a pattern that shows up in many real-world systems.

The Identity Problem

Once the schema starts to take shape, another question appears.

How do we uniquely identify a rock?

There are a few options:

  • sequential IDs (rock_001, rock_002)
  • UUIDs
  • physical tags attached to rocks
  • some form of image-based identification

For a small backyard dataset, almost anything works.

But the choice matters more as the system grows.

Sequential IDs are easy to read but require coordination.

UUIDs are globally unique but harder to work with manually.

Physical tags introduce a connection between the digital record and the real-world object.

Even in a simple system, identity becomes a design decision.

Classification: The Quarry Taxonomy

At some point, it becomes useful to introduce categories.

Originally this was just a convenience.

But like many things in this project, it quickly became something more formal.

A simple classification system might look like this:

Class 0 — Pebble
Class 1 — Hand Sample
Class 2 — Landscaping Rock
Class 3 — Wheelbarrow Class
Class 4 — Engine Block Class
Class 5 — Heavy Machinery Class

Each class roughly corresponds to how the rock is handled.

This turns out to be surprisingly useful.

Instead of asking for exact dimensions, we can filter by class:

  • “Show me all Pebble Class rocks”
  • “Exclude anything above Wheelbarrow Class”

In other words, we’ve introduced a derived attribute — something computed from the underlying data rather than stored arbitrarily.

This is exactly how classification systems evolve in real datasets.

Thinking About Lifecycle

Rocks don’t change much physically, but their role in the system does.

A rock might move through states like:

collected
cataloged
listed_for_sale
sold

Tracking this lifecycle introduces another dimension to the data.

Now we’re not just modeling objects.

We’re modeling *objects over *.

Even in a simple system, state and transitions begin to matter.

The Tradeoffs

At this point, the schema is already doing useful work.

But it’s also clear that there’s no perfect design.

Every decision involves tradeoffs:

  • more fields vs simplicity
  • normalized structure vs ease of use
  • flexibility vs consistency

The goal isn’t to design the perfect schema on the first try.

The goal is to design something that can evolve.

Because as soon as we start capturing real data, we’ll learn what we got wrong.

What Comes Next

With a basic schema in place, the next challenge becomes obvious.

We know what we want to store.

Now we need to figure out how to capture it.

In the next post, we’ll look at how to turn a physical rock into images, measurements, and potentially a 3D model — and how that process introduces its own set of constraints.

Because it turns out that collecting data from the physical world is rarely as clean as designing a schema on paper.

Facebooktwitterredditlinkedinmail

The Backyard Quarry: Turning Rocks Into Data

Another round of tech layoffs rolled through the industry recently, and I was one of the people caught in it.

If you’ve worked in tech for any length of time, you know the routine that follows. Update the résumé. Reach out to contacts. Scroll job boards. Try to figure out which technologies the market is currently excited about and which ones have quietly drifted into irrelevance.

After a few days of that cycle, I found myself spending more time outside than in front of a laptop.

One afternoon, walking around the yard, I noticed something interesting.

My backyard contains a surprisingly large dataset.

Rocks.

Pile of Rocks
Sample of the rocks from the Backyard Quarry used for the dataset.

Lots of rocks.

Some are the size of peas. Others are roughly the size of a car engine. A few fall somewhere in the unsettling range between “wheelbarrow recommended” and “this probably requires heavy machinery.”

Naturally, I had the same thought many people eventually have when staring at a large pile of rocks:

I could probably sell these.

A small stand near the road. A few piles sorted by size. Maybe a sign that says “Landscaping Rock.” It’s not exactly a venture-backed startup, but stranger side businesses have existed.

Unfortunately, engineers have a well-known weakness.

We rarely do things the simple way.

If I was going to sell rocks, I wasn’t just going to pile them on a table.

I was going to build a system.

The Dataset

The moment you start thinking about the rocks as inventory, a familiar set of questions appears.

How many rocks are there?

What kinds?

Which ones are small decorative stones and which ones fall firmly into what I’ve started calling Engine Block Class?

Like many real-world datasets, this one has significant variability.

Some objects are a few grams. Others weigh enough to require careful lifting technique and a brief internal conversation about life choices.

At a glance, the dataset looks chaotic. But underneath the chaos are patterns.

Different sizes. Different shapes. Different colors. Different geological types. Some rocks are smooth river stones. Others are jagged fragments that look like they escaped from a small landslide.

If you squint a little, you start to see the outlines of something familiar to anyone who works with data systems.

A collection of physical objects that could be represented as structured records.

The Engineer’s Curse

In theory, selling rocks is simple.

  • Step one: collect rocks.
  • Step two: put them in a pile.
  • Step three: wait for someone to stop their car and decide they want landscaping material.

But once you start thinking about it from an engineering perspective, the questions multiply.

Should each rock have an identifier?

Should there be photographs?

Should the system track weight or dimensions?

What about classification?

It’s probably useful to distinguish between Pebble Class rocks and Wheelbarrow Class rocks.

And what about the really large ones — the ones that are clearly in the Engine Block Class, which itself appears to span everything from motorcycle engine scale to something closer to a semi-truck.

Once you start thinking about these questions, the simple rock pile begins to look like something else entirely.

A catalog.

A dataset.

A system waiting to happen.

From Rocks to Records

What if every rock had a record?

Something simple at first.

An identifier. A few attributes. Maybe a photo.

Conceptually, it might look like this:

rock_id
weight
dimensions
color
rock_type
location_found
status

Each rock in the yard becomes a digital object — a structured record representing something in the physical world.

In other words, each rock now has a digital twin.

That might sound slightly ridiculous in the context of landscaping stones, but the idea is surprisingly powerful.

Across many industries, organizations are trying to solve exactly this problem: how to connect messy physical reality with structured digital systems.

Manufacturers track machine parts.

Museums catalog artifacts.

Farmers track crops.

Logistics companies track inventory moving through warehouses.

In each case, the challenge is similar.

A physical object exists somewhere in the world.

We want to represent it in a way that software systems can understand.

The Backyard Quarry

At this point the rock pile had acquired a new name.

The Backyard Quarry.

Partly as a joke, and partly because it captured the spirit of the project. What started as a casual observation had turned into a small experiment in data modeling, object cataloging, and system design.

The dataset might be small.

The objects might be rocks.

But the underlying questions are surprisingly rich.

How do you represent physical objects in software?

How do you capture information about them?

How do you search and organize the resulting data?

And how do these systems scale when the number of objects grows from a few dozen to thousands — or millions?

What Comes Next

Over the next few posts in this series, I’m going to explore those questions by building a small system around the Backyard Quarry.

We’ll look at things like:

  • designing a schema for physical objects
  • capturing images and measurements
  • generating 3D models using photogrammetry
  • building ingestion pipelines
  • indexing and searching the dataset

All starting from a simple collection of rocks.

The world has no shortage of complicated engineering problems.

Sometimes the best place to explore them is somewhere simpler.

Like a pile of rocks in the backyard.

And if you happen to need a carefully documented specimen from the Backyard Quarry, inventory is currently available.

Shipping, however, may exceed the value of the rock itself.

Facebooktwitterredditlinkedinmail