What I’ve Been Building: Systems, AI, and Real-World Data

Over the past several weeks, I’ve been spending a lot of time thinking about systems.

Some of that thinking has taken the form of writing.

If you’ve come across any of my recent posts, they might seem like they cover very different topics:

  • cataloging rocks in a backyard
  • building AI systems using MCP
  • working with documents, images, and real-world data

At first glance, they don’t appear to have much in common.

But they’re all exploring the same underlying idea.

The Common Thread

Across all of these posts, the focus has been on a specific kind of problem:

How do we turn messy, real-world inputs into structured, usable systems?

That problem shows up in many different forms.

Sometimes the input is physical:

  • objects
  • artifacts
  • environments

Sometimes it’s digital:

  • documents
  • images
  • logs

Sometimes it’s dynamic:

  • motion
  • behavior
  • sensor data

But the challenge is the same.

The input is unstructured.

The system needs structure.

The Backyard Quarry

One way I explored this idea was through a small project I called the Backyard Quarry.

It started with a simple observation:

There are a lot of rocks in the yard.

From there, the problem evolved into something more interesting:

  • how to represent physical objects as data
  • how to capture images and measurements
  • how to build pipelines around that data
  • how to search and organize it
  • how to think about digital twins

What began as a small experiment became a way to explore system design in a constrained, tangible setting.

MCP and AI Systems

In parallel, I’ve been writing about building AI systems using MCP.

On the surface, this looks very different.

Instead of rocks, the inputs are:

  • documents
  • APIs
  • models
  • agent workflows

But the structure is familiar.

  • inputs are ingested
  • processed
  • transformed
  • routed
  • used by applications

The system still needs to handle:

  • variability
  • scale
  • imperfect data
  • orchestration

Different inputs.

Same patterns.

From Objects to Systems

One of the more useful realizations in working through these ideas is this:

The problem is rarely about the individual object.
It’s about the system that handles many objects over time.

Whether the object is:

  • a rock
  • a document
  • a sensor reading

The questions become:

  • how is it represented?
  • how does it enter the system?
  • how is it transformed?
  • how is it stored?
  • how is it retrieved?

These are system-level questions.

A Shared Architecture

Across these different domains, a common architecture begins to emerge.

Diagram showing how raw inputs are captured, processed, structured, indexed, and used by applications in a data system.
A common pattern for transforming real-world inputs into usable systems.

The labels change depending on the domain.

But the structure remains consistent.

Why This Matters

Understanding this pattern makes it easier to approach new problems.

Instead of starting from scratch each time, you can ask:

  • Where does the data come from?
  • How does it enter the system?
  • What transformations are required?
  • How will it be used?

This reduces complexity.

It also makes systems more predictable.

What I’m Interested In

Going forward, I’m particularly interested in systems that sit at the boundary between:

  • the physical world and digital systems
  • unstructured inputs and structured data
  • human workflows and automated processes

That includes areas like:

  • digital archiving
  • photogrammetry and 3D capture
  • AI-assisted analysis
  • systems that track objects or behavior over time

These problems are messy.

Which is part of what makes them interesting.

A Continuing Exploration

The posts I’ve been writing are not meant to be definitive.

They’re part of an ongoing exploration.

A way to think through problems in public.

And occasionally, a way to use a slightly unusual example — like a pile of rocks — to make broader ideas easier to see.

If You’re Interested

If any of this resonates, you might find these useful:

The Backyard Quarry Series

A systems-focused look at modeling and working with physical objects starting with Turning Rocks Into Data.

MCP and AI Systems

A technical exploration of building agent-based systems and data pipelines. I’d suggest starting with The End of Glue Code: Why MCP is the USB-C Moment for AI Systems.

More to come.

And if nothing else, it turns out that even a backyard can be a good place to think about system design.

Facebooktwitterredditlinkedinmail

The Backyard Quarry, Part 5: Digital Twins for Physical Objects

At this point in the Backyard Quarry project, something subtle has happened.

We started with a pile of rocks.

We now have:

  • a schema
  • a capture process
  • stored images
  • searchable metadata
  • classification
  • lifecycle states

Each rock has a record.

Each record represents something in the physical world.

And that leads to a useful observation.

We’re no longer just cataloging rocks.

We’re building digital representations of them.

What Is a Digital Twin?

In simple terms, a digital twin is:

A structured digital representation of a physical object.

That representation can include:

  • identity
  • properties
  • visual data
  • state
  • history

In the context of the Quarry, a rock’s digital twin might look like:

rock_id: QRY-042
weight_lb: 12.3
dimensions_cm: 18 x 10 x 7
color: gray
rock_type: granite
status: for_sale
images: [rock_042_1.jpg, rock_042_2.jpg]
model: rock_042.obj

It’s not the rock itself.

But it’s a useful abstraction of it.

More Than Just Metadata

At first glance, a digital twin might look like a simple database record.

But there’s an important difference.

A well-designed digital twin combines multiple types of data:

  • structured metadata (easy to query)
  • unstructured assets (images, models)
  • derived attributes (classification, embeddings)
  • state over time

It’s not just describing the object.

It’s enabling interaction with it through software.

The Time Dimension

One of the most important aspects of a digital twin is that it can change over time.

Even a rock — which is about as static as objects get — has a lifecycle in the system:

collected → cataloged → listed_for_sale → sold

Each transition adds context.

Now we’re not just storing a snapshot.

We’re tracking a history.

This becomes much more important in other domains.

Where This Shows Up

The interesting part is that this pattern isn’t unique to rocks.

It appears in many different systems.

Manufacturing

  • digital twins of machine parts
  • tracking condition and usage
  • linking physical components to system data

Museums and Archives

  • artifacts with metadata, images, provenance
  • digitized collections
  • searchable historical records

Agriculture

  • crops tracked over time
  • environmental data
  • growth and yield metrics

Healthcare and Motion

  • human movement captured as data
  • gait analysis
  • rehabilitation tracking

This last one starts to look a lot like something else entirely.

From Objects to Systems

What the Backyard Quarry demonstrates, in a small way, is that once you:

  • represent objects as data
  • capture their properties
  • store and index them

you’ve created the foundation for a larger system.

The digital twin becomes a building block.

And systems are built from collections of these building blocks.

The Abstraction Layer

A useful way to think about digital twins is as an abstraction layer.

They sit between:

Diagram showing how physical objects are captured and represented as digital twins with metadata, assets, and application layers.
Digital twins act as a bridge between physical objects and software systems.

Applications don’t interact with rocks directly.

They interact with the representation of rocks.

That layer enables:

  • search
  • analytics
  • visualization
  • automation

Without it, everything remains manual and unstructured.

The Limits of the Model

Of course, digital twins are not perfect representations.

They are approximations.

Some properties are easy to capture.

Others are difficult or impossible.

Even in the Quarry:

  • weight is approximate
  • dimensions are imprecise
  • visual data depends on lighting
  • 3D models may be incomplete

The goal isn’t perfect fidelity.

It’s usefulness.

The Real Insight

At this point, the Backyard Quarry starts to feel less like a joke and more like a small version of a much larger idea.

Many modern systems are built around digital twins.

Not because the concept is new.

But because we now have the tools to make it practical:

  • cheap sensors
  • high-resolution cameras
  • scalable storage
  • machine learning

The pattern has existed for a long time.

The difference is that we can now implement it at scale.

What Comes Next

So far, the Quarry system works at a small scale.

A handful of rocks.

A manageable dataset.

But what happens when the number of objects grows?

When the dataset becomes:

  • hundreds
  • thousands
  • or millions

The next post explores that question.

Because designing a system for a small dataset is one thing.

Designing a system that scales is something else entirely.

And somewhere along the way, it becomes clear that a pile of rocks is enough to illustrate ideas that show up across entire industries.

Yet another surprise in this Backyard Quarry journey.

The Rock Quarry Series

Facebooktwitterredditlinkedinmail