What I’ve Been Building: Systems, AI, and Real-World Data

Over the past several weeks, I’ve been spending a lot of time thinking about systems.

Some of that thinking has taken the form of writing.

If you’ve come across any of my recent posts, they might seem like they cover very different topics:

  • cataloging rocks in a backyard
  • building AI systems using MCP
  • working with documents, images, and real-world data

At first glance, they don’t appear to have much in common.

But they’re all exploring the same underlying idea.

The Common Thread

Across all of these posts, the focus has been on a specific kind of problem:

How do we turn messy, real-world inputs into structured, usable systems?

That problem shows up in many different forms.

Sometimes the input is physical:

  • objects
  • artifacts
  • environments

Sometimes it’s digital:

  • documents
  • images
  • logs

Sometimes it’s dynamic:

  • motion
  • behavior
  • sensor data

But the challenge is the same.

The input is unstructured.

The system needs structure.

The Backyard Quarry

One way I explored this idea was through a small project I called the Backyard Quarry.

It started with a simple observation:

There are a lot of rocks in the yard.

From there, the problem evolved into something more interesting:

  • how to represent physical objects as data
  • how to capture images and measurements
  • how to build pipelines around that data
  • how to search and organize it
  • how to think about digital twins

What began as a small experiment became a way to explore system design in a constrained, tangible setting.

MCP and AI Systems

In parallel, I’ve been writing about building AI systems using MCP.

On the surface, this looks very different.

Instead of rocks, the inputs are:

  • documents
  • APIs
  • models
  • agent workflows

But the structure is familiar.

  • inputs are ingested
  • processed
  • transformed
  • routed
  • used by applications

The system still needs to handle:

  • variability
  • scale
  • imperfect data
  • orchestration

Different inputs.

Same patterns.

From Objects to Systems

One of the more useful realizations in working through these ideas is this:

The problem is rarely about the individual object.
It’s about the system that handles many objects over time.

Whether the object is:

  • a rock
  • a document
  • a sensor reading

The questions become:

  • how is it represented?
  • how does it enter the system?
  • how is it transformed?
  • how is it stored?
  • how is it retrieved?

These are system-level questions.

A Shared Architecture

Across these different domains, a common architecture begins to emerge.

Diagram showing how raw inputs are captured, processed, structured, indexed, and used by applications in a data system.
A common pattern for transforming real-world inputs into usable systems.

The labels change depending on the domain.

But the structure remains consistent.

Why This Matters

Understanding this pattern makes it easier to approach new problems.

Instead of starting from scratch each time, you can ask:

  • Where does the data come from?
  • How does it enter the system?
  • What transformations are required?
  • How will it be used?

This reduces complexity.

It also makes systems more predictable.

What I’m Interested In

Going forward, I’m particularly interested in systems that sit at the boundary between:

  • the physical world and digital systems
  • unstructured inputs and structured data
  • human workflows and automated processes

That includes areas like:

  • digital archiving
  • photogrammetry and 3D capture
  • AI-assisted analysis
  • systems that track objects or behavior over time

These problems are messy.

Which is part of what makes them interesting.

A Continuing Exploration

The posts I’ve been writing are not meant to be definitive.

They’re part of an ongoing exploration.

A way to think through problems in public.

And occasionally, a way to use a slightly unusual example — like a pile of rocks — to make broader ideas easier to see.

If You’re Interested

If any of this resonates, you might find these useful:

The Backyard Quarry Series

A systems-focused look at modeling and working with physical objects starting with Turning Rocks Into Data.

MCP and AI Systems

A technical exploration of building agent-based systems and data pipelines. I’d suggest starting with The End of Glue Code: Why MCP is the USB-C Moment for AI Systems.

More to come.

And if nothing else, it turns out that even a backyard can be a good place to think about system design.

Facebooktwitterredditlinkedinmail

Get Cooking with ChatGPT

ChatGPT from OpenAI has been getting a lot of publicity recently. If you haven’t heard of it, ChatGPT is an acronym for Chat Generative Pre-trained Transformer. What does that mean? It’s a chatbot that’s been tuned through reinforcement and supervised learning. There has been much discussion about the accuracy and usability of ChatGPT.

ChatGPT Models

ChatGPT currently offers three different models for response generation. They have different capabilities and come with different pricing. The models are:

  • GPT-3, for understanding and generating natural language.
  • Codex, for understanding and generating software code.
  • Content filter, for determining whether text may be sensitive or unsafe

Being in the software industry, the Codex model is intriguing, but I have yet to see examples of auto-generated code that does exactly what I’m trying to accomplish. Therefore, having previous experience in the culinary industry I thought it would be interesting to try ChatGPT in that arena using the GPT-3 model. Let’s see what sort of recipes ChatGPT can generate and how true to the classic recipes they are.

There are four different model options available when using the overall GPT-3 model: text-ada-001, text-babbage-001, text-curie-001, and text-davinci-003. The models go along a speed/cost scale of fast & inexpensive with text-ada-001 to the most capable but slower & more expensive text-davinci-003. While fine-tuning these models is an option, it goes beyond the scope of this post. With that in mind, let’s look at what these different models can generate for us “out of the box.”

The Recipe

In classical French cuisine, there are five mother sauces on which other sauces are based. To test out ChatGPT’s recipe generation, let’s look at one of these classic sauces: Béchamel. Béchamel is a simple sauce, from a classic ingredient standpoint, in that it contains butter, all-purpose flour, milk, salt, and pepper.

Let’s write a brief Python application that we’ll use to access the OpenAI API and see what responses we get from ChatGPT.

First, we’ll need to get an account with OpenAI and an API Key. As a best practice to keep your API Key safe, using an API Vault, such as Pangea’s Vault is a great idea. At a bare minimum, assign it to an environment variable so you can access in your application but not need to save the key in your code file.

Second, we’ll need to install OpenAI using a package management tool such as pip or pipenv. I prefer pipenv, so running the following command will get that installed:

pipenv install openai

Now, in a Python file, chatgpt-python.py, we’ll make a “How do I make bechamel sauce?” request to OpenAI using the four different GPT-3 models:

import openai
import os

openai.api_key = os.environ['OPENAPI_APIKEY']

# Choose the model engine
model_engines = ["text-ada-001", "text-babbage-001", "text-curie-001", "text-davinci-003"]

# The question passed into the API
prompt = "How do I make bechamel sauce?"

for model_engine in model_engines:

     # Response generation
     generator = openai.Completion.create(
          engine=model_engine,  # (REQUIRED) - the model to use
          prompt=prompt,  # (OPTIONAL) - the prompt that's passed to the API
          max_tokens=1024, # (OPTIONAL) - the max # of tokens to generate
          n=1,  # (OPTIONAL) - how many completions to generate
          stop=None,  # (OPTIONAL) - stop word string or array.
          temperature=0.5.  # (OPTIONAL) - the coherency of the generated response. Number between 0 and 2
)

     generated_response = generator.choices[0].text
     print("Response for " + model_engine + ": ")
     print(generated_response)

In the response generator, we have a few options, and we’re defining the model engine, passing in the prompt, defining the number of tokens used from the API (default is 16), and how many completions are generated for each prompt (default is 1), any stop sequences we want to define to stop further token generation, and the temperature of the response.

What is the temperature? It defines the randomness (coherency) of the generated response. Lower numbers here make the responses more focused and higher numbers generate more random output. We’ll explore this in a bit. For now, here are the responses I got from the above script. Remember that we’re looking for an answer to the question of “How do I make béchamel sauce?”

text-ada-001 (.50 second response)

There is no one definitive way to make bechamel sauce. However, most likely you will need to use grated cheese, milk, and butter.

Hmmm, that doesn’t really seem helpful and while there are sauces based on a bechamel that include cheese the base sauce doesn’t.

text-babbage-001 (0.61 second response)

Place a pot of water on the stove to boil. Add a cup of sugar and cook until the sugar has dissolved. Add a cup of milk and cook until the milk has boiled off. Pour the sauce into a saucepan and stir in the bechamel sauce ingredients. Cook over low heat until thickened.

Again, not spectacularly useful, for classic béchamel with sugar being mentioned. Also, what is a “pot of water” exactly? 1 cup, 1 quart, 5 gallons? I’m not sure what “until the milk has boiled off” means in this context either. And then, a very cyclical instruction for how to make bechamel sauce: “stir in the bechamel sauce ingredients.”

text-curie-001 (.84 second response)

Bechamel sauce is a simple sauce made from milk and flour. Heat milk in a saucepan over medium heat until it starts to simmer. Whisk in flour until a smooth paste forms. Cook, stirring constantly, until the sauce thickens. Serve hot.

Getting closer here, but still not quite there. Using straight flour to thicken milk doesn’t create a great flavor. However, even without measurements, you could generate something here, right? Even with 100 gallons of milk being heated to a simmer you know that you must add in enough flour until a “smooth paste forms.” Still, far from a classical béchamel sauce.

text-davinci-003 (7.3 second response)

Bechamel sauce is a classic French sauce made from a roux of butter and flour, and a liquid, usually milk. Here is a basic recipe:

Ingredients:

-3 tablespoons butter
-3 tablespoons all-purpose flour
-2 cups milk
-Salt and freshly ground pepper
-Pinch of ground nutmeg

Instructions:

1. Melt the butter in a saucepan over medium heat.
2. Add the flour and whisk until the mixture is smooth and bubbling.
3. Gradually whisk in the milk, whisking constantly to avoid lumps.
4. Continue to whisk until the sauce thickens, about 5 minutes.
5. Season with salt, pepper, and nutmeg.
6. Serve warm.

This is a recipe that one could work with. I’d have a few nit-picks with the instructions, but overall, it makes sense. Personally, I wouldn’t add the nutmeg and it isn’t explicitly classical, but I’ve seen it done.

Temperature Adjustment

As mentioned earlier, the temperature parameter adjusts the “coherency” of the generated output. You can set that parameter to a number between 0 and 2. Recall that the above script ran with a temperature parameter of 0.5. The output was, while not always helpful, at least coherent.

The output from the text-davinci-003 model was useful. Let’s see what happens if we alter the temperature parameter value.

temperature=1.75

Bechamel is one of the staples in employing tha classic “mother tastes” that understands fundamental techniques, which are included bases like Bechamel felt se as Milk, simple sugars /mutable faster blend together all seeds again serve holes along veggies traditional

That’s just the introduction, the rest of the “recipe” isn’t much better…

To make béchamel classic:

1 MELT 2 tablespoons worthwhile normal core puns international info and work users tub back stainless stre ‘tub fully restricted sized documents allowed uses principle-friendly views data

… and it goes on but becomes less coherent.

temperature=0.2

Let’s go in the other direction and reduce the temperature parameter value. Here are the results:

Bechamel sauce is a classic French white sauce made with butter, flour, and milk.

Ingredients:

– 2 tablespoons butter
– 2 tablespoons all-purpose flour
– 2 cups milk
– Salt and pepper to taste

Instructions:

1. In a medium saucepan, melt the butter over medium heat.
2. Add the flour and whisk until the mixture is smooth and bubbly.
3. Gradually add the milk, whisking constantly until the mixture is smooth.
4. Bring the mixture to a boil, then reduce the heat and simmer for 5 minutes, stirring occasionally.
5. Season with salt and pepper to taste.
6. Serve over your favorite dish. Enjoy!

Another great béchamel recipe, and this time without nutmeg!

I don’t believe you. You’re an impasta!

As we’ve seen using OpenAI and ChatGPT can have some interesting results. You might wind up with some imposter recipes or you might wind up with some winners. That is true overall, in my opinion, for AI-generated content now though. Could you build an entire application strictly based on results from ChatGPT? Perhaps. Individual results may vary though.

As the models get better, get more use, and “learn” more, things might change. And recipe creation is just one small example of this technology. We see folks attempting to use it in areas such as courtrooms, health care, insurance, and more. In comparison, culinary uses seem safe. Just based on the varying results of a béchamel sauce recipe, I think we have a ways to go with this technology.

Facebooktwitterredditlinkedinmail