The Forensic Team: Architecting Multi-Agent Handoffs with MCP

Why One LLM Isn’t Enough—And How to Build a Specialized Agentic Workforce

In my last post, we explored the “Zero-Glue” architecture of the Model Context Protocol (MCP). We established that standardizing how AI “talks” to data via an MCP Server is the “USB-C moment” for AI infrastructure.

But once you have the pipes, how do you build the engine?

In 2026, the answer is no longer “one giant system prompt.” Instead, it’s Functional Specialization. Today, we’re building a Multi-Agent Forensic Team: a group of specialized Python agents that use our TypeScript MCP Server to perform deep-dive archival audits.

The “Context Fatigue” Problem

Early agent architectures relied on a single LLM handling everything:

  • retrieve data
  • reason about it
  • run tools
  • write the final output

Even with large context windows, this approach quickly hits a reasoning ceiling.

A single agent juggling too many tools often suffers from:

  1. Tool Confusion
    Choosing the wrong function when multiple tools are available.
  2. Logic Drift
    Losing track of the objective during multi-step reasoning.
  3. Latency and Cost
    Sequential reasoning loops increase response time and token usage.

The solution is functional specialization.

Instead of one overloaded agent, we build a team of focused agents coordinated by a supervisor.

Before diving into the multi-agent design, it helps to understand where the agents live in the MCP stack.

Figure 1. The MCP architecture stack: agents reason about tasks while MCP standardizes access to tools, resources, and enterprise data.

Layered architecture diagram of an MCP-based AI system showing applications, agent orchestration, the Model Context Protocol layer, tools and resources, and underlying data systems.
The MCP architecture stack: agents reason about tasks while MCP standardizes access to tools, resources, and enterprise data.

The Architecture: A Polyglot Powerhouse

One of MCP’s strengths is that it decouples tools from orchestration.

This allows each layer of the system to use the language best suited for the job.

In our case:

  • The “Hands” (TypeScript)
    Our MCP server handles data access and tool execution with strong typing.
  • The “Brain” (Python)
    A Python orchestrator manages reasoning and agent coordination using frameworks like LangGraph or PydanticAI.

Because both layers communicate through MCP, the language boundary disappears.

Multi-Agent MCP Architecture

Diagram showing a multi-agent architecture using the Model Context Protocol (MCP) with a Python supervisor agent coordinating Librarian and Analyst agents that access tools through a TypeScript MCP server connected to an archive database.
Multi-agent MCP architecture: a Python supervisor coordinates specialized agents that access tools through a shared MCP server.

Each agent communicates with tools through the MCP server, not directly with the data source.

The Forensic Team Roles:

Role Agent Identity Primary Responsibility MCP Tools Used
Supervisor The Orchestrator Receives request, manages state, and handles handoffs. list_tools, list_resources
Librarian The Researcher Gathers historical facts and archival metadata find_book_in_master_bibliography
Analyst The Forensic Tech Compares observed data against metadata to find flaws audit_artifact_consistency

How It Works: Glue-Free Agent Handoffs

The beauty of MCP is the Transport Layer. Our Python client connects to the TypeScript server via stdio. It doesn’t care that the server is written in Node.js; it only cares about the protocol.

  1. Spawning the Sub-process
    In our orchestrator.py, we define how to “wake up” the TypeScript server. Notice how we point Python directly at the Node.js build:
def get_server_params() -> StdioServerParameters:
    # This is the bridge: Python spawning a Node.js process
    return StdioServerParameters(
        command="node",
        args=[str(SERVER_ENTRY)], # Points to our TS /build/index.js
        cwd=str(PROJECT_ROOT),
    )
  1. The Functional Handoff
    Because MCP tools expose strict schemas, the agents can pass structured results between each other without custom translation layers.

The Supervisor doesn’t manually parse JSON or remap fields.

Instead it simply chains the outputs:

# 1. Librarian: pull book details
librarian_result = await librarian_agent(session, title, author)

# 2. Analyst: audit for discrepancies (using Librarian's data)
analyst_result = await analyst_agent(
    session, book_page_id, book_standard, observed
)

Why This Wins in the Enterprise:

Auditability

You can track exactly what each agent saw and what conclusions it produced.

Security

Agent permissions can be scoped by tool access.
The Librarian may only read archives, while the Analyst writes forensic reports.

Maintainability

Each agent owns a single responsibility.
If the forensic logic changes, only the Analyst agent needs to be updated.

Scaling to the “AI Mesh”

By using MCP as the backbone, you’ve built more than an app; you’ve built a System of Intelligence. Any new tool you add to your TypeScript server is instantly “discoverable” by your Python team. You are no longer writing “Glue Code”; you are orchestrating a digital workforce.

The MCP server becomes the shared capability layer for your entire AI system.

📚 The “Zero-Glue” Series
– Post 1: The End of Glue Code: Why MCP is the USB-C Moment for AI
– Post 2: The Forensic Team: Architecting Multi-Agent Handoffs – You are here
– Post 3: From Cloud to Laptop: Running MCP Agents with SLMs – Coming Soon
– Post 4: Enterprise Governance: Scaling MCP with Oracle 26ai – Coming Soon

Explore the Code:

The full multi-agent orchestrator is now live in the /examples folder of the repo:
👉 MCP Forensic Analyzer – Multi-Agent Example

Up Next in the Series:

Next week, we go small. We’re moving the “Forensic Team” out of the cloud and onto your laptop. We’ll explore Edge AI and how to run this entire stack using Small Language Models (SLMs) like Phi-4—no $10,000 GPU required.

Facebooktwitterredditlinkedinmail

Flask Blood Glucose Tracker

My oldest daughter was diagnosed with Type 1 Diabetes at the age of two. Technology has come a long way in terms of tracking blood sugar levels, but I thought I would start a Python web application to do so. I’ll be using the Flask web framework for the project and, since I’m not a marketing or product naming genius, will call the project Flask Blood Glucose Tracker. I’m certainly open to other, more catchy, names.

This is somewhat of a different post for me, in that I’ll be walking through the generation of an actual application. The application will mostly be a minimal viable product (MVP) though. It should, however, be a good tutorial on how to integrate some concepts and technologies together. As such, in this first post, I’ll cover some of the project specifications and features that I’d like to include and work on implementing them in future posts.

Application for Diabetes

Diabetics have to check their blood glucose levels frequently, typically using a blood sugar monitor. These checks involve getting a small portion of blood from a finger (or arm) prick and having the monitor test it. It then returns a measurement of the amount of glucose in the blood as a number of milligrams per deciliter (mG/dL). These readings then will be the numbers we want to record in our application.

It is also important to know if the reading is taken before or after eating and the time of day the reading was taken. Therefore, the application will need to accommodate for that as well. The readings themselves, along with the time of day and relation to meal time are all factors into the amount of insulin needs to be injected.

Throughout this, and subsequent posts, I will do my best to explain diabetes-specific terms in as user-friendly of a way as possible. I am basing much of the numbers that I use, ranges of “good and bad”, etc. on managing my daughter’s diabetes over the last 14 years and the countless doctor appointments I’ve attended. As a disclaimer, I am not a medical professional. Please check with your physician about specifics with diabetes. There is a Diabetes For Dummies book which provides a decent overview as well.

Flask Blood Glucose Tracker Application Features

There are already a lot of excellent products on the market for keeping track of one’s blood sugar levels. With that in mind, this application is going to be fairly simple to start. I would like to build it with growth in mind, however, so building a REST API into the project seems like a good idea. My basic feature list to start with is:

  • Register for the BGT site, with email confirmation.
  • Login/Logout and based on role have different access.
    • Patient to access and edit my own data.
    • Physician to get a list of all current patient records.
    • Administrator for site maintenance.
  • Input blood sugar levels with date and time of reading and indication of before or after a meal or snack.
  • Display the data in a table with averages.
  • Display the data in a chart or visual format.
  • Typical CRUD operations for the data
  • REST API to expose patient blood sugars in a secure fashion
  • Data is stored in the cloud for accessibility and ease of database maintenance.

For the reporting features highlighting high and low blood sugar levels in the report would be helpful. Since what is “good” and “bad” can change for each individual, I’ll include a field for each individual to set that.

Application Stack

I have already mentioned that this application will be built with Flask. For the data store, I will use MongoDB. To keep in line with the feature request of storing data in the cloud, MongoDB offers a Database as a Service (DBaaS) called Atlas.

I haven’t quite figured out yet where I’ll ultimately host this application, perhaps Heroku? Or maybe on my own server.

Application Libraries

When it comes to libraries, there are a lot of choices. Here’s what I’ll be using, which will also be included in a requirements.txt file.

  • Flask version 0.12.2
  • Flask-Login, 0.4.0
  • Flask-PyMongo, 0.5.1
  • Bokeh, 0.12.6
  • Jinja2, 2.9.6
  • pandas, 0.20.2

There are other libraries that will but used as well, but those are the main ones of interest.

I’m also more of a fan of Zurb Foundation than Twitter Bootstrap, so I’ll be using that for styling.

Pages and Routes

Web Pages

To start with we need a way for a user to log in, enter their personal data, enter a new blood sugar record, edit their record, and view their information in both a tabular format and then in a chart format.

We’ll need the following pages, at least to start.

  • Index
  • Registration
  • Login/Logout Page
  • Profile page, login required
  • Records page (create, read, update, delete), login required
  • Chart page, login required

For a Physician we would want to be able to:

  • Display all of their patients, login required

For an Administrator, we want to be able to

  • Have the ability to manage users (patients & physicians) but not be able to see patient medical data.
REST API Routes

From the API I want to expose the ability to securely read and write (GET and POST) data to a patient’s record. This will make it easier to, for example, write a mobile application to connect to our data. Or, with the advances in blood sugar monitors, perhaps automatically update our application with readings from a device.

Document Model

Since I will be using MongoDB to store data for this application, I’ll be leveraging the document model. This offers a lot of flexibility in how data is stored, among other benefits. I would encourage you to read my blog post on the document model if you are not familiar with it.

To start with, the basic data we want to capture and model will be as follows:

BGT Sample Document
Sample Patient Record document

In looking at this sample document, the groups field will keep track of values such as patientphysician, and admin. I have chosen to implement the postal_code and MRN values as strings instead of integers to accommodate alpha-numeric values.

Through the course of developing this application, we will see the flexibility of the document model in action. We’ll see how we can utilize some of the features from MongoDB’s aggregation pipeline to handle our data processing as well.

Wrap Up

I have outlined a nice project here which will utilize several different bits of programming. In the next few posts then, I’ll cover how to implement all of these features into an MVP application. I would definitely enjoy receiving feedback, so please leave comments below.

 


Follow me on Twitter @kenwalger to get the latest updates on my postings.

Facebooktwitterredditlinkedinmail