{"id":1532,"date":"2026-05-13T07:53:12","date_gmt":"2026-05-13T14:53:12","guid":{"rendered":"https:\/\/www.kenwalger.com\/blog\/?p=1532"},"modified":"2026-04-28T08:26:09","modified_gmt":"2026-04-28T15:26:09","slug":"feature-freshness-designing-pipelines-that-keep-up-with-the-world","status":"publish","type":"post","link":"https:\/\/www.kenwalger.com\/blog\/ai\/feature-freshness-designing-pipelines-that-keep-up-with-the-world\/","title":{"rendered":"Feature Freshness: Designing Pipelines That Keep Up With the World"},"content":{"rendered":"<p>In the <a href=\"https:\/\/www.kenwalger.com\/blog\/ai\/when-your-ai-pipeline-grows-up-infrastructure-thinking-for-real-time-inference-at-scale\">previous post<\/a>, we identified three categories of pressure that expose architectural weaknesses when AI pipelines scale: load variability, data velocity, and index drift. This post is about data velocity \u2014 specifically, the feature freshness problem.<\/p>\n<p>The core question is deceptively simple: <strong>how old is the data your model is reasoning about when it makes a prediction?<\/strong><\/p>\n<p>For some workloads, a few hours of staleness is harmless. For others, a few minutes can meaningfully degrade prediction quality. And for a growing class of real-time applications \u2014 fraud detection, dynamic pricing, live personalization \u2014 the answer has to be measured in seconds.<\/p>\n<p>Getting feature freshness right is primarily an architectural problem, not a modeling problem. The model doesn&#8217;t control how fresh its inputs are. The pipeline does.<\/p>\n<hr \/>\n<h2>Why Features Go Stale (And Why It Matters)<\/h2>\n<p>A feature is a representation of something that happened in the world: a user clicked something, a transaction was attempted, an inventory level changed. That event occurred at a specific moment in time. The feature value derived from it has a half-life \u2014 a window during which it accurately represents reality.<\/p>\n<p>When the pipeline can&#8217;t deliver features fast enough, the model receives a picture of the world that&#8217;s already out of date. For stationary signals \u2014 a user&#8217;s age, a product&#8217;s category \u2014 staleness is irrelevant. But for behavioral signals \u2014 recent purchase history, session activity, account velocity \u2014 staleness is a direct hit to prediction quality.<\/p>\n<p>Consider fraud detection. A model trained to catch account takeover attempts needs to know what the account has done in the last few minutes, not the last few hours. A batch pipeline refreshing features every two hours is structurally incapable of catching a credential-stuffing attack that executes in 20 minutes. The model isn&#8217;t wrong. The data is wrong.<\/p>\n<p>The same dynamic plays out across recommendation systems (a user&#8217;s interest signal from three hours ago is not the same as their interest signal right now), dynamic pricing (demand changes faster than hourly batch cycles can track), and content moderation (viral spread happens in minutes).<\/p>\n<p><strong>Freshness is a system property, not a model property.<\/strong> Which means the solution lives in the pipeline.<\/p>\n<hr \/>\n<h2>The Two Pipeline Architectures<\/h2>\n<h3>Batch Pipelines: Simple, Reliable, and Structurally Limited<\/h3>\n<p>A batch pipeline computes features on a schedule. A job runs every hour (or every day, or on-demand), reads from a source of truth, computes aggregations and transformations, and writes the results to a feature store for the model to consume at inference time.<\/p>\n<p>Batch pipelines are operationally mature. The tooling is well-understood \u2014 Spark, dbt, Airflow \u2014 and the failure modes are predictable. When a batch job fails, you know about it immediately and you can rerun it. They&#8217;re also cost-efficient: compute runs when you schedule it, not continuously.<\/p>\n<p>Their limitation is structural. The minimum freshness a batch pipeline can deliver is bounded by the job interval. An hourly job delivers features that are, at best, a few minutes old and, at worst, nearly an hour old. For workloads that need sub-minute freshness, no amount of operational optimization changes this fundamental constraint.<\/p>\n<p>Batch pipelines are the right answer when your features don&#8217;t change faster than your batch interval, or when the cost of staleness is low. They&#8217;re the wrong answer when your model depends on recent behavioral signals.<\/p>\n<h3>Streaming Pipelines: Fresh, Continuous, and More Complex to Operate<\/h3>\n<p>A streaming pipeline processes events as they arrive. Rather than computing features on a schedule, it reacts to each event in the source stream \u2014 a user action, a transaction, a sensor reading \u2014 and updates the relevant feature values immediately.<\/p>\n<p>The result is features that are seconds old rather than minutes or hours old. For workloads where that difference matters, streaming is the only viable architecture.<\/p>\n<p>The tradeoff is operational complexity. Streaming systems \u2014 typically built on Kafka for transport and Flink or Spark Structured Streaming for processing \u2014 have more moving parts than batch pipelines. Failures are harder to reason about: what happens to in-flight events when a processing node goes down? How do you handle out-of-order events? How do you test a streaming job end-to-end without a production-like event stream?<\/p>\n<p>These aren&#8217;t reasons to avoid streaming. They&#8217;re reasons to be intentional about when you adopt it, and to invest properly in the operational infrastructure when you do.<\/p>\n<hr \/>\n<h2>The Practical Answer: Lambda Architecture<\/h2>\n<p>Most production systems that need real-time ML don&#8217;t need <em>all<\/em> of their features to be fresh in real time. They need <em>some<\/em> features \u2014 typically behavioral signals \u2014 to be fresh, while relying on batch computation for historical aggregates and slowly-changing dimensions.<\/p>\n<p>This is the insight behind the Lambda architecture pattern, which has become the most widely deployed approach for production ML feature pipelines.<\/p>\n<p>The architecture has two parallel processing paths:<\/p>\n<ul>\n<li><strong>The batch layer<\/strong> computes features over the full historical dataset on a regular schedule. It&#8217;s authoritative, accurate, and complete \u2014 but slow. Features like &#8220;total purchases in the last 90 days&#8221; or &#8220;average session duration over the last 6 months&#8221; live here.<\/p>\n<\/li>\n<li><strong>The speed layer<\/strong> processes the real-time event stream continuously. It computes recent-window features \u2014 &#8220;purchases in the last 5 minutes,&#8221; &#8220;pages viewed in this session&#8221; \u2014 and writes them to the online store with low latency. It covers the gap that the batch layer can&#8217;t.<\/p>\n<\/li>\n<\/ul>\n<p>At serving time, the feature store merges values from both layers. The model sees a unified view: historically-grounded aggregates from the batch layer combined with freshly-computed behavioral signals from the speed layer.<\/p>\n<pre><code>Event Stream \u2500\u2500\u25ba Speed Layer \u2500\u2500\u25ba Online Store \u2500\u2500\u2510\n                                                 \u251c\u2500\u2500\u25ba Model Inference\nHistorical Data \u25ba Batch Layer \u2500\u2500\u25ba Online Store \u2500\u2500\u2518\n<\/code><\/pre>\n<p>The Lambda pattern isn&#8217;t free of complexity \u2014 maintaining two processing paths means two codebases, two sets of failure modes, and the challenge of keeping the definitions consistent between layers. But it&#8217;s a well-understood tradeoff, and the operational complexity is manageable once the architecture is established.<\/p>\n<hr \/>\n<h2>The Staleness Trap: Training-Serving Skew<\/h2>\n<p>No discussion of feature freshness is complete without addressing training-serving skew \u2014 arguably the most dangerous and hardest-to-detect failure mode in real-time ML pipelines.<\/p>\n<p>The problem occurs when the features used to train a model don&#8217;t match the features the model sees at inference time. Not because of a bug, exactly, but because of a subtle mismatch in how features are computed across the two contexts.<\/p>\n<p>The most common cause: <strong>future leakage during training<\/strong>.<\/p>\n<p>When you train a model on historical data, you need to be careful about which features were actually <em>knowable<\/em> at the moment of each training example. If you join feature values carelessly, you can accidentally include information that wasn&#8217;t available yet at the time the label was generated \u2014 what&#8217;s called &#8220;looking into the future.&#8221;<\/p>\n<p>Here&#8217;s a simplified illustration of why this matters:<\/p>\n<pre><code class=\"language-python\"># Naive approach \u2014 likely leaking future data\ntraining_data = events.join(features, on=\"user_id\")\n# Problem: 'features' contains values computed AFTER the event occurred\n\n# Point-in-time correct approach\ntraining_data = events.join(\n    features,\n    on=\"user_id\",\n    how=\"point_in_time\",\n    event_timestamp_col=\"event_time\",\n    feature_timestamp_col=\"feature_created_at\"\n)\n# Only features that existed BEFORE event_time are joined\n<\/code><\/pre>\n<p>The naive join looks correct. The training pipeline runs without errors. The model trains successfully. But the model has learned from a dataset that includes signals it will never have access to at inference time. The result is a model that performs better in offline evaluation than in production \u2014 sometimes dramatically better \u2014 with no obvious explanation.<\/p>\n<p>Point-in-time correct feature retrieval is the solution. It ensures that for each training example, only feature values that were computed <em>before<\/em> that example&#8217;s timestamp are used. Most mature feature store implementations provide this as a first-class operation.<\/p>\n<p>If yours doesn&#8217;t, it&#8217;s worth treating that as a gap to close \u2014 especially if your team has ever looked at a model&#8217;s offline metrics and wondered why production performance didn&#8217;t match.<\/p>\n<hr \/>\n<h2>Backfill Capability: The Feature You Don&#8217;t Think About Until You Need It<\/h2>\n<p>When you retrain a model \u2014 which you will, regularly \u2014 you need training data. That means you need historical feature values: what did the features look like for each training example at the time it was generated?<\/p>\n<p>Batch pipelines handle this naturally. The historical data is already there.<\/p>\n<p>Streaming pipelines are a different story. By definition, streaming features are computed in real time and written to an online store optimized for low-latency point reads. Unless you&#8217;ve explicitly designed for it, there&#8217;s no historical record of what those features looked like at any given moment in the past.<\/p>\n<p>Teams that discover this gap tend to discover it in a painful way: they&#8217;ve built a great real-time feature pipeline, the model is performing well, they want to retrain \u2014 and they realize they have no training data that reflects the streaming features their production model depends on.<\/p>\n<p>Designing for backfill from the start means:<\/p>\n<ul>\n<li><strong>Logging feature values at serving time<\/strong> \u2014 capturing what features were actually served for each prediction, along with timestamps. This creates a training dataset that exactly reflects production serving conditions.<\/li>\n<li><strong>Maintaining a feature log in the offline store<\/strong> \u2014 writing streaming feature values to a durable, queryable store as they&#8217;re computed, not just to the online serving store.<\/li>\n<li><strong>Defining features declaratively<\/strong> \u2014 so that the same transformation logic can be applied to historical data during a backfill run, rather than embedding it in a stateful streaming job that can&#8217;t be easily replayed.<\/li>\n<\/ul>\n<p>The teams that get this right tend to be the ones who thought about retraining before they thought about deployment. The teams that struggle are the ones who optimized for inference first and treated retraining as a future problem.<\/p>\n<hr \/>\n<h2>Feature Reuse: The Organizational Dimension<\/h2>\n<p>One aspect of feature pipelines that rarely gets enough attention in architecture discussions is the organizational cost of feature redundancy.<\/p>\n<p>In most data science organizations that have grown organically, the same feature \u2014 a user&#8217;s 30-day purchase total, for example \u2014 is computed independently by multiple teams for multiple models. Each team owns their own pipeline. Each pipeline uses a slightly different definition. The results are close, but not identical.<\/p>\n<p>This creates several categories of problems:<\/p>\n<ul>\n<li><strong>Compute waste<\/strong>: The same aggregation is being run multiple times against the same source data.<\/li>\n<li><strong>Definitional drift<\/strong>: When the source data schema changes, some pipelines get updated and others don&#8217;t. Features with the same name start returning different values.<\/li>\n<li><strong>Cross-model inconsistency<\/strong>: Two models that should share the same user signal are actually seeing different values, making it impossible to reason clearly about why their predictions diverge.<\/li>\n<\/ul>\n<p>A centralized feature store with a shared feature registry addresses this by making features first-class, named, versioned artifacts \u2014 not private implementation details of individual model pipelines. Teams can discover existing features before building new ones, reuse definitions with confidence, and consume the same computed values rather than running redundant jobs.<\/p>\n<p>This is as much a governance and process problem as a technical one. The technical infrastructure makes reuse possible; the organizational practices make it happen.<\/p>\n<hr \/>\n<h2>Designing for Freshness: A Decision Framework<\/h2>\n<p>Before choosing a pipeline architecture, answer these questions:<\/p>\n<p><strong>1. What is the maximum acceptable feature age at inference time?<\/strong><br \/>\nIf the answer is hours, batch may be sufficient. If it&#8217;s minutes, you need a fast batch cycle or light streaming. If it&#8217;s seconds, you need full streaming.<\/p>\n<p><strong>2. Which features are freshness-sensitive?<\/strong><br \/>\nNot all features need to be fresh. Identify the behavioral signals that lose value quickly, and design the streaming path around those specifically.<\/p>\n<p><strong>3. Can you enforce point-in-time correctness in training?<\/strong><br \/>\nIf not, your offline evaluation metrics are unreliable. Fix this before you trust any model performance numbers.<\/p>\n<p><strong>4. Have you designed for backfill?<\/strong><br \/>\nIf you can&#8217;t reconstruct historical feature values for retraining, your streaming pipeline is missing a critical capability.<\/p>\n<p><strong>5. Is feature logic shared or siloed?<\/strong><br \/>\nIf multiple teams are computing the same features independently, the organizational cost will compound over time.<\/p>\n<p>Answering these questions honestly surfaces the gaps that will cause problems at scale. The architecture choices that follow from them are usually straightforward. The hard part is asking before you&#8217;re in production.<\/p>\n<hr \/>\n<p>In the next post, we&#8217;ll move downstream from the pipeline to the feature store itself \u2014 the operational hub that sits between feature computation and model inference, and where consistency and latency collide at scale.<\/p>\n<h3>When Your AI Pipeline Grows Up Series<\/h3>\n<ul>\n<li><a href=\"https:\/\/www.kenwalger.com\/blog\/ai\/when-your-ai-pipeline-grows-up-infrastructure-thinking-for-real-time-inference-at-scale\">Real Time AI at Scale<\/a><\/li>\n<li>Feature Freshness &#8211; <em>This Post.<\/em><\/li>\n<li>Feature Store &#8211; <em>Coming Soon.<\/em><\/li>\n<\/ul>\n<a class=\"synved-social-button synved-social-button-share synved-social-size-48 synved-social-resolution-single synved-social-provider-facebook nolightbox\" data-provider=\"facebook\" target=\"_blank\" rel=\"nofollow\" title=\"Share on Facebook\" href=\"https:\/\/www.facebook.com\/sharer.php?u=https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F1532&#038;t=Feature%20Freshness%3A%20Designing%20Pipelines%20That%20Keep%20Up%20With%20the%20World&#038;s=100&#038;p&#091;url&#093;=https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F1532&#038;p&#091;images&#093;&#091;0&#093;=https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-content%2Fuploads%2F2026%2F04%2Fblog-of-ken-w.-alger-69f0d0be1ad2d.png&#038;p&#091;title&#093;=Feature%20Freshness%3A%20Designing%20Pipelines%20That%20Keep%20Up%20With%20the%20World\" style=\"font-size: 0px;width:48px;height:48px;margin:0;margin-bottom:5px;margin-right:5px\"><img loading=\"lazy\" decoding=\"async\" alt=\"Facebook\" title=\"Share on Facebook\" class=\"synved-share-image synved-social-image synved-social-image-share\" width=\"48\" height=\"48\" style=\"display: inline;width:48px;height:48px;margin: 0;padding: 0;border: none;box-shadow: none\" src=\"https:\/\/www.kenwalger.com\/blog\/wp-content\/plugins\/social-media-feather\/synved-social\/image\/social\/regular\/96x96\/facebook.png\" \/><\/a><a class=\"synved-social-button synved-social-button-share synved-social-size-48 synved-social-resolution-single synved-social-provider-twitter nolightbox\" data-provider=\"twitter\" target=\"_blank\" rel=\"nofollow\" title=\"Share on Twitter\" href=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F1532&#038;text=Hey%20check%20this%20out\" style=\"font-size: 0px;width:48px;height:48px;margin:0;margin-bottom:5px;margin-right:5px\"><img loading=\"lazy\" decoding=\"async\" alt=\"twitter\" title=\"Share on Twitter\" class=\"synved-share-image synved-social-image synved-social-image-share\" width=\"48\" height=\"48\" style=\"display: inline;width:48px;height:48px;margin: 0;padding: 0;border: none;box-shadow: none\" src=\"https:\/\/www.kenwalger.com\/blog\/wp-content\/plugins\/social-media-feather\/synved-social\/image\/social\/regular\/96x96\/twitter.png\" \/><\/a><a class=\"synved-social-button synved-social-button-share synved-social-size-48 synved-social-resolution-single synved-social-provider-reddit nolightbox\" data-provider=\"reddit\" target=\"_blank\" rel=\"nofollow\" title=\"Share on Reddit\" href=\"https:\/\/www.reddit.com\/submit?url=https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F1532&#038;title=Feature%20Freshness%3A%20Designing%20Pipelines%20That%20Keep%20Up%20With%20the%20World\" style=\"font-size: 0px;width:48px;height:48px;margin:0;margin-bottom:5px;margin-right:5px\"><img loading=\"lazy\" decoding=\"async\" alt=\"reddit\" title=\"Share on Reddit\" class=\"synved-share-image synved-social-image synved-social-image-share\" width=\"48\" height=\"48\" style=\"display: inline;width:48px;height:48px;margin: 0;padding: 0;border: none;box-shadow: none\" src=\"https:\/\/www.kenwalger.com\/blog\/wp-content\/plugins\/social-media-feather\/synved-social\/image\/social\/regular\/96x96\/reddit.png\" \/><\/a><a class=\"synved-social-button synved-social-button-share synved-social-size-48 synved-social-resolution-single synved-social-provider-linkedin nolightbox\" data-provider=\"linkedin\" target=\"_blank\" rel=\"nofollow\" title=\"Share on Linkedin\" href=\"https:\/\/www.linkedin.com\/shareArticle?mini=true&#038;url=https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F1532&#038;title=Feature%20Freshness%3A%20Designing%20Pipelines%20That%20Keep%20Up%20With%20the%20World\" style=\"font-size: 0px;width:48px;height:48px;margin:0;margin-bottom:5px;margin-right:5px\"><img loading=\"lazy\" decoding=\"async\" alt=\"linkedin\" title=\"Share on Linkedin\" class=\"synved-share-image synved-social-image synved-social-image-share\" width=\"48\" height=\"48\" style=\"display: inline;width:48px;height:48px;margin: 0;padding: 0;border: none;box-shadow: none\" src=\"https:\/\/www.kenwalger.com\/blog\/wp-content\/plugins\/social-media-feather\/synved-social\/image\/social\/regular\/96x96\/linkedin.png\" \/><\/a><a class=\"synved-social-button synved-social-button-share synved-social-size-48 synved-social-resolution-single synved-social-provider-mail nolightbox\" data-provider=\"mail\" rel=\"nofollow\" title=\"Share by email\" href=\"mailto:?subject=Feature%20Freshness%3A%20Designing%20Pipelines%20That%20Keep%20Up%20With%20the%20World&#038;body=Hey%20check%20this%20out:%20https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F1532\" style=\"font-size: 0px;width:48px;height:48px;margin:0;margin-bottom:5px\"><img loading=\"lazy\" decoding=\"async\" alt=\"mail\" title=\"Share by email\" class=\"synved-share-image synved-social-image synved-social-image-share\" width=\"48\" height=\"48\" style=\"display: inline;width:48px;height:48px;margin: 0;padding: 0;border: none;box-shadow: none\" src=\"https:\/\/www.kenwalger.com\/blog\/wp-content\/plugins\/social-media-feather\/synved-social\/image\/social\/regular\/96x96\/mail.png\" \/><\/a>","protected":false},"excerpt":{"rendered":"<p>In the previous post, we identified three categories of pressure that expose architectural weaknesses when AI pipelines scale: load variability, data velocity, and index drift. This post is about data velocity \u2014 specifically, the feature freshness problem. The core question is deceptively simple: how old is the data your model is reasoning about when it &hellip; <a href=\"https:\/\/www.kenwalger.com\/blog\/ai\/feature-freshness-designing-pipelines-that-keep-up-with-the-world\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Feature Freshness: Designing Pipelines That Keep Up With the World&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":1557,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"pmpro_default_level":"","_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_post_was_ever_published":false},"categories":[1669],"tags":[],"yst_prominent_words":[1061,90,290,420,967],"class_list":["post-1532","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai","pmpro-has-access"],"jetpack_featured_media_url":"https:\/\/www.kenwalger.com\/blog\/wp-content\/uploads\/2026\/04\/blog-of-ken-w.-alger-69f0d0be1ad2d.png","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p8lx70-oI","jetpack-related-posts":[],"_links":{"self":[{"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/posts\/1532","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/comments?post=1532"}],"version-history":[{"count":4,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/posts\/1532\/revisions"}],"predecessor-version":[{"id":1551,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/posts\/1532\/revisions\/1551"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/media\/1557"}],"wp:attachment":[{"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/media?parent=1532"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/categories?post=1532"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/tags?post=1532"},{"taxonomy":"yst_prominent_words","embeddable":true,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/yst_prominent_words?post=1532"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}