{"id":1529,"date":"2026-05-06T08:45:00","date_gmt":"2026-05-06T15:45:00","guid":{"rendered":"https:\/\/www.kenwalger.com\/blog\/?p=1529"},"modified":"2026-05-06T13:40:00","modified_gmt":"2026-05-06T20:40:00","slug":"when-your-ai-pipeline-grows-up-infrastructure-thinking-for-real-time-inference-at-scale","status":"publish","type":"post","link":"https:\/\/www.kenwalger.com\/blog\/ai\/when-your-ai-pipeline-grows-up-infrastructure-thinking-for-real-time-inference-at-scale\/","title":{"rendered":"When Your AI Pipeline Grows Up: Infrastructure Thinking for Real-Time Inference at Scale"},"content":{"rendered":"<p>There&#8217;s a familiar arc in AI development. A team builds a model, wires up a pipeline, and ships it. It works. In the demo, it&#8217;s fast. Features arrive cleanly, predictions feel fresh, vector search returns sensible results. Everyone is happy.<\/p>\n<p>Then production happens.<\/p>\n<p>Latencies spike unpredictably. Features arrive stale. The vector index that performed beautifully at 100K records starts degrading at 10M. The system that hummed in development begins to wheeze under real load. The model hasn&#8217;t changed. The accuracy metrics still look fine. But the <em>system<\/em> is struggling \u2014 and accuracy is no longer the only thing that matters.<\/p>\n<p>This post is about what comes after model accuracy: the infrastructure concerns that determine whether your real-time AI actually works in production at scale.<\/p>\n<hr \/>\n<h2>The Gap Between Dev and Prod<\/h2>\n<p>Most ML pipelines are designed around a happy-path assumption: data is clean, features are fresh, requests arrive at a manageable pace, and the compute resources you provisioned are enough. These assumptions hold in development. They rarely hold at scale.<\/p>\n<p>The production environment introduces three categories of pressure that expose architectural weaknesses:<\/p>\n<p><strong>1. Load variability.<\/strong> Traffic is never flat. Real-world AI workloads spike \u2014 product launches, viral events, end-of-quarter reporting rushes, user behavior patterns tied to time zones. A pipeline that performs at P50 doesn&#8217;t guarantee acceptable behavior at P99. And P99 is where your users live when things go wrong.<\/p>\n<p><strong>2. Data velocity.<\/strong> Features go stale. The world changes faster than batch refresh cycles. For recommendation systems, fraud detection, personalization engines, and anything that depends on recent behavioral signals, a feature value that&#8217;s 15 minutes old can be meaningfully worse than one that&#8217;s 15 seconds old. The gap between feature generation and model consumption is a direct contributor to prediction quality degradation.<\/p>\n<p><strong>3. Index drift.<\/strong> Vector search is not a set-it-and-forget-it operation. As your embedding space grows and evolves \u2014 new documents, updated products, revised knowledge bases \u2014 the indices that power semantic search require continuous maintenance. Approximate Nearest Neighbor (ANN) indices in particular degrade in relevance and response time as the data distribution shifts underneath them.<\/p>\n<p>Understanding these three pressures is the first step toward designing systems that survive them.<\/p>\n<hr \/>\n<h2>What &#8220;Real-Time&#8221; Actually Requires<\/h2>\n<p>&#8220;Real-time AI&#8221; is an overloaded term. Before you can design for it, you need to be precise about what it means in your context. There are at least three meaningful tiers:<\/p>\n<ul>\n<li><strong>Near-real-time (seconds to minutes):<\/strong> Acceptable for many analytics, batch recommendation refreshes, and reporting use cases.<\/li>\n<li><strong>Low-latency (sub-second):<\/strong> Required for interactive recommendation, search, and user-facing personalization.<\/li>\n<li><strong>Streaming real-time (milliseconds):<\/strong> Required for fraud detection, financial trading signals, and reactive safety systems.<\/li>\n<\/ul>\n<p>Each tier demands different architectural choices. A feature store that works beautifully for near-real-time refreshes may be completely unsuited for millisecond-latency inference. The first architectural question to ask isn&#8217;t <em>&#8220;how do we get features?&#8221;<\/em> \u2014 it&#8217;s <em>&#8220;what does real-time actually mean for this workload?&#8221;<\/em><\/p>\n<p>Once you&#8217;ve answered that, you can reason about the pipeline design.<\/p>\n<hr \/>\n<h2>The Three Architectural Pillars<\/h2>\n<h3>1. Feature Freshness: Designing for the Speed of Your Signal<\/h3>\n<p>The feature pipeline is where most latency and staleness problems originate. There are two broad architectures:<\/p>\n<p><strong>Batch feature pipelines<\/strong> compute features on a schedule \u2014 hourly, daily, or on-demand \u2014 and write them to a feature store. They&#8217;re operationally simple and cost-efficient. They&#8217;re also structurally incapable of delivering fresh signals for low-latency workloads.<\/p>\n<p><strong>Streaming feature pipelines<\/strong> compute features continuously as events arrive, using frameworks like Apache Kafka, Apache Flink, or Spark Structured Streaming. They&#8217;re more complex to build and operate, but they&#8217;re the only viable path when your model needs to reason about what happened in the last 30 seconds.<\/p>\n<p>The practical reality is that most production systems need both. A <em>Lambda architecture<\/em> pattern \u2014 combining batch for historical aggregates with streaming for real-time signals \u2014 gives you the freshness of streaming where it matters without abandoning the reliability and richness of batch-computed features.<\/p>\n<p>Key design decisions in feature pipelines:<\/p>\n<ul>\n<li><strong>Point-in-time correctness<\/strong>: Features used for training must reflect what the system would have known at the moment of prediction \u2014 not values computed with hindsight. Failure to enforce this introduces training-serving skew, one of the most insidious sources of silent model degradation.<\/li>\n<li><strong>Backfill capability<\/strong>: Can your streaming pipeline reconstruct historical features when you retrain? Architectures that can&#8217;t backfill trade away long-term flexibility for short-term simplicity.<\/li>\n<li><strong>Feature reuse<\/strong>: The same feature \u2014 a user&#8217;s 7-day purchase count, for example \u2014 is often needed by multiple models. Centralizing feature computation prevents redundant infrastructure and inconsistent definitions across teams.<\/li>\n<\/ul>\n<hr \/>\n<h3>2. The Feature Store: Consistency and Latency at Scale<\/h3>\n<p>A feature store is the operational hub of a real-time ML system. It serves as the bridge between feature computation (where data scientists live) and model inference (where production systems live). Getting its design right has outsized consequences.<\/p>\n<p>The central tension in feature store design is between <strong>consistency<\/strong> and <strong>latency<\/strong>. Achieving both simultaneously at scale is genuinely hard.<\/p>\n<p><strong>The dual-store pattern<\/strong> is the most widely adopted solution. It separates storage into two layers:<\/p>\n<ul>\n<li>An <strong>online store<\/strong> \u2014 typically an in-memory or low-latency key-value store \u2014 serves features at inference time. Reads must be fast, often sub-millisecond. The tradeoff is cost: fast storage is expensive, so online stores typically hold only the most recent feature values.<\/li>\n<li>An <strong>offline store<\/strong> \u2014 typically a columnar data warehouse \u2014 serves training pipelines, batch scoring, and historical analysis. Reads are slower but the storage cost is orders of magnitude lower.<\/li>\n<\/ul>\n<p>A write path synchronizes values between the two stores as new features are computed.<\/p>\n<p><strong>Consistency pitfalls to design against:<\/strong><\/p>\n<ul>\n<li><strong>Training-serving skew<\/strong>: If the offline store and online store derive features differently \u2014 even slightly \u2014 your model is trained on data that doesn&#8217;t match what it sees in production. This is silent and difficult to detect.<\/li>\n<li><strong>Schema drift<\/strong>: Features evolve. Adding a new feature, changing a transformation, or retiring a deprecated one all require careful version management. Feature stores without explicit schema governance accumulate technical debt that eventually manifests as production incidents.<\/li>\n<li><strong>Cold start<\/strong>: When a new entity (a new user, a new product) arrives with no feature history, what does the model see? Null-handling and default value strategy belong in the feature store design, not as afterthoughts.<\/li>\n<\/ul>\n<p><strong>Access pattern design:<\/strong><\/p>\n<p>Feature retrieval for inference often involves batch point lookups \u2014 fetching dozens of feature values for a single entity across multiple feature groups simultaneously. The data model and indexing strategy of your online store must be optimized for this access pattern, not for the range scans and aggregations that suit an offline analytical store.<\/p>\n<hr \/>\n<h3>3. Vector Search at Scale: Maintaining Performance Under Continuous Change<\/h3>\n<p>Vector databases and ANN search have moved from research curiosity to production infrastructure in a remarkably short time. They&#8217;re now central to RAG (Retrieval-Augmented Generation) pipelines, semantic search, recommendation systems, and multimodal applications. And they introduce a class of operational problems that most teams underestimate.<\/p>\n<p><strong>The index degradation problem<\/strong><\/p>\n<p>ANN indices \u2014 HNSW, IVF, and their variants \u2014 are built for approximate search speed, not for correctness under mutation. They&#8217;re typically optimized at build time for a specific data distribution. As you add, update, and delete vectors continuously, several things happen:<\/p>\n<ul>\n<li><strong>Recall degrades<\/strong>: The approximation quality drops as the index structure diverges from the actual data distribution.<\/li>\n<li><strong>Latency increases<\/strong>: More nodes are traversed during search as the graph structure becomes less optimal.<\/li>\n<li><strong>Tombstone accumulation<\/strong>: Deleted vectors that aren&#8217;t fully purged create phantom results and slow index traversal.<\/li>\n<\/ul>\n<p>The naive solution \u2014 periodic full index rebuilds \u2014 introduces its own problems: rebuild latency, resource contention during the rebuild window, and the risk of serving stale or inconsistent results during transitions.<\/p>\n<p>More sophisticated approaches include:<\/p>\n<ul>\n<li><strong>Incremental indexing<\/strong>: Adding new vectors to the live index rather than rebuilding from scratch, trading some approximation quality for operational continuity.<\/li>\n<li><strong>Segment-based architectures<\/strong>: Maintaining multiple smaller index segments that are merged periodically, similar to how LSM-tree databases manage compaction. Fresh vectors land in small, easily-rebuilt segments; cold vectors live in stable, large segments.<\/li>\n<li><strong>Recall monitoring<\/strong>: Treating recall as an operational metric \u2014 not just a benchmark number \u2014 and triggering maintenance operations when it drops below acceptable thresholds.<\/li>\n<\/ul>\n<p><strong>Filtering and hybrid search<\/strong><\/p>\n<p>Production vector search is rarely pure semantic similarity. Real workloads layer metadata filters on top of vector similarity: find the most relevant product <em>in a user&#8217;s country<\/em>, find the most similar document <em>within a specific category<\/em>, find semantically related customers <em>above a revenue threshold<\/em>.<\/p>\n<p>Pre-filtering and post-filtering strategies have meaningfully different performance and correctness profiles. Pre-filtering (restricting the candidate set before ANN search) is faster but can miss relevant results if the filter is highly selective. Post-filtering (running ANN search broadly, then applying filters) is more complete but wastes compute. The right approach depends on your data distribution and selectivity characteristics \u2014 and it needs to be a deliberate architectural choice, not a default.<\/p>\n<hr \/>\n<h2>A Framework for Evaluating Your Pipeline<\/h2>\n<p>Before committing to architectural decisions, it&#8217;s worth stress-testing your current or planned design against these questions:<\/p>\n<p><strong>On feature freshness:<\/strong><br \/>\n&#8211; What is the maximum acceptable age of each feature at inference time?<br \/>\n&#8211; Do you have a streaming path for high-velocity signals?<br \/>\n&#8211; Is training-serving skew actively monitored?<\/p>\n<p><strong>On the feature store:<\/strong><br \/>\n&#8211; Can you retrieve all features for a single inference request in a single round-trip?<br \/>\n&#8211; Is your schema versioned and your transformation logic reproducible?<br \/>\n&#8211; What happens when a new entity arrives with no feature history?<\/p>\n<p><strong>On vector search:<\/strong><br \/>\n&#8211; Do you track recall as a production metric?<br \/>\n&#8211; How do you handle index updates without full rebuilds?<br \/>\n&#8211; Is your filtering strategy validated against your actual query distribution?<\/p>\n<p><strong>On the system as a whole:<\/strong><br \/>\n&#8211; What is your P99 latency SLA, and have you load-tested to it?<br \/>\n&#8211; Where are your single points of failure?<br \/>\n&#8211; Can you replay or backfill features and embeddings if a component fails?<\/p>\n<p>These aren&#8217;t hypothetical questions. Each one corresponds to a category of production incident that real teams have encountered when real-time AI systems scaled beyond their original design envelope.<\/p>\n<hr \/>\n<h2>The Shift in Mindset<\/h2>\n<p>Scaling real-time AI infrastructure requires a shift in how engineering teams think about the problem.<\/p>\n<p>In early development, the model is the system. Accuracy is the primary metric. Everything else is scaffolding.<\/p>\n<p>At scale, the <em>pipeline<\/em> is the system. The model is one component \u2014 important, but dependent on everything that surrounds it. Latency, freshness, consistency, and recall become first-class engineering concerns, tracked with the same rigor as model performance metrics.<\/p>\n<p>The teams that make this transition successfully are the ones that start treating their feature pipelines, feature stores, and vector indices not as data infrastructure afterthoughts, but as the production systems they actually are \u2014 with SLAs, observability, capacity planning, and failure modes worth designing against from the start.<\/p>\n<p>Real-time AI at scale is harder than it looks. But it&#8217;s not mysterious. The problems are identifiable, the architectural patterns are well-understood, and the path forward is clear once you&#8217;re asking the right questions.<\/p>\n<hr \/>\n<p><em>This post is part of an ongoing series on building production-grade AI systems. If you found this useful, consider sharing it with a teammate who&#8217;s hitting these problems for the first time.<\/em><\/p>\n<h3>When Your AI Pipeline Grows Up<\/h3>\n<ul>\n<li>Real Time AI At Scale \u2013 <em>This Post.<\/em><\/li>\n<li>Feature Freshness \u2013 <em>Coming 13 May 2026<\/em><\/li>\n<li>Feature Store &#8211; <em>Coming 20 May 2026<\/em><\/li>\n<li>Vector Search &#8211; <em>Coming 27 May 2026<\/em><\/li>\n<li>Operations &#8211; <em>Coming 3 June 2026<\/em><\/li>\n<\/ul>\n<a class=\"synved-social-button synved-social-button-share synved-social-size-48 synved-social-resolution-single synved-social-provider-facebook nolightbox\" data-provider=\"facebook\" target=\"_blank\" rel=\"nofollow\" title=\"Share on Facebook\" href=\"https:\/\/www.facebook.com\/sharer.php?u=https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F1529&#038;t=When%20Your%20AI%20Pipeline%20Grows%20Up%3A%20Infrastructure%20Thinking%20for%20Real-Time%20Inference%20at%20Scale&#038;s=100&#038;p&#091;url&#093;=https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F1529&#038;p&#091;images&#093;&#091;0&#093;=https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-content%2Fuploads%2F2026%2F04%2Fblog-of-ken-w.-alger-69f0d0d42c27f.png&#038;p&#091;title&#093;=When%20Your%20AI%20Pipeline%20Grows%20Up%3A%20Infrastructure%20Thinking%20for%20Real-Time%20Inference%20at%20Scale\" style=\"font-size: 0px;width:48px;height:48px;margin:0;margin-bottom:5px;margin-right:5px\"><img loading=\"lazy\" decoding=\"async\" alt=\"Facebook\" title=\"Share on Facebook\" class=\"synved-share-image synved-social-image synved-social-image-share\" width=\"48\" height=\"48\" style=\"display: inline;width:48px;height:48px;margin: 0;padding: 0;border: none;box-shadow: none\" src=\"https:\/\/www.kenwalger.com\/blog\/wp-content\/plugins\/social-media-feather\/synved-social\/image\/social\/regular\/96x96\/facebook.png\" \/><\/a><a class=\"synved-social-button synved-social-button-share synved-social-size-48 synved-social-resolution-single synved-social-provider-twitter nolightbox\" data-provider=\"twitter\" target=\"_blank\" rel=\"nofollow\" title=\"Share on Twitter\" href=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F1529&#038;text=Hey%20check%20this%20out\" style=\"font-size: 0px;width:48px;height:48px;margin:0;margin-bottom:5px;margin-right:5px\"><img loading=\"lazy\" decoding=\"async\" alt=\"twitter\" title=\"Share on Twitter\" class=\"synved-share-image synved-social-image synved-social-image-share\" width=\"48\" height=\"48\" style=\"display: inline;width:48px;height:48px;margin: 0;padding: 0;border: none;box-shadow: none\" src=\"https:\/\/www.kenwalger.com\/blog\/wp-content\/plugins\/social-media-feather\/synved-social\/image\/social\/regular\/96x96\/twitter.png\" \/><\/a><a class=\"synved-social-button synved-social-button-share synved-social-size-48 synved-social-resolution-single synved-social-provider-reddit nolightbox\" data-provider=\"reddit\" target=\"_blank\" rel=\"nofollow\" title=\"Share on Reddit\" href=\"https:\/\/www.reddit.com\/submit?url=https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F1529&#038;title=When%20Your%20AI%20Pipeline%20Grows%20Up%3A%20Infrastructure%20Thinking%20for%20Real-Time%20Inference%20at%20Scale\" style=\"font-size: 0px;width:48px;height:48px;margin:0;margin-bottom:5px;margin-right:5px\"><img loading=\"lazy\" decoding=\"async\" alt=\"reddit\" title=\"Share on Reddit\" class=\"synved-share-image synved-social-image synved-social-image-share\" width=\"48\" height=\"48\" style=\"display: inline;width:48px;height:48px;margin: 0;padding: 0;border: none;box-shadow: none\" src=\"https:\/\/www.kenwalger.com\/blog\/wp-content\/plugins\/social-media-feather\/synved-social\/image\/social\/regular\/96x96\/reddit.png\" \/><\/a><a class=\"synved-social-button synved-social-button-share synved-social-size-48 synved-social-resolution-single synved-social-provider-linkedin nolightbox\" data-provider=\"linkedin\" target=\"_blank\" rel=\"nofollow\" title=\"Share on Linkedin\" href=\"https:\/\/www.linkedin.com\/shareArticle?mini=true&#038;url=https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F1529&#038;title=When%20Your%20AI%20Pipeline%20Grows%20Up%3A%20Infrastructure%20Thinking%20for%20Real-Time%20Inference%20at%20Scale\" style=\"font-size: 0px;width:48px;height:48px;margin:0;margin-bottom:5px;margin-right:5px\"><img loading=\"lazy\" decoding=\"async\" alt=\"linkedin\" title=\"Share on Linkedin\" class=\"synved-share-image synved-social-image synved-social-image-share\" width=\"48\" height=\"48\" style=\"display: inline;width:48px;height:48px;margin: 0;padding: 0;border: none;box-shadow: none\" src=\"https:\/\/www.kenwalger.com\/blog\/wp-content\/plugins\/social-media-feather\/synved-social\/image\/social\/regular\/96x96\/linkedin.png\" \/><\/a><a class=\"synved-social-button synved-social-button-share synved-social-size-48 synved-social-resolution-single synved-social-provider-mail nolightbox\" data-provider=\"mail\" rel=\"nofollow\" title=\"Share by email\" href=\"mailto:?subject=When%20Your%20AI%20Pipeline%20Grows%20Up%3A%20Infrastructure%20Thinking%20for%20Real-Time%20Inference%20at%20Scale&#038;body=Hey%20check%20this%20out:%20https%3A%2F%2Fwww.kenwalger.com%2Fblog%2Fwp-json%2Fwp%2Fv2%2Fposts%2F1529\" style=\"font-size: 0px;width:48px;height:48px;margin:0;margin-bottom:5px\"><img loading=\"lazy\" decoding=\"async\" alt=\"mail\" title=\"Share by email\" class=\"synved-share-image synved-social-image synved-social-image-share\" width=\"48\" height=\"48\" style=\"display: inline;width:48px;height:48px;margin: 0;padding: 0;border: none;box-shadow: none\" src=\"https:\/\/www.kenwalger.com\/blog\/wp-content\/plugins\/social-media-feather\/synved-social\/image\/social\/regular\/96x96\/mail.png\" \/><\/a>","protected":false},"excerpt":{"rendered":"<p>There&#8217;s a familiar arc in AI development. A team builds a model, wires up a pipeline, and ships it. It works. In the demo, it&#8217;s fast. Features arrive cleanly, predictions feel fresh, vector search returns sensible results. Everyone is happy. Then production happens. Latencies spike unpredictably. Features arrive stale. The vector index that performed beautifully &hellip; <a href=\"https:\/\/www.kenwalger.com\/blog\/ai\/when-your-ai-pipeline-grows-up-infrastructure-thinking-for-real-time-inference-at-scale\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;When Your AI Pipeline Grows Up: Infrastructure Thinking for Real-Time Inference at Scale&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":1558,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"pmpro_default_level":"","_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_post_was_ever_published":false},"categories":[1669],"tags":[],"yst_prominent_words":[476,1061,90,290,715],"class_list":["post-1529","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai","pmpro-has-access"],"jetpack_featured_media_url":"https:\/\/www.kenwalger.com\/blog\/wp-content\/uploads\/2026\/04\/blog-of-ken-w.-alger-69f0d0d42c27f.png","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p8lx70-oF","jetpack-related-posts":[],"_links":{"self":[{"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/posts\/1529","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/comments?post=1529"}],"version-history":[{"count":6,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/posts\/1529\/revisions"}],"predecessor-version":[{"id":1600,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/posts\/1529\/revisions\/1600"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/media\/1558"}],"wp:attachment":[{"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/media?parent=1529"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/categories?post=1529"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/tags?post=1529"},{"taxonomy":"yst_prominent_words","embeddable":true,"href":"https:\/\/www.kenwalger.com\/blog\/wp-json\/wp\/v2\/yst_prominent_words?post=1529"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}