Pattern catalog
Every Fowler-style pattern across the series — 76 in all. Filter by volume, or type to search titles and intents. (Press / anywhere for full-text search.)
BM25 retrieval
Retrieve documents based on lexical overlap between query and document, with scoring that accounts for term frequency saturation and document length normalization in a way that pro…
Cross-encoder reranking
Apply joint query-document attention to rank candidate documents with much higher precision than independent embedding similarity, accepting higher per-document cost in exchange fo…
Dense vector retrieval (HNSW, IVF)
Retrieve documents based on semantic similarity by encoding queries and documents into a shared embedding space and finding nearest neighbors in that space, with approximate algori…
Federated multi-index search
Search across multiple separate indexes — different content types, different domains, different geographical or organizational boundaries — and combine the results into a coherent …
Intent-based query routing
Route each incoming query to the most appropriate retrieval pipeline based on classified intent (navigational, informational, conversational, transactional), producing better per-q…
Phrase and proximity matching
Boost or restrict matches based on the proximity and ordering of query terms within documents, capturing phrase semantics that bag-of-words scoring loses.
Query result caching strategies
Reduce query latency and infrastructure cost by caching results at appropriate granularity: full result sets, intermediate retrieval candidates, filter results, analyzer outputs, e…
Reciprocal Rank Fusion (RRF)
Combine ranked results from multiple retrieval methods (lexical, dense, sparse-learned) into a single ranked list, using only the rank positions rather than the raw scores, produci…
Search engineering communities and references
Provide pointers to the active sources of search engineering knowledge across academic research, practitioner literature, vendor documentation, and community gatherings.
Sparse-learned retrieval (SPLADE, BGE-sparse)
Retrieve documents using sparse vector representations — where each dimension corresponds to a vocabulary term — with learned weights that include implicit term expansion, bridging…
Two-stage retrieve-and-rerank
Apply expensive ranking methods (LTR, cross-encoder rerankers, personalization) to a small candidate set produced by cheap first-stage retrieval, achieving better top-K quality tha…
User and session context injection
Adjust retrieval and ranking based on signals available at query time — user history, current session context, locale, device, time of day — to produce results more relevant to the…
Weighted hybrid scoring
Combine scores from multiple retrieval methods using explicit per-method weights, supporting per-query-type tuning and integration with learned ranking models that need calibrated …
Context-aware query understanding
Adjust query understanding outputs — intent classification, entity linking, synonym expansion — based on context signals about the user, session, and environment, producing per-use…
Edit-distance and phonetic spell correction
Identify and correct misspelled query tokens by finding nearby dictionary entries via Levenshtein/Damerau-Levenshtein distance or phonetic encoding, recovering queries that would o…
Intent classification across rule, ML, and LLM approaches
Classify each query into intent classes with confidence scores, supporting downstream routing decisions and providing features for ranking models.
Named entity recognition and entity linking for search
Extract structured entity information (brands, products, categories, attributes, locations) from natural-language queries and link the extracted entities to IDs in the company's ca…
Resources for tracking query understanding discipline
Provide pointers to the active sources of query understanding knowledge across NLP, IR, ML, and production practice.
Stop words, query reformulation, and query reduction
Improve retrieval quality by transforming the user's query — removing low-value tokens, reformulating into structured forms, reducing overly-long queries — in ways that improve mat…
Synonym management and query expansion strategies
Expand queries (or documents) with related terms so matches succeed despite vocabulary mismatch between user queries and document content, using a combination of manual, learned, a…
The Lucene-style analyzer chain
Process query and document text into matchable tokens using a configurable chain of character-level filters, tokenization, and per-token transformations, with the same chain applie…
Blue/green reindexing with index aliases
Evolve indices (schema changes, analyzer changes, embedding model changes) without downtime by building a new index version alongside the live one, reindexing all data, and atomica…
Chunking strategies for production vector retrieval
Choose a chunking strategy appropriate to the document type and retrieval needs, producing chunks that maximize retrieval quality at acceptable index size and indexing cost.
LLM-based attribute extraction at index time
Extract structured signals from raw document content using LLM-based processing at index time, producing fields that retrieval can filter on and ranking can use as features.
Multi-modal embedding for cross-modal search
Index documents that combine text, images, audio, or video by extracting embeddings from each modality and storing them as separate vector fields, supporting retrieval that matches…
Production embedding strategies and multi-vector schemas
Choose embedding model, content representation, and field structure to produce high-quality vector representations that retrieval can use effectively across the diverse queries the…
Production schema design with sub-fields and multi-mode matching
Design a document schema where each field's type, analyzer, and storage decisions support the specific query behaviors the system needs to handle, using sub-field patterns to suppo…
Resources for tracking indexing and document engineering discipline
Provide pointers to the active sources of indexing knowledge across IR, NLP, ML, RAG, and production practice.
Symmetric and asymmetric index-time analysis
Apply analyzer chains at index time to produce the tokens that retrieval will match against, with deliberate choices about whether to use the same chain at query time (symmetric) o…
BM25 family in production depth
Apply BM25 and its production variants to score query-document pairs in ways that work as standalone first-stage retrieval scoring and as input features to learning-to-rank models.
Cross-encoder reranking in production
Apply transformer-based cross-encoder scoring to a small candidate set to produce substantially higher top-K quality than feature-based LTR can achieve, accepting the higher comput…
Feature engineering and ablation methodology
Build a feature set that contributes meaningfully to ranking quality, validate each feature's value through ablation, and manage the feature pipeline at production scale with consi…
LambdaMART and gradient-boosted decision tree LTR
Train a learned ranking model from labeled training data that combines many features (50–500 typical) into per-document scores optimized for ranking metrics (NDCG, MAP) rather than…
Late-interaction models (ColBERT family)
Bridge the cost-quality gap between bi-encoder retrieval (fast but lower quality) and cross-encoder reranking (high quality but expensive) by pre-computing document representations…
Maximal Marginal Relevance (MMR) and diversification
Produce ranked result lists that balance relevance to the query against diversity of results, addressing the failure mode where pure-relevance ranking surfaces clusters of similar …
Multi-objective ranking with weighted combination and business rules
Produce final ranking that balances relevance with other objectives (freshness, diversity, business goals) through explicit weighting that can be tuned per query class and validate…
Personalization features in ranking pipelines
Adjust ranking based on context features that capture who the user is, what they've done recently, and their current environment, producing per-user-per-query ranking that outperfo…
Pointwise, pairwise, and listwise loss functions
Choose the right framing of the ranking problem as a machine learning task: pointwise (regression per document), pairwise (preference classification per pair), or listwise (loss ov…
Resources for tracking ranking and relevance discipline
Provide pointers to the active sources of ranking and relevance knowledge: foundational texts, academic and industry venues, practitioner writing, open-source tools, communities.
Vector similarity scoring
Score query-document pairs by similarity in a learned embedding space, where queries and documents are encoded as dense vectors and similarity captures semantic relationships beyon…
A/B testing for search
Measure whether a candidate search system produces better real-user outcomes than the current system by splitting production traffic and comparing per-user metrics with statistical…
Click models for bias correction (PBM, Cascade, DBN)
Model the probability that a user clicks a result as a function of the result's relevance and its position (and other presentation features), so that observed clicks can be decompo…
Custom business metrics for search
Measure search quality through metrics that map directly to business outcomes — revenue, conversion, task completion, satisfaction — rather than only through academic proxy metrics…
Explicit expert labeling
Produce high-quality relevance judgments by using assessors who understand the domain, the relevance definition, and the edge cases, accepting higher cost in exchange for higher qu…
Golden query sets and continuous evaluation
Detect search quality regressions automatically by running curated query sets against the current system frequently (daily or per-deployment) and alerting when metrics fall outside…
Implicit signals and click-based judgments
Extract relevance signal from production user behavior at scale, accepting that the signal is biased and requires modeling to interpret correctly, in exchange for judgment volume t…
Interleaving (TDI and successors)
Compare two ranking systems with much higher statistical efficiency than A/B testing by having each user effectively serve as their own experiment — seeing results from both system…
Judgment list construction and pooling
Build a judgment list that supports reliable offline evaluation: representative queries that cover the production query distribution, document pools that capture the candidates any…
LLM-as-judge for relevance labeling
Generate relevance judgments at scale using LLMs as automated assessors, accepting model-specific biases in exchange for low cost and high throughput, with explicit validation agai…
MAP, MRR, and P@K --- the alternative offline metrics
Apply the right metric for cases where NDCG isn't the best fit: MRR for known-item search, MAP for exhaustive retrieval, P@K for simpler interpretability, ERR for user-stopping mod…
NDCG and discounted gain metrics
Score a ranked result list by combining the relevance grades of its results with a position discount that rewards relevant results appearing higher, normalized to enable comparison…
Resources for tracking search evaluation discipline
Provide pointers to the active sources of search evaluation knowledge: foundational texts, academic and industry conferences, practitioner blogs, tools, communities.
A/B testing for search changes with power calculation and guardrails
Convert proposed search changes into shipped improvements (or learned-from failures) via the discipline of controlled experimentation, with statistical rigor that distinguishes rea…
Index health monitoring and indexing pipeline observability
Maintain operational visibility into the indexing pipeline — throughput, latency, freshness, completeness, error rates — so that indexing issues are caught and fixed before they de…
Low-CTR investigation methodology
Diagnose why users aren't clicking returned results, tracing the failure to the specific pipeline component responsible — retrieval, ranking, query understanding, or presentation —…
Multi-signal regression detection and alerting
Detect search-quality regressions promptly through automated monitoring of offline quality, online behavior, and operational metrics — with alert thresholds tuned to balance false …
Pipeline tracing and change correlation for root cause analysis
Move from a fired regression alert to a confirmed root cause efficiently by tracing the search pipeline for affected queries, correlating regression timing with recent changes, and…
Production query log schema and standard analytical views
Capture production search events with sufficient detail and enrichment to support all downstream operational analyses — zero-result investigation, regression detection, A/B test ev…
Resources for tracking search operations discipline
Provide pointers to the active sources of operational knowledge across search, SRE, and data engineering.
The zero-result investigation cycle
Convert zero-result query reports into a steady stream of small fixes — spell correction tweaks, synonym additions, entity recognition adjustments, content gap identifications — th…
Conversational search UX patterns with answer synthesis and citation
Provide conversational answer experiences that satisfy informational and analytical queries directly while preserving the user's ability to verify sources and explore further.
Hybrid autocomplete with query suggestions, instant results, and personalization
Build an autocomplete component that meets sub-100ms latency requirements, blends multiple suggestion sources appropriately for the workload, handles keyboard and screen reader int…
Mobile-specific search UX patterns and responsive design
Adapt search UX patterns to mobile constraints — small screens, touch input, slower networks, different user contexts — while maintaining the affordances that make search useful.
Production faceted navigation with URL state, dynamic counts, and accessibility
Provide users with refinement controls that narrow large result sets through structured attributes, with URL state for bookmarkability, dynamic counts for guidance, and accessibili…
Resources for tracking search UX discipline
Provide pointers to the active sources of search UX knowledge across design, accessibility, and emerging conversational interfaces.
Result card design with query-aware snippets and highlighting
Present each ranked result as a card whose visual structure communicates relevance through query-aware snippets, highlighting, and prominent display of the user-relevant metadata, …
Spell correction and query suggestion UX patterns
Surface query understanding outputs as user-facing affordances that improve search outcomes without removing user agency over their query intent.
The empty state hierarchy and graceful failure patterns
Convert search failure modes into useful user interactions by acknowledging the failure clearly, offering alternative paths forward, and preserving user agency.
Hybrid retrieval with Reciprocal Rank Fusion (RRF)
Combine the recall of lexical matching, the semantic understanding of vector search, and the optional LLM-augmented signals into a unified retrieval result that's better than any s…
LLM query rewriting with conversation context
Transform raw user queries into queries that produce better retrieval, particularly handling pronoun resolution, context dependencies, and the gap between conversational language a…
LLM-as-judge for relevance and faithfulness evaluation
Provide judgment signal at scale by using an LLM to assess relevance of retrieved passages, faithfulness of synthesized answers, and citation correctness — with appropriate calibra…
Operational patterns for production LLM-augmented search
Extend traditional search operational practice (Vol 6) to handle the new operational concerns LLM augmentation introduces: variable per-query cost, latency tails, drift, vendor dep…
RAG synthesis with grounded citation
Generate natural-language answers that satisfy informational queries directly while preserving the user\'s ability to verify each claim against source passages through cited refere…
Resources for tracking LLM-augmented search
Provide pointers to the active sources of LLM-augmented search knowledge across research, practitioner writing, vendor documentation, and tooling.
Semantic chunking and indexed summarization for RAG
Prepare documents for retrieval-augmented use by chunking them into semantically coherent pieces and generating summaries that capture each chunk\'s gist, enabling better embedding…
Two-stage retrieval with cross-encoder reranking
Lift retrieval quality by re-scoring the top-N candidates from cheap retrieval using a semantic model that\'s too expensive to run on every document but cheap enough for the candid…