Skip to content
AI Foundations for Bankers
0%

Embeddings and Vector Search — The "Memory" Layer

beginner12 min readembeddingsvector-searchvector-databasesemantic-searchsimilarity

Beyond Keyword Search: Finding Information by Meaning

Every bank sits on a vast repository of documents -- regulatory filings, credit memos, policy manuals, customer correspondence, audit findings, and board presentations. Today, finding specific information in this repository typically means keyword search: type "Basel III capital requirements" and hope the exact phrase appears in the document you need.

But what if you could search by meaning instead of exact words? What if your system could find all documents discussing capital adequacy -- even those that never use the phrase "Basel III"? That is exactly what embeddings and vector search make possible.

This technology is not theoretical. It is the foundation of the most powerful enterprise AI applications being deployed in banking today, and understanding it will help you evaluate AI vendors, assess architecture proposals, and make informed investment decisions.

What Are Embeddings?

At its core, an embedding is a way to convert text into numbers that capture meaning. When you run a sentence, paragraph, or entire document through an embedding model, you get back a list of numbers -- typically 768 to 3,072 numbers -- called a vector. These numbers represent the semantic content of that text in a mathematical space.

The critical insight is this: texts with similar meanings produce vectors that are close together in this mathematical space. A document about "mortgage delinquency trends" and one about "home loan default patterns" would produce vectors that are nearly identical -- even though they share few keywords.

KEY TERM

Embedding: A numerical representation (vector) of text that captures its semantic meaning. Similar concepts produce vectors that are mathematically close together, enabling machines to measure the "distance" between ideas -- not just between words.

BANKING ANALOGY

Think of embeddings as the Dewey Decimal System for your bank's document vault -- but vastly more intelligent. The Dewey system groups books by topic using a numbering scheme: all books about finance cluster around the 330s. Embeddings do the same thing, but in hundreds or thousands of dimensions instead of one, and they capture nuanced meaning, not just broad categories. A document about "anti-money laundering controls for correspondent banking" would be placed near documents about "BSA compliance in cross-border transactions" because the meanings are related, even though the terminology differs.

How Vector Search Works

Once you have converted your documents into embeddings, you can search them by meaning using semantic search (also called vector search or similarity search). The process works in three steps:

  1. Embed the query: When a user asks a question -- say, "What is our policy on concentrated commercial real estate exposure?" -- the system converts this question into an embedding vector using the same model
  2. Find nearest neighbors: The system compares the query vector against all stored document vectors and finds the closest matches using mathematical distance calculations (typically cosine similarity)
  3. Return ranked results: The most semantically similar documents are returned, ranked by relevance -- regardless of whether they contain the exact keywords from the query

This approach solves a fundamental problem in banking: institutional knowledge is spread across thousands of documents using inconsistent terminology. Different departments, different eras, and different authors describe the same concepts in different ways. Keyword search fails when terminology varies. Semantic search succeeds because it understands meaning.

Why This Matters for Banking

Consider a compliance officer searching for all internal policies related to "customer due diligence." With keyword search, they might miss documents titled "KYC Procedures," "Client Onboarding Standards," or "Enhanced Verification Protocols" -- all of which address the same regulatory requirement. With embedding-based search, all of these documents surface because their meanings are similar, regardless of the specific words used.

Tip

When evaluating AI vendors for document search capabilities, ask specifically about their embedding model. Not all embedding models are equal -- some are optimized for general text, others for specific domains. For banking, look for models that have been exposed to financial and regulatory language during training. Also ask about embedding dimensionality: higher dimensions capture more nuance but cost more to store and search.

Vector Databases: Storing and Searching at Scale

A vector database is a specialized database designed to store embedding vectors and perform fast similarity searches across millions or billions of entries. Traditional relational databases (the kind powering your core banking systems) are optimized for exact matches -- "find the account with ID 12345." Vector databases are optimized for approximate nearest-neighbor searches -- "find the 10 documents most similar to this query."

Leading vector database options include:

  • Pinecone: Fully managed cloud service, simple to deploy, popular for proof-of-concept projects
  • Weaviate: Open-source with enterprise features, supports hybrid search (combining keyword and vector approaches)
  • pgvector: An extension for PostgreSQL that adds vector search capabilities to your existing database infrastructure
  • Qdrant: High-performance open-source option with strong filtering capabilities

For banking institutions, the choice of vector database involves familiar enterprise considerations: data residency, security, scalability, and integration with existing infrastructure. Many banks start with pgvector because it adds vector capabilities to PostgreSQL databases they already operate and secure.

The Embedding Pipeline

Deploying embeddings in a banking context follows a structured pipeline:

  1. Document ingestion: Collect and preprocess documents from internal repositories, regulatory feeds, and knowledge bases
  2. Chunking: Split large documents into smaller, overlapping segments. A 200-page regulatory filing might be split into 500-word chunks with 50-word overlaps to preserve context at boundaries
  3. Embedding generation: Run each chunk through an embedding model to produce its vector representation
  4. Storage: Store the vectors alongside metadata (source document, date, department, classification level) in a vector database
  5. Query processing: When a user searches, embed their query and find the nearest document chunks

Warning

Embedding models, like LLMs, have been trained on public data. When you embed proprietary bank documents, the vectors are mathematical abstractions -- they do not contain readable text. However, the original text must be stored alongside the vectors for retrieval. Ensure your vector database deployment satisfies the same data residency and access control requirements as any system handling sensitive banking data.

Real Banking Applications

Embeddings power several high-value banking use cases:

  • Intelligent policy search: Relationship managers find relevant policies instantly, even when they do not know the exact document title or regulatory reference
  • Regulatory change detection: Automatically match new regulatory guidance to affected internal policies by comparing embeddings of new regulations against your policy library
  • Customer communication analysis: Cluster customer complaints by semantic theme to identify emerging issues before they become systemic
  • Due diligence research: Search across thousands of news articles, filings, and reports to surface relevant information about counterparties or market segments

Quick Recap

  • Embeddings convert text into numerical vectors that capture meaning, enabling search by concept rather than keywords
  • Vector search finds semantically similar documents by comparing embedding vectors, solving the inconsistent-terminology problem in banking
  • Vector databases (Pinecone, pgvector, Weaviate) are specialized storage systems optimized for fast similarity search at scale
  • The embedding pipeline involves ingesting documents, chunking them, generating embeddings, and storing them for retrieval
  • Banking applications include intelligent policy search, regulatory change detection, and customer communication analysis

KNOWLEDGE CHECK

A compliance officer searches for customer due diligence but the relevant internal policy is titled KYC Enhanced Verification Protocols. Why would embedding-based search find this document when keyword search would not?

A bank is evaluating vector database options for storing embeddings of sensitive internal documents. Which factor most distinguishes this decision from a typical database selection?

Why is document chunking an important step in the embedding pipeline for banking applications?