The Foundation Model Landscape
From One Model to an Ecosystem
When the AI conversation began in earnest in late 2022, many people equated AI with a single product: ChatGPT. Today, the landscape has exploded. There are dozens of foundation modelsFoundation ModelA large AI model trained on broad data that can be adapted to many tasks. Examples include GPT-4, Claude, and Gemini. Banks evaluate these for capabilities, safety, and regulatory fit.See glossary from major technology companies, open-source communities, and specialized AI labs -- each with distinct strengths, licensing terms, and deployment options.
For banking executives, this is actually good news. Competition drives innovation, reduces vendor lock-in risk, and creates options tailored to different regulatory and operational requirements. But navigating this landscape requires a structured approach.
KEY TERM
Foundation Model: A large AI model trained on broad data at scale that can be adapted (fine-tunedFine-TuningThe process of further training a pre-trained model on a specific dataset to specialize its behavior for a particular domain or task, such as banking compliance language.See glossary) to a wide range of downstream tasks. Unlike traditional models built for one specific purpose, a foundation model serves as the "foundation" upon which many applications can be built -- similar to how a core banking platform supports multiple product lines.
The Major Players
The foundation model market is dominated by a handful of providers, each bringing different philosophies and capabilities. Here is a snapshot of the models most relevant to enterprise banking use cases.
| Model | Provider | Key Strength | Banking Use Case |
|---|---|---|---|
| GPT-4o | OpenAI | Broad capability, large ecosystem | Document analysis, customer service automation |
| Claude | Anthropic | Long context window, safety focus | Regulatory document review, compliance analysis |
| Gemini | Google DeepMind | Multimodal (text + image + video) | Check processing, document OCR with understanding |
| Llama 3 | Meta | Open-source, on-premise deployment | Internal tools where data cannot leave the bank |
| Command R+ | Cohere | Enterprise RAG, retrieval focus | Knowledge base search, internal Q&A systems |
| Mistral Large | Mistral AI | European-based, GDPR alignment | EU banking operations, cross-border compliance |
Each of these models represents billions of dollars in training investment and distinct architectural choices. Selecting the right one -- or the right combination -- for your institution is a strategic decision, not a technical one.
BANKING ANALOGY
Choosing a foundation model is remarkably similar to your vendor evaluation process for a core banking system. You would never select a core platform based solely on a features checklist. You evaluate total cost of ownership, vendor stability, regulatory compliance capabilities, integration with existing infrastructure, and long-term strategic alignment. The same framework applies to foundation models. The "best" model in a benchmark may not be the best model for your institution.
Understanding the Differences
Capability and Quality
Not all foundation models are created equal. Performance varies significantly across different task types:
- Reasoning and analysis: GPT-4o and Claude currently lead on complex reasoning tasks -- the kind of multi-step analysis required for credit risk assessment or regulatory interpretation
- Long document processing: Claude supports context windows up to 200,000 tokens (roughly 500 pages), making it particularly strong for processing lengthy regulatory filings or loan documentation packages
- Code generation: GPT-4o and Claude both excel at generating and explaining code, useful for data teams building analytics pipelines
- Multilingual capability: Gemini and Mistral show strong performance across European and Asian languages, relevant for global banking operations
- Retrieval-augmented tasks: Command R+ was specifically designed for RAGRetrieval-Augmented Generation (RAG)A pattern that combines document retrieval with LLM generation. The system searches a knowledge base for relevant context, then feeds it to the model to produce grounded, accurate answers.See glossary workflows, making it a strong choice for internal knowledge management
Speed and Cost
There is a direct trade-off between model capability and operational cost. The most powerful models (GPT-4o, Claude Opus) cost significantly more per query than smaller alternatives. For many banking use cases, a less expensive model may be entirely sufficient.
Consider the use case hierarchy:
- High-stakes analysis (regulatory interpretation, risk assessment): Use the most capable model available. The cost per query is trivial compared to the value of accuracy
- Document processing at scale (summarizing thousands of customer communications): Use a mid-tier model. You need good quality at reasonable cost
- Simple classification tasks (routing customer inquiries, tagging documents): Use a smaller, faster model. Speed and cost matter more than peak capability
Data Privacy and Deployment Options
This is where banking requirements diverge sharply from other industries. When a retail company uses an LLM to generate marketing copy, data privacy is a minor concern. When a bank processes loan applications, customer financial records, or trading strategies through an LLM, data residency and privacy become paramount.
The deployment spectrum looks like this:
- Cloud API (lowest control): Your data is sent to the provider's servers for processing. Enterprise agreements typically include data handling provisions, but the data does leave your perimeter
- Virtual Private Cloud (moderate control): The model runs in a dedicated cloud instance within your chosen region. Better isolation, higher cost
- On-premise deployment (highest control): The model runs entirely within your infrastructure. Only possible with open-source models like Llama 3 or Mistral. Maximum control, but you bear the operational burden
Tip
For most banking institutions, the practical approach is a tiered strategy: use cloud APIs with enterprise agreements for non-sensitive workloads (market research summarization, public document analysis), and deploy open-source models on-premise for anything involving customer data, proprietary strategies, or regulatory submissions. This "best of both worlds" approach lets you access the most capable models where appropriate while maintaining control where required.
Open Source vs. Closed Source
The open-source vs. closed-source debate is one of the most consequential decisions in your AI strategy. Here is what each approach offers:
Closed-Source Models (GPT-4o, Claude, Gemini)
Advantages:
- Highest capability on complex tasks
- Managed infrastructure -- no operational burden on your teams
- Regular updates and improvements from the provider
- Enterprise support agreements available
Disadvantages:
- Data must be sent to provider infrastructure (even with enterprise agreements)
- Limited transparency into how the model works
- Vendor lock-in risk -- switching costs increase over time
- Pricing can change with limited notice
Open-Source Models (Llama 3, Mistral, Command R+)
Advantages:
- Full control over data -- nothing leaves your infrastructure
- Ability to fine-tuneFine-TuningThe process of further training a pre-trained model on a specific dataset to specialize its behavior for a particular domain or task, such as banking compliance language.See glossary on your own data (banking-specific terminology, institutional knowledge)
- No per-query costs (after infrastructure investment)
- No vendor dependency for the model itself
Disadvantages:
- Requires significant ML engineering talent to deploy and maintain
- Generally lower capability than the leading closed-source models
- You bear the full operational burden (monitoring, scaling, updates)
- Fine-tuning requires expertise and computing resources
The Hybrid Approach
Most sophisticated banking institutions are adopting a hybrid approach: closed-source models for non-sensitive, high-complexity tasks and open-source models for sensitive data processing. This pragmatic middle ground maximizes capability while respecting the regulatory constraints that define banking.
Selection Criteria for Banking
When evaluating foundation models for your institution, structure the evaluation around these dimensions:
Regulatory Compliance
- Does the provider offer data processing agreements that satisfy OCC/Fed/FDIC requirements?
- Where is data processed and stored? Can you guarantee data residency within required jurisdictions?
- Does the provider support audit trails and logging sufficient for regulatory examination?
- What happens to your data after processing -- is it used for model training?
Operational Risk
- What is the provider's uptime SLA? Banking operations often require 99.9%+ availability
- What happens during provider outages? Do you have failover options?
- How does the provider handle model updates? Can you control when you migrate to new model versions?
- Is there model versioning so you can reproduce prior outputs for regulatory purposes?
Total Cost of Ownership
- What is the per-query cost at your projected volume?
- What internal infrastructure and talent is required?
- What are the integration costs with your existing technology stack?
- What is the switching cost if you need to change providers?
Strategic Alignment
- Does this model strategy support your 3-5 year AI roadmap?
- Are you building internal AI capability or outsourcing it?
- How does this decision affect your talent strategy -- do you need ML engineers?
The model landscape is evolving rapidly. Any specific model comparison will be dated within months. What endures is the evaluation framework. Build institutional capability in evaluating models, not just in using any single one.
Looking Ahead
The foundation model landscape will continue to evolve rapidly. Models that are state-of-the-art today may be surpassed within months. New entrants will emerge, and existing providers will release increasingly capable versions.
For banking executives, the strategic imperative is clear: develop institutional fluency in evaluating and deploying foundation models, build the governance frameworks to manage them responsibly, and create the flexibility to adopt new models as the landscape evolves. The banks that treat AI model selection as a one-time decision will find themselves locked into yesterday's technology. The banks that build adaptive capability will thrive.
KNOWLEDGE CHECK
A bank needs to process customer loan applications through an LLM. Which deployment approach best addresses data residency requirements?
What is the primary reason banking institutions are adopting a hybrid model strategy rather than selecting a single foundation model?
When evaluating foundation models, which factor most distinguishes banking's evaluation criteria from those of other industries?