Skip to content
AI Foundations for Bankers
0%

Snowflake Cortex & Databricks Mosaic AI

intermediate10 min readsnowflakedatabricksdata-platformcortex-aimosaic-aimlops

The Data Gravity Argument

Most banks have spent years -- and tens of millions of dollars -- building modern data platforms. Whether your institution runs Snowflake, Databricks, or both, your analytics infrastructure already holds the cleaned, governed, and compliance-ready data that AI applications need.

Moving that data to a separate AI platform introduces risk, cost, and governance overhead. Snowflake Cortex AI and Databricks Mosaic AI take the opposite approach: they bring AI capabilities to where the data already lives.

BANKING ANALOGY

Think about the difference between sending loan files to an outside review firm versus having the reviewers work on-site in your secure document room. When the reviewers come to the data, you maintain full custody, avoid the risk of documents in transit, and keep your existing access controls in place. Snowflake Cortex and Databricks Mosaic AI follow this same logic -- instead of copying data to an AI platform, they bring AI capabilities into your existing data platform where governance is already established.

Snowflake Cortex AI

Snowflake Cortex AI adds foundation model capabilities directly into the Snowflake Data Cloud. You invoke AI functions using SQL -- the same language your analysts and data engineers already use.

Core Capabilities

Cortex LLM Functions. Call foundation models directly from SQL queries. Functions like SNOWFLAKE.CORTEX.COMPLETE() accept a prompt and return model output, enabling AI processing inline with data transformations. You can summarize text columns, classify records, extract entities, and generate insights without leaving Snowflake.

Cortex Search. A managed retrieval-augmented generation service that creates searchable indexes over your Snowflake data. Point it at a table of documents, and it handles chunking, embedding, and vector storage automatically. Your applications query it with natural language and get grounded answers backed by your data.

Cortex Fine-Tuning. Fine-tune supported models on your proprietary data without the data ever leaving Snowflake's security boundary. The fine-tuned model is stored and served within your Snowflake account.

Cortex Analyst. A natural language interface to structured data. Business users describe what they want to know in plain English, and Cortex Analyst generates and executes the SQL -- democratizing data access for non-technical stakeholders.

Banking-Specific Value

For banks already running Snowflake for analytics and reporting, Cortex AI offers three significant advantages:

Zero data movement. Your customer data, transaction records, and regulatory reports stay in Snowflake. AI processes them in place, eliminating the data copy, transfer, and secondary governance that separate AI platforms require.

Existing access controls. Snowflake's role-based access control extends to Cortex AI functions. If an analyst cannot see PII columns in a table, they cannot pass those columns to an LLM function. Your existing data governance policies apply automatically.

SQL interface. Your data engineering team does not need to learn Python, new frameworks, or AI-specific tooling. SQL-callable AI functions integrate into existing ETL pipelines, reporting workflows, and data products.

Databricks Mosaic AI

Databricks Mosaic AI takes a broader approach, providing a full MLOps platform with LLM serving alongside traditional machine learning capabilities. Where Snowflake emphasizes SQL simplicity, Databricks emphasizes flexibility and control.

Core Capabilities

Model Serving. Deploy foundation models and custom-trained models as scalable API endpoints within your Databricks environment. Supports both pay-per-token and provisioned throughput pricing, with automatic scaling based on demand.

AI Gateway. A unified interface for managing multiple inference endpoints -- including external providers like OpenAI and Anthropic alongside self-hosted models. The gateway adds rate limiting, usage tracking, and fallback routing across all providers.

Vector Search. Native vector database capabilities built into the Databricks Lakehouse. Create vector indexes over Delta tables and perform similarity search without a separate vector database service -- your embeddings live alongside your structured data.

RAG Studio. A managed environment for building and evaluating RAG applications. It provides document parsing, chunking strategies, retrieval evaluation, and end-to-end quality metrics -- the full pipeline needed to move from prototype to production RAG.

MLflow Integration. All AI experiments, model versions, and deployments are tracked through MLflow -- the open-source ML lifecycle platform that Databricks created. This provides the model registry, experiment tracking, and deployment management that model risk management teams require.

Banking-Specific Value

Unified ML and LLM platform. Banks running traditional ML models (credit scoring, fraud detection) on Databricks can add LLM workloads to the same platform. One governance framework, one model registry, one monitoring system for all AI.

Open-source foundation. Databricks is built on open formats -- Delta Lake, MLflow, Apache Spark. Banks concerned about vendor lock-in find comfort in the portability of their data and model artifacts.

Lakehouse architecture. The Databricks Lakehouse unifies structured data (transaction records, financial metrics) with unstructured data (documents, emails, call transcripts) in one platform. AI applications that need both types of data -- which describes most banking use cases -- benefit from this unified architecture.

Comparing the Two Approaches

DimensionSnowflake Cortex AIDatabricks Mosaic AI
Primary interfaceSQL functionsPython SDK + SQL
Target userData analysts, SQL-fluent teamsData scientists, ML engineers
Model customizationManaged fine-tuning (limited models)Full MLOps with custom training
RAG approachCortex Search (managed)RAG Studio + Vector Search (configurable)
StrengthsSimplicity, zero data movement, SQL-nativeFlexibility, open source, unified ML+LLM
Best for banks that...Want fast AI adoption with minimal new toolingHave ML teams and want full control over model lifecycle

Tip

Many large banks use both Snowflake and Databricks for different workloads. The same strategy can apply to AI: use Snowflake Cortex for SQL-driven analytics AI (portfolio reporting, compliance screening) and Databricks Mosaic AI for complex ML pipelines (custom model training, multi-step agent workflows). The two platforms can coexist in your architecture.

When Data Platform AI Makes Sense

Data-platform-adjacent AI is strongest when:

  1. Your data is already in the platform. If your loan portfolio, customer records, and regulatory data live in Snowflake or Databricks, running AI there avoids the cost and risk of data movement.

  2. Your AI use cases are data-centric. Summarizing database records, classifying transactions, extracting entities from document tables -- these map naturally to SQL-callable AI functions.

  3. Your team is data-engineering-heavy. If your strongest technical talent is in SQL and data pipelines, Snowflake Cortex meets them where they are. If you have a strong data science team, Databricks Mosaic AI gives them the flexibility they expect.

  4. Governance is your top concern. Keeping AI processing within your existing data governance boundary -- with the same access controls, audit logging, and compliance infrastructure -- reduces the incremental risk of AI adoption.

Quick Recap

  • Snowflake Cortex AI and Databricks Mosaic AI bring AI to where banking data already lives, eliminating data movement and governance overhead
  • Snowflake emphasizes SQL simplicity -- AI functions callable from existing queries and pipelines
  • Databricks emphasizes flexibility -- full MLOps with custom training, open formats, and unified ML and LLM capabilities
  • Both approaches keep data within existing security boundaries, extending current access controls to AI workloads
  • Many banks will use both platforms for different AI workloads, matching platform strengths to use case requirements

KNOWLEDGE CHECK

What is the PRIMARY advantage of running AI on an existing data platform like Snowflake or Databricks?

A bank with a strong SQL-focused analytics team and most data in Snowflake should evaluate which approach first?

Which capability differentiates Databricks Mosaic AI for banks that already run traditional ML models?