Prompts, Completions, and System Instructions

beginner10 min readpromptscompletionssystem-instructionsprompt-engineeringtemperature

The Interface Layer: How You Talk to an LLM

In traditional banking software, you interact through structured forms, dropdown menus, and predefined workflows. A loan origination system does not ask for your opinion -- it asks for a borrower's income, credit score, and collateral value in precisely defined fields.

Large Language ModelsLarge Language Model (LLM)A neural network trained on vast amounts of text data that can understand and generate human language. LLMs power chatbots, document analysis, code generation, and many enterprise AI applications.See glossary work differently. Instead of structured inputs, you communicate in natural language -- plain English sentences and paragraphs. This interface layer, built on prompts, completions, and system instructions, is what makes LLMs both remarkably accessible and surprisingly nuanced to use well.

Understanding this interface is essential for any banking executive evaluating AI tools, because the quality of your instructions directly determines the quality of the output. A poorly prompted LLM is like a brilliant but poorly briefed analyst: capable of excellent work, but likely to deliver something you did not actually need.

KEY TERM

Prompt: The text input you provide to an LLM -- your question, instruction, or request. A prompt can be a single sentence ("Summarize this regulatory filing") or a detailed multi-paragraph instruction set with examples, constraints, and formatting requirements.

Prompts: Your Instructions to the Model

A promptPrompt EngineeringThe practice of crafting effective instructions (prompts) to guide AI model behavior. Techniques include few-shot examples, chain-of-thought reasoning, and role-based system instructions.See glossary is simply the text you send to an LLM. When a relationship manager types "Draft a follow-up email to a commercial lending prospect who toured our treasury management platform," that entire sentence is the prompt. The model reads it, interprets the intent, and generates a response.

But not all prompts are created equal. The emerging discipline of prompt engineering focuses on crafting instructions that consistently produce high-quality, reliable outputs. For banking applications where accuracy and tone matter, this discipline is critical.

Zero-Shot vs. Few-Shot Prompting

Zero-shot prompting means giving the model a task with no examples. You simply describe what you want:

"Classify this customer complaint as one of: billing dispute, fraud report, service quality, or account access."

Few-shot prompting means including examples in your prompt to demonstrate the pattern you expect:

"Classify customer complaints. Examples:

'I was charged twice for my wire transfer' -> billing dispute

'Someone opened a card in my name' -> fraud report Now classify: 'Your mobile app has been down for three days'"

Few-shot prompting typically produces more consistent results for banking tasks because it removes ambiguity about your expectations. When classifying regulatory correspondence or tagging transaction categories, a few well-chosen examples dramatically improve accuracy.

BANKING ANALOGY

Think of prompting like briefing a new analyst on your team. If you say "Review this loan file," you might get anything from a one-paragraph summary to a 20-page analysis. But if you say "Review this loan file, focusing on the three largest risk factors, and present your findings in a one-page memo formatted like the example I am attaching," you will get exactly what you need. The same principle applies to LLMs -- specificity and examples produce better results.

Completions: What the Model Returns

A completion is the model's response to your prompt. The term comes from the original framing of LLMs as text-completion engines: given a sequence of tokensTokensThe basic units of text that LLMs process. A token is roughly 3/4 of an English word. Token counts determine cost, speed, and context window limits for every API call.See glossary, the model "completes" the sequence by predicting what comes next.

In practice, completions can be anything from a single word to a multi-page document, depending on your prompt and configuration. For banking applications, completions might include draft regulatory responses, summarized credit memos, classified customer inquiries, or generated code for data analysis.

System Instructions: Setting the Ground Rules

System instructionsSystem InstructionsPersistent instructions provided to an LLM that define its role, behavior constraints, and output format. System instructions shape every response without being visible to end users.See glossary are persistent directives that shape every response the model generates. Unlike a prompt -- which changes with each query -- system instructions remain constant across an entire session or application. They define the model's role, behavioral constraints, and output format.

KEY TERM

System Instructions: A special category of prompt that sets persistent behavioral guidelines for the LLM. System instructions are processed before every user prompt and define the model's persona, constraints, tone, and output requirements. End users typically do not see system instructions.

For banking applications, system instructions are where you encode compliance requirements, tone guidelines, and safety constraints. For example:

"You are a compliance assistant for a US-based commercial bank. Never provide legal advice. Always cite the specific regulation when referencing regulatory requirements. If uncertain about any regulatory interpretation, state that explicitly."

This single system instruction transforms a general-purpose LLM into a focused, appropriately constrained banking tool.

BANKING ANALOGY

System instructions are like the compliance guidelines you give a new analyst on their first day. Before they answer a single client question, they learn: "We never guarantee returns. We always disclose fees. We refer legal questions to General Counsel." These ground rules shape every interaction without needing to be repeated each time. System instructions work the same way for an LLM.

Controlling Output: Temperature and Token Limits

Two key parameters give you fine-grained control over how the model generates responses:

Temperature

TemperatureTemperatureA parameter controlling the randomness of model outputs. Lower temperature (0.0-0.3) produces focused, deterministic responses; higher temperature (0.7-1.0) produces more creative, varied outputs.See glossary controls the randomness of the model's output. It ranges from 0 to 1 (sometimes higher):

Low temperature (0.0 - 0.3): The model produces highly deterministic, focused responses. Best for banking tasks requiring consistency -- regulatory classifications, data extraction, compliance checks
High temperature (0.7 - 1.0): The model produces more varied, creative responses. Useful for brainstorming sessions or generating diverse draft options

For most banking use cases, lower temperature settings are appropriate. When a model is classifying customer complaints or extracting data from loan documents, you want the same input to produce the same output every time.

Token Limits

Every LLM has a context windowContext WindowThe maximum amount of text (measured in tokens) a model can process in a single request. Larger context windows allow more information but increase cost and latency.See glossary -- the maximum number of tokens it can process in a single request. This limit includes both your prompt and the model's response. If your prompt consumes 80% of the context window, the model has only 20% left for its answer.

For banking applications processing lengthy documents, context window management is a practical concern. A 200-page regulatory filing might require chunking into smaller segments or using a model with a larger context window.

Tip

When deploying LLMs for banking workflows, establish standard system instructions for each use case -- compliance review, customer communication drafting, document summarization. Version-control these instructions the same way you version your compliance policies. This ensures consistency across your organization and creates an audit trail for regulators.

Quick Recap

Prompts are your natural-language instructions to the LLM -- their quality directly determines output quality
Completions are the model's generated responses, produced one token at a time
System instructions set persistent behavioral guidelines that shape every response, functioning like compliance rules for the AI
Few-shot prompting (providing examples) produces more consistent results than zero-shot for banking classification tasks
Temperature controls randomness (low for consistency, high for creativity) and token limits constrain the total input/output length

KNOWLEDGE CHECK

A bank wants its LLM-powered compliance tool to classify regulatory correspondence the same way every time. Which configuration is most appropriate?

What is the primary purpose of system instructions in a banking AI application?

Why is few-shot prompting particularly valuable for banking use cases compared to zero-shot prompting?