Integrating LLMs Into Your Application: Patterns That Actually Work
Randall Clapper

by Randall Clapper

29 Jun, 2025

Integrating LLMs Into Your Application: Patterns That Actually Work

Beyond the ChatGPT Wrapper

Everyone's building AI features now, but most integrations are shallow: take user input, send to OpenAI, display response. That works for demos but fails in production.

Here are the patterns we use when integrating LLMs into real applications.


Pattern 1: RAG (Retrieval-Augmented Generation)

Use when: You need the LLM to answer questions about your data.

The problem with raw LLMs: they don't know about your company's documentation, your product catalog, or your customer data. RAG solves this by retrieving relevant context before generating.

How It Works

User Question → Search Your Data → Add Results to Prompt → LLM → Answer

Implementation Sketch

public async Task<string> AnswerQuestion(string question)
{
    // 1. Convert question to embedding
    var questionEmbedding = await _embeddingService.GetEmbedding(question);
    
    // 2. Search your vector database for relevant chunks
    var relevantDocs = await _vectorDb.Search(questionEmbedding, topK: 5);
    
    // 3. Build prompt with context
    var prompt = $"""
        Answer the user's question based on the following context.
        If the answer isn't in the context, say "I don't know."
        
        Context:
        {string.Join("\n\n", relevantDocs.Select(d => d.Content))}
        
        Question: {question}
        """;
    
    // 4. Generate answer
    return await _llm.Complete(prompt);
}

Key Decisions

  • Chunking strategy - How you split documents matters more than which embedding model you use
  • Hybrid search - Combine vector search with keyword search for better recall
  • Reranking - Use a smaller model to rerank retrieved chunks before sending to the LLM

Pattern 2: Function Calling

Use when: You need the LLM to take actions, not just generate text.

Function calling (or "tool use") lets the LLM invoke your code. Instead of generating a text answer, it generates a structured function call that your application executes.

Example: Order Lookup

var tools = new[]
{
    new Tool
    {
        Name = "lookup_order",
        Description = "Look up an order by order number",
        Parameters = new
        {
            type = "object",
            properties = new
            {
                order_number = new { type = "string", description = "The order number" }
            },
            required = new[] { "order_number" }
        }
    }
};

var response = await _llm.Chat(
    messages: [new("user", "What's the status of order ABC-123?")],
    tools: tools
);

if (response.ToolCalls.Any())
{
    var call = response.ToolCalls[0];
    var orderNumber = call.Arguments["order_number"];
    var order = await _orderService.GetOrder(orderNumber);
    
    // Send result back to LLM for final response
    var finalResponse = await _llm.Chat(
        messages: [
            new("user", "What's the status of order ABC-123?"),
            new("assistant", null, toolCalls: response.ToolCalls),
            new("tool", $"Order {orderNumber}: {order.Status}", toolCallId: call.Id)
        ]
    );
}

Key Decisions

  • Tool descriptions matter - The LLM decides which tool to call based on descriptions
  • Validate inputs - Don't trust the LLM to generate valid parameters
  • Handle errors gracefully - If a tool fails, let the LLM recover

Pattern 3: Agentic Workflows

Use when: You need the LLM to complete multi-step tasks autonomously.

An agent is an LLM in a loop: it decides what to do, takes an action, observes the result, and repeats until the task is complete.

Simple Agent Loop

public async Task<string> RunAgent(string task)
{
    var messages = new List<Message>
    {
        new("system", "You are a helpful assistant. Use tools to complete tasks."),
        new("user", task)
    };

    while (true)
    {
        var response = await _llm.Chat(messages, tools: _tools);
        messages.Add(new("assistant", response.Content, response.ToolCalls));

        if (!response.ToolCalls.Any())
            return response.Content; // Done

        foreach (var call in response.ToolCalls)
        {
            var result = await ExecuteTool(call);
            messages.Add(new("tool", result, toolCallId: call.Id));
        }
    }
}

Key Decisions

  • Set a max iterations limit - Agents can loop forever
  • Add guardrails - Restrict which tools are available based on context
  • Log everything - You'll need to debug why the agent did what it did

Anti-Patterns to Avoid

1. Putting Business Logic in Prompts

Bad:

"If the user is a premium customer, show them the discount. 
Premium customers have spent over $1000 lifetime..."

Good: Check customer status in your code, then tell the LLM what to do.

2. No Fallbacks

What happens when the API times out? When the model returns garbage? Always have a fallback path.

3. Ignoring Latency

LLM calls are slow (500ms-2s). Don't put them in hot paths. Use streaming for user-facing responses.

4. No Observability

You need to see:

  • What prompts you're sending
  • What responses you're getting
  • Token usage and costs
  • Error rates and latency

Start Simple

If you're adding LLM features for the first time:

  1. Start with a single, well-defined use case
  2. Use function calling over free-form generation
  3. Add structured output (JSON mode) to get predictable responses
  4. Measure everything

Need Help?

We've integrated LLMs into applications across financial services, legal, and logistics. If you're exploring AI features and want to avoid the common pitfalls, let's talk.