n8n + Ollama - Automate Workflows with AI Running on Your Own VPS

If you’re using Zapier or Make to automate your work, you’ve probably thought: “I wish I could self-host this on my own server.” Good news – there’s n8n, an open-source workflow automation platform that runs on your VPS. Combine it with Ollama for running local AI models, and you have a complete AI automation system without depending on any external services.

In this article, I’ll guide you through installing n8n + Ollama with Docker Compose, then build some real-world workflows to show you how powerful this combo can be.

What are n8n and Ollama?

n8n (pronounced “n-eight-n”) is a workflow automation platform similar to Zapier, but self-hosted and open source. You drag and drop nodes to create workflows: receive webhooks, read emails, call APIs, process data, send notifications… Almost everything Zapier can do, n8n can do too, but runs on your own server so you don’t worry about task limits or pay per execution.

Ollama – many of you already know from previous articles in the series. It allows running LLMs (Llama, Mistral, Gemma, Qwen…) right on your local machine or VPS, exposed via an OpenAI-compatible API.

Combine these two: n8n handles workflow logic, Ollama handles the AI part. Everything runs on your own VPS, data doesn’t go anywhere.

Installing n8n + Ollama with Docker Compose

The simplest way is to run both in Docker. Create a docker-compose.yml file with the following content:

services:
  n8n:
    image: n8nio/n8n:latest
    container_name: n8n
    restart: unless-stopped
    ports:
      - "5678:5678"
    environment:
      - N8N_SECURE_COOKIE=false
      - GENERIC_TIMEZONE=Asia/Ho_Chi_Minh
    volumes:
      - n8n_data:/home/node/.n8n
ollama:
    image: ollama/ollama:latest
    container_name: ollama
    restart: unless-stopped
    ports:
      - "11434:11434"
    volumes:
      - ollama_data:/root/.ollama
volumes:
  n8n_data:
  ollama_data:

Start them:

docker compose up -d
# Pull models for Ollama
docker exec ollama ollama pull llama3.2
docker exec ollama ollama pull qwen2.5:7b

After running, access http://VPS-IP:5678 to enter the n8n interface. First time it will ask you to create an admin account.

If your VPS has GPU, add the deploy section with runtime: nvidia to the ollama service to utilize GPU acceleration. Check the Ollama article in the series for details.

Connecting n8n with Ollama

There are two ways for n8n to communicate with Ollama:

Method 1: HTTP Request node – Call Ollama API directly. Flexible, full control over everything. Suitable when you want to customize request body details.

Method 2: Built-in AI nodes – From version 1.x onwards, n8n has built-in AI nodes like LLM Chain, AI Agent, Chat Model… Just choose Ollama as model provider, enter URL and select model. This method is faster and more deeply integrated with n8n’s AI ecosystem.

Since both containers run on the same Docker network, n8n calls Ollama via URL http://ollama:11434 (container name instead of localhost). This is the beauty of using Docker Compose: services automatically find each other by name.

To set up credentials for Ollama in n8n: go to Settings → Credentials → Add Credential → Ollama, enter Base URL as http://ollama:11434. Then AI nodes can use this credential.

Workflow 1: Webhook Text Summarization

The first and simplest workflow: receive text via webhook, send to Ollama for summarization, return the result.

Required nodes:

Webhook – Receive POST request with body containing text
HTTP Request – Call Ollama API
Respond to Webhook – Return result

Configure HTTP Request node to call Ollama:

{
  "method": "POST",
  "url": "http://ollama:11434/api/generate",
  "body": {
    "model": "llama3.2",
    "prompt": "Summarize the following content in 3 short sentences, respond in English:\n\n{{ $json.body.text }}",
    "stream": false
  }
}

Test with curl:

curl -X POST http://VPS-IP:5678/webhook/summarize \
  -H "Content-Type: application/json" \
  -d '{"text": "Long content that needs summarization here..."}'

Now you have your own text summarization API, running on your server. Can be integrated into any application that needs to call an API.

Workflow 2: Auto RSS Summarization and Telegram Notification

This workflow runs automatically daily: reads RSS feed, gets new articles, asks AI to summarize then sends via Telegram.

Nodes:

Schedule Trigger – Run every morning at 8am
RSS Feed Read – Read feed from URL (e.g. tech blog, news)
Limit – Limit to 5 newest articles (avoid overload)
HTTP Request – Send title + description of each article to Ollama for summarization
Telegram – Send summary to group or personal chat

Prompt sent to Ollama like:

Summarize the following article in 2-3 sentences. Keep technical terms intact.
Title: {{ $json.title }}
Content: {{ $json.contentSnippet }}

Telegram message can be formatted like:

📰 *{{ $json.title }}*
{{ $json.summary }}
🔗 {{ $json.link }}

Every morning you wake up to a neat news summary in Telegram, no need to open each website to read. Completely automatic, completely running on your own server.

Workflow 3: AI-powered Form Submission Classification

Suppose you have a contact form on your website. Instead of every submission going to one inbox, use AI to classify and route to the right team.

Nodes:

Webhook – Receive form data (name, email, content)
HTTP Request – Send content to Ollama for classification
Switch – Route based on classification result
Multiple branches – Send email/Slack/Telegram to corresponding team

Classification prompt:

Classify the following customer request into ONE of these categories:
- sales (pricing inquiry, purchase, quote)
- support (errors, technical support, not working)
- partnership (collaboration, partners, affiliation)
- other (doesn't belong to above categories)
Respond with exactly 1 word: sales, support, partnership, or other.
Content: {{ $json.body.message }}

Switch node will check Ollama’s output. If it’s “sales” then send Slack to sales team, “support” then create ticket in system, “partnership” then forward email to boss. Simple but effective.

When using AI for classification, prompts should require fixed format responses (1 word only). Avoid letting the model respond with long text as it will be difficult to parse in subsequent nodes.

AI Agent Nodes in n8n

Besides manually calling APIs via HTTP Request, n8n has a whole set of nodes dedicated to AI, designed with LangChain architecture:

Chat Model (Ollama) – Connect to Ollama server, select model. This is the foundation node, other nodes will use it to call LLM.
Basic LLM Chain – Send prompt, receive response. Similar to HTTP Request but cleaner, with built-in template variables and output parsing.
AI Agent – The most powerful node. Agent can use tools (call API, read database, search web…) to autonomously decide how to handle tasks. You define tools, agent chooses appropriate tool.
Window Buffer Memory – Store conversation history for agent to remember context. Useful when building chatbot or workflows needing multiple interaction steps.

Real example: you can create an AI Agent that receives requests from Telegram, agent autonomously decides whether to search Google, read database or call some API, then returns results. All configured by dragging and dropping, no coding required.

Usage: drag AI Agent node to canvas, attach Chat Model (Ollama) to model slot, add Tool nodes (HTTP Request Tool, Code Tool…) to tools slot. Agent will know when to call which tool based on your prompt.

Tips for Running n8n + Ollama

Error Handling

Ollama runs locally so sometimes model loads slowly or timeouts, especially first call after restart. How to handle:

Enable Retry On Fail on HTTP Request node: set retry 2-3 times, wait 5 seconds between each attempt.
Add separate Error Trigger workflow to receive notifications when errors occur (send Telegram or email).
Set timeout for HTTP Request node higher than default (at least 60 seconds, large models may need 120 seconds).

Output Parsing

LLMs don’t always return output in the format you want. Some tricks:

Ask model to return JSON and use JSON Parse node to process. Clear prompt like: “Respond in JSON format: {\”category\”: \”…\”, \”confidence\”: 0.9}”
Use Code node (JavaScript) to clean output: trim whitespace, lowercase, remove extra characters.
For smaller models (7B), prompts need to be more specific. Examples in prompts (few-shot) will help model respond in correct format.

Performance

If workflow processes many items (like reading 20 RSS articles), don’t send all at once to Ollama. Use Loop Over Items or SplitInBatches node to process each batch.
Small models (Qwen 2.5 7B, Llama 3.2 3B) are good enough for classification and short summarization tasks. No need to pull 70B models for everything.
Keep Ollama running continuously instead of start/stop per workflow. First call after loading model will be slow, subsequent calls much faster because model is already in RAM.

Conclusion

n8n + Ollama is an ideal combo for anyone wanting to automate work with AI while keeping everything on their own server. n8n handles orchestration (triggers, logic, connecting services), Ollama handles AI (text generation, classification, summarization). Both are open source, run with Docker, no subscription fees.

Besides the 3 workflows above, you can also try:

Email triage – Read new emails, AI classifies urgent/normal/spam, send notifications for important emails.
Content repurposing – Receive long blog post, AI creates summary versions for Twitter, LinkedIn, newsletter.
Database enrichment – Read customer list, AI analyzes and automatically adds tags/categories.
Internal chatbot – Use AI Agent node to create chatbot answering questions based on company documents.

Do you have a VPS, Docker, and some weekend time? Give it a try, I think you’ll be surprised by what this combo can do.

n8n + Ollama – Automate Workflows with AI Running on Your Own VPS