All guides
Local AI28 May 20262 min read

Running AI Locally with Ollama

Set up private, low-cost AI on your own machine and stop paying for tasks that don't need the cloud.

Not every AI task needs a frontier model or a cloud round-trip. Drafting an email, summarizing a document, renaming a batch of files, asking a quick coding question — a small model running on your own laptop handles all of these well, privately, and for free.

Ollama is the simplest way to get there. It runs open models locally with a single command, and exposes them through an API that most AI tools already know how to talk to.

Why bother running AI locally

  • Cost. Recurring subscriptions add up. Local inference is free after the download.
  • Privacy. Your prompts and documents never leave your machine.
  • Availability. It works offline, on a plane, behind a firewall.
  • Learning. Running the model yourself demystifies what's actually happening.

This isn't about replacing the big hosted models — it's about not reaching for them when something smaller does the job.

The five-minute setup

  1. Install Ollama from the official site for your OS.

  2. Pull a model. Start small so it runs comfortably:

    ollama pull llama3.2
    
  3. Chat with it right from the terminal:

    ollama run llama3.2
    

That's a working local assistant. No account, no API key.

Wiring it into your workflow

Ollama serves an API at http://localhost:11434. That means any editor, script, or app that supports a custom endpoint can point at your local model — note-takers, code editors, and command-line tools included.

The goal isn't to run everything locally. It's to know which tasks can run locally — and quietly move them off your bill.

Choosing a model size

A rough rule: pick the smallest model that's still good enough for the task. Bigger models are slower and need more memory; for everyday drafting and summarizing, a small or mid-size model is usually indistinguishable in practice.

Start with one small model, use it for a week, and only reach for something larger when you actually hit its limits.

Put this to work

Want this running in your business instead of on your reading list? We build it with you — first call is free.

Book a free AI call