Guide · Choosing your setup
Self-hosted vs. cloud AI agents for small business.
Once a business decides an AI agent is worth having, the next question is always the same: should it run in the cloud, or on our own hardware? The honest answer is that this is two questions wearing one coat — and once you separate them, the decision gets much easier. This guide walks through it the way we do with clients: plainly, with the trade-offs on the table.
First, separate the two questions
People say "self-hosted vs. cloud" as if it's one choice. It's two:
- Where does the agent live? The always-on system that watches your inbox, holds your files, and keeps your business memory — this can run on a machine in your office, a small dedicated box, or a server in your own cloud account.
- Where does the thinking happen? The AI model that reads and writes — this can be a frontier cloud model you pay per use (Claude and its peers), or a local model (via runtimes like Ollama) on your own hardware.
Because these are separate, you get three workable patterns, not two.
The three patterns
1. Fully cloud. Agent and model both run on someone else's infrastructure. Fastest to start, least to maintain, and fine for many businesses — provided the terms of service, data handling, and account ownership are actually read and understood. The main risks are data leaving your control by default and subscription creep.
2. Hybrid — the SMB sweet spot. The agent, your files, and your business memory live on a machine you control. For the heavy thinking, it calls a cloud model API under your own account and terms, sending only what each task needs. You get frontier-model capability with your documents staying home, a pay-per-use bill instead of per-seat subscriptions, and the freedom to swap models without rebuilding anything. This is the pattern we recommend most often, and the one behind the working tools on this site.
3. Fully local. Agent and model both run on your own hardware; nothing ever leaves the building. This is the right call when regulation, client contracts, or plain policy say data can't go out — and it's genuinely achievable now. The trade-offs are honest ones: a one-time hardware purchase, local models that are capable but a step behind the frontier, and a bit more care and feeding.
How to choose: four factors
- Data sensitivity. What's the most sensitive thing the agent will touch? Client health records point local. Quote follow-up emails don't. Design for the strictest constraint you actually have — not the scariest headline you read.
- Cost shape. Cloud means a modest ongoing bill that scales with use. Local means hardware up front and less after. Neither is "cheaper" in all cases; they're different shapes. Match the shape to how your business likes to spend.
- Maintenance appetite. Someone must own updates, monitoring, and backups. If nobody in-house wants that job, either stay cloud-side or arrange care from outside — an unmaintained local setup is worse than no setup.
- Capability needs. Drafting, triage, summarizing, and filing work well on local models. Complex multi-step reasoning still favours frontier cloud models. Most businesses need both — which is exactly why the hybrid pattern wins so often.
A quick self-check
You're probably a hybrid candidate if: you handle normal business data, want your files under your roof, and prefer utility-style billing. You're a fully local candidate if: contracts or regulation forbid data leaving, and you can budget the hardware plus care. You're a fully cloud candidate if: you're testing the waters, have no unusual data constraints, and want the shortest possible path to seeing value.
The part nobody puts in the comparison table
Whichever pattern you choose, the things that make an agent setup succeed are the same: guardrails on what it may do alone, human approval on anything that leaves the building, logs you can audit, and one measurable workflow to start. A well-run cloud setup beats a neglected local one every time — and vice versa. The infrastructure choice matters; the operating discipline matters more.
Want the recommendation without the research project?
Our private AI agent service covers exactly this: a vendor-neutral read on which pattern fits your constraints, then the setup, connections, and care. It starts with a 20-minute Agent Fit Review. New to the topic? Start with What is an AI agent OS?