About

About Infersec

Infrastructure-first AI delivery for teams that need control.

Infersec connects your Linux and macOS hosts to cloud-based AI surfaces, allowing you to use your own hardware to run business-grade LLMs securely and at scale. You maintain full ownership of your hardware, data and how you route requests into your infrastructure.

Explore documentation
icon Main Features

Control plane for secure, composable AI delivery

Connect private hardware, expose compatible APIs, route intelligently — own your AI stack end to end

icon

OpenAI & Anthropic-compatible endpoints

Drop-in support for existing SDKs and clients without protocol rewrites.

icon

Connect Linux and macOS hosts

Run conduit workers on your own machines and keep model execution private.

icon

Intelligent routing with failover

Session stickiness, load balancing, priority routing, and automatic offline-source detection across your inference fleet.

icon

Self-hosted MCP gateways

Expose MySQL, Postgres, and MariaDB through scoped MCP gateways — attach tool servers per endpoint with policy-aware access controls.

icon

Privacy-first by design

No prompt logging, no tool-call content storage. Your data stays on your infrastructure in self-hosted deployments.

icon

Pluggable telemetry

Ship logs, traces, and Prometheus-format metrics to your existing observability stack.

icon

Server-side tool calling

Tool calls are intercepted and executed on the server via MCP tool servers — results are fed back into the conversation automatically, transparent to the client.

icon

European provider

Infrastructure and data handling operate under EU regulations. No data leaves European jurisdiction.

icon

Affordable usage-based pricing

Pay only for what you use — per token and per connected source hour. No fixed fees, no minimums.

icon Common questions

Frequently Asked Questions

Everything you need to know about running AI endpoints from your own hardware.

Yes. Install the Infersec conduit on any Linux or macOS machine — mini PCs, desktops, Macbooks, rackmount servers. If it can run an inference engine like llama.cpp or vLLM, it can be a source. Always get the permission of your organisation first.

You can configure models that run on your own hardware, or use hosted models that Infersec provides. In both cases, prompt data never leaves your account and is not used for training purposes.

Infersec detects offline sources automatically and reroutes traffic to healthy sources using fallback chains and priority rules. Endpoints stay available even when individual machines drop off.

That depends on your hardware and the model you choose. With a modern GPU or Apple Silicon, local models like Llama, Mistral, and Qwen deliver responses competitive with hosted services — often faster since there's no queue behind other tenants.

For many workloads, yes. Apple Silicon handles 7B–70B parameter models well, and Infersec's routing lets you chain multiple machines together. You won't match the largest proprietary models on every benchmark, but for most agent workflows, internal tools, and day-to-day inferencing, your own hardware is more than enough — and you keep full control of the data.

Yes — Infersec provides server-side tool execution for just this purpose. The LLM will see the database tools within our system, but the actual tool itself does not need to be publicly exposed.