The operating system for bare-metal AI LIMITED CLOSED BETA

From rack power-on to first token in minutes, not months.

We provision GPU clusters from bare rack to production in under an hour, activate models in under a second, and keep every GPU dollar working — not waiting on storage, restarts, or the cloud.

Request Beta Access → See How It Works
<200ms
Model activation time
via FlashActivate
~1hr
Bare-metal cluster
provisioning time
#1 #2 #3
IO500 10-Node Production
Top 3 — all powered by DAOS
0%
Cloud lock-in.
You own everything.

Your GPUs are expensive. They shouldn't sit idle waiting on storage.

AI adoption is accelerating, but on-prem GPU clusters are crippled by operational friction that burns compute budgets and stalls production inference.

3–10 minute model swap delays

Every checkpoint reload requires weight copies, engine restarts, and reloads. On a cluster of NVIDIA and AMD GPUs, that idle time costs thousands of dollars per occurrence.

🔒

I/O and metadata bottlenecks

RAG pipelines, agentic systems, and multi-modal workflows stall on storage reads. Legacy POSIX filesystems choke on the small-object metadata patterns that AI inference generates.

💥

State loss on failure

Agentic and multi-modal systems lose state across restarts. In mission-critical deployments — finance, healthcare, defense — that means downtime risk and data loss.

☁️

Cloud lock-in vs. sovereignty

Enterprises demand on-premises control for data residency, latency, and cost. But the tooling to run bare-metal AI at cloud-grade manageability doesn't exist. Until now.

Enakta is the full-stack bare-metal AI platform.

Provision, store, activate. One platform from PXE boot to production inference — no Kubernetes, no virtualization, no cloud dependency.

3–10
minutes
Conventional model reload
(copy weights → restart engine → warm up)
<200
milliseconds
FlashActivate model swap
(atomic pointer swap in Enakta Storage → serve)
enakta-cli — model activation
# Snapshot model into Enakta Storage registry
$ enakta flash snapshot --model llama-3-70b --pool prod-inference
✓ Checkpoint snapshotted to Enakta Storage

# Activate on production workers — atomic swap
$ enakta flash activate --model llama-3-70b --strategy canary
✓ Canary: 2/16 workers activated [138ms]
✓ Health check passed
✓ Rolling to all workers [194ms]
✓ Model live. Zero downtime.

# Hot-swap LoRA adapter — no engine restart
$ enakta flash lora --adapter customer-finance-v3 --rdma
✓ LoRA loaded via RDMA [42ms]
Core

Bare-Metal Provisioning

Deploy and configure hundreds of GPU nodes in ~1 hour via PXE boot and immutable OS images. Stateless design — any node can be replaced without cluster reconfiguration.

Storage

Enakta Storage Platform

End-to-end deployment, monitoring, and recovery of high-performance storage clusters built on DAOS. Offline-resilient architecture ensures storage continuity even during network partitions.

Flagship

FlashActivate — Sub-Second Model Swap

Atomic registry pointer swap in Enakta Storage. No file copy. <200ms activation. 20x+ faster first-token latency versus conventional reload.

Ops

Canary Rollouts & Cross-Pool Failover

Blue/green model deploys with automatic rollback on quality regression. Session affinity preserved across pool migrations. vLLM and TGI plugin integration.

Performance

KV Cache Tiering & RDMA LoRA

HBM → CPU DRAM → Enakta Storage cache tiering maximizes effective context window. RDMA-loaded LoRA adapters enable per-request personalization without engine restarts.

Enakta Storage Platform — built on DAOS, the engine behind the top IO500 results worldwide.

The Enakta Storage Platform is built on DAOS (Distributed Asynchronous Object Storage), the open-source engine that dominates the IO500 global benchmark. We make it production-ready for AI.

#1 #2 #3
IO500 10-Node Production SC25
Top 3 all run on DAOS, the
engine behind Enakta Storage
10x+
Per-server I/O advantage
vs conventional parallel
filesystems
0
Kernel I/O overhead
User-space RDMA bypasses
the OS entirely
DAOS Foundation founding members: Argonne National Lab Enakta Labs Google HPE Intel

Phased expansion from activation to full AI stack.

Q1–Q2 2026 · Shipping

Core + FlashActivate

Production-hardened bare-metal provisioning and sub-second model activation. vLLM/TGI compatibility. 8,000+ lines, 220+ automated tests.

Q3 2026 – Q1 2027

Inference & Agentic

Enakta Recall: native RAG with tiered embedding retrieval. Enakta Swarm: shared KV cache for multi-agent state coordination.

2027–2028+

Multi-Modal & HPC

Sharded media checkpoints, continual fine-tuning, and upstream data pipeline acceleration.

Stop paying for idle GPUs.
Start shipping production AI.

The Enakta Labs AI Platform is currently in limited closed beta. If you're running bare-metal GPU infrastructure and want early access, we'd love to hear from you.