doradus-research

Doradus Research is the operator perspective of a small on-prem AI cluster — consumer-grade GPUs, multi-vendor inference, multi-model serving. We publish what we learn the hard way: dead-ends, configuration that didn't work, and the recipes that ended up in production.

We also take on commissioned engagements — formally-verified arXiv papers structured to pass peer review, patent-services research, custom model training, and 3D scanning / reconstruction / printing. Same operators, same cluster, written deliverables.

Compute: 3 GPU compute nodes carrying 10× NVIDIA RTX PRO 6000 Blackwell (95 GiB each, SM12.0a) + 4× RTX 5090 — partitioned across rotation pools and dedicated services. 2× NVIDIA DGX Spark (GB10, 128 GiB UMA each) for medium-tier always-on serving + content workflows. 2× Apple Mac Studio M3 Ultra (256 GiB UMA each) for Apple-Silicon-native inference. ~1.3 TB system RAM total. All on-prem. Built to accommodate ~40 daily users across internal tooling, research pipelines, and ad-hoc inference.

Storage: ~75 TB across four tiers — hot NVMe per node, an erasure-coded warm cluster, a shared NFS model cache, and SMB cold archive. Promotion / demotion between tiers is automatic.

Network: 100GbE fiberoptic backbone with managed switching across all nodes, fronted by a dedicated perimeter firewall appliance.

Orchestration: modern scheduler + service mesh with mTLS on every internal hop, including localhost. Inter-host trust over a private overlay.

The interesting bugs live at the seams of a multi-vendor inference stack. That's where we publish from.

Code at github.com/DoradusResearch. Cross-posted on Hugging Face at huggingface.co/DoradusResearch (forthcoming).

Reach out: commission research, GitHub, or X (@DoradusAI). We also read r/LocalLLaMA and the vLLM Discord.