Skip to content

NVIDIA DGX Spark — Primary Compute

The DGX Spark serves as the primary inference and orchestration node for the CorpusIQ agent platform. It handles model inference, cron scheduling, memory management, and the majority of operational workloads.

Hardware

NVIDIA GPU with CUDA support. Local inference eliminates API latency and cost for lightweight tasks. The machine runs 24/7 with automatic recovery for any agent process failures.

Software Stack

Hermes Agent

Primary orchestration framework. Upgraded from v0.15.1 to v0.16.0, incorporating 426 commits of new capabilities: model switching, fallback chains, browser tooling, skill bundles, and gateway integrations.

Ollama Local Inference

ollama pull qwen3.6:27b
ollama pull deepseek-r1:32b
Local models handle routine tasks at zero cost with sub-100ms latency.

Multi-Model Router

Task complexity determines model selection. Defaults to local models (Qwen, Ollama). Escalates to DeepSeek for complex reasoning and Claude Opus for strategic decisions. Approximately 65% cost savings compared to premium-model-only routing.

GBrain Memory

Persistent knowledge layer: 729 indexed files, pglite database, nomic-embed-text embeddings at 768 dimensions. Nightly dream cycle consolidates knowledge at 03:00.

Cron Infrastructure

24 scheduled processes: email monitoring, social publishing, video generation, knowledge consolidation, GitHub monitoring, reporting, and self-improvement cycles.

Why DGX Spark

  • Zero-cost local inference for routine tasks
  • Data privacy — sensitive operations never leave the machine
  • Sub-100ms latency for local models
  • Silent 24/7 operation in workspace environment