SentinelCloud, AI-Driven Autonomous DevOps Engineer

Problem statement

Current AIOps prototypes excel in demos and break in production. They hallucinate commands, correlate text without grounding, and collapse into consensus when run as multi-agent debates. SentinelCloud is a closed-loop system that treats every step as a measurable contract.

Approach

A six-agent state machine separates analysis, dissent, planning, validation, safety, and outcome prediction. Actions pass a deterministic policy gate, a calibrated confidence gate, and a blast-radius gate before execution. Every run is logged as an episode for memory recall.

Reproducibility

Seven scenarios are seeded fixtures. Same input, same orchestrator, same KPIs. The LLM gateway has a deterministic stub fallback so the demo runs offline. Source code, prompts, and the policy constitution all live in the repo.

Baselines we compare against

Single-LLM zero-shot, No tools, no debate, free-form chain-of-thought.
Single-LLM with tools, Tool calling but no critic, no policy gate.
Naive debate (no devil), Multiple agents but no contractually-pinned dissenter.
SentinelCloud (this work), Adversarial debate, blast radius, semantic policy, calibration, memory.
Oracle upper bound, Cheats with the ground-truth scenario answer; reports the ceiling.

Honest limitations

Demo runs against simulated topologies. Connector mode against a real GCP / K8s project is implemented but disabled by default.
LLM cost depends on provider; the gateway falls back to a stub when no provider is reachable.
The Process Reward Model is heuristic; a learned PRM is future work.
Evaluation set is seven scenarios, not hundreds. Scale-out is straightforward but out of scope for the capstone window.

Cite this work

@misc{kumar2026sentinelcloud,
  title  = {SentinelCloud: A Closed-Loop Multi-Agent System for Autonomous Cloud DevOps},
  author = {Kumar, Rohit},
  year   = {2026},
  note   = {BTech CSE Cloud Computing capstone, Shoolini University},
  url    = {https://sentinelcloud.dmj.one}
}