Skip to content
Capstone v1.0 · May 2026 · BTech CSE Cloud Computing

The autonomous DevOps engineer
that thinks before it acts.

SentinelCloud is a closed-loop multi-agent system. It observes signals, debates the root cause, predicts the outcome of a fix, gates the action against a written constitution, and only then acts. If it is not sure enough, it pauses for a human. Every step is logged. Every claim is a measurable KPI.

Run a live demoSee the architectureNo login. No keys required.
Layers
3
perception → reasoning → actuation
Agents
6
analyst, devil, safety, strategist, verifier, critic
Scenarios
7
reliability, FinOps, security, drift
Gaps closed
12
from 2025–2026 AIOps literature
The seven scenarios

Pick an incident. Watch the agents debate, decide and act.

Every scenario is a deterministic fixture. The same run will look the same to your reviewer and to the next person who clones the repo.

The work

Twelve research gaps. Twelve modules. One closed loop.

Each module maps to a real, named gap that current AIOps work struggles with. Click into Architecture for the full mapping.

G1

Tool Selector Critic

A second small model verifies every tool call against the registry before dispatch.

G2

Topology-Aware Reasoner

Logs, metrics and traces are joined over a service-graph, not in isolation.

G3

Adversarial Debate

A Devil’s Advocate is contractually pinned to disagree, breaking groupthink.

G4

Blast Radius Calculator

BFS over dependencies; actions above 70/100 are gated to humans.

G5

Counterfactual Memory

Rejected alternatives are stored alongside accepted actions for next time.

G6

Semantic Policy Engine

Plain-English constitution with deterministic and LLM-validated checks.

G7

Deterministic Scenarios

Seeded fixtures so KPI claims are reproducible byte-for-byte.

G8

Cost-Risk Optimizer

Pareto over price, eviction probability and workload tolerance.

G9

Episodic Memory + PRM

Past resolutions retrieved by similarity. Per-step quality scored.

G10

Confidence Calibration

Auto-act only above class-specific thresholds; below = human-on-the-loop.

G11

WAF Rule Synthesizer

Given a CVE, drafts a ModSecurity / Cloud Armor rule with TTL and cite.

G12

Multimodal Ingestor

OTLP, JSON logs, metrics, PR diffs, chat, one normalized envelope.

What we measure

KPIs you can audit

No vibes. Every claim is a number with a definition and a source-of-truth file.

MTTR (auto)
< 5 min
incident start → fully resolved
Noise Reduction
> 90%
alerts auto-suppressed or auto-resolved
Drift Latency
< 60 s
manual change → AI reverts
Deployment Success
> 99.9%
no-rollback rate
Tool-Call Validity
> 99%
critic-verified
Hallucination Rate
< 1%
verifier disagreement
Cost Saved
USD
cumulative since deploy
Confidence Calib.
0.99
critical-class threshold
Open and reproducible

The whole thing is on GitHub.

Source, scenarios, prompts, KPI math, none of it is a black box. Fork it, run it, prove the numbers, write the next paper.