Capstone v1.0 · May 2026 · BTech CSE Cloud Computing

The autonomous DevOps engineer
that thinks before it acts.

SentinelCloud is a closed-loop multi-agent system. It observes signals, debates the root cause, predicts the outcome of a fix, gates the action against a written constitution, and only then acts. If it is not sure enough, it pauses for a human. Every step is logged. Every claim is a measurable KPI.

Run a live demo See the architectureNo login. No keys required.

Layers

perception → reasoning → actuation

Agents

analyst, devil, safety, strategist, verifier, critic

Scenarios

reliability, FinOps, security, drift

Gaps closed

from 2025–2026 AIOps literature

The seven scenarios

Pick an incident. Watch the agents debate, decide and act.

Every scenario is a deterministic fixture. The same run will look the same to your reviewer and to the next person who clones the repo.

highReliability

Memory leak in payments-api v2.4

Heap climbs 5%/min only on v2.4 pods; OOMKilled events follow.

6 signals→ rollback

highReliability

orders-db connection pool exhausted

Connection wait time spikes; reports-batch holding stale connections.

4 signals→ restart pods

criticalSecurity

Zero-day CVE-2026-30412 in libcrypto-flex

Newly disclosed RCE; no patch yet. Fleet-wide WAF rule must hold the line.

3 signals→ waf rule

mediumFinOps

reports-batch over-provisioned by 6×

Average CPU 4%; nightly burst handled by a single pod. Right-size opportunity.

3 signals→ right size

highDrift

Manual mesh weight change detected

Out-of-band Istio change moved 60% traffic to canary. IaC reconciliation needed.

2 signals→ mesh weight

criticalReliability

Cascading failure: fraud-check timeout

fraud-check 504s storm payments-api retries; gateway error budget burning fast.

3 signals→ mesh weight

highSecurity

Layer-7 anomalous traffic from one ASN

Anomalous 8× spike in /api/checkout from a single ASN. Likely credential stuffing.

3 signals→ waf rule

The work

Twelve research gaps. Twelve modules. One closed loop.

Each module maps to a real, named gap that current AIOps work struggles with. Click into Architecture for the full mapping.

Tool Selector Critic

A second small model verifies every tool call against the registry before dispatch.

Topology-Aware Reasoner

Logs, metrics and traces are joined over a service-graph, not in isolation.

Adversarial Debate

A Devil’s Advocate is contractually pinned to disagree, breaking groupthink.

Blast Radius Calculator

BFS over dependencies; actions above 70/100 are gated to humans.

Counterfactual Memory

Rejected alternatives are stored alongside accepted actions for next time.

Semantic Policy Engine

Plain-English constitution with deterministic and LLM-validated checks.

Deterministic Scenarios

Seeded fixtures so KPI claims are reproducible byte-for-byte.

Cost-Risk Optimizer

Pareto over price, eviction probability and workload tolerance.

Episodic Memory + PRM

Past resolutions retrieved by similarity. Per-step quality scored.

G10

Confidence Calibration

Auto-act only above class-specific thresholds; below = human-on-the-loop.

G11

WAF Rule Synthesizer

Given a CVE, drafts a ModSecurity / Cloud Armor rule with TTL and cite.

G12

Multimodal Ingestor

OTLP, JSON logs, metrics, PR diffs, chat, one normalized envelope.

What we measure

KPIs you can audit

No vibes. Every claim is a number with a definition and a source-of-truth file.

MTTR (auto)

< 5 min

incident start → fully resolved

Noise Reduction

> 90%

alerts auto-suppressed or auto-resolved

Drift Latency

< 60 s

manual change → AI reverts

Deployment Success

> 99.9%

no-rollback rate

Tool-Call Validity

> 99%

critic-verified

Hallucination Rate

< 1%

verifier disagreement

Cost Saved

USD

cumulative since deploy

Confidence Calib.

0.99

critical-class threshold

Open and reproducible

The whole thing is on GitHub.

Source, scenarios, prompts, KPI math, none of it is a black box. Fork it, run it, prove the numbers, write the next paper.

View on GitHub Read the research report Deploy your own

The autonomous DevOps engineerthat thinks before it acts.