Mr. Handy Investigates — Autonomous NOC Research Agent

The Problem

Your Team Is Wasting
The First 20 Minutes

Every alert starts the same way: an engineer wakes up with zero context and has to go find it themselves. Mr. Handy fixes that.

Before

Traditional NOC

✗ Engineer wakes at 3 AM, opens ticket cold
✗ Manually checks if other hosts are affected
✗ Googles the hostname to figure out what it does
✗ Searches runbooks for access instructions
✗ Calls site contact to get physical access info
✗ Asks senior: "Have you seen this before?"
✗ After 20+ minutes: finally starts troubleshooting

After

With Mr. Handy

✓ Alert fires → Mr. Handy immediately starts investigation
✓ Phase 1 (2 min): Correlation analysis runs automatically
✓ Phase 2 (5 min): Remote diagnostics executed
✓ Phase 3 (1 min): Context enrichment from CMDB
✓ Engineer opens ticket to a complete investigation package
✓ No detective work. No runbook hunting. Just action.
✓ Time to action under 10 minutes, not 20+

60%

Faster Diagnosis

50%

Faster Resolution

<10m

Time to Action

<5%

False Positives

How It Works

Three Phases.
One Investigation Package.

Autonomous investigation delivers actionable intelligence in under 10 minutes — no human required until it's time to act.

Phase 01

Correlation

~2 minutes

Query related alerts within ±30 min window
Check Zabbix proxy health and status
Match against historical alert patterns
Identify scope: single host vs site-wide

Phase 02

Diagnostics

~5 minutes

Multi-vantage ping tests
TCP port probes (22, 80, 161)
SNMP switch port status query
OOB access attempt via IPMI / iLO / DRAC
Gateway ARP verification

Phase 03

Context

~1 minute

Site contact information retrieval
Physical access requirements
Dependency mapping from CMDB
Business impact assessment

Engineer Output

Not Just an Alert.
A Complete Package.

Every investigation produces six key deliverables your engineer can act on immediately — no hunting required.

Complete Diagnostic History

Every command run, every result, timestamped and organized for immediate review.

Site Contact & Access Info

Phone numbers, building access procedures, circuit locations — ready to use.

Tiered Remediation Playbook

Quick remote fix → advanced remote → field dispatch, prioritized by likelihood.

Clear Escalation Criteria

Defined thresholds for when to dispatch and what triggers a field tech visit.

Correlated Alerts Identified

Shared infrastructure failures spotted automatically across the environment.

OOB Credentials (if available)

IPMI / iLO / DRAC access details for remote recovery without field dispatch.

Technical Stack

Built on Proven
Infrastructure Tools.

No exotic dependencies. Runs on what your NOC already trusts.

Core Engine

Python 3.10+

Investigation orchestration and diagnostic automation

Monitoring Integration

Zabbix 7.0+

Alert ingestion, correlation, and problem tracking via API

Database

SQLite

Investigation records and remediation pattern storage

Remote Diagnostics

urllib3 + pysnmp

Network testing, SNMP queries, and TCP port probes

OOB Management

ipmitool + SSH

IPMI / iLO / DRAC access and console server integration

Scheduling

System Cron

15-minute polling intervals with reliable execution

Your Team Is WastingThe First 20 Minutes