Why require human approval for automated bug bounty submissions?

Reputation in bug bounty is cumulative. Programs remember researchers who submit garbage. One bad automated submission can damage standing across platforms. Human review catches edge cases that algorithms miss and ensures only high-quality reports go out.

What is the 'quality over quantity' principle in bug bounty?

A researcher with 50 reports and 40 accepted has better reputation than one with 200 reports and 50 accepted. Acceptance rate matters. Automated volume without quality control destroys reputation faster than manual hunting builds it.

How does scope validation prevent reputation damage?

Before any test, the system verifies the target is in-scope for the program. Out-of-scope testing can result in legal issues, program bans, and reputation damage. Automation makes mistakes faster--scope validation is the safety net.

What evidence should bug bounty reports include?

Screenshots of the vulnerability, full HTTP request/response pairs, reproducible PoC code (curl or script), and SHA-256 hashes proving evidence wasn't tampered. This enables programs to verify findings and protects researchers in disputes.

Is fully autonomous bug bounty hunting possible?

Technically possible, ethically problematic. Security research requires judgment about impact, scope, and responsible disclosure. Autonomous systems can find vulnerabilities but shouldn't decide how to report them or whether to report at all.

Human-in-the-Loop: The Ethics of Security Automation

Why mandatory human review protects researcher reputation better than any algorithm. Building AI that knows when to stop. Part 5 of 5.

Chudi Nnorukam

Dec 19, 2025 8 min read

In this cluster

Bug Bounty Automation: Autonomous security testing with human-in-the-loop safeguards and evidence gates.

Pillar guide

Semi-Autonomous Bug Bounty System How I built a multi-agent bug bounty hunting system with evidence-gated progression, RAG-enhanced learning, and safety mechanisms that keeps humans in the loop.

Related in this cluster

Why I Built Human-in-the-Loop Instead of Full Automation Keep humans in control when building AI security tools. Full automation sounds impressive until your reputation tanks from false positives.
I Built an AI-Powered Bug Bounty Automation System Why I chose multi-agent architecture over monolithic scanners, and how evidence-gated progression keeps findings honest. Part 1 of 5.
Bug Bounty: Detection vs Validation Why 'finding' a vulnerability isn't enough, and how response diff analysis cut my false positive rate dramatically. Part 2 of 5.

I could make this system fully autonomous. Remove the human review gates. Let it find vulnerabilities, validate them, and submit reports automatically.

I won’t.

Not because the technology can’t do it. Because I’ve seen what happens when researchers prioritize volume over judgment. Their acceptance rates crater. Programs add them to internal “problematic researcher” lists. Other programs notice.

That specific reputation damage—slow, invisible, cumulative—is worse than any technical failure.

Human-in-the-loop security automation requires mandatory human review for all submission decisions. Automation handles reconnaissance, testing, and validation—the tedious work where machines excel. Humans handle judgment calls: Is this finding worth reporting? Is the impact assessment accurate? Does the proof-of-concept clearly demonstrate the vulnerability? Quality over quantity, always.

Why Is Mandatory Human Review Non-Negotiable?

In part 2, I described how validation reduced false positives from 90% to 40%. That’s a huge improvement.

But 40% false positives is still unacceptable for direct submission.

If I submit 10 reports and 4 are invalid:

Programs notice patterns of low-quality submissions
Triage teams develop negative associations with my username
Future reports get scrutinized more heavily
Bounty amounts decrease for “problematic” researchers

The math doesn’t favor automation without human gates.

My system has hard rules:

Always requires human review:

Any finding with ≥0.70 confidence
Critical or high severity findings (any confidence)
First submission to any new program
Scope ambiguity detected
Potential for dispute or pushback

Never automated:

Report submission
Response to program triage questions
Scope clarification decisions
Disclosure timing

[!WARNING] Bug bounty platforms share information. A ban from one program can affect your standing elsewhere. Programs in the same company (e.g., Google, Meta) definitely share researcher reputations internally. One careless automated submission can cascade.

What Is the Quality Over Quantity Principle?

Two hypothetical researchers:

Researcher A: 200 reports submitted, 50 accepted (25% acceptance rate) Researcher B: 50 reports submitted, 40 accepted (80% acceptance rate)

Who would you rather have in your program?

Researcher B, obviously. They’re careful. They understand impact. They don’t waste triage time.

My system optimizes for Researcher B’s pattern:

High confidence threshold (0.85+) for human review queue
Detailed validation before any human sees it
Quality evidence collection (screenshots, PoC, hashes)
Report templates that match program expectations
No “spray and pray” submissions

I hated the idea of leaving valid findings unreported. But I needed to accept that a finding I’m 60% confident about isn’t ready. Let it mature. Get more evidence. Or discard it.

The acceptance rate compounds. Programs start trusting my reports. Triage becomes faster. Bounties increase. Fewer back-and-forth questions.

How Does Scope Validation Prevent Disaster?

Every bug bounty program has scope—what you’re allowed to test, what’s off-limits.

Out-of-scope testing can result in:

Legal action (yes, really)
Permanent program ban
Platform suspension
Criminal investigation (in extreme cases)

Automation makes mistakes faster. Without scope validation, the system could hammer a production database that’s explicitly out of scope. By the time I notice, the damage is done.

My scope validation runs before every test:

async function validateScope(target: Target, program: Program): Promise<boolean> {
  // Check explicit in-scope domains
  if (program.inScope.domains.includes(target.domain)) {
    return true;
  }

  // Check wildcard patterns
  if (program.inScope.wildcards.some(w => matchWildcard(w, target.domain))) {
    return true;
  }

  // Check explicit out-of-scope
  if (program.outOfScope.includes(target.domain)) {
    logScopeViolation(target, program, 'explicit_exclusion');
    return false;
  }

  // Ambiguous--flag for human review
  logScopeViolation(target, program, 'ambiguous');
  await notifyHuman('scope_clarification_needed', { target, program });
  return false;
}

Ambiguous cases don’t proceed. They wait for human judgment. Better to miss a finding than to get banned.

In part 3, I described how scope violations are a failure category that triggers immediate halt and blacklisting.

What Evidence Should Every Report Include?

Evidence serves two purposes:

Help programs verify your finding
Protect you if there’s a dispute

My evidence collection:

Screenshots

Visual proof of the vulnerability. Timestamped. Shows the actual vulnerable state, not just a claim.

HTTP request/response pairs

Exact traffic that demonstrates the vulnerability. Raw format, reproducible. Sensitive data scrubbed.

PoC code

Curl command or Python script that reproduces the issue. Programs can verify independently.

SHA-256 hashes

Hash of all evidence at collection time. Proves nothing was modified between discovery and submission.

interface EvidencePackage {
  screenshots: Array<{
    path: string;
    hash: string;
    capturedAt: Date;
  }>;
  httpExchanges: Array<{
    request: string;
    response: string;
    hash: string;
  }>;
  poc: {
    type: 'curl' | 'python' | 'manual';
    code: string;
    hash: string;
  };
  packageHash: string; // Hash of all component hashes
}

The package hash enables verification: “Here’s the SHA-256 of my evidence bundle at time of submission. It hasn’t changed.”

[!TIP] Some researchers skip evidence collection to submit faster. Don’t. That 10 minutes of screenshot capture has saved me in disputes where programs claimed “couldn’t reproduce.” I had timestamped proof that it worked on date X.

How Does Human Review Actually Work?

When a finding reaches 0.70+ confidence, it queues for human review with full context:

interface ReviewQueueItem {
  finding: Finding;
  validationSummary: {
    pocResult: 'passed' | 'partial' | 'failed';
    responseDiff: string;    // Key differences found
    falsePositiveRisk: number;
  };
  suggestedActions: string[];
  priorityScore: number;
  program: ProgramSummary;
  relatedFindings?: Finding[]; // Other findings in same session
}

The review interface shows:

Full finding details
Validation evidence
Why the system thinks it’s valid
Similar past findings (accepted or rejected)
Program-specific notes

Human reviewer can:

Approve: Proceed to formatting and submission
Request more validation: Send back for additional testing
Dismiss: Mark as false positive (logs pattern for learning)
Hold: Wait for more context before deciding

This connects to platform integration in part 4. Approved findings go to platform-specific formatters, then submit with human-approved content.

What’s the Human Augmentation Philosophy?

I’m not building a replacement for human researchers. I’m building a tool that makes human researchers more effective.

What automation handles:

Subdomain enumeration (tedious, mechanical)
Technology fingerprinting (pattern matching)
Endpoint discovery (exhaustive search)
Initial vulnerability detection (known patterns)
PoC validation (reproducibility testing)
Evidence collection (systematic capture)
Report formatting (platform-specific templates)

What humans handle:

Is this finding impactful enough to report?
Is the severity assessment accurate?
Are there edge cases the automation missed?
How should this be communicated to the program?
Should we coordinate with other researchers?
Is disclosure timing appropriate?

The division is clear: automation for breadth and consistency, humans for judgment and nuance.

I originally wanted full automation. Well, it’s more like… I wanted the efficiency fantasy of passive income from vulnerability reports. But judgment can’t be automated. Context matters too much. Programs are run by humans who respond to human communication.

What Are the Ethical Boundaries of Security Automation?

Some things the system will never do:

Never exploit for gain beyond bounty

No data exfiltration
No ransomware deployment
No selling access

Never test without authorization

Only registered bug bounty programs
Only explicitly in-scope targets
Halt immediately on scope ambiguity

Never prioritize speed over safety

Rate limiting is mandatory
Ban detection triggers immediate halt
Human review required before submission

Never misrepresent findings

No exaggerating severity for higher bounties
No fabricating evidence
No duplicate submissions across programs for same vendor

These aren’t just ethical guidelines—they’re code constraints. The system literally cannot do some of these things.

How Does This Connect to the Full System?

Throughout this series:

Architecture: Multi-agent design with evidence-gated progression
Validation: Response diff analysis to reduce false positives
Failure Learning: Recovery strategies and pattern learning
Multi-Platform: Unified model with platform-specific formatters
Human-in-the-Loop (you are here): Mandatory review gates and ethical boundaries

Each layer builds on the previous. But they all converge on this final point: humans make the decisions that matter.

The SQLite RAG learns from human feedback. Validation signatures come from human rejections. Platform formatters produce what human reviewers approve. The entire system exists to serve human judgment, not replace it.

What’s the Actual Outcome?

Before human-in-the-loop design:

Fast automated submissions
Low acceptance rate
Negative program relationships
Stressful dispute resolution

After mandatory human review:

Slower, more deliberate submissions
80%+ acceptance rate
Programs respond faster (trust established)
Evidence prevents disputes

The speed tradeoff is worth it. I’d rather submit 5 high-quality reports per week than 50 that damage my reputation.

Series Conclusion: What Did We Build?

Over five posts, I’ve described a system that:

Uses multi-agent architecture for parallel reconnaissance, testing, validation, and reporting
Applies evidence-gated progression where findings must prove themselves before advancing
Learns from failures with categorized recovery strategies and pattern databases
Integrates multiple platforms through unified models and platform-specific formatters
Requires human judgment for all decisions that affect researcher reputation

It’s not fully autonomous. It’s not meant to be.

The goal was never to replace human security researchers. The goal was to eliminate the tedious parts—the subdomain enumeration, the endpoint mapping, the false positive filtering—so human attention goes to the parts that require judgment.

Maybe the best automation isn’t the kind that removes humans from the loop. Maybe it’s the kind that keeps humans at the center—informed, efficient, and making better decisions because the noise has been cleared away.

That’s the series. Five posts on building something I actually use. If you’re building security automation, I hope this helped you think through the architecture, the failure modes, and especially the ethical constraints.

Questions? Critiques? I’d love to hear them.

Written by Chudi Nnorukam

I design and deploy agent-based AI automation systems that eliminate manual workflows, scale content, and power recursive learning. Specializing in micro-SaaS tools, content automation, and high-performance web applications.

Twitter/X LinkedIn GitHub

FAQ

Sources & Further Reading

Sources

OWASP Web Security Testing Guide OWASP standard Baseline methodology reference for web application security testing.
OWASP Top Ten Web Application Security Risks OWASP standard Canonical list of common web application risks for prioritization.
MITRE CWE - Common Weakness Enumeration MITRE dataset Authoritative taxonomy for classifying software weaknesses.