Deduplication
How BlockSecOps consolidates findings from multiple scanners. Deduplication identifies when multiple scanners report the same vulnerability and consolidates...
Deduplication
How BlockSecOps consolidates findings from multiple scanners.
What Is Deduplication?
Deduplication identifies when multiple scanners report the same vulnerability and consolidates them into a single finding.
The Problem
Without deduplication, running 17 scanners might produce:
- 50 unique issues
- 150 duplicate findings
- 200 total findings to review
With deduplication:
- 50 unique issues
- Each showing which scanners found it
How It Works
Fingerprinting
Each finding gets a fingerprint based on:
- Location - File + line number
- Type - Vulnerability category
- Code - Code pattern hash
Matching
Findings with matching fingerprints are grouped:
Slither: Reentrancy at Token.sol:45 ─┐
Aderyn: Reentrancy at Token.sol:45 ─┼─→ Single Finding
Mythril: Reentrancy at Token.sol:45 ─┘
Canonical Selection
For grouped findings, we select the "best" one to display:
- Best description
- Most context
- Highest confidence
Other findings become "also found by" references.
Deduplication Strategies
Exact Match
Same file, same line, same type:
- Highest confidence matching
- Rarely misses true duplicates
Fuzzy Match
Similar location, same type:
- Catches near-duplicates
- Handles line number drift
Semantic Match
Same code pattern, different location:
- Finds repeated issues
- Groups similar problems
Viewing Deduplicated Results
In Findings List
Deduplicated findings show:
Reentrancy vulnerability
Token.sol:45 | Critical
Found by: Slither, Aderyn, SolidityDefend (3 scanners)
In Finding Detail
The detail view shows:
- Primary finding (canonical)
- All scanner sources
- Combined recommendations
Expanding Duplicates
Click "Show all sources" to see:
- Each scanner's raw output
- Individual descriptions
- Scanner-specific details
Deduplication Statistics
Summary Section
After a scan, see:
Total raw findings: 150
Unique after dedup: 50
Reduction: 67%
Per-Finding
Each finding shows:
- Number of scanners that found it
- List of source scanners
- Combined confidence
Benefits
Less Noise
Review 50 issues, not 150.
Higher Confidence
Multiple scanner agreement = more confidence.
Better Context
Combined information from all sources.
Faster Triage
Spend time on unique issues, not duplicates.
When Findings Aren't Deduplicated
Different Types
Same location, different vulnerability:
Token.sol:45 - Reentrancy (Slither)
Token.sol:45 - Unchecked Call (Mythril)
These are separate issues.
Different Locations
Same type, different locations:
Token.sol:45 - Reentrancy
Token.sol:78 - Reentrancy
These are separate findings.
Edge Cases
Some findings may not dedupe if:
- Scanner reports different types
- Location is slightly different
- Code pattern differs
Adjusting Deduplication
Aggressive Deduplication
Groups more findings together:
- Broader matching
- Fewer unique findings
- May combine distinct issues
Conservative Deduplication
Groups fewer findings:
- Stricter matching
- More unique findings
- Less risk of over-grouping
Setting Preference
Go to Settings → Scan Settings → Deduplication Level
| Level | Behavior |
|---|---|
| Aggressive | Maximize grouping |
| Balanced | Default |
| Conservative | Minimize grouping |
Quality Indicators
High Deduplication
Many scanners finding same issues:
- Good scanner overlap
- Likely real vulnerabilities
Low Deduplication
Few duplicates found:
- Scanners finding different things
- Or very unique issues
Suspicious Patterns
Extremely high dedup (>90%):
- May need to review scanner selection
- Some scanners may be redundant
FAQ
Q: Can I see the original findings before dedup?
A: Yes. Click "Show all sources" on any finding.
Q: What if deduplication groups different issues?
A: Report false grouping. Use Conservative setting.
Q: Does dedup affect severity?
A: No. The highest severity from any source is used.
Q: How do I know if something was deduplicated?
A: Look for "Found by X scanners" indicator.
Next Steps
- Risk Scoring - How scores are calculated
- Prioritization - Fix ordering
- Scanner Catalog - What each scanner finds