Noise Filtering¶
Reduce false positives and focus on important findings with configurable noise controls.
Overview¶
Noise filtering helps reduce false positives and focus on important findings by applying multiple filtering strategies:
- Confidence thresholds: Filter findings below minimum confidence
- Deduplication: Remove duplicate or highly similar findings
- Severity filtering: Only show findings above minimum severity
- Known risk suppression: Suppress acknowledged risks
- Rule-based adjustments: Apply severity rules and adjustments
- Memory-based filtering: Use user feedback to suppress false positives
Configuration¶
Configure noise filtering in superqode.yaml:
qe:
noise:
# Minimum confidence threshold (0.0 to 1.0)
min_confidence: 0.7
# Enable deduplication
deduplicate: true
# Similarity threshold for deduplication (0.0 to 1.0)
similarity_threshold: 0.8
# Suppress known/acknowledged risks
suppress_known_risks: false
known_risk_patterns:
- "legacy code pattern"
- "known false positive"
# Minimum severity to report
min_severity: "low" # low, medium, high, critical
# Maximum findings per file (0 = unlimited)
max_findings_per_file: 0
# Maximum total findings (0 = unlimited)
max_total_findings: 0
# Enable rule-based severity adjustments
apply_severity_rules: true
Confidence Threshold¶
Filter findings below a minimum confidence score.
Default: 0.7¶
Only findings with confidence โฅ 0.7 are included.
Adjusting Threshold¶
| Threshold | Effect |
|---|---|
0.9 | Very strict - only high-confidence findings |
0.7 | Default - balanced |
0.5 | Lenient - includes more speculative findings |
0.3 | Very lenient - includes most findings |
Example¶
# Strict filtering
min_confidence: 0.85
# Balanced (default)
min_confidence: 0.7
# Include more findings
min_confidence: 0.5
Deduplication¶
Remove duplicate or highly similar findings to reduce noise.
How It Works¶
- Fingerprint-based: Exact duplicates are identified by fingerprint
- Similarity-based: Fuzzy matching identifies similar findings
- Highest severity kept: When duplicates found, keeps highest severity instance
Fingerprinting¶
Each finding gets a fingerprint based on: - Title - File path - Line number range - Finding type - Evidence snippet
Similarity Threshold¶
Controls how similar findings must be to be considered duplicates:
| Threshold | Effect |
|---|---|
1.0 | Only exact duplicates removed |
0.8 | Default - similar findings merged |
0.6 | Aggressive - many similar findings merged |
Example¶
Finding A: "SQL injection in user input"
Finding B: "SQL injection vulnerability in user input"
Finding C: "Potential SQL injection in query"
With similarity_threshold: 0.8
โ A and B merged (similarity ~0.9)
โ C kept separate (similarity ~0.6)
Severity Filtering¶
Only report findings above a minimum severity level.
Severity Levels¶
| Level | Priority | Description |
|---|---|---|
critical | 4 | Security breaches, data loss risks |
high | 3 | Serious bugs, performance issues |
medium | 2 | Moderate issues, best practice violations |
low | 1 | Minor issues, style inconsistencies |
info | 0 | Informational findings |
Minimum Severity¶
Example¶
# Only critical and high findings
min_severity: "high"
# Include medium and above
min_severity: "medium"
# Include all findings (default)
min_severity: "low"
Known Risk Suppression¶
Suppress findings that match known risk patterns.
Configuration¶
qe:
noise:
suppress_known_risks: true
known_risk_patterns:
- "legacy authentication code"
- "intentional backdoor for testing"
- "approved security exception"
Pattern Matching¶
Patterns can be: - Exact strings: Matches if pattern appears in finding title or description - Case-insensitive: Matching is case-insensitive - Partial matches: Matches anywhere in text
Example¶
known_risk_patterns:
- "legacy code - do not refactor"
- "approved exception: SEC-2024-001"
- "intentional for test harness"
Rule-Based Severity Adjustments¶
Apply severity rules to adjust finding severity based on context.
Default Rules¶
Built-in rules adjust severity based on: - Finding type (security vs. style) - Context (test code vs. production) - Evidence strength
Configuration¶
qe:
noise:
apply_severity_rules: true
# Uses default rules if not specified
# Custom rules (advanced)
severity_rules:
# Custom rule configuration
Example Adjustments¶
- Security issues in production:
lowโhigh - Style issues in test code:
mediumโinfo - Unclear evidence:
highโmedium
Memory-Based Filtering¶
Use feedback from previous sessions to suppress false positives.
How It Works¶
When you mark findings as false positives:
The system remembers this and suppresses similar findings in future sessions.
Automatic Application¶
Memory-based suppressions are automatically applied if: - Finding matches a previously suppressed pattern - User feedback indicates false positive - Pattern is learned from multiple feedback instances
Disable Memory Filtering¶
Finding Limits¶
Limit the number of findings reported to focus on top issues.
Per-File Limit¶
Useful when files have many issues - focuses on top findings.
Total Limit¶
Example¶
Filtering Order¶
Noise filters are applied in this order:
- Confidence threshold: Remove low-confidence findings
- Memory suppressions: Remove learned false positives
- Known risk patterns: Suppress configured patterns
- Deduplication: Merge similar findings
- Severity filtering: Remove below minimum severity
- Severity rules: Adjust severity based on rules
- Finding limits: Apply per-file and total limits
Examples¶
Strict Filtering¶
For production code review:
qe:
noise:
min_confidence: 0.85
min_severity: "medium"
deduplicate: true
similarity_threshold: 0.9
max_total_findings: 20
Lenient Filtering¶
For exploratory analysis:
qe:
noise:
min_confidence: 0.5
min_severity: "low"
deduplicate: true
similarity_threshold: 0.7
suppress_known_risks: false
Focus on Security¶
qe:
noise:
min_confidence: 0.7
min_severity: "high" # Only high/critical
deduplicate: true
apply_severity_rules: true
Monitoring Filtering Effectiveness¶
Check Filtered Counts¶
# View QR to see filtering stats
superqe report
# Check artifacts for filtering details
cat .superqode/qe-artifacts/qr/qr-*.json | jq '.noise_filtering'
Adjust Based on Results¶
- If too many false positives โ increase
min_confidenceormin_severity - If important findings missing โ decrease thresholds
- If too many duplicates โ adjust
similarity_threshold - If specific patterns โ add to
known_risk_patterns
Best Practices¶
1. Start Conservative¶
2. Tune Based on Feedback¶
3. Use Known Risk Patterns¶
4. Set Reasonable Limits¶
Troubleshooting¶
Too Many Findings¶
Symptom: Report has hundreds of findings
Solution:
Important Findings Filtered¶
Symptom: Critical issues not showing up
Solution:
Duplicate Findings¶
Symptom: Same issue reported multiple times
Solution:
Related Features¶
- Noise Configuration - Configuration options
- Feedback System - Learn from feedback
Next Steps¶
- QE Features Index - All QE features
- Configuration Reference - Full config options