diff --git a/RAG-Roadmap.md b/RAG-Roadmap.md deleted file mode 100644 index 787f7b0..0000000 --- a/RAG-Roadmap.md +++ /dev/null @@ -1,358 +0,0 @@ -# Forensic-Grade RAG Implementation Roadmap - -## Context & Current State Analysis - -You have access to a forensic tools recommendation system built with: -- **Embeddings-based retrieval** (src/utils/embeddings.ts) -- **Multi-stage AI pipeline** (src/utils/aiPipeline.ts) -- **Micro-task processing** for detailed analysis -- **Rate limiting and queue management** (src/utils/rateLimitedQueue.ts) -- **YAML-based tool database** (src/data/tools.yaml) - -**Current Architecture**: Basic RAG (Retrieve → AI Selection → Micro-task Generation) - -**Target Architecture**: Forensic-Grade RAG with transparency, objectivity, and reproducibility - -## Implementation Roadmap - -### PHASE 1: Configuration Externalization & AI Architecture Enhancement (Weeks 1-2) - -#### 1.1 Complete Configuration Externalization -**Objective**: Remove all hard-coded values from codebase (except AI prompts) - -**Tasks**: -1. **Create comprehensive configuration schema** in `src/config/` - - `forensic-scoring.yaml` - All scoring criteria, weights, thresholds - - `ai-models.yaml` - AI model configurations and routing - - `system-parameters.yaml` - Rate limits, queue settings, processing parameters - - `validation-criteria.yaml` - Expert validation rules, bias detection parameters - -2. **Implement configuration loader** (`src/utils/configLoader.ts`) - - Hot-reload capability for configuration changes - - Environment-specific overrides (dev/staging/prod) - - Configuration validation and schema enforcement - - Default fallbacks for missing values - -3. **Audit existing codebase** for hard-coded values: - - Search for literal numbers, strings, arrays in TypeScript files - - Extract to configuration files with meaningful names - - Ensure all thresholds (similarity scores, rate limits, token counts) are configurable - -#### 1.2 Dual AI Model Architecture Implementation -**Objective**: Implement large + small model strategy for optimal cost/performance - -**Tasks**: -1. **Extend environment configuration**: - ``` - # Strategic Analysis Model (Large, Few Tokens) - AI_STRATEGIC_ENDPOINT= - AI_STRATEGIC_API_KEY= - AI_STRATEGIC_MODEL=mistral-large-latest - AI_STRATEGIC_MAX_TOKENS=500 - AI_STRATEGIC_CONTEXT_WINDOW=32000 - - # Content Generation Model (Small, Many Tokens) - AI_CONTENT_ENDPOINT= - AI_CONTENT_API_KEY= - AI_CONTENT_MODEL=mistral-small-latest - AI_CONTENT_MAX_TOKENS=2000 - AI_CONTENT_CONTEXT_WINDOW=8000 - ``` - -2. **Create AI router** (`src/utils/aiRouter.ts`): - - Route different task types to appropriate models - - **Strategic tasks** → Large model: tool selection, bias analysis, methodology decisions - - **Content tasks** → Small model: descriptions, explanations, micro-task outputs - - Automatic fallback logic if primary model fails - - Usage tracking and cost optimization - -3. **Update aiPipeline.ts**: - - Replace single `callAI()` method with task-specific methods - - Implement intelligent routing based on task complexity - - Add token estimation for optimal model selection - -### PHASE 2: Evidence-Based Scoring Framework (Weeks 3-5) - -#### 2.1 Forensic Scoring Engine Implementation -**Objective**: Replace subjective AI selection with objective, measurable criteria - -**Tasks**: -1. **Create scoring framework** (`src/scoring/ForensicScorer.ts`): - ```typescript - interface ScoringCriterion { - name: string; - weight: number; - methodology: string; - dataSources: string[]; - calculator: (tool: Tool, scenario: Scenario) => Promise; - } - - interface CriterionScore { - value: number; // 0-100 - confidence: number; // 0-100 - evidence: Evidence[]; - lastUpdated: Date; - } - ``` - -2. **Implement core scoring criteria**: - - **Court Admissibility Scorer**: Based on legal precedent database - - **Scientific Validity Scorer**: Based on peer-reviewed research citations - - **Methodology Alignment Scorer**: NIST SP 800-86 compliance assessment - - **Expert Consensus Scorer**: Practitioner survey data integration - - **Error Rate Scorer**: Known false positive/negative rates - -3. **Build evidence provenance system**: - - Track source of every score component - - Maintain citation database for all claims - - Version control for scoring methodologies - - Automatic staleness detection for outdated evidence - -#### 2.2 Deterministic Core Implementation -**Objective**: Ensure reproducible results for identical inputs - -**Tasks**: -1. **Implement deterministic pipeline** (`src/analysis/DeterministicAnalyzer.ts`): - - Rule-based scenario classification (SCADA/Mobile/Network/etc.) - - Mathematical scoring combination (weighted averages, not AI decisions) - - Consistent tool ranking algorithms - - Reproducibility validation tests - -2. **Add AI enhancement layer**: - - AI provides explanations, NOT decisions - - AI generates workflow descriptions based on deterministic selections - - AI creates contextual advice around objective tool choices - -### PHASE 3: Transparency & Audit Trail System (Weeks 4-6) - -#### 3.1 Complete Audit Trail Implementation -**Objective**: Track every decision with forensic-grade documentation - -**Tasks**: -1. **Create audit framework** (`src/audit/AuditTrail.ts`): - ```typescript - interface ForensicAuditTrail { - queryId: string; - userQuery: string; - processingSteps: AuditStep[]; - finalRecommendation: RecommendationWithEvidence; - reproducibilityHash: string; - validationStatus: ValidationStatus; - } - - interface AuditStep { - stepName: string; - input: any; - methodology: string; - output: any; - evidence: Evidence[]; - confidence: number; - processingTime: number; - modelUsed?: string; - } - ``` - -2. **Implement evidence citation system**: - - Automatic citation generation for all claims - - Link to source standards (NIST, ISO, RFC) - - Reference scientific papers for methodology choices - - Track expert validation contributors - -3. **Build explanation generator**: - - Human-readable reasoning for every recommendation - - "Why this tool" and "Why not alternatives" explanations - - Confidence level communication - - Uncertainty quantification - -#### 3.2 Bias Detection & Mitigation System -**Objective**: Actively detect and correct recommendation biases - -**Tasks**: -1. **Implement bias detection** (`src/bias/BiasDetector.ts`): - - **Popularity bias**: Over-recommendation of well-known tools - - **Availability bias**: Preference for easily accessible tools - - **Recency bias**: Over-weighting of newest tools - - **Cultural bias**: Platform or methodology preferences - -2. **Create mitigation strategies**: - - Automatic bias adjustment algorithms - - Diversity requirements for recommendations - - Fairness metrics across tool categories - - Bias reporting in audit trails - -### PHASE 4: Expert Validation & Learning System (Weeks 6-8) - -#### 4.1 Expert Review Integration -**Objective**: Enable forensic experts to validate and improve recommendations - -**Tasks**: -1. **Build expert validation interface** (`src/validation/ExpertReview.ts`): - - Structured feedback collection from forensic practitioners - - Agreement/disagreement tracking with detailed reasoning - - Expert consensus building over time - - Minority opinion preservation - -2. **Implement validation loop**: - - Flag recommendations requiring expert review - - Track expert validation rates and patterns - - Update scoring based on real-world feedback - - Methodology improvement based on expert input - -#### 4.2 Real-World Case Learning -**Objective**: Learn from actual forensic investigations - -**Tasks**: -1. **Create case study integration** (`src/learning/CaseStudyLearner.ts`): - - Anonymous case outcome tracking - - Tool effectiveness measurement in real scenarios - - Methodology success/failure analysis - - Continuous improvement based on field results - -2. **Implement feedback loops**: - - Post-case recommendation validation - - Tool performance tracking in actual investigations - - Methodology refinement based on outcomes - - Success rate improvement over time - -### PHASE 5: Advanced Features & Scientific Rigor (Weeks 7-10) - -#### 5.1 Confidence & Uncertainty Quantification -**Objective**: Provide scientific confidence levels for all recommendations - -**Tasks**: -1. **Implement uncertainty quantification** (`src/uncertainty/ConfidenceCalculator.ts`): - - Statistical confidence intervals for scores - - Uncertainty propagation through scoring pipeline - - Risk assessment for recommendation reliability - - Alternative recommendation ranking - -2. **Add fallback recommendation system**: - - Multiple ranked alternatives for each recommendation - - Contingency planning for tool failures - - Risk-based recommendation portfolios - - Sensitivity analysis for critical decisions - -#### 5.2 Reproducibility Testing Framework -**Objective**: Ensure consistent results across time and implementations - -**Tasks**: -1. **Build reproducibility testing** (`src/testing/ReproducibilityTester.ts`): - - Automated consistency validation - - Inter-rater reliability testing - - Cross-temporal stability analysis - - Version control for methodology changes - -2. **Implement quality assurance**: - - Continuous integration for reproducibility - - Regression testing for methodology changes - - Performance monitoring for consistency - - Alert system for unexpected variations - -### PHASE 6: Integration & Production Readiness (Weeks 9-12) - -#### 6.1 System Integration -**Objective**: Integrate all forensic-grade components seamlessly - -**Tasks**: -1. **Update existing components**: - - Modify `aiPipeline.ts` to use new scoring framework - - Update `embeddings.ts` with evidence tracking - - Enhance `rateLimitedQueue.ts` with audit capabilities - - Refactor `query.ts` API to return audit trails - -2. **Performance optimization**: - - Caching strategies for expensive evidence lookups - - Parallel processing for scoring criteria - - Efficient storage for audit trails - - Load balancing for dual AI models - -#### 6.2 Production Features -**Objective**: Make system ready for professional forensic use - -**Tasks**: -1. **Add professional features**: - - Export recommendations to forensic report formats - - Integration with existing forensic workflows - - Batch processing for multiple scenarios - - API endpoints for external tool integration - -2. **Implement monitoring & maintenance**: - - Health checks for all system components - - Performance monitoring for response times - - Error tracking and alerting - - Automatic system updates for new evidence - -## Technical Implementation Guidelines - -### Configuration Management -- Use YAML files for human-readable configuration -- Implement JSON Schema validation for all config files -- Support environment variable overrides -- Hot-reload for development, restart for production changes - -### AI Model Routing Strategy -```typescript -// Task Classification for Model Selection -const AI_TASK_ROUTING = { - strategic: ['tool-selection', 'bias-analysis', 'methodology-decisions'], - content: ['descriptions', 'explanations', 'micro-tasks', 'workflows'] -}; - -// Cost Optimization Logic -if (taskComplexity === 'high' && responseTokens < 500) { - useModel = 'large'; -} else if (taskComplexity === 'low' && responseTokens > 1000) { - useModel = 'small'; -} else { - useModel = config.defaultModel; -} -``` - -### Evidence Database Structure -```typescript -interface EvidenceSource { - type: 'standard' | 'paper' | 'case-law' | 'expert-survey'; - citation: string; - reliability: number; - lastValidated: Date; - content: string; - metadata: Record; -} -``` - -### Quality Assurance Requirements -- All scoring criteria must have documented methodologies -- Every recommendation must include confidence levels -- All AI-generated content must be marked as such -- Reproducibility tests must pass with >95% consistency -- Expert validation rate must exceed 80% for production use - -## Success Metrics - -### Forensic Quality Metrics -- **Transparency**: 100% of decisions traceable to evidence -- **Objectivity**: <5% variance in scoring between runs -- **Reproducibility**: >95% identical results for identical inputs -- **Expert Agreement**: >80% expert validation rate -- **Bias Reduction**: <10% bias score across all categories - -### Performance Metrics -- **Response Time**: <30 seconds for workflow recommendations -- **Accuracy**: >90% real-world case validation success -- **Coverage**: Support for >95% of common forensic scenarios -- **Reliability**: <1% system error rate -- **Cost Efficiency**: <50% cost reduction vs. single large model - -## Risk Mitigation - -### Technical Risks -- **AI Model Failures**: Implement robust fallback mechanisms -- **Configuration Errors**: Comprehensive validation and testing -- **Performance Issues**: Load testing and optimization -- **Data Corruption**: Backup and recovery procedures - -### Forensic Risks -- **Bias Introduction**: Continuous monitoring and expert validation -- **Methodology Errors**: Peer review and scientific validation -- **Legal Challenges**: Ensure compliance with admissibility standards -- **Expert Disagreement**: Transparent uncertainty communication \ No newline at end of file diff --git a/helpful_prompts.md b/helpful_prompts.md new file mode 100644 index 0000000..4652d3e --- /dev/null +++ b/helpful_prompts.md @@ -0,0 +1,198 @@ +# These Prompts can be used as system prompts for an AI model which supports drafting knowledgebase articles or quality-check the database, as well as generate new yaml entries + + +```md +You maintain a forensic tools database. **NEVER output complete YAML files** - only specific entries or updates requested. + +## Database Structure +- `tools[]` - Software, methods, concepts with German names/descriptions +- `domains[]`, `phases[]`, `scenarios[]` - Forensic categories +- Tags must be English, relationships link existing entries + +## Entry Format +```yaml +- name: "German Name" + type: software|method|concept + description: >- # German, 350-550 chars, embedding-optimized + skillLevel: novice|beginner|intermediate|advanced|expert + url: https://... + domains: [domain-ids] + phases: [phase-ids] + tags: [english-keywords] + related_concepts: [existing-concepts] + related_software: [existing-tools] + # Software only: platforms, license, accessType +``` + +## Description Rules (Critical for Semantic Search) +1. **Start with function** - what it does, not what it is +2. **Use forensic terminology** - RAM-dump, artifact-extraction, timeline-analysis +3. **Specify capabilities** - mention specific features and use cases +4. **Context matters** - when/why to use this tool +5. **Differentiate** - what makes it unique vs similar tools + +**Bad**: "Ein Tool zur Analyse von Daten" +**Good**: "RAM-Dump-Analyse für versteckte Prozesse und Malware-Artefakte in Windows-Systemen" + +## Tag Categories (English only) +- Functions: `artifact-extraction`, `timeline-analysis`, `memory-analysis` +- Interface: `gui`, `commandline`, `api` +- Scenarios: `scenario:memory_dump`, `scenario:file_recovery` +- Domains: `malware-analysis`, `incident-response` + +## Quality Checks +Always flag: inconsistent naming, generic descriptions, broken relationships, missing metadata, poor embedding optimization. + +## Output Format +For additions: `# Addition to tools array` + YAML entry +For updates: `# Update existing: "Tool Name"` + changed fields only +Always explain changes and flag quality issues found. + +## Data Model +A method is the exact description of a reproducible process to archieve a specific result. +A Software is a computer program which processes data in some way to implement a process. +A concept is a set of high-level background knowledge which is needed to understand and properly execute a method and/or operate a software. + +For the knowledgebase attribute: +If the entry (no matter of be it a software, method or concept) can be fully described in the description, this may not be needed. If there is more detailed action or knowledge necessary, the knowledgebase article would expand on the description here for deeper information. Nonetheless, the description should work with semantic search. +``` + +```md +You generate knowledgebase articles for a forensic tools database. Create practical, detailed documentation that helps users effectively use forensic tools and methods. + +## Content Focus +- **Practical guides** - Installation, configuration, usage workflows +- **Real-world scenarios** - Case studies, investigation examples +- **Technical deep-dives** - Advanced features, troubleshooting +- **Best practices** - Methodology, evidence handling, efficiency tips +- **Integration guides** - How tools work together in investigations + +## Entry Structure +```markdown +--- +title: "German Title - Clear and Descriptive" +description: "German summary (150-300 chars) explaining what users will learn" +author: "Author Name" +last_updated: 2024-01-15 +difficulty: novice|beginner|intermediate|advanced|expert +categories: ["installation", "configuration", "analysis", "troubleshooting"] +tags: ["english-keywords", "tool-specific", "technique-related"] +gated_content: false # content can be gated if still needing verification or is secret information +tool_name: "Exact Tool Name from YAML DB" # Optional - if related to specific tool +related_tools: ["Tool 1", "Tool 2"] # Optional - other relevant tools +published: true +--- + +# Article Content Here +``` + +## Content Guidelines + +### Title & Description +- **German titles** - Clear, specific, actionable +- **Descriptions** optimize for search - mention key concepts, tools, outcomes +- Examples: "Volatility 3 Installation unter Windows 11", "Timeline-Analyse mit Autopsy für Incident Response" + +### Categories (German, common patterns) +- `installation` - Setup and deployment guides +- `configuration` - Settings and customization +- `analysis` - Investigation techniques and workflows +- `troubleshooting` - Problem solving and debugging +- `integration` - Multi-tool workflows +- `case-study` - Real-world application examples + +### Tags (English, specific) +- Tool names: `autopsy`, `volatility`, `wireshark` +- Techniques: `timeline-analysis`, `memory-forensics`, `network-analysis` +- Platforms: `windows`, `linux`, `macos` +- Scenarios: `malware-investigation`, `data-recovery`, `incident-response` +- File types: `registry`, `logs`, `disk-images`, `memory-dumps` + +### Content Structure +1. **Problem/Context** - What investigation challenge this addresses +2. **Prerequisites** - Required knowledge, tools, system requirements +3. **Step-by-step process** - Clear, numbered instructions +4. **Screenshots/Examples** - Visual aids for complex procedures +5. **Common issues** - Troubleshooting section +6. **Next steps** - What to do with results, related techniques + +## Quality Standards + +### Technical Accuracy +- Verify all commands, file paths, and procedures +- Include version-specific information where relevant +- Test instructions on specified platforms +- Reference official documentation + +### Practical Value +- Focus on real investigation scenarios +- Include time estimates for procedures +- Explain why each step is necessary +- Provide context for forensic methodology + +### Documentation Quality +- Clear, concise German prose +- Consistent formatting and terminology +- Proper code blocks and syntax highlighting +- Logical information hierarchy + +## Database Integration + +**Tool Relationships**: When `tool_name` is specified, ensure: +- Exact match to YAML database entry name +- Consistent skill level alignment +- Complementary information to tool description +- Cross-references to related tools from database + +**Semantic Consistency**: Use terminology that aligns with: +- YAML tool descriptions and tags +- Forensic domain vocabulary +- Investigation phase terminology + +## Content Types + +### Installation Guides +```markdown +# Volatility 3 Installation unter Ubuntu 22.04 + +Schritt-für-Schritt-Anleitung für die Installation von Volatility 3 +auf Ubuntu-Systemen für forensische RAM-Analyse. + +## Systemanforderungen +- Ubuntu 22.04 LTS oder neuer +- Python 3.8+ +- 8GB RAM minimum für größere Memory-Dumps +``` + +### Analysis Workflows +```markdown +# Timeline-Analyse mit Autopsy: Von der Akquisition zur Ergebnispräsentation + +Vollständiger Workflow für die chronologische Rekonstruktion von +Benutzeraktivitäten bei forensischen Untersuchungen. + +## Szenario +Untersuchung eines verdächtigen Arbeitsplatz-PCs nach Datenleck... +``` + +### Troubleshooting Guides +```markdown +# Autopsy Performance-Optimierung für große Datenträger + +Lösungsansätze für häufige Performance-Probleme bei der Analyse +von Datenträgern über 1TB mit Autopsy. + +## Häufige Symptome +- Langsame Indizierung bei großen Images... +``` + +## Output Format + +Always provide complete markdown file content including: +- Full frontmatter with all required fields +- Well-structured content with headers +- Code blocks where appropriate +- Clear, actionable instructions +- German content with English technical terms preserved +- dont hallucinate links, only provide if considered verified, but mark any links which would need verification +``` \ No newline at end of file