11 KiB
IT Forensics Tests
This document provides detailed explanations of the IT Forensics tests in the evaluation suite.
Overview
The forensics tests are designed to evaluate an AI model's ability to:
- Interpret raw hex data from various forensic artifacts
- Apply domain knowledge of file systems, registry, and network protocols
- Perform accurate byte-order conversions (little-endian)
- Correlate events and reconstruct timelines
- Explain technical concepts clearly
🔍 Test Breakdown
IT Forensics - File Systems
Test: forensics_mft_01 - MFT Entry Analysis (Basic)
Purpose: Evaluate basic NTFS Master File Table interpretation
Key Concepts:
- MFT Signature: "FILE" (46 49 4C 45 in hex, ASCII)
- Entry flags at offset 0x16:
- 0x01 = In use
- 0x02 = Directory
- Sequence number: 16-bit value at offset 0x10 (little-endian)
Example Hex Dump:
Offset(h) 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
00000000 46 49 4C 45 30 00 03 00 95 1F 23 00 00 00 00 00
00000010 01 00 01 00 38 00 01 00 A0 01 00 00 00 04 00 00
Expected Analysis:
- Signature: "FILE" (bytes 00-03)
- Update Sequence Offset: 0x0030 (bytes 04-05, little-endian)
- Update Sequence Size: 0x0003 (bytes 06-07, little-endian)
- Sequence Number: 0x0001 (bytes 10-11, little-endian)
- Flags: 0x0001 at offset 0x16 = In use
Scoring Criteria:
- 5 points: Identifies all fields correctly with offset references
- 3-4 points: Identifies most fields, minor errors in interpretation
- 1-2 points: Recognizes MFT but misses key fields
- 0 points: Cannot identify as MFT entry
Test: forensics_mft_02 - MFT Entry Analysis (Advanced)
Purpose: Deep understanding of MFT structure
Additional Concepts:
- Update Sequence Array (USA): Anti-corruption mechanism
- $LogFile Sequence Number (LSN): Transaction logging
- First Attribute Offset: Where attribute records begin
- MFT Entry Flags: Bitfield indicating file properties
Key Offsets:
- 0x00-0x03: Signature "FILE"
- 0x04-0x05: Update Sequence Offset
- 0x06-0x07: Update Sequence Size
- 0x08-0x0F: $LogFile Sequence Number (LSN, 64-bit)
- 0x10-0x11: Sequence Number
- 0x14-0x15: First Attribute Offset
- 0x16-0x17: Flags (0x01=in use, 0x02=directory)
Example Analysis for LSN:
Offset 08: EA 3F 00 00 00 00 00 00
Little-endian 64-bit: 0x0000000000003FEA = 16362 decimal
Scoring Criteria:
- 5 points: All fields correct with little-endian conversion shown
- 3-4 points: Most fields correct, minor calculation errors
- 1-2 points: Understands structure but significant errors
- 0 points: Cannot parse MFT header
Test: forensics_signature_01 - File Signature Identification
Purpose: Recognition of common file magic numbers
Magic Numbers to Know:
| Signature | File Type | Notes |
|---|---|---|
| FF D8 FF E0 | JPEG | Often followed by "JFIF" |
| 89 50 4E 47 0D 0A 1A 0A | PNG | \x89PNG + line endings |
| 25 50 44 46 | "%PDF" in ASCII | |
| 50 4B 03 04 | ZIP | "PK" headers (PKZip) |
| 52 61 72 21 1A 07 | RAR | "Rar!" + markers |
| 4D 5A | EXE/DLL | DOS "MZ" header |
| 7F 45 4C 46 | ELF | Linux executables |
Test Example:
A) FF D8 FF E0 00 10 4A 46 49 46
→ JPEG (FF D8 FF + JFIF marker)
B) 50 4B 03 04 14 00 06 00
→ ZIP/DOCX/XLSX (PKZip format)
Scoring Criteria:
- 5 points: All signatures identified with explanations
- 3-4 points: Most correct, understands concept
- 1-2 points: Recognizes some but misses key ones
- 0 points: Cannot identify file signatures
IT Forensics - Registry & Artifacts
Test: forensics_registry_01 - Windows Registry Hive Header
Purpose: Parse Windows Registry binary format
Key Structure:
Offset Field Size
0x00 Signature "regf" 4 bytes
0x04 Primary Seq Number 4 bytes (little-endian)
0x08 Secondary Seq Number 4 bytes (little-endian)
0x0C Timestamp 8 bytes (FILETIME)
0x14 Major Version 4 bytes
0x18 Minor Version 4 bytes
Example:
Offset(h) 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
00000000 72 65 67 66 E6 07 00 00 E6 07 00 00 00 00 00 00
Analysis:
- Signature: "regf" (72 65 67 66)
- Primary Seq: 0x000007E6 = 2022 decimal
- Secondary Seq: 0x000007E6 = 2022 decimal
Scoring Criteria:
- 5 points: Correct parsing with endianness consideration
- 3-4 points: Identifies structure, minor errors
- 1-2 points: Recognizes registry but inaccurate parsing
- 0 points: Cannot identify registry hive
Test: forensics_timestamp_01 - FILETIME Conversion
Purpose: Convert Windows timestamps to human-readable format
FILETIME Format:
- 64-bit value (little-endian)
- Counts 100-nanosecond intervals
- Epoch: January 1, 1601 00:00:00 UTC
Conversion Process:
- Reverse byte order (little-endian to big-endian)
- Convert to decimal
- Divide by 10,000,000 to get seconds
- Add to Unix epoch conversion factor
Example:
Hex: 01 D8 93 4B 7C F3 D9 01
Reversed: 01 D9 F3 7C 4B 93 D8 01
Decimal: 133,000,000,000,000,001
Seconds: 13,300,000,000
Date: Approximately 2023-05-15 (depends on epoch calculation)
Scoring Criteria:
- 5 points: Correct conversion with methodology explained
- 3-4 points: Understands process, calculation errors acceptable
- 1-2 points: Recognizes FILETIME but significant errors
- 0 points: Cannot explain conversion
IT Forensics - Memory & Network
Test: forensics_memory_01 - Memory Artifact Identification
Purpose: Extract meaningful data from memory dumps
Key Artifacts to Identify:
- HTTP headers (GET/POST requests)
- Session cookies (PHPSESSID, etc.)
- IP addresses and hostnames
- User agents
- Authentication tokens
Example Analysis:
GET /admin/login.php HTTP/1.1
Host: 192.168.1.100
Cookie: PHPSESSID=a3f7d8bc9e2a1d5c
Forensic Value:
- Web access to admin panel
- Target: 192.168.1.100
- Session: a3f7d8bc9e2a1d5c
- Timeline: Can correlate with web server logs
Scoring Criteria:
- 5 points: All artifacts extracted with forensic significance explained
- 3-4 points: Most artifacts identified, basic analysis
- 1-2 points: Recognizes HTTP but misses key details
- 0 points: Cannot identify artifacts
Test: forensics_network_01 - TCP Header Analysis
Purpose: Parse TCP packet headers
TCP Header Structure (first 20 bytes):
Offset Field Size Notes
0-1 Source Port 16 bits Big-endian
2-3 Destination Port 16 bits Big-endian
4-7 Sequence Number 32 bits Big-endian
8-11 Acknowledgment 32 bits Big-endian
12 Data Offset+Flags 8 bits Upper 4=offset, lower 4=reserved
13 Flags 8 bits SYN, ACK, FIN, RST, PSH, URG
14-15 Window Size 16 bits Big-endian
16-17 Checksum 16 bits
18-19 Urgent Pointer 16 bits
TCP Flags (byte 13):
- 0x01: FIN
- 0x02: SYN
- 0x04: RST
- 0x08: PSH
- 0x10: ACK
- 0x20: URG
Example:
Offset(h) 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
00000000 C3 5E 01 BB 6B 8B 9C 41 00 00 00 00 50 02 20 00
Analysis:
- Source Port: 0xC35E = 50014
- Dest Port: 0x01BB = 443 (HTTPS)
- Sequence: 0x6B8B9C41
- Flags: 0x02 = SYN (connection initiation)
- Window: 0x2000 = 8192 bytes
Scoring Criteria:
- 5 points: All fields correct with protocol understanding
- 3-4 points: Most fields correct, minor errors
- 1-2 points: Basic structure recognized, significant errors
- 0 points: Cannot parse TCP header
IT Forensics - Timeline & Log Analysis
Test: forensics_timeline_01 - Event Reconstruction
Purpose: Correlate logs to identify attack patterns
Timeline Analysis Skills:
- Chronological ordering
- Event correlation across sources
- Anomaly identification
- Attack pattern recognition
- Impact assessment
Example Scenario:
14:23:15 - Admin login from 10.0.0.5 ✓ Normal
14:23:47 - Access /etc/passwd ⚠️ Suspicious (enumeration)
14:24:12 - Write shell.php to web dir 🚨 Malicious (web shell)
14:24:45 - Netcat listener on 4444 🚨 Malicious (backdoor)
14:25:01 - External connection 🚨 Compromise (C2 callback)
14:26:33 - Admin logout
14:30:00 - Failed login from external 🚨 Lateral movement attempt
Attack Pattern: Web application compromise → web shell upload → reverse shell → persistence → lateral movement
Scoring Criteria:
- 5 points: Complete attack narrative with IOCs and recommendations
- 3-4 points: Identifies compromise, basic timeline
- 1-2 points: Recognizes suspicious activity, incomplete analysis
- 0 points: Cannot identify attack pattern
🎯 Multi-Turn Conversation Tests
Test: multiturn_01 - Progressive Hex Analysis
Purpose: Maintain context across multiple exchanges while building understanding
Turn 1: File type identification from initial bytes Turn 2: Structure parsing with offset references Turn 3: Next steps and deeper analysis
Key Evaluation Points:
- Remembers initial findings
- Builds on previous responses
- Shows progressive understanding
- Maintains technical accuracy
Test: multiturn_02 - Forensic Investigation Scenario
Purpose: Simulate real investigation workflow
Stages:
- Initial triage (data source identification)
- Evidence correlation (connecting artifacts)
- Impact assessment (IOC identification, response planning)
Scoring Focus:
- Logical investigation flow
- Context retention across turns
- Practical recommendations
- Complete picture integration
Test: multiturn_03 - Technical Depth Building
Purpose: Progress from concept to implementation
Progression:
- Concept explanation (NTFS ADS)
- Practical application (attack scenarios)
- Hands-on implementation (PowerShell commands)
Expected Depth:
- Turn 1: Clear conceptual understanding
- Turn 2: Builds on concept with examples
- Turn 3: Demonstrates practical application
📊 Evaluation Guidelines
Little-Endian Conversions
Always verify:
- Byte order reversal shown
- Decimal conversion provided
- Offset references included
Example:
Bytes at offset 0x10: 42 00
Little-endian: 0x0042 = 66 decimal
Hex to ASCII
Common conversions to know:
- 0x20-0x7E: Printable ASCII
- 46 49 4C 45 = "FILE"
- 50 4B = "PK"
- 4D 5A = "MZ"
Forensic Significance
Always ask:
- What does this artifact tell us?
- How can it be used in investigation?
- What are the limitations?
- What other data sources confirm/refute this?
🎓 Recommended Resources
For deeper understanding:
- NTFS Documentation (Microsoft)
- RFC 793 (TCP)
- File Signatures Database (Gary Kessler)
- Windows Registry Forensics (Harlan Carvey)
- The Art of Memory Forensics (Ligh, Case, Levy, Walters)
⚖️ Scoring Summary
Exceptional (4-5):
- Accurate hex interpretation
- Correct endianness handling
- Forensic context provided
- Clear explanations
Pass (2-3):
- Basic accuracy
- Some interpretation errors
- Limited context
- Incomplete explanations
Fail (0-1):
- Major misinterpretations
- No endianness consideration
- Missing forensic value
- Incoherent explanations