EVTX Parsing Strategy for SecureWatch SIEMΒΆ
OverviewΒΆ
This document outlines the comprehensive strategy for parsing Windows Event Log (EVTX) files in SecureWatch SIEM, enabling ingestion of historical Windows security events for threat hunting and forensic analysis. Updated with Enhanced EVTX Parser v2.0 featuring comprehensive MITRE ATT&CK detection and Sysmon support.
Enhanced EVTX Parser v2.0ΒΆ
Key EnhancementsΒΆ
MITRE ATT&CK Integration: Automatic technique detection with 50+ supported techniques
Sysmon Support: Full coverage of Events 1-29 with enhanced field extraction
Attack Pattern Recognition: 50+ regex patterns for malicious behavior detection
Risk Scoring Algorithm: Intelligent threat prioritization (0-100 scale)
Web Upload Interface: Real-time file parsing via frontend component
EVTX-ATTACK-SAMPLES Testing: Validated against 329 attack samples
Implementation StatusΒΆ
β
Completed: Enhanced parser with MITRE ATT&CK detection
β
Completed: Comprehensive testing against EVTX-ATTACK-SAMPLES
β
Completed: Web-based upload component
β
Completed: Integration with SecureWatch platform
β
Completed: Risk scoring and confidence assessment
Architecture DesignΒΆ
1. Parser Service ArchitectureΒΆ
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β File Upload ββββββΆβ EVTX Parser ββββββΆβ Log Normalizer β
β API Endpoint β β Service β β β
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β β β
β β βΌ
β β βββββββββββββββββββ
β β β TimescaleDB β
β β βββββββββββββββββββ
β β
βΌ βΌ
βββββββββββββββββββ ββββββββββββββββββββ
β File Storage β β Processing β
β (Temporary) β β Queue β
βββββββββββββββββββ ββββββββββββββββββββ
2. Technology StackΒΆ
Python EVTX Library:
python-evtxfor parsing binary EVTX filesNode.js Integration: Child process or microservice for Python execution
File Processing: Multer for file uploads, Bull for job queuing
Storage: Temporary file storage with automatic cleanup
Implementation StrategyΒΆ
Phase 1: EVTX Parser ModuleΒΆ
Create a Python-based EVTX parser that:
Reads binary EVTX files using python-evtx
Extracts all event records with full metadata
Converts Windows XML schema to JSON format
Preserves all fields for comprehensive analysis
Phase 2: API IntegrationΒΆ
Extend log-ingestion service with:
File upload endpoint:
POST /api/ingest/evtxSupport for multiple file formats:
.evtx (binary Windows Event Log)
.xml (exported Windows Event XML)
.json (pre-processed event data)
Async processing with job queue
Progress tracking and status API
Phase 3: Enhanced Field MappingΒΆ
Map Windows Event fields to SecureWatch schema:
Security Events (4624/4625) - Authentication
Process Events (4688) - Process creation
Service Events (7045) - Service installation
Task Events (4698/106) - Scheduled tasks
PowerShell Events (4103/4104) - Script execution
Audit Events (1102) - Log clearing
Phase 4: Batch ProcessingΒΆ
Implement efficient batch processing:
Chunk large EVTX files (>100MB)
Parallel processing of event records
Memory-efficient streaming
Progress reporting via WebSocket
Detailed Component DesignΒΆ
1. Python EVTX Parser (evtx_parser.py)ΒΆ
import Evtx.Evtx as evtx
import json
from datetime import datetime
import xml.etree.ElementTree as ET
class EVTXParser:
def __init__(self, file_path):
self.file_path = file_path
def parse(self):
"""Parse EVTX file and yield JSON events"""
with evtx.Evtx(self.file_path) as log:
for record in log.records():
yield self.record_to_json(record)
def record_to_json(self, record):
"""Convert EVTX record to JSON format"""
# Extract XML and parse
xml_str = record.xml()
root = ET.fromstring(xml_str)
# Build comprehensive event object
event = {
"EventID": self.extract_event_id(root),
"TimeCreated": self.extract_timestamp(root),
"Computer": self.extract_computer(root),
"Channel": self.extract_channel(root),
"Provider": self.extract_provider(root),
"Level": self.extract_level(root),
"Task": self.extract_task(root),
"Keywords": self.extract_keywords(root),
"EventData": self.extract_event_data(root),
"UserData": self.extract_user_data(root),
"System": self.extract_system_data(root),
"RawXML": xml_str
}
return event
2. Node.js Integration ServiceΒΆ
// evtx-parser.service.ts
import { spawn } from 'child_process';
import { EventEmitter } from 'events';
export class EVTXParserService extends EventEmitter {
async parseEVTXFile(filePath: string): Promise<void> {
const python = spawn('python3', ['evtx_parser.py', filePath]);
python.stdout.on('data', (data) => {
const events = data.toString().split('\n')
.filter(line => line.trim())
.map(line => JSON.parse(line));
for (const event of events) {
this.emit('event', event);
}
});
python.stderr.on('data', (data) => {
this.emit('error', new Error(data.toString()));
});
python.on('close', (code) => {
if (code === 0) {
this.emit('complete');
} else {
this.emit('error', new Error(`Parser exited with code ${code}`));
}
});
}
}
3. API Endpoint ImplementationΒΆ
// evtx-upload.route.ts
import multer from 'multer';
import { Router } from 'express';
import { EVTXParserService } from './evtx-parser.service';
import { LogNormalizer } from '../processors/log-normalizer';
const router = Router();
const upload = multer({
dest: '/tmp/evtx-uploads/',
limits: { fileSize: 1024 * 1024 * 500 } // 500MB limit
});
router.post('/api/ingest/evtx', upload.single('file'), async (req, res) => {
const { file } = req;
const parser = new EVTXParserService();
const normalizer = new LogNormalizer();
let processedCount = 0;
const batchSize = 100;
let batch = [];
parser.on('event', async (event) => {
const normalized = normalizer.normalizeWindowsEvent(event);
batch.push(normalized);
if (batch.length >= batchSize) {
await ingestBatch(batch);
processedCount += batch.length;
batch = [];
// Send progress update
res.write(JSON.stringify({
status: 'processing',
processed: processedCount
}) + '\n');
}
});
parser.on('complete', async () => {
if (batch.length > 0) {
await ingestBatch(batch);
processedCount += batch.length;
}
res.end(JSON.stringify({
status: 'complete',
total: processedCount
}));
// Cleanup uploaded file
fs.unlink(file.path, () => {});
});
parser.on('error', (error) => {
res.status(500).json({ error: error.message });
});
await parser.parseEVTXFile(file.path);
});
Field Mapping StrategyΒΆ
Windows Event ID MappingsΒΆ
1. Authentication Events (4624/4625)ΒΆ
{
// Windows Fields β SecureWatch Fields
"LogonType": "auth.logon_type",
"TargetUserName": "user.name",
"TargetDomainName": "user.domain",
"TargetUserSid": "user.id",
"IpAddress": "source.ip",
"IpPort": "source.port",
"WorkstationName": "source.hostname",
"LogonProcessName": "process.name",
"AuthenticationPackageName": "auth.package",
"Status": "event.outcome",
"SubStatus": "auth.sub_status"
}
2. Process Creation (4688)ΒΆ
{
"NewProcessName": "process.executable",
"CommandLine": "process.command_line",
"ProcessId": "process.pid",
"ParentProcessName": "process.parent.executable",
"SubjectUserName": "user.name",
"SubjectDomainName": "user.domain",
"TokenElevationType": "process.elevation_type"
}
3. PowerShell Events (4103/4104)ΒΆ
{
"ScriptBlockText": "powershell.script_block",
"Path": "file.path",
"ScriptBlockId": "powershell.script_id",
"MessageNumber": "powershell.message_number",
"MessageTotal": "powershell.message_total"
}
Performance ConsiderationsΒΆ
1. Memory ManagementΒΆ
Stream processing for large files
Chunk size: 1000 events per batch
Memory limit: 512MB per parser instance
Automatic garbage collection
2. Processing SpeedΒΆ
Target: 10,000 events/second
Parallel processing: 4 worker threads
Database batch inserts: 1000 records
Index optimization for common queries
3. Storage OptimizationΒΆ
Compress raw XML data
Index frequently queried fields
Partition by event timestamp
Retention policies by event type
Security ConsiderationsΒΆ
1. File ValidationΒΆ
Verify EVTX file signatures
Scan for malicious content
Enforce file size limits
Validate XML structure
2. Access ControlΒΆ
Role-based upload permissions
Audit trail for uploads
Data classification tags
Encryption at rest
3. Data PrivacyΒΆ
PII detection and masking
Compliance field mapping
Retention policy enforcement
Export restrictions
Testing StrategyΒΆ
1. Unit TestsΒΆ
Parser accuracy validation
Field mapping verification
Error handling scenarios
Performance benchmarks
2. Integration TestsΒΆ
End-to-end upload flow
Database persistence
API response validation
Progress tracking
3. Load TestsΒΆ
Large file processing (>1GB)
Concurrent uploads
Memory usage monitoring
Database performance
Deployment PlanΒΆ
Phase 1: Development (Week 1-2)ΒΆ
Implement Python EVTX parser
Create Node.js integration service
Add file upload API endpoint
Basic field mapping
Phase 2: Enhancement (Week 3-4)ΒΆ
Advanced field mappings
Batch processing optimization
Progress tracking API
Error handling
Phase 3: Testing (Week 5)ΒΆ
Unit test coverage
Integration testing
Performance optimization
Security review
Phase 4: Deployment (Week 6)ΒΆ
Docker containerization
Production deployment
Monitoring setup
Documentation
Success MetricsΒΆ
Performance
Parse 1GB EVTX in <2 minutes
Support 10+ concurrent uploads
99.9% parsing accuracy
Coverage
Support all Windows versions (7/8/10/11/Server)
Parse 100+ event ID types
Extract 50+ unique fields
Reliability
Zero data loss
Automatic retry on failure
Comprehensive error logging
Future EnhancementsΒΆ
Format Support
EVT (legacy format)
PCAP with Windows events
Sysmon enhanced events
Processing Features
Real-time streaming
Deduplication
Correlation rules
Integration
Direct Windows collector
WMI/WinRM support
Active Directory enrichment