EVTX Parsing Strategy for SecureWatch SIEMΒΆ

OverviewΒΆ

This document outlines the comprehensive strategy for parsing Windows Event Log (EVTX) files in SecureWatch SIEM, enabling ingestion of historical Windows security events for threat hunting and forensic analysis. Updated with Enhanced EVTX Parser v2.0 featuring comprehensive MITRE ATT&CK detection and Sysmon support.

Enhanced EVTX Parser v2.0ΒΆ

Key EnhancementsΒΆ

  • MITRE ATT&CK Integration: Automatic technique detection with 50+ supported techniques

  • Sysmon Support: Full coverage of Events 1-29 with enhanced field extraction

  • Attack Pattern Recognition: 50+ regex patterns for malicious behavior detection

  • Risk Scoring Algorithm: Intelligent threat prioritization (0-100 scale)

  • Web Upload Interface: Real-time file parsing via frontend component

  • EVTX-ATTACK-SAMPLES Testing: Validated against 329 attack samples

Implementation StatusΒΆ

βœ… Completed: Enhanced parser with MITRE ATT&CK detection
βœ… Completed: Comprehensive testing against EVTX-ATTACK-SAMPLES
βœ… Completed: Web-based upload component
βœ… Completed: Integration with SecureWatch platform
βœ… Completed: Risk scoring and confidence assessment

Architecture DesignΒΆ

1. Parser Service ArchitectureΒΆ

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  File Upload    │────▢│  EVTX Parser     │────▢│ Log Normalizer  β”‚
β”‚  API Endpoint   β”‚     β”‚  Service         β”‚     β”‚                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚                       β”‚                         β”‚
         β”‚                       β”‚                         β–Ό
         β”‚                       β”‚                β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
         β”‚                       β”‚                β”‚   TimescaleDB   β”‚
         β”‚                       β”‚                β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚                       β”‚
         β–Ό                       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  File Storage   β”‚     β”‚  Processing      β”‚
β”‚  (Temporary)    β”‚     β”‚  Queue           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

2. Technology StackΒΆ

  • Python EVTX Library: python-evtx for parsing binary EVTX files

  • Node.js Integration: Child process or microservice for Python execution

  • File Processing: Multer for file uploads, Bull for job queuing

  • Storage: Temporary file storage with automatic cleanup

Implementation StrategyΒΆ

Phase 1: EVTX Parser ModuleΒΆ

Create a Python-based EVTX parser that:

  1. Reads binary EVTX files using python-evtx

  2. Extracts all event records with full metadata

  3. Converts Windows XML schema to JSON format

  4. Preserves all fields for comprehensive analysis

Phase 2: API IntegrationΒΆ

Extend log-ingestion service with:

  1. File upload endpoint: POST /api/ingest/evtx

  2. Support for multiple file formats:

    • .evtx (binary Windows Event Log)

    • .xml (exported Windows Event XML)

    • .json (pre-processed event data)

  3. Async processing with job queue

  4. Progress tracking and status API

Phase 3: Enhanced Field MappingΒΆ

Map Windows Event fields to SecureWatch schema:

  1. Security Events (4624/4625) - Authentication

  2. Process Events (4688) - Process creation

  3. Service Events (7045) - Service installation

  4. Task Events (4698/106) - Scheduled tasks

  5. PowerShell Events (4103/4104) - Script execution

  6. Audit Events (1102) - Log clearing

Phase 4: Batch ProcessingΒΆ

Implement efficient batch processing:

  1. Chunk large EVTX files (>100MB)

  2. Parallel processing of event records

  3. Memory-efficient streaming

  4. Progress reporting via WebSocket

Detailed Component DesignΒΆ

1. Python EVTX Parser (evtx_parser.py)ΒΆ

import Evtx.Evtx as evtx
import json
from datetime import datetime
import xml.etree.ElementTree as ET

class EVTXParser:
    def __init__(self, file_path):
        self.file_path = file_path
        
    def parse(self):
        """Parse EVTX file and yield JSON events"""
        with evtx.Evtx(self.file_path) as log:
            for record in log.records():
                yield self.record_to_json(record)
                
    def record_to_json(self, record):
        """Convert EVTX record to JSON format"""
        # Extract XML and parse
        xml_str = record.xml()
        root = ET.fromstring(xml_str)
        
        # Build comprehensive event object
        event = {
            "EventID": self.extract_event_id(root),
            "TimeCreated": self.extract_timestamp(root),
            "Computer": self.extract_computer(root),
            "Channel": self.extract_channel(root),
            "Provider": self.extract_provider(root),
            "Level": self.extract_level(root),
            "Task": self.extract_task(root),
            "Keywords": self.extract_keywords(root),
            "EventData": self.extract_event_data(root),
            "UserData": self.extract_user_data(root),
            "System": self.extract_system_data(root),
            "RawXML": xml_str
        }
        return event

2. Node.js Integration ServiceΒΆ

// evtx-parser.service.ts
import { spawn } from 'child_process';
import { EventEmitter } from 'events';

export class EVTXParserService extends EventEmitter {
  async parseEVTXFile(filePath: string): Promise<void> {
    const python = spawn('python3', ['evtx_parser.py', filePath]);
    
    python.stdout.on('data', (data) => {
      const events = data.toString().split('\n')
        .filter(line => line.trim())
        .map(line => JSON.parse(line));
      
      for (const event of events) {
        this.emit('event', event);
      }
    });
    
    python.stderr.on('data', (data) => {
      this.emit('error', new Error(data.toString()));
    });
    
    python.on('close', (code) => {
      if (code === 0) {
        this.emit('complete');
      } else {
        this.emit('error', new Error(`Parser exited with code ${code}`));
      }
    });
  }
}

3. API Endpoint ImplementationΒΆ

// evtx-upload.route.ts
import multer from 'multer';
import { Router } from 'express';
import { EVTXParserService } from './evtx-parser.service';
import { LogNormalizer } from '../processors/log-normalizer';

const router = Router();
const upload = multer({ 
  dest: '/tmp/evtx-uploads/',
  limits: { fileSize: 1024 * 1024 * 500 } // 500MB limit
});

router.post('/api/ingest/evtx', upload.single('file'), async (req, res) => {
  const { file } = req;
  const parser = new EVTXParserService();
  const normalizer = new LogNormalizer();
  
  let processedCount = 0;
  const batchSize = 100;
  let batch = [];
  
  parser.on('event', async (event) => {
    const normalized = normalizer.normalizeWindowsEvent(event);
    batch.push(normalized);
    
    if (batch.length >= batchSize) {
      await ingestBatch(batch);
      processedCount += batch.length;
      batch = [];
      
      // Send progress update
      res.write(JSON.stringify({ 
        status: 'processing', 
        processed: processedCount 
      }) + '\n');
    }
  });
  
  parser.on('complete', async () => {
    if (batch.length > 0) {
      await ingestBatch(batch);
      processedCount += batch.length;
    }
    
    res.end(JSON.stringify({ 
      status: 'complete', 
      total: processedCount 
    }));
    
    // Cleanup uploaded file
    fs.unlink(file.path, () => {});
  });
  
  parser.on('error', (error) => {
    res.status(500).json({ error: error.message });
  });
  
  await parser.parseEVTXFile(file.path);
});

Field Mapping StrategyΒΆ

Windows Event ID MappingsΒΆ

1. Authentication Events (4624/4625)ΒΆ

{
  // Windows Fields β†’ SecureWatch Fields
  "LogonType": "auth.logon_type",
  "TargetUserName": "user.name",
  "TargetDomainName": "user.domain",
  "TargetUserSid": "user.id",
  "IpAddress": "source.ip",
  "IpPort": "source.port",
  "WorkstationName": "source.hostname",
  "LogonProcessName": "process.name",
  "AuthenticationPackageName": "auth.package",
  "Status": "event.outcome",
  "SubStatus": "auth.sub_status"
}

2. Process Creation (4688)ΒΆ

{
  "NewProcessName": "process.executable",
  "CommandLine": "process.command_line",
  "ProcessId": "process.pid",
  "ParentProcessName": "process.parent.executable",
  "SubjectUserName": "user.name",
  "SubjectDomainName": "user.domain",
  "TokenElevationType": "process.elevation_type"
}

3. PowerShell Events (4103/4104)ΒΆ

{
  "ScriptBlockText": "powershell.script_block",
  "Path": "file.path",
  "ScriptBlockId": "powershell.script_id",
  "MessageNumber": "powershell.message_number",
  "MessageTotal": "powershell.message_total"
}

Performance ConsiderationsΒΆ

1. Memory ManagementΒΆ

  • Stream processing for large files

  • Chunk size: 1000 events per batch

  • Memory limit: 512MB per parser instance

  • Automatic garbage collection

2. Processing SpeedΒΆ

  • Target: 10,000 events/second

  • Parallel processing: 4 worker threads

  • Database batch inserts: 1000 records

  • Index optimization for common queries

3. Storage OptimizationΒΆ

  • Compress raw XML data

  • Index frequently queried fields

  • Partition by event timestamp

  • Retention policies by event type

Security ConsiderationsΒΆ

1. File ValidationΒΆ

  • Verify EVTX file signatures

  • Scan for malicious content

  • Enforce file size limits

  • Validate XML structure

2. Access ControlΒΆ

  • Role-based upload permissions

  • Audit trail for uploads

  • Data classification tags

  • Encryption at rest

3. Data PrivacyΒΆ

  • PII detection and masking

  • Compliance field mapping

  • Retention policy enforcement

  • Export restrictions

Testing StrategyΒΆ

1. Unit TestsΒΆ

  • Parser accuracy validation

  • Field mapping verification

  • Error handling scenarios

  • Performance benchmarks

2. Integration TestsΒΆ

  • End-to-end upload flow

  • Database persistence

  • API response validation

  • Progress tracking

3. Load TestsΒΆ

  • Large file processing (>1GB)

  • Concurrent uploads

  • Memory usage monitoring

  • Database performance

Deployment PlanΒΆ

Phase 1: Development (Week 1-2)ΒΆ

  1. Implement Python EVTX parser

  2. Create Node.js integration service

  3. Add file upload API endpoint

  4. Basic field mapping

Phase 2: Enhancement (Week 3-4)ΒΆ

  1. Advanced field mappings

  2. Batch processing optimization

  3. Progress tracking API

  4. Error handling

Phase 3: Testing (Week 5)ΒΆ

  1. Unit test coverage

  2. Integration testing

  3. Performance optimization

  4. Security review

Phase 4: Deployment (Week 6)ΒΆ

  1. Docker containerization

  2. Production deployment

  3. Monitoring setup

  4. Documentation

Success MetricsΒΆ

  1. Performance

    • Parse 1GB EVTX in <2 minutes

    • Support 10+ concurrent uploads

    • 99.9% parsing accuracy

  2. Coverage

    • Support all Windows versions (7/8/10/11/Server)

    • Parse 100+ event ID types

    • Extract 50+ unique fields

  3. Reliability

    • Zero data loss

    • Automatic retry on failure

    • Comprehensive error logging

Future EnhancementsΒΆ

  1. Format Support

    • EVT (legacy format)

    • PCAP with Windows events

    • Sysmon enhanced events

  2. Processing Features

    • Real-time streaming

    • Deduplication

    • Correlation rules

  3. Integration

    • Direct Windows collector

    • WMI/WinRM support

    • Active Directory enrichment