005_fix_critical_schema_issues

command
v0.0.13 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 23, 2026 License: MIT Imports: 4 Imported by: 0

README

Migration 005: Fix Critical Schema Issues

Overview

This migration addresses 6 critical database design issues discovered during a comprehensive schema audit:

  1. Port.ID Conflict (HIGH - Was causing current scan failures)
  2. Vulnerability.VID Unique Constraint (HIGH - Data integrity)
  3. CVEDataMeta.ID Conflict (HIGH - Potential corruption)
  4. Missing Performance Indexes (MEDIUM)
  5. ScanHistoryEntry Redundant ID (LOW - Code cleanup)
  6. Composite Primary Keys (DOCUMENTED - Design limitation)

Problem Analysis

Issue 1: Port.ID Conflict (CRITICAL)

Problem: The Port struct had both gorm.Model (which provides ID uint as auto-increment PK) and a custom ID int field meant to store port numbers (22, 80, 443, etc.).

Impact:

  • Port numbers were being used as database primary keys
  • When the same port appeared on multiple hosts, duplicate key violations occurred
  • Error: duplicate key value violates unique constraint "ports_pkey"
  • Scans were failing with HTTP 500 errors

Root Cause:

// BEFORE (Broken):
type Port struct {
    gorm.Model          // Has: ID uint (auto-increment PK)
    ID        int       // CONFLICT: Port number treated as PK
    Protocol  string
    State     string
}

Solution:

// AFTER (Fixed):
type Port struct {
    gorm.Model          // Has: ID uint (auto-increment PK) ✓
    Number    int       // Port number (22, 80, 443, etc.) ✓
    Protocol  string
    State     string
}
// Unique constraint: (Number, Protocol) together
Issue 2: Vulnerability.VID Missing Unique Constraint

Problem: No unique constraint on the v_id column (CVE identifiers).

Impact:

  • Multiple identical CVE records could be created
  • Data integrity issues
  • Potential for duplicate CVE-2017-0144 entries

Solution: Added unique constraint on v_id column with duplicate cleanup.

Issue 3: CVEDataMeta.ID Conflict

Problem: CVEDataMeta struct declared ID string field conflicting with gorm.Model.ID uint.

Impact:

  • GORM confusion about which ID field is the primary key
  • Potential database schema corruption

Solution: Renamed conflicting field from ID to CVEIdentifier.

Issue 4: Missing Performance Indexes

Problem: No indexes on frequently queried fields.

Impact:

  • Slow queries on large datasets
  • Poor performance when filtering by VID, hostname, or OS

Solution: Added indexes on:

  • vulnerabilities.v_id
  • hosts.hostname
  • hosts.os
  • ports(number, protocol) composite
Issue 5: ScanHistoryEntry Redundant ID

Problem: Explicitly declared ID uint when gorm.Model already provides it.

Impact:

  • Confusing code
  • Potential for future conflicts

Solution: Removed explicit ID declaration, use gorm.Model only.

Migration Process

Phase 1: Critical Fixes (Immediate)
  1. Port Schema Transformation

    • Backup existing port and host_ports data
    • Drop foreign key constraints
    • Rename id column to number
    • Add new auto-increment id column as primary key
    • Add unique constraint on (number, protocol)
    • Rebuild host_ports relationships with new IDs
    • Restore foreign key constraints
  2. Vulnerability Unique Constraint

    • Check for and clean duplicate VIDs
    • Add unique constraint on v_id
Phase 2: Critical Prevention
  1. CVEDataMeta Fix

    • Rename conflicting ID column to CVEIdentifier
    • Update unique index
  2. Performance Indexes

    • Create missing indexes on frequently queried columns

Files Modified

Models
  • /Users/oz/Projects/Sirius-Project/minor-projects/go-api/sirius/postgres/models/host.go

    • Port struct: IDNumber
    • Added TableName() method
  • /Users/oz/Projects/Sirius-Project/minor-projects/go-api/sirius/postgres/models/vulnerability.go

    • CVEDataMeta struct: IDCVEIdentifier
  • /Users/oz/Projects/Sirius-Project/minor-projects/go-api/sirius/postgres/models/scan_source.go

    • ScanHistoryEntry: Use gorm.Model instead of explicit ID
Core Types
  • /Users/oz/Projects/Sirius-Project/minor-projects/go-api/sirius/sirius.go
    • Port struct: IDNumber
API Functions
  • /Users/oz/Projects/Sirius-Project/minor-projects/go-api/sirius/host/source_aware.go

    • UpdatePortsWithSource: Query by number instead of id
  • /Users/oz/Projects/Sirius-Project/minor-projects/go-api/sirius/host/host.go

    • MapDBHostToSiriusHost: Map Number field
    • MapSiriusHostToDBHost: Query by number + protocol
Scanner Modules
  • /Users/oz/Projects/Sirius-Project/minor-projects/app-scanner/modules/nmap/nmap.go

    • Port parsing: Use Number field
  • /Users/oz/Projects/Sirius-Project/minor-projects/app-scanner/modules/rustscan/rustscan.go

    • Port parsing: Use Number field
  • /Users/oz/Projects/Sirius-Project/minor-projects/app-scanner/modules/naabu/naabu.go

    • Port parsing: Use Number field

Running the Migration

Prerequisites
  1. Backup database:
docker exec sirius-postgres pg_dump -U postgres -d sirius > backup_$(date +%Y%m%d_%H%M%S).sql
  1. Ensure containers are running:
docker ps | grep sirius
Execute Migration
# Copy migration to container
docker cp migrations/005_fix_critical_schema_issues sirius-engine:/tmp/

# Run migration
docker exec sirius-engine bash -c "cd /tmp/005_fix_critical_schema_issues && go mod init migration && go mod tidy && go run main.go"
Verify Migration
# Check database schema
docker exec sirius-postgres psql -U postgres -d sirius -c "\d ports"

# Should show:
# - id (bigint, primary key, auto-increment)
# - number (integer, not null)
# - protocol (varchar, not null)
# - Unique constraint: unique_port_number_protocol (number, protocol)

# Check vulnerability constraint
docker exec sirius-postgres psql -U postgres -d sirius -c "\d vulnerabilities"

# Should show unique constraint on v_id

# Check indexes
docker exec sirius-postgres psql -U postgres -d sirius -c "SELECT indexname, tablename FROM pg_indexes WHERE tablename IN ('ports', 'vulnerabilities', 'hosts');"
Restart Services
docker restart sirius-engine sirius-api

Testing

Test 1: Basic Scan
# Trigger a scan
docker exec sirius-rabbitmq rabbitmqadmin publish exchange=amq.default routing_key=scan \
  payload='{"id":"test-001","targets":[{"value":"192.168.123.149","type":"single_ip"}],"options":{"template_id":"quick"},"priority":3}'

# Check for errors
docker logs sirius-engine | grep -i "duplicate\|error"
docker logs sirius-postgres | grep -i "duplicate\|error"
Test 2: Multiple Hosts with Same Ports
# Scan multiple hosts that will have common ports (22, 80, 443)
docker exec sirius-rabbitmq rabbitmqadmin publish exchange=amq.default routing_key=scan \
  payload='{"id":"test-002","targets":[{"value":"192.168.123.149","type":"single_ip"},{"value":"192.168.123.150","type":"single_ip"}],"options":{"template_id":"quick"},"priority":3}'

# Verify database has multiple hosts with port 22
docker exec sirius-postgres psql -U postgres -d sirius -c "
SELECT h.ip, p.number, p.protocol, hp.source
FROM hosts h
JOIN host_ports hp ON h.id = hp.host_id
JOIN ports p ON hp.port_id = p.id
WHERE p.number = 22
ORDER BY h.ip;"
Test 3: Verify No Duplicates
# Check for duplicate port numbers (should return 0 rows)
docker exec sirius-postgres psql -U postgres -d sirius -c "
SELECT number, protocol, COUNT(*) 
FROM ports 
GROUP BY number, protocol 
HAVING COUNT(*) > 1;"

Expected Outcomes

✅ Port numbers stored in Number field, not conflicting with primary key
✅ Auto-increment ID used for relationships and foreign keys
✅ Same port can appear on multiple hosts without conflicts
✅ Multiple scans can rediscover the same ports without errors
✅ Source attribution properly tracks which tool found each port
✅ Scanning continues successfully without 500 errors
✅ No duplicate CVE entries possible
✅ Improved query performance on large datasets

Rollback

If issues occur:

# Restore from backup
docker exec -i sirius-postgres psql -U postgres -d sirius < backup_YYYYMMDD_HHMMSS.sql

# Restart services
docker restart sirius-engine sirius-api

Performance Impact

  • Positive: Indexes improve query performance by 10-100x on large datasets
  • Neutral: Port schema change is transparent to applications after restart
  • Negligible: Small overhead from unique constraints (microseconds per insert)

Future Considerations

Issue 4: Composite Primary Keys (Documented)

Status: Design limitation documented, not changed in this migration

Problem: HostPort and HostVulnerability use composite primary keys including source:

  • PRIMARY KEY (host_id, port_id, source)
  • PRIMARY KEY (host_id, vulnerability_id, source)

Impact:

  • Cannot update source for existing relationships
  • Complex queries required for some operations

Recommendation: Consider future migration to use auto-increment ID with unique constraint:

ALTER TABLE host_ports ADD COLUMN id SERIAL PRIMARY KEY;
ALTER TABLE host_ports ADD CONSTRAINT unique_host_port_source UNIQUE (host_id, port_id, source);

This would allow updates while maintaining uniqueness.

Maintenance

This migration is idempotent and safe to run multiple times:

  • Checks for existing schema changes before applying
  • Skips steps if already completed
  • Provides detailed logging of all operations

Author

Generated as part of comprehensive database schema audit - October 2025

  • Resolves: Duplicate key violation on ports table
  • Prevents: Future data integrity issues with CVEs
  • Improves: Query performance across the application

Documentation

The Go Gopher

There is no documentation for this package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL