README
¶
Migration 005: Fix Critical Schema Issues
Overview
This migration addresses 6 critical database design issues discovered during a comprehensive schema audit:
- Port.ID Conflict (HIGH - Was causing current scan failures)
- Vulnerability.VID Unique Constraint (HIGH - Data integrity)
- CVEDataMeta.ID Conflict (HIGH - Potential corruption)
- Missing Performance Indexes (MEDIUM)
- ScanHistoryEntry Redundant ID (LOW - Code cleanup)
- Composite Primary Keys (DOCUMENTED - Design limitation)
Problem Analysis
Issue 1: Port.ID Conflict (CRITICAL)
Problem:
The Port struct had both gorm.Model (which provides ID uint as auto-increment PK) and a custom ID int field meant to store port numbers (22, 80, 443, etc.).
Impact:
- Port numbers were being used as database primary keys
- When the same port appeared on multiple hosts, duplicate key violations occurred
- Error:
duplicate key value violates unique constraint "ports_pkey" - Scans were failing with HTTP 500 errors
Root Cause:
// BEFORE (Broken):
type Port struct {
gorm.Model // Has: ID uint (auto-increment PK)
ID int // CONFLICT: Port number treated as PK
Protocol string
State string
}
Solution:
// AFTER (Fixed):
type Port struct {
gorm.Model // Has: ID uint (auto-increment PK) ✓
Number int // Port number (22, 80, 443, etc.) ✓
Protocol string
State string
}
// Unique constraint: (Number, Protocol) together
Issue 2: Vulnerability.VID Missing Unique Constraint
Problem:
No unique constraint on the v_id column (CVE identifiers).
Impact:
- Multiple identical CVE records could be created
- Data integrity issues
- Potential for duplicate CVE-2017-0144 entries
Solution:
Added unique constraint on v_id column with duplicate cleanup.
Issue 3: CVEDataMeta.ID Conflict
Problem:
CVEDataMeta struct declared ID string field conflicting with gorm.Model.ID uint.
Impact:
- GORM confusion about which ID field is the primary key
- Potential database schema corruption
Solution:
Renamed conflicting field from ID to CVEIdentifier.
Issue 4: Missing Performance Indexes
Problem: No indexes on frequently queried fields.
Impact:
- Slow queries on large datasets
- Poor performance when filtering by VID, hostname, or OS
Solution: Added indexes on:
vulnerabilities.v_idhosts.hostnamehosts.osports(number, protocol)composite
Issue 5: ScanHistoryEntry Redundant ID
Problem:
Explicitly declared ID uint when gorm.Model already provides it.
Impact:
- Confusing code
- Potential for future conflicts
Solution:
Removed explicit ID declaration, use gorm.Model only.
Migration Process
Phase 1: Critical Fixes (Immediate)
-
Port Schema Transformation
- Backup existing port and host_ports data
- Drop foreign key constraints
- Rename
idcolumn tonumber - Add new auto-increment
idcolumn as primary key - Add unique constraint on
(number, protocol) - Rebuild host_ports relationships with new IDs
- Restore foreign key constraints
-
Vulnerability Unique Constraint
- Check for and clean duplicate VIDs
- Add unique constraint on
v_id
Phase 2: Critical Prevention
-
CVEDataMeta Fix
- Rename conflicting
IDcolumn toCVEIdentifier - Update unique index
- Rename conflicting
-
Performance Indexes
- Create missing indexes on frequently queried columns
Files Modified
Models
-
/Users/oz/Projects/Sirius-Project/minor-projects/go-api/sirius/postgres/models/host.go- Port struct:
ID→Number - Added
TableName()method
- Port struct:
-
/Users/oz/Projects/Sirius-Project/minor-projects/go-api/sirius/postgres/models/vulnerability.go- CVEDataMeta struct:
ID→CVEIdentifier
- CVEDataMeta struct:
-
/Users/oz/Projects/Sirius-Project/minor-projects/go-api/sirius/postgres/models/scan_source.go- ScanHistoryEntry: Use
gorm.Modelinstead of explicit ID
- ScanHistoryEntry: Use
Core Types
/Users/oz/Projects/Sirius-Project/minor-projects/go-api/sirius/sirius.go- Port struct:
ID→Number
- Port struct:
API Functions
-
/Users/oz/Projects/Sirius-Project/minor-projects/go-api/sirius/host/source_aware.go- UpdatePortsWithSource: Query by
numberinstead ofid
- UpdatePortsWithSource: Query by
-
/Users/oz/Projects/Sirius-Project/minor-projects/go-api/sirius/host/host.go- MapDBHostToSiriusHost: Map
Numberfield - MapSiriusHostToDBHost: Query by
number+protocol
- MapDBHostToSiriusHost: Map
Scanner Modules
-
/Users/oz/Projects/Sirius-Project/minor-projects/app-scanner/modules/nmap/nmap.go- Port parsing: Use
Numberfield
- Port parsing: Use
-
/Users/oz/Projects/Sirius-Project/minor-projects/app-scanner/modules/rustscan/rustscan.go- Port parsing: Use
Numberfield
- Port parsing: Use
-
/Users/oz/Projects/Sirius-Project/minor-projects/app-scanner/modules/naabu/naabu.go- Port parsing: Use
Numberfield
- Port parsing: Use
Running the Migration
Prerequisites
- Backup database:
docker exec sirius-postgres pg_dump -U postgres -d sirius > backup_$(date +%Y%m%d_%H%M%S).sql
- Ensure containers are running:
docker ps | grep sirius
Execute Migration
# Copy migration to container
docker cp migrations/005_fix_critical_schema_issues sirius-engine:/tmp/
# Run migration
docker exec sirius-engine bash -c "cd /tmp/005_fix_critical_schema_issues && go mod init migration && go mod tidy && go run main.go"
Verify Migration
# Check database schema
docker exec sirius-postgres psql -U postgres -d sirius -c "\d ports"
# Should show:
# - id (bigint, primary key, auto-increment)
# - number (integer, not null)
# - protocol (varchar, not null)
# - Unique constraint: unique_port_number_protocol (number, protocol)
# Check vulnerability constraint
docker exec sirius-postgres psql -U postgres -d sirius -c "\d vulnerabilities"
# Should show unique constraint on v_id
# Check indexes
docker exec sirius-postgres psql -U postgres -d sirius -c "SELECT indexname, tablename FROM pg_indexes WHERE tablename IN ('ports', 'vulnerabilities', 'hosts');"
Restart Services
docker restart sirius-engine sirius-api
Testing
Test 1: Basic Scan
# Trigger a scan
docker exec sirius-rabbitmq rabbitmqadmin publish exchange=amq.default routing_key=scan \
payload='{"id":"test-001","targets":[{"value":"192.168.123.149","type":"single_ip"}],"options":{"template_id":"quick"},"priority":3}'
# Check for errors
docker logs sirius-engine | grep -i "duplicate\|error"
docker logs sirius-postgres | grep -i "duplicate\|error"
Test 2: Multiple Hosts with Same Ports
# Scan multiple hosts that will have common ports (22, 80, 443)
docker exec sirius-rabbitmq rabbitmqadmin publish exchange=amq.default routing_key=scan \
payload='{"id":"test-002","targets":[{"value":"192.168.123.149","type":"single_ip"},{"value":"192.168.123.150","type":"single_ip"}],"options":{"template_id":"quick"},"priority":3}'
# Verify database has multiple hosts with port 22
docker exec sirius-postgres psql -U postgres -d sirius -c "
SELECT h.ip, p.number, p.protocol, hp.source
FROM hosts h
JOIN host_ports hp ON h.id = hp.host_id
JOIN ports p ON hp.port_id = p.id
WHERE p.number = 22
ORDER BY h.ip;"
Test 3: Verify No Duplicates
# Check for duplicate port numbers (should return 0 rows)
docker exec sirius-postgres psql -U postgres -d sirius -c "
SELECT number, protocol, COUNT(*)
FROM ports
GROUP BY number, protocol
HAVING COUNT(*) > 1;"
Expected Outcomes
✅ Port numbers stored in Number field, not conflicting with primary key
✅ Auto-increment ID used for relationships and foreign keys
✅ Same port can appear on multiple hosts without conflicts
✅ Multiple scans can rediscover the same ports without errors
✅ Source attribution properly tracks which tool found each port
✅ Scanning continues successfully without 500 errors
✅ No duplicate CVE entries possible
✅ Improved query performance on large datasets
Rollback
If issues occur:
# Restore from backup
docker exec -i sirius-postgres psql -U postgres -d sirius < backup_YYYYMMDD_HHMMSS.sql
# Restart services
docker restart sirius-engine sirius-api
Performance Impact
- Positive: Indexes improve query performance by 10-100x on large datasets
- Neutral: Port schema change is transparent to applications after restart
- Negligible: Small overhead from unique constraints (microseconds per insert)
Future Considerations
Issue 4: Composite Primary Keys (Documented)
Status: Design limitation documented, not changed in this migration
Problem:
HostPort and HostVulnerability use composite primary keys including source:
PRIMARY KEY (host_id, port_id, source)PRIMARY KEY (host_id, vulnerability_id, source)
Impact:
- Cannot update source for existing relationships
- Complex queries required for some operations
Recommendation: Consider future migration to use auto-increment ID with unique constraint:
ALTER TABLE host_ports ADD COLUMN id SERIAL PRIMARY KEY;
ALTER TABLE host_ports ADD CONSTRAINT unique_host_port_source UNIQUE (host_id, port_id, source);
This would allow updates while maintaining uniqueness.
Maintenance
This migration is idempotent and safe to run multiple times:
- Checks for existing schema changes before applying
- Skips steps if already completed
- Provides detailed logging of all operations
Author
Generated as part of comprehensive database schema audit - October 2025
Related Issues
- Resolves: Duplicate key violation on ports table
- Prevents: Future data integrity issues with CVEs
- Improves: Query performance across the application
Documentation
¶
There is no documentation for this package.