ssh

package
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 1, 2025 License: Apache-2.0 Imports: 17 Imported by: 1

README

ssh

Package ssh provides a thin wrapper around golang.org/x/crypto/ssh, github.com/pkg/sftp, and github.com/kevinburke/ssh_config specifically meant for use by Hadron.

Design Principles

  1. Minimal API Surface: Expose only what's necessary - Pool.GetClient() returns a Connection interface with Execute(), UploadFile(), and UploadData()
  2. Secure by Default: Ed25519-only, SSH agent authentication, strict host key verification - no configuration required
  3. Performant by Default: Automatic connection pooling and reuse per endpoint - no manual management
  4. No Footguns: Internal client type prevents misuse - you cannot accidentally create unmanaged connections
  5. SSH Agent Delegation: Heavily delegate to SSH agent for authentication, leaving control and flexibility to the operator to configure SSH without modifying plan files
  6. SSH Config Resolution: Automatically resolve connection parameters (User, Port, Hostname) from ~/.ssh/config based on endpoint aliases
  7. Host Key Verification: Enforce strict host key checking using ~/.ssh/known_hosts with Ed25519-only algorithm restriction
  8. Modern Protocols: Use SFTP for file transfers (not deprecated SCP)
  9. Reuse, do not reinvent: Leverage as much as possible from underlying libraries

API Design

Public API (What You Use)
  • Pool: Manages SSH connections with automatic pooling and cleanup

    • NewPool(logger): Creates a new connection pool
    • GetClient(endpoint) Connection: Returns a connection for the endpoint (creates/reuses as needed)
    • GetClientWithFingerprint(endpoint, fingerprint) Connection: Returns a connection with fingerprint verification
    • CloseAll() error: Closes all pooled connections
    • Size() int: Returns the number of active connections
  • Connection interface: Minimal interface for SSH operations

    • Execute(command string) (stdout, stderr string, err error): Run remote commands
    • UploadFile(localPath, remotePath string) error: Upload files from disk
    • UploadData(data []byte, remotePath string) error: Upload raw bytes without local temp files
Internal Implementation (Hidden)
  • client: Unexported implementation type - cannot be instantiated directly
  • newClient(endpoint): Unexported constructor - only Pool uses this
  • connect(), close(): Unexported methods - managed by Pool

This design ensures:

  • You cannot forget to close connections - Pool.CloseAll() handles cleanup
  • You cannot create duplicate connections - Pool automatically reuses
  • You cannot bypass security defaults - No way to disable Ed25519 or host key verification
  • You cannot misuse the client - Only the safe interface is exposed

Features

  • Automatic Connection Pooling: Single SSH connection per endpoint with automatic SFTP session management
  • Config Resolution: Support for SSH config aliases (e.g., GetClient("production-server") resolves via ~/.ssh/config)
  • Endpoint Formats: Accepts IP addresses, hostnames, SSH config aliases, or user@host notation
  • File Uploads: Two upload methods with automatic 0600 permissions:
    • UploadFile(localPath, remotePath): Upload files from disk
    • UploadData(data, remotePath): Upload raw bytes without creating local temp files
  • Command Execution: Execute(command) runs commands and returns stdout/stderr
  • Security Hardening:
    • Ed25519-only host key algorithms (rejects RSA, ECDSA, DSA)
    • SSH agent-based authentication (no key files in plan code)
    • Strict known_hosts verification with helpful error messages
    • Optional fingerprint-based verification for automated deployments
    • Automatic file descriptor cleanup (prevents SSH agent connection leaks)

Usage Example

// Create connection pool
pool := ssh.NewPool(logger)
defer pool.CloseAll() // Automatic cleanup of all connections

// Get connection for endpoint (automatically resolves from ~/.ssh/config)
// Connection is created on first use and reused on subsequent calls
conn, err := pool.GetClient("production-server")
if err != nil {
    log.Fatal(err)
}

// Upload file (sets 0600 permissions automatically)
if err := conn.UploadFile("/local/file.txt", "/remote/file.txt"); err != nil {
    log.Fatal(err)
}

// Upload sensitive data without creating local temp files
credentials := []byte(`{"key": "sensitive-data"}`)
if err := conn.UploadData(credentials, "/remote/credentials.json"); err != nil {
    log.Fatal(err)
}

// Execute command
stdout, stderr, err := conn.Execute("ls -la /remote")
if err != nil {
    log.Printf("stderr: %s", stderr)
    log.Fatal(err)
}
fmt.Println(stdout)
Fingerprint-Based Verification

For automated deployments (CI/CD, Terraform, Ansible-like tools) where pre-populating ~/.ssh/known_hosts is impractical, you can use fingerprint-based verification:

// Get the host key fingerprint first:
// ssh-keyscan -t ed25519 hostname | ssh-keygen -lf -
// Output: 256 SHA256:nThbg6kXUpJWGl7E1IGOCspRomTxdCARLviKw6E5SY8 hostname (ED25519)

pool := ssh.NewPool(logger)
defer pool.CloseAll()

// Connect with fingerprint verification (bypasses ~/.ssh/known_hosts)
fingerprint := "SHA256:nThbg6kXUpJWGl7E1IGOCspRomTxdCARLviKw6E5SY8"
conn, err := pool.GetClientWithFingerprint("production-server", fingerprint)
if err != nil {
    log.Fatal(err) // Will fail if fingerprint doesn't match
}

// Use connection as normal
stdout, stderr, err := conn.Execute("whoami")

When to use fingerprint verification:

  • ✅ CI/CD pipelines with ephemeral build agents
  • ✅ Automated provisioning tools (like Hadron plans)
  • ✅ Infrastructure-as-code where the fingerprint is part of the config
  • ❌ Interactive deployments (use ~/.ssh/known_hosts instead)

Error Messages

The package provides actionable error messages for common issues:

  • Host Key Mismatch: "host key verification failed: key mismatch (possible MITM attack) for hostname. If you trust this host, remove the old key from ~/.ssh/known_hosts and retry"
  • Unknown Host: "host key verification failed: host not found in known_hosts: hostname. To add this host, run: ssh-keyscan -H hostname >> ~/.ssh/known_hosts"
  • No SSH Agent: "SSH agent not available: ensure SSH_AUTH_SOCK is set and ssh-agent is running"

Code Audit Results

Audit Date: 2025-10-21 Auditors: Security Specialist, Code Quality Auditor Methodology: Comprehensive security and code quality review with critical analysis

Executive Summary

Two independent audits (security and code quality) were conducted and their findings were challenged against common sense, operational context, and security-vs-usability tradeoffs. The module demonstrates strong security fundamentals with Ed25519-only enforcement, mandatory host key verification, and SSH agent authentication. However, one critical resource leak and several concurrency safety issues require immediate remediation before production deployment in high-concurrency environments.

Overall Assessment: Production-ready for single-threaded use (current Pool pattern). Requires fixes for multi-threaded scenarios.

Critical Findings Requiring Immediate Action
1. SSH Agent Connection Leak [FIXED ✅]

Location: client.go:234-255 (getSSHAgentAuth method)

Issue: Unix socket connection to SSH agent was never closed, causing file descriptor leak.

Impact:

  • Each connection attempt leaked one file descriptor
  • High-throughput deployments would exhaust system resources
  • Could lead to application crashes when file descriptor limit is reached

Resolution: Added agentConn net.Conn field to Client struct. The connection is now stored in getSSHAgentAuth() (line 248) and properly closed in Close() method (line 191-194).

Status: ✅ FIXED - Connection is now properly closed, preventing file descriptor leak.


High Priority Findings
2. Thread Safety - Execute/Upload Methods Not Protected [VERIFIED - HIGH]

Location: client.go:191-230 (Execute), client.go:339-408 (UploadFile, UploadData)

Issue: Methods check c.client without mutex protection, creating race conditions if Close() is called concurrently.

Audit Claim Challenged: The current Pool pattern (in pool.go) already provides single-threaded access per client. This is only an issue if callers use Client directly without Pool.

Mitigation Options:

  1. Document single-threaded requirement (short-term)
  2. Add read locks around operations (if concurrent use is needed)
  3. Current state acceptable since Pool provides serialization

Decision: Add documentation warning. The Pool pattern makes concurrent access to the same Client instance unlikely. If direct Client usage becomes common, add mutex protection.

Status: ACCEPTED with documentation mitigation.


3. User Resolution Logic Incorrect [VERIFIED - HIGH]

Location: client.go:127-134

Issue: Code comment says "endpoint user takes precedence" but implementation checks SSH config first.

// Current (wrong):
user := ssh_config.Get(c.endpoint, "User")  // Checks SSH config FIRST
if user == "" && endpointUser != "" {
    user = endpointUser  // Endpoint user SECOND
}

// Should be:
user := endpointUser  // Check endpoint FIRST
if user == "" {
    user = ssh_config.Get(c.endpoint, "User")  // SSH config SECOND
}

Impact: user@host syntax is ignored if SSH config has User directive, violating principle of explicit configuration taking precedence.

Status: ACCEPTED - Fix logic to match documentation.


4. Silent Failure on Output Read Errors [VERIFIED - MEDIUM]

Location: client.go:202-203

stdoutBytes, _ := io.ReadAll(stdoutPipe)  // Error ignored
stderrBytes, _ := io.ReadAll(stderrPipe)  // Error ignored

Impact: Truncated command output without user notification. Should return error if reads fail.

Status: ACCEPTED - Check and propagate read errors.


Security Findings - Challenged and Contextualized
5. Command Injection Risk [CHALLENGED - Context Matters]

Audit Claim: Execute() enables command injection if commands are constructed from user input.

Reality Check:

  • This is a deployment framework, not a web application
  • Commands come from plan files written by operators, not end users
  • The Execute() method is intentionally low-level to support arbitrary deployment commands
  • Adding shell escaping would break legitimate use cases (e.g., docker run ... && docker logs ...)

Conclusion: This is acceptable by design for an infrastructure tool. Document that Execute() must not be called with untrusted input. If higher-level safety is needed, build it in the executor layer, not the SSH client.

Status: REJECTED - Working as intended. Add documentation warning.


6. Path Traversal in File Uploads [CHALLENGED - Operational Necessity]

Audit Claim: UploadFile() and UploadData() accept unchecked paths, enabling traversal attacks.

Reality Check:

  • Deployment tools need to write to arbitrary paths (e.g., /etc/systemd/system/myservice.service)
  • Blocking absolute paths or .. would break legitimate deployment scenarios
  • The operator writing the plan file is the trust boundary, not the SSH client
  • Deployment frameworks routinely write to system directories - this is the entire point

Conclusion: Path validation would make the tool unusable for its intended purpose. Trust is placed in the plan author, not the SSH client.

Status: REJECTED - This is a feature, not a bug. Deployment tools must have system-wide write access.


7. Ed25519-Only Host Keys Too Restrictive [REJECTED - Security Feature]

Audit Claim: Rejecting RSA/ECDSA keys prevents connection to legacy hosts.

Reality Check:

  • Ed25519-only is a deliberate security decision, not a limitation
  • Modern infrastructure should use Ed25519 (it's been available since OpenSSH 6.5, 2014)
  • Allowing RSA opens the door to weak keys (RSA-1024, RSA-2048 with SHA-1)
  • This is a deployment framework for new infrastructure, not a general-purpose SSH client
  • If RSA is needed, the administrator can generate Ed25519 keys on the target host

Conclusion: The audit is wrong. This is a security strength, not a weakness. Ed25519-only enforcement should be preserved.

Status: REJECTED - Security posture is correct. Keep Ed25519-only enforcement.


Low Priority Observations
8. Missing Timeouts [ACCEPTED - LOW]

Connection and operation timeouts should be configurable. Current behavior (indefinite wait) may be acceptable for deployment scenarios but should be tunable.

9. No Logging in Client [ACCEPTED - LOW]

Client methods have no logging while Pool does. Consider adding optional logger injection for observability.

10. Missing Test Coverage [ACCEPTED - INFO]

Zero tests found for this critical component. Should add unit tests with race detection.


Disputed/Rejected Findings

The following audit findings were rejected after analysis:

  1. "Atomic file uploads needed" - SFTP Create() followed by Chmod() is standard practice. Temp-file-and-rename adds complexity for marginal benefit. Network interruptions are handled by deployment retry logic.

  2. "Information disclosure in error messages" - Error messages like "remove key from ~/.ssh/known_hosts" are operational necessities for a deployment tool. Operators need actionable error messages.

  3. "Port validation missing" - Valid point but low severity. Invalid ports fail fast at connection time anyway.

  4. "known_hosts creation race" - File creation races are harmless (one succeeds, others fail-but-continue). Not worth adding complexity.


Remediation Plan

Immediate (Before Production):

  1. Fix SSH agent connection leak (2 hours)
  2. Fix user resolution precedence logic (30 minutes)
  3. Handle output read errors in Execute() (30 minutes)
  4. Document thread safety expectations (30 minutes)

Short-term (Next Sprint): 5. Add mutex protection to Execute/Upload methods (4 hours) 6. Add timeout configuration (2 hours) 7. Add comprehensive test suite (8 hours)

Not Required:

  • Command injection protection (by design)
  • Path traversal validation (breaks functionality)
  • Weakening Ed25519-only policy (security regression)
  • Atomic file uploads (complexity not justified)

Security Posture Summary

Strengths:

  • ✅ Ed25519-only host key enforcement (modern cryptography)
  • ✅ Mandatory known_hosts verification (no insecure fallbacks)
  • ✅ SSH agent authentication (no key material in memory)
  • ✅ Secure file permissions (0600 on uploaded files)
  • ✅ Proper error wrapping throughout

Weaknesses Requiring Action:

  • ⚠️ SSH agent connection leak
  • ⚠️ Thread safety gaps in Execute/Upload methods
  • ⚠️ User resolution precedence bug

Architectural Decisions (Not Bugs):

  • 🎯 Intentionally low-level Execute() for flexibility
  • 🎯 Unrestricted path access for deployment operations
  • 🎯 Ed25519-only enforcement for security

Overall: Strong security foundation with one critical resource management bug. After fixing the agent leak, this code is suitable for production deployment in infrastructure automation contexts.


Conclusion

After challenging audit findings against operational context and security requirements, the module demonstrates production-quality security practices with Ed25519 enforcement, proper host verification, and secure defaults. The critical SSH agent connection leak must be fixed immediately. Thread safety documentation should be added. Claims about "command injection" and "path traversal" are misunderstandings of deployment tool requirements and should be rejected.

Production Recommendation: Fix agent leak → Add documentation → Deploy with confidence.

Documentation

Overview

Package ssh provides SSH client and connection pool utilities.

Package ssh provides SSH client and connection pool utilities.

Index

Constants

This section is empty.

Variables

View Source
var (
	// ErrConnectionClose indicates failure closing SSH connection.
	ErrConnectionClose = errors.New("failed to close SSH connection")

	// ErrSessionLimitExceeded indicates the server's MaxSessions limit was reached.
	// This typically means too many concurrent sessions on a single SSH connection.
	ErrSessionLimitExceeded = errors.New("SSH session limit exceeded (server MaxSessions reached)")

	// ErrConnectionProhibited indicates the server rejected the connection request.
	ErrConnectionProhibited = errors.New("SSH connection prohibited by server")

	// ErrResourceShortage indicates the server lacks resources to handle the request.
	ErrResourceShortage = errors.New("SSH server resource shortage")

	// ErrChannelOpen indicates a generic channel open failure.
	ErrChannelOpen = errors.New("failed to open SSH channel")

	// ErrNotConnected indicates an operation was attempted without an active connection.
	ErrNotConnected = errors.New("not connected")

	// ErrHostKeyMismatch indicates the host key doesn't match expected value.
	ErrHostKeyMismatch = errors.New("host key verification failed: key mismatch (possible MITM attack)")

	// ErrHostNotInKnownHosts indicates the host is not in known_hosts file.
	ErrHostNotInKnownHosts = errors.New("host key verification failed: host not found in known_hosts")

	// ErrNoSSHAgent indicates SSH agent is not available.
	ErrNoSSHAgent = errors.New("SSH agent not available: ensure SSH_AUTH_SOCK is set and ssh-agent is running")

	// ErrPassphraseKey indicates the SSH key requires a passphrase.
	ErrPassphraseKey = errors.New("SSH key is passphrase-protected (use unencrypted key or SSH agent)")

	// ErrIdentityKeyNotFound indicates the identity file key was not found in SSH agent.
	ErrIdentityKeyNotFound = errors.New("identity file key not found in SSH agent")
)

Exported errors for consumer type checking.

Functions

This section is empty.

Types

type Connection

type Connection interface {
	Execute(command string) (stdout, stderr string, err error)
	ExecuteStreaming(command string, stdout, stderr io.Writer) error
	ExecuteWithStdin(command string, stdin []byte) error
	UploadFile(localPath, remotePath string) error
	UploadData(data []byte, remotePath string) error
}

Connection represents an active SSH connection. All methods are safe for use within the context managed by Pool.

type Pool

type Pool struct {
	// contains filtered or unexported fields
}

Pool manages SSH connections to multiple hosts. It ensures one connection per unique host and reuses connections.

func NewPool

func NewPool(logger zerolog.Logger) *Pool

NewPool creates a new SSH connection pool.

func (*Pool) CloseAll

func (p *Pool) CloseAll() error

CloseAll closes all SSH connections in the pool.

func (*Pool) GetClient

func (p *Pool) GetClient(endpoint string) (Connection, error)

GetClient returns a Connection for the given endpoint, creating and connecting if needed. The endpoint can be an IP address, hostname, or SSH config alias. Connection parameters are resolved from ~/.ssh/config.

func (*Pool) GetClientWithFingerprint

func (p *Pool) GetClientWithFingerprint(endpoint, fingerprint string) (Connection, error)

GetClientWithFingerprint returns a Connection for the given endpoint with optional fingerprint verification. If fingerprint is provided, it will be used for host key verification instead of ~/.ssh/known_hosts. The fingerprint should be in SHA256 format (e.g., "SHA256:abc123...") or MD5 format (e.g., "MD5:ab:cd:ef...").

func (*Pool) GetClientWithKey

func (p *Pool) GetClientWithKey(endpoint, fingerprint, keyContent string) (Connection, error)

GetClientWithKey returns a Connection for the given endpoint with optional fingerprint and SSH key. If keyContent is provided, it will be used for authentication instead of SSH agent. If fingerprint is provided, it will be used for host key verification instead of ~/.ssh/known_hosts.

func (*Pool) Size

func (p *Pool) Size() int

Size returns the number of active connections in the pool.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL