robusthttp

package

v1.28.0 Latest Latest Go to latest Published: May 5, 2026 License: Apache-2.0, MIT Imports: 18 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/filecoin-project/curio

Links

Open Source Insights

Documentation ¶

Overview ¶

Package robusthttp implements automatic retry logic for HTTP transfers that can recover from TCP streams that become "stuck" at extremely low throughput.

The Problem: TCP Streams Getting Stuck ¶

On multi-gigabit network connections, especially when multiple TCP flows compete for bandwidth, individual streams can become trapped in a state where they transmit only a few bytes per second despite having abundant link capacity available. This phenomenon occurs both on local networks and over the internet, with long-haul cross-continent routes being particularly susceptible.

Why TCP Streams Get Stuck ¶

TCP's congestion control algorithms (Cubic, BBR, Reno, etc.) attempt to find the available bandwidth by probing: they increase the sending rate until packet loss occurs, then back off. When multiple flows compete, several failure modes emerge:

Congestion Window Collapse: When packet loss occurs, TCP drastically reduces its congestion window (cwnd). On high-bandwidth, high-latency paths (large BDP), recovery can take minutes because the additive increase phase grows cwnd by only ~1 MSS per RTT. A flow that loses packets while competing with aggressive flows may never recover its fair share.
Bufferbloat-Induced Starvation: When network buffers are oversized, competing flows can fill queues with thousands of packets. A flow that experiences loss during this state backs off, while the competing flows continue to hold queue space. The backed-off flow's packets consistently arrive at the tail of deep queues, experiencing enormous RTTs (seconds) and further timeouts.
Retransmission Timeout (RTO) Spirals: After loss, if the retransmitted packets are also lost (common under congestion), RTO doubles exponentially (up to 60-120 seconds). The flow becomes effectively dormant while competing flows absorb the freed capacity. The standard RTO minimum of 200ms-1s further delays recovery.
Receive Window Limitations: On some paths, especially through middleboxes or with asymmetric routing, receive window updates can be delayed or lost, causing the sender to stall waiting for window space that the receiver has already advertised but the sender never received.
Path Asymmetry on Long Routes: Cross-continent connections often traverse different physical paths for forward and return traffic. ACK packets returning on congested reverse paths can be delayed or lost, starving the sender of acknowledgments and causing spurious retransmissions or window stalls.

Why Restarting Helps ¶

Establishing a new TCP connection provides a fresh start that bypasses many of these stuck states:

Fresh Congestion Window: New connections start with initial cwnd (typically 10 MSS) and can use slow-start to rapidly probe for available bandwidth, rather than being trapped in a collapsed cwnd with linear recovery.
Queue Position Reset: The new flow's packets enter queues without the accumulated RTT debt of the stuck flow. On paths with AQM (Active Queue Management like fq_codel), new flows get their own queue slot.
RTO Timer Reset: The new connection has a fresh RTO estimate based on current RTT measurements, rather than an exponentially backed-off timer.
Potential Path Diversity: ECMP (Equal-Cost Multi-Path) routing may assign the new flow to a different physical path, avoiding congested links entirely.
Updated Receive Window: The receiver re-advertises its current window, resolving any stale window state from the old connection.

How This Package Works ¶

RobustGet creates an HTTP reader that monitors transfer rate and automatically recovers from stuck connections:

Rate Monitoring: Each read operation is tracked. Periodically (every few seconds), the transfer rate is computed and compared against a minimum threshold that accounts for the number of concurrent transfers.
Adaptive Threshold: The minimum acceptable rate scales with the number of active transfers (using logarithmic scaling) and respects the expected link capacity. A single transfer must sustain higher throughput than when bandwidth is split among many.
Connection Restart: When rate drops below threshold, the current connection is closed and a new HTTP request is issued with a Range header starting at the last successfully received byte offset.
Retry Limits: To prevent infinite retry loops on truly failed servers, a maximum retry count (15 attempts) is enforced before returning an error.

This approach is particularly effective for large file transfers (sectors, pieces) where the cost of restarting a 32GiB transfer from 10% progress is acceptable compared to waiting potentially hours for a stuck connection to recover (if it ever does).

Index ¶

Variables
func NewSSRFProtectedHTTPClient(policy *SSRFPolicy, originalHeaders http.Header) (*http.Client, func() net.Conn)
func RobustGet(url string, headers http.Header, dataSize int64, rcf func() *RateCounter) io.ReadCloser
func RobustGetWithSSRFPolicy(url string, headers http.Header, dataSize int64, rcf func() *RateCounter, ...) io.ReadCloser
func SetGlobalSSRFOverride(p *SSRFPolicy)
func ValidateClientFetchURL(raw string, headers http.Header, policy *SSRFPolicy) (*url.URL, error)
type RateCounter
- func (rc *RateCounter) Check(cb func() error) error
- func (rc *RateCounter) Release()
type RateCounters
- func NewRateCounters[K comparable](rateFunc RateFunc) *RateCounters[K]
- func (rc *RateCounters[K]) Get(key K) *RateCounter
type RateEnforcingReader
- func NewRateEnforcingReader(r io.Reader, rc *RateCounter, windowDuration time.Duration) *RateEnforcingReader
type RateFunc
- func MinAvgGlobalLogPeerRate(minTxRateMbps, linkMbps float64) RateFunc
type SSRFPolicy
- func DefaultSSRFPolicy() SSRFPolicy

Constants ¶

This section is empty.

Variables ¶

View Source

var TotalTransferDivFactor int64 = 4

Functions ¶

func NewSSRFProtectedHTTPClient ¶ added in v1.27.4

func NewSSRFProtectedHTTPClient(policy *SSRFPolicy, originalHeaders http.Header) (*http.Client, func() net.Conn)

NewSSRFProtectedHTTPClient is for the actual downloader.

func RobustGet ¶

func RobustGet(url string, headers http.Header, dataSize int64, rcf func() *RateCounter) io.ReadCloser

func RobustGetWithSSRFPolicy ¶ added in v1.27.4

func RobustGetWithSSRFPolicy(url string, headers http.Header, dataSize int64, rcf func() *RateCounter, policy *SSRFPolicy) io.ReadCloser

func SetGlobalSSRFOverride ¶ added in v1.28.0

func SetGlobalSSRFOverride(p *SSRFPolicy)

SetGlobalSSRFOverride installs a fallback SSRFPolicy that is used whenever a caller passes a nil policy. Call with nil to clear the override.

func ValidateClientFetchURL ¶ added in v1.27.4

func ValidateClientFetchURL(raw string, headers http.Header, policy *SSRFPolicy) (*url.URL, error)

ValidateClientFetchURL is for early deal/proposal validation.

Types ¶

type RateCounter ¶

type RateCounter struct {
	// contains filtered or unexported fields
}

func (*RateCounter) Check ¶

func (rc *RateCounter) Check(cb func() error) error

Check allows only single concurrent check per peer - this is to prevent multiple concurrent checks causing all transfers to fail at once. When we drop a peer, we'll reduce rc.transfers, so the next check will require less total bandwidth (assuming that MinAvgGlobalLogPeerRate is used).

func (*RateCounter) Release ¶

func (rc *RateCounter) Release()

type RateCounters ¶

type RateCounters[K comparable] struct {
	// contains filtered or unexported fields
}

func NewRateCounters ¶

func NewRateCounters[K comparable](rateFunc RateFunc) *RateCounters[K]

func (*RateCounters[K]) Get ¶

func (rc *RateCounters[K]) Get(key K) *RateCounter

type RateEnforcingReader ¶

type RateEnforcingReader struct {
	// contains filtered or unexported fields
}

func NewRateEnforcingReader ¶

func NewRateEnforcingReader(r io.Reader, rc *RateCounter, windowDuration time.Duration) *RateEnforcingReader

func (*RateEnforcingReader) Done ¶

func (rer *RateEnforcingReader) Done()

func (*RateEnforcingReader) Read ¶

func (rer *RateEnforcingReader) Read(p []byte) (int, error)

func (*RateEnforcingReader) ReadError ¶

func (rer *RateEnforcingReader) ReadError() error

type RateFunc ¶

type RateFunc func(transferRateMbps float64, peerTransfers, totalTransfers int64) error

func MinAvgGlobalLogPeerRate ¶

func MinAvgGlobalLogPeerRate(minTxRateMbps, linkMbps float64) RateFunc

type SSRFPolicy ¶ added in v1.27.4

type SSRFPolicy struct {
	MaxURLLength          int
	MaxHeaderBytes        int
	MaxHeaderValues       int
	MaxRedirects          int
	DialTimeout           time.Duration
	TLSHandshakeTimeout   time.Duration
	ResponseHeaderTimeout time.Duration

	AllowLoopbackIPs    bool
	AllowLocalHostnames bool
	AllowPrivateIPs     bool

	// AllowedHosts is a list of host or host:port entries that bypass IP and
	// hostname validation. Entries without a port match any port for that host.
	AllowedHosts []string

	// Disabled skips all SSRF host/IP/header validation. URL structure checks
	// (scheme, control characters) are still enforced.
	Disabled bool
}

SSRFPolicy is for controlling the behavior of the downloader to prevent Server-Side Request Forgery

func DefaultSSRFPolicy ¶ added in v1.27.4

func DefaultSSRFPolicy() SSRFPolicy

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL