http-over-polling

module

v0.2.1 Latest Latest Go to latest Published: Mar 18, 2026 License: MIT

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/DevNewbie1826/http-over-polling

Links

Open Source Insights

README ¶

http-over-polling

http-over-polling is a net/http-compatible experiment built on top of cloudwego/netpoll.

The current tree focuses on two things:

serving regular net/http handlers through an event-loop-driven transport
showing where poller-style handling helps, and where standard net/http remains competitive

The repository is strongest as a runnable playground: it ships a single example server, in-process benchmarks, and manual wrk reproduction steps.

Platform note: the current transport path and examples are intended for Linux and macOS. Windows is not a supported target right now.

At a glance

Latest local benchmark snapshot:

Category	`std`	`hop`	Takeaway
`wrk` throughput	`218,662.67 req/s`	`223,689.11 req/s`	`hop` led by about `2.30%` on the bundled `/` workload
`wrk` avg latency	`1.15 ms`	`1.05 ms`	`hop` averaged lower latency in this run
`wrk` p99 latency	`6.02 ms`	`5.81 ms`	tail latency was slightly lower on `hop` in this run
HTTP idle connections	`10000` connections / `10000` goroutines	`10000` connections / `0 observed` goroutines	`std` grows with connections; `hop` stayed near zero in this run
WebSocket idle connections	`10000` connections / `10000` goroutines	`10000` connections / `0 observed` goroutines	the event-driven `hop` WS path stayed near zero in this run

The wrk rows summarize the bundled / workload. The goroutine rows are process-wide observed deltas from the in-repo goroutine benchmarks, not universal guarantees. The exact commands and caveats are documented in Benchmarking with wrk and In-process benchmarks.

What is in this repo

Path	Role
`server/`	top-level server wrapper around `cloudwego/netpoll`
`engine/`	request execution pipeline and connection state management
`adaptor/`	`http.ResponseWriter` implementation plus hijack/read-handler bridge
`appcontext/`	request-scoped connection context helpers
`internal/parser/`	internal HTTP parsing implementation
`examples/`	runnable server plus benchmark and example tests

Quick start

Run the bundled example server from the repository root:

go run ./examples -type hop

Available modes:

Mode	Command	Notes
`hop`	`go run ./examples -type hop`	`netpoll`-backed server
`std`	`go run ./examples -type std`	standard `net/http` baseline

The example server listens on :1826 and also starts pprof on localhost:6060.

Example endpoints

examples/main.go exposes these routes:

Endpoint	Purpose
`/`	overview page listing the demo endpoints
`/ws-gobwas-low`	low-level event-driven WebSocket echo
`/ws-gobwas-high`	`wsutil`-based event-driven WebSocket echo
`/ws-gorilla-std`	classic Gorilla loop with a goroutine per connection
`/ws-gorilla-event`	Gorilla API driven through `SetReadHandler`
`/sse`	simple server-sent events stream
`/file`	serves `examples/main.go`

The WebSocket examples intentionally show different execution styles, not just different libraries. gobwas and the event-driven Gorilla path stay close to the poller model; the classic Gorilla loop demonstrates the more traditional goroutine-per-connection shape.

Current API shape

There is no hop.ListenAndServe(...) convenience entrypoint in the current tree. The public setup pattern is explicit:

package main

import (
    "log"
    "net/http"

    hengine "github.com/DevNewbie1826/http-over-polling/engine"
    hserver "github.com/DevNewbie1826/http-over-polling/server"
)

func main() {
    mux := http.NewServeMux()
    mux.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
        w.Write([]byte("hello from hop"))
    })

    eng := hengine.NewEngine(mux)
    srv := hserver.NewServer(eng)

    log.Fatal(srv.Serve(":1826"))
}

Useful server options live in server/server.go:

server.WithPollerNum(...)
server.WithBufferSize(...)
server.WithMaxConns(...)
server.WithKeepAliveTimeout(...)
server.WithReadTimeout(...)
server.WithWriteTimeout(...)

For upgraded or hijacked flows, adaptor.Hijacker exposes SetReadHandler(...), which is how the event-driven WebSocket examples plug custom frame handling into the connection lifecycle.

Benchmarking with `wrk`

The repository does not currently ship a wrk_compare.sh script. The supported flow is manual:

start one server mode
warm it up with a short wrk run
measure it with the same wrk command
restart in the other mode and repeat

Command shape used for the measurements below:

wrk -t6 -c300 -d30s --latency http://127.0.0.1:1826/

Metric guide:

Requests/sec: completed requests per second
Latency: request-to-response time; --latency adds percentile output
Transfer/sec: response bytes transferred per second

wrk's own documentation recommends using a meaningful test duration and reading percentile output rather than only the average. These numbers were taken locally on the same machine as the server, so treat them as directional measurements for this workload, not universal claims.

This comparison is for the bundled example configurations, not a transport-only micro-comparison. In the current tree, hop runs through server.NewServer(...) with explicit WithPollerNum, WithBufferSize, and zero timeout settings, and the server path also uses the listener configuration in server/server.go. The std baseline is the plain http.ListenAndServe(...) path from examples/main.go.

Latest local `wrk` result

Measured on 2026-03-17 against the / handler in examples/main.go, after a 5 second warm-up for each mode.

Metric	`std`	`hop`	Delta (`hop` vs `std`)
Requests/sec	218,662.67	223,689.11	+2.30%
Transfer/sec	95.51 MB/s	97.70 MB/s	+2.29%
Avg latency	1.15 ms	1.05 ms	-8.70%
P50 latency	711 us	722 us	+1.55%
P99 latency	6.02 ms	5.81 ms	-3.49%

What this current measurement says:

on the bundled root-handler workload, hop now edges out std in throughput
the gap is modest, so the interesting part is not a huge headline multiplier
the stronger story in this repository is still execution model control, especially for upgraded or long-lived connections
percentile results are mixed: hop improves average and p99 here, while p50 is effectively flat to slightly worse

If you want to rerun it yourself:

# std
go run ./examples -type std
wrk -t6 -c300 -d5s http://127.0.0.1:1826/
wrk -t6 -c300 -d30s --latency http://127.0.0.1:1826/

# hop
go run ./examples -type hop
wrk -t6 -c300 -d5s http://127.0.0.1:1826/
wrk -t6 -c300 -d30s --latency http://127.0.0.1:1826/

In-process benchmarks

For faster iteration, the example package also includes Go benchmarks:

go test ./examples -run '^$' -bench '^BenchmarkServer_(Standard|Hop)_HTTP$|^BenchmarkServer_Standard_WS$' -benchmem -count=1
go test ./examples -run '^$' -bench '^BenchmarkServer_Standard_HTTP_Goroutines$' -benchmem -benchtime=1x -count=1
go test ./examples -run '^$' -bench '^BenchmarkServer_Hop_HTTP_Goroutines$' -benchmem -benchtime=1x -count=1
go test ./examples -run '^$' -bench '^BenchmarkServer_Standard_WS_Goroutines$' -benchmem -benchtime=1x -count=1
go test ./examples -run '^$' -bench '^BenchmarkServer_Hop_WS_Goroutines$' -benchmem -benchtime=1x -count=1

The goroutine benchmarks open as many idle connections as the current process can sustain for that benchmark run. By default they derive a target from the open-file limit and the local ephemeral port range, then report the number of connections actually established before sampling. You can still override the target explicitly with GOROUTINE_BENCH_CONN_COUNT.

The custom metrics mean:

connections: idle connections successfully established for that run
goroutines: peak additional goroutines observed above the post-startup baseline during sampling
goroutines/conn: observed goroutine delta per idle connection

Use GOROUTINE_BENCH_CONN_COUNT to change the connection count:

GOROUTINE_BENCH_CONN_COUNT=512 go test ./examples -run '^$' -bench '^BenchmarkServer_Standard_HTTP_Goroutines$' -benchmem -benchtime=1x -count=1
GOROUTINE_BENCH_CONN_COUNT=512 go test ./examples -run '^$' -bench '^BenchmarkServer_Hop_WS_Goroutines$' -benchmem -benchtime=1x -count=1

A recent local run of the throughput-focused in-process benchmarks produced:

Benchmark	Result
`BenchmarkServer_Standard_HTTP`	`27,329 ns/op`, `5115 B/op`, `63 allocs/op`
`BenchmarkServer_Hop_HTTP`	`32,912 ns/op`, `4852 B/op`, `62 allocs/op`
`BenchmarkServer_Standard_WS`	`13,352 ns/op`, `1088 B/op`, `5 allocs/op`

That is useful context: the in-process benchmark and the external wrk benchmark answer different questions. If they disagree, prefer describing the workload rather than forcing one universal conclusion.

A recent local run of the goroutine benchmarks, using the automatically derived maximum connection count for each individual benchmark run on this machine, produced:

Benchmark	Result
`BenchmarkServer_Standard_HTTP_Goroutines`	`2865 connections`, `2865 goroutines`, `1.000 goroutines/conn`
`BenchmarkServer_Hop_HTTP_Goroutines`	`8123 connections`, `0 observed goroutines`, `0 goroutines/conn`
`BenchmarkServer_Standard_WS_Goroutines`	`987 connections`, `987 goroutines`, `1.000 goroutines/conn`
`BenchmarkServer_Hop_WS_Goroutines`	`4382 connections`, `0 observed goroutines`, `0 goroutines/conn`

Those goroutine benchmarks are intentionally about idle connection shape, not request throughput. They are most useful as sanity checks for per-connection goroutine growth, and the hop values should be read as “near-zero additional goroutines observed in this run”, not as a universal structural guarantee.

Verification commands

Core verification:

go test ./... -count=1

Targeted example checks:

go test ./examples -count=1
go test ./examples -run '^$' -bench '^BenchmarkServer_(Standard|Hop)_HTTP$|^BenchmarkServer_Standard_WS$' -benchmem -count=1
go test ./examples -run '^$' -bench '^BenchmarkServer_Standard_HTTP_Goroutines$' -benchmem -benchtime=1x -count=1
go test ./examples -run '^$' -bench '^BenchmarkServer_Hop_HTTP_Goroutines$' -benchmem -benchtime=1x -count=1
go test ./examples -run '^$' -bench '^BenchmarkServer_Standard_WS_Goroutines$' -benchmem -benchtime=1x -count=1
go test ./examples -run '^$' -bench '^BenchmarkServer_Hop_WS_Goroutines$' -benchmem -benchtime=1x -count=1

If you are investigating the parser directly:

go test ./internal/parser -run '^$' -bench . -benchmem

How to describe this project honestly

The current repository is strongest when described this way:

it adapts net/http handlers onto a cloudwego/netpoll-backed server path
it includes side-by-side stdlib and poller-driven server modes in one runnable example
it demonstrates multiple upgraded-connection styles, including event-driven WebSocket handling
it gives you enough local commands to rerun both in-process and end-to-end measurements yourself

What to avoid saying:

hop is always faster than net/http
netpoll removes goroutines entirely
one benchmark here predicts every production workload

That narrower framing matches both the current code and the upstream netpoll positioning more closely.

Acknowledgements

This project references and learns from the ideas and implementations in:

https://github.com/cloudwego/netpoll
https://github.com/valyala/bytebufferpool
https://github.com/valyala/fasthttp/tree/master/tcplisten

Thanks to the authors and maintainers of those projects.

Directories ¶

Path	Synopsis
adaptor
appcontext
engine
examples
internal
parser
tcplisten The files in this package are copied from github.com/valyala/fasthttp.	The files in this package are copied from github.com/valyala/fasthttp.
server

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

README ¶

http-over-polling

At a glance

What is in this repo

Quick start

Example endpoints

Current API shape

Benchmarking with wrk

Latest local wrk result

In-process benchmarks

Verification commands

How to describe this project honestly

Acknowledgements

Directories ¶

Benchmarking with `wrk`

Latest local `wrk` result