smartremote

package module
v0.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 14, 2025 License: MIT Imports: 15 Imported by: 1

README

Test Go Reference Coverage Status

SmartRemote

SmartRemote is a Go library that provides seamless access to remote HTTP files with intelligent partial downloading and local caching. Rather than downloading entire files upfront, it allows you to open a URL and read from it like a regular file, automatically fetching only the needed portions on-demand.

This is particularly useful for large files like ISOs, ZIPs, and other archives where you might only need to access specific sections (e.g., reading the central directory of a ZIP file without downloading the entire archive).

Features

  • Lazy Loading: Downloads only the blocks you actually read
  • Resume Support: Partial downloads can be saved and resumed via .part files
  • Intelligent Seeking: Handles seekable HTTP connections using Range requests
  • Concurrent Downloads: Manages multiple concurrent download clients with configurable limits
  • Idle Background Downloading: Automatically fills gaps in partial downloads when not actively reading
  • Block-Based Tracking: Uses efficient RoaringBitmap for tracking downloaded 64KB blocks
  • Standard Interfaces: Implements io.Reader, io.ReaderAt, and io.Seeker

Installation

go get github.com/KarpelesLab/smartremote

Quick Start

package main

import (
    "fmt"
    "io"
    "github.com/KarpelesLab/smartremote"
)

func main() {
    // Open a remote file
    f, err := smartremote.Open("https://example.com/largefile.zip")
    if err != nil {
        panic(err)
    }
    defer f.Close()

    // Use f as a regular read-only file
    // It will download parts as needed from the remote URL
    buf := make([]byte, 1024)
    n, err := f.Read(buf)
    if err != nil && err != io.EOF {
        panic(err)
    }
    fmt.Printf("Read %d bytes\n", n)
}

How It Works

Block-Based Downloads

SmartRemote divides remote files into 64KB blocks. When you read from the file, only the blocks containing the requested data are downloaded. Downloaded blocks are:

  1. Stored in a local temporary file
  2. Tracked using a RoaringBitmap for efficient status checking
  3. Persisted to a .part file so downloads can be resumed
HTTP Range Requests

The library uses HTTP Range requests (status 206 Partial Content) to download specific byte ranges. If the server doesn't support Range requests, SmartRemote falls back to downloading the entire file.

Connection Pooling

The DownloadManager maintains a pool of HTTP connections (default: 10) that are reused across requests. Idle connections are automatically cleaned up after 5 minutes.

Background Downloading

When there are no active read requests, SmartRemote opportunistically downloads missing blocks in the background, progressively completing the file.

Advanced Usage

Custom DownloadManager

Create a custom DownloadManager for more control:

dm := smartremote.NewDownloadManager()
dm.MaxConcurrent = 5           // Limit to 5 concurrent connections
dm.MaxDataJump = 1024 * 1024   // Allow skipping up to 1MB when seeking
dm.TmpDir = "/custom/tmp"      // Custom temp directory
dm.Client = customHTTPClient   // Use a custom http.Client

f, err := dm.Open("https://example.com/file.iso")
if err != nil {
    panic(err)
}
defer f.Close()
Specify Local Storage Path

Store the downloaded file at a specific path:

dm := smartremote.NewDownloadManager()
f, err := dm.OpenTo("https://example.com/file.iso", "/path/to/local/file.iso")
if err != nil {
    panic(err)
}
defer f.Close()
Simple ReaderAt Interface

For simple use cases where you just need io.ReaderAt:

dm := smartremote.NewDownloadManager()
reader := dm.For("https://example.com/file.bin")

buf := make([]byte, 100)
n, err := reader.ReadAt(buf, 1000) // Read 100 bytes starting at offset 1000
Force Complete Download

Download the entire file:

f, err := smartremote.Open("https://example.com/file.zip")
if err != nil {
    panic(err)
}
defer f.Close()

// Download everything
err = f.Complete()
if err != nil {
    panic(err)
}
Manual Progress Saving

Manually trigger a save of download progress:

f, err := smartremote.Open("https://example.com/file.zip")
if err != nil {
    panic(err)
}

// ... perform some reads ...

// Save progress explicitly
err = f.SavePart()
if err != nil {
    panic(err)
}

Configuration Options

Option Type Default Description
MaxConcurrent int 10 Maximum number of concurrent HTTP connections
MaxReadersPerFile int 3 Maximum HTTP connections per file for random access patterns (e.g., ZIP files)
MaxDataJump int64 512KB Maximum bytes to read and discard when seeking forward (vs opening a new connection)
TmpDir string os.TempDir() Directory for temporary download files
Client *http.Client http.DefaultClient HTTP client for making requests
Logger *log.Logger stderr Logger for debug output

API Reference

Package Functions
  • Open(url string) (*File, error) - Open a remote URL using the default manager
DownloadManager
  • NewDownloadManager() *DownloadManager - Create a new download manager
  • Open(url string) (*File, error) - Open a URL with auto-generated local path
  • OpenTo(url, localPath string) (*File, error) - Open a URL with specific local path
  • For(url string) io.ReaderAt - Get a simple ReaderAt for a URL
File
  • Read(p []byte) (n int, err error) - Read from current position
  • ReadAt(p []byte, off int64) (int, error) - Read from specific offset
  • Seek(offset int64, whence int) (int64, error) - Seek to position
  • Close() error - Close file and save progress
  • GetSize() (int64, error) - Get remote file size
  • SetSize(size int64) - Manually set file size
  • Stat() (os.FileInfo, error) - Get file info
  • Complete() error - Download entire file
  • SavePart() error - Manually save download progress

Resume Behavior

When opening a URL:

  1. If the local file doesn't exist, a new download begins
  2. If the local file exists with a .part file, the download resumes from where it left off
  3. If the local file exists without a .part file, it's assumed to be complete

On close:

  • If download is incomplete and progress was saved successfully, both files are kept for resume
  • If download is incomplete and progress save failed, the partial file is deleted
  • If download is complete, the .part file is removed

Requirements

  • Go 1.18 or later
  • Server must support HTTP Range requests for partial downloads (falls back to full download otherwise)

TODO

  • Add support for range invalidation (bad checksum causes re-download of affected area)

Documentation

Index

Constants

View Source
const DefaultBlockSize = 65536

DefaultBlockSize is the size in bytes of each download block (64KB). Downloaded data is tracked and stored in blocks of this size.

Variables

View Source
var DefaultDownloadManager = NewDownloadManager()

DefaultDownloadManager is the default DownloadManager used by package-level functions like Open. It is initialized with sensible defaults.

Functions

This section is empty.

Types

type DownloadManager

type DownloadManager struct {
	// MaxConcurrent is the maximum number of concurrent downloads.
	// changing it might not be effective immediately. Default is 10
	MaxConcurrent int

	// MaxReadersPerFile is the maximum number of concurrent HTTP connections
	// per file. This allows efficient random access patterns (e.g., ZIP files).
	// Default is 3.
	MaxReadersPerFile int

	// Client is the http client used to access urls to be downloaded
	Client *http.Client

	// TmpDir is where temporary files are created, and by default will be os.TempDir()
	TmpDir string

	// MaxDataJump is the maximum data that can be read & dropped when seeking forward
	// default is 512kB
	MaxDataJump int64

	*log.Logger
	// contains filtered or unexported fields
}

DownloadManager orchestrates concurrent downloads and manages HTTP connections for accessing remote files. It maintains a pool of download clients, handles connection reuse, and coordinates background downloading of file blocks during idle periods. A default manager is provided as DefaultDownloadManager.

func NewDownloadManager

func NewDownloadManager() *DownloadManager

NewDownloadManager creates and returns a new DownloadManager with default settings: 10 concurrent connections, 512KB maximum data jump for seeking, and the system temp directory for temporary files. The manager starts a background goroutine for connection management and idle downloading.

func (*DownloadManager) For

func (dl *DownloadManager) For(u string) io.ReaderAt

For returns an io.ReaderAt interface for the given URL. This provides a simple way to read from a remote URL at arbitrary offsets without managing a File object. Each ReadAt call may open a new HTTP connection.

func (*DownloadManager) Open

func (dlm *DownloadManager) Open(u string) (*File, error)

Open a given URL and return a file pointer that will run partial downloads when reads are needed. Downloaded data will be stored in the system temp directory, and will be removed at the end if download is incomplete.

func (*DownloadManager) OpenTo

func (dlm *DownloadManager) OpenTo(u, localPath string) (*File, error)

OpenTo opens a given URL and stores downloaded data at the specified local path. If the file already exists with a .part file, the download will resume. If the file exists without a .part file, it is assumed to be complete.

type File

type File struct {
	// contains filtered or unexported fields
}

File represents a remote file that can be accessed locally through partial downloads. It implements io.Reader, io.ReaderAt, and io.Seeker interfaces, allowing transparent access to remote HTTP content as if it were a local file. Downloaded data is cached locally in blocks, and only the required portions are fetched on demand. Partial download progress can be persisted to disk and resumed later.

func Open

func Open(u string) (*File, error)

Open a given URL and return a file pointer that will run partial downloads when reads are needed. Downloaded data will be stored in the system temp directory, and will be removed at the end if download is incomplete.

func (*File) Close

func (f *File) Close() error

Close will close the file and make sure data is synced on the disk if the download is still partial.

func (*File) Complete added in v0.0.15

func (f *File) Complete() error

Complete will download the whole file locally, returning errors in case of failure.

func (*File) GetSize

func (f *File) GetSize() (int64, error)

GetSize returns a file's size according to the remote server.

func (*File) Read

func (f *File) Read(p []byte) (n int, err error)

Read will read data from the file at the current position after checking it was successfully downloaded.

func (*File) ReadAt

func (f *File) ReadAt(p []byte, off int64) (int, error)

ReadAt will read data from the disk at a specified offset after checking it was successfully downloaded.

func (*File) SavePart

func (f *File) SavePart() error

SavePart triggers an immediate save of the download status to a .part file on disk, allowing resume to happen if the program terminates and is opened again.

func (*File) Seek

func (f *File) Seek(offset int64, whence int) (int64, error)

Seek in file for next Read() operation. If you use io.SeekEnd but the file download hasn't started, a HEAD request will be made to obtain the file's size.

func (*File) SetSize added in v0.0.13

func (f *File) SetSize(size int64)

SetSize manually sets the file size for incomplete files. This is useful when the file size is known in advance but the server doesn't provide a Content-Length header. Has no effect on complete files.

func (*File) Stat added in v0.0.13

func (f *File) Stat() (os.FileInfo, error)

Stat() will obtain the information of the underlying file after checking its size matches that of the file to download.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL