Documentation
¶
Overview ¶
Package tempfile provides an abstraction for creating virtual temporary files that are mapped to sections of a single physical file on disk. This design minimizes file descriptor usage while supporting efficient sequential writes and concurrent reads.
Package tempfile provides an abstraction for creating virtual temporary files that are mapped to sections of a single physical file on disk. This design minimizes file descriptor usage while supporting efficient sequential writes and concurrent reads.
The package supports two main workflows:
- Write data sequentially to multiple virtual files using FileWriter
- Read data back from any virtual file section using TempReader
Temporary Directory Selection: When no specific directory is provided, the package intelligently selects temporary directories that prefer disk-backed locations over potentially memory-backed filesystems (like tmpfs on Linux). This helps prevent out-of-memory issues when sorting datasets larger than available RAM. On Unix-like systems, /var/tmp is preferred over /tmp when available, as /tmp may be mounted as tmpfs (memory-backed).
The implementation handles cross-platform differences in file cleanup behavior, with automatic cleanup on Unix systems and explicit cleanup on Windows.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func GetTempDir ¶ added in v1.4.1
GetTempDir returns the optimal temporary directory for the given preference. If dir is provided and non-empty, it's validated and returned if usable. Otherwise, returns a pre-computed optimal directory based on preferDiskBacked. This function is thread-safe and performs O(1) lookups after initialization.
Types ¶
type FileWriter ¶
type FileWriter struct {
// contains filtered or unexported fields
}
FileWriter provides sequential writing to virtual temporary file sections. Each "virtual file" corresponds to a section of the underlying physical file, allowing multiple logical files to share a single file descriptor and reduce system resource usage during external sorting operations.
func New ¶
func New(dir string, preferDiskBacked bool) (*FileWriter, error)
New creates a new FileWriter for virtual temporary files in the specified directory. If dir is empty, intelligent directory selection is used that prefers disk-backed locations over potentially memory-backed filesystems (controlled by preferDiskBacked). The function attempts automatic cleanup on Unix systems by unlinking the file immediately, while Windows requires explicit cleanup when the FileWriter is closed.
func (*FileWriter) Close ¶
func (w *FileWriter) Close() error
Close terminates the FileWriter, flushes any buffered data, closes the underlying file, and removes it from disk if manual cleanup is required. This operation is irreversible and should only be called when abandoning the temporary file (e.g., on error). Use Save() instead to transition from writing to reading.
func (*FileWriter) Name ¶
func (w *FileWriter) Name() string
Name returns the full filesystem path of the underlying physical temporary file. This is primarily useful for debugging and logging purposes.
func (*FileWriter) Next ¶
func (w *FileWriter) Next() (int64, error)
Next finalizes the current virtual file section and prepares for writing the next section. It flushes buffered data and records the section boundary for later reading. Returns the file offset where the next section will begin.
func (*FileWriter) Save ¶
func (w *FileWriter) Save() (TempReader, error)
Save finalizes all virtual file sections and returns a TempReader for accessing the data. After calling Save(), the FileWriter can no longer be used for writing. The returned TempReader allows concurrent access to any virtual file section.
func (*FileWriter) Size ¶
func (w *FileWriter) Size() int
Size returns the total number of virtual file sections created. This includes the current section being written plus all completed sections.
func (*FileWriter) Write ¶
func (w *FileWriter) Write(p []byte) (int, error)
Write appends data to the current virtual file section. Data is buffered for efficiency and will be flushed when Next() or Save() is called.
func (*FileWriter) WriteString ¶
func (w *FileWriter) WriteString(s string) (int, error)
WriteString appends a string to the current virtual file section. This is more efficient than Write() for string data as it avoids byte slice conversion.
type MockFileWriter ¶
type MockFileWriter struct {
// contains filtered or unexported fields
}
MockFileWriter provides an in-memory implementation of the TempWriter interface. It stores all data in memory using bytes.Buffer instead of writing to disk files. This is useful for testing and benchmarking without filesystem I/O overhead.
func Mock ¶
func Mock(n int) *MockFileWriter
Mock creates a new in-memory TempWriter with the specified initial capacity. The parameter n sets the initial capacity of the underlying buffer to reduce memory reallocations during writing. Use this for testing and benchmarking scenarios where disk I/O should be avoided.
func (*MockFileWriter) Close ¶
func (w *MockFileWriter) Close() error
Close terminates the MockFileWriter and releases all memory. This operation is irreversible and prevents transitioning to read mode. Use Save() instead to transition from writing to reading.
func (*MockFileWriter) Next ¶
func (w *MockFileWriter) Next() (int64, error)
Next finalizes the current virtual file section and prepares for writing the next section. It records the section boundary for later reading and returns the offset where the next section begins.
func (*MockFileWriter) Save ¶
func (w *MockFileWriter) Save() (TempReader, error)
Save finalizes all virtual file sections and returns a TempReader for accessing the data. After calling Save(), the MockFileWriter can no longer be used for writing. The returned TempReader allows concurrent access to any virtual file section.
func (*MockFileWriter) Size ¶
func (w *MockFileWriter) Size() int
Size returns the total number of virtual file sections that have been created. This includes the current section being written plus all completed sections.
func (*MockFileWriter) Write ¶
func (w *MockFileWriter) Write(p []byte) (int, error)
Write appends data to the current virtual file section in memory.
func (*MockFileWriter) WriteString ¶
func (w *MockFileWriter) WriteString(s string) (int, error)
WriteString appends string data to the current virtual file section in memory.
type TempReader ¶
type TempReader interface {
// Close terminates the reader and cleans up resources.
// This should be called after all reading operations are complete.
io.Closer
// Size returns the total number of virtual file sections available for reading.
Size() int
// Read returns a buffered reader for the specified virtual file section.
// The section index i must be in the range [0, Size()-1].
// Each call may return a new reader instance positioned at the section start.
Read(i int) *bufio.Reader
}
TempReader defines the interface for reading from virtual temporary file sections. It provides concurrent access to any section created by the corresponding TempWriter. Multiple readers can access different sections simultaneously for efficient merging.
type TempWriter ¶
type TempWriter interface {
// Close terminates the writer and cleans up resources.
// This is irreversible and prevents transitioning to read mode.
io.Closer
// Size returns the number of virtual file sections created.
Size() int
// Write appends data to the current virtual file section.
Write(p []byte) (int, error)
// WriteString appends string data to the current virtual file section.
WriteString(s string) (int, error)
// Next finalizes the current section and prepares for the next one.
// Returns the offset where the next section will begin.
Next() (int64, error)
// Save finalizes all sections and returns a TempReader for data access.
// After calling Save(), the TempWriter cannot be used for further writing.
Save() (TempReader, error)
}
TempWriter defines the interface for sequential writing to virtual temporary file sections. It provides methods for writing data, managing section boundaries, and transitioning to read mode. Implementations handle the underlying storage mechanism (disk or memory).