gopilot

package module

v0.0.1-rc9 Latest Latest Go to latest Published: Mar 8, 2026 License: MIT Imports: 23 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/falmar/gopilot

Links

Open Source Insights

README ¶

gopilot

A lightweight approach to Chromium automation using basic CDP commands.

NOTE: Breaking changes may occur until the API is finalized.

Overview

gopilot is my attempt to provide a simple, minimalistic API for automating Chromium browsers. It's not meant to be another Puppeteer. Instead, it's focused on the essential features most users need for straightforward browser tasks—no fluff, just what you need.

Under the hood gopilot uses github.com/mafredri/cdp for chrome communication, inspired by gRPC provides a really nice and easy API.

Why Minimalistic?

I wanted to simplify browser automation by sticking to the core functionalities that most of us use:

Navigation to web pages
Clicking on elements
Typing text
Taking screenshots
Extracting HTML content

I’ve also added some features for intercepting requests, which is handy if you want to cancel or grab AJAX info. Overall, gopilot aims to be a lightweight tool that doesn’t bog you down with unnecessary complexity.

Key Features

Headfull mode support: Designed to run as headful and compatible with Docker using Xvfb for display.
Headless mode: Easily switch to headless operation when needed.
Navigate to a specified URL
Element Search finds and/or wait for elements
Click on elements
Get and set HTML content
Intercept Request/Response network requests for those who want to dig deeper
Set, get, and clear cookies and local storage
Screenshots the current page's viewport, the full page or an element's within is bounding box
Text Typing just provide the text to be written, a delay or func can be supplied per keystroke delays

Installation

Prerequisites

Go 1.24.0 or later
Chrome or Chromium browser installed on your system

Installing gopilot

To install gopilot, use the standard Go package installation command:

go get github.com/falmar/gopilot

Import it in your Go code:

import "github.com/falmar/gopilot"

Quick Start

Here's a very basic example of how to use gopilot to open a URL:

package main

import (
	"context"
	"log/slog"
	"os"
	"os/signal"
	"time"

	"github.com/falmar/gopilot"
)

func main() {
	ctx, cancel := signal.NotifyContext(context.Background(), os.Interrupt, os.Kill)
	defer cancel()

	logger := slog.New(slog.NewTextHandler(os.Stderr, &slog.HandlerOptions{
		Level: slog.LevelDebug,
	}))

	cfg := gopilot.NewBrowserConfig()
	b := gopilot.NewBrowser(cfg, logger)

	err := b.Open(ctx, &gopilot.BrowserOpenInput{})
	if err != nil {
		logger.Error("unable to open browser", "error", err)
		return
	}
	defer b.Close(ctx)

	pOut, err := b.NewPage(ctx, &gopilot.BrowserNewPageInput{})
	if err != nil {
		logger.Error("unable to open page", "error", err)
		return
	}
	page := pOut.Page
	defer page.Close(ctx)

	_, err = page.Navigate(ctx, &gopilot.PageNavigateInput{
		URL:                "https://www.google.com",
		WaitDomContentLoad: true,
	})
	if err != nil {
		logger.Error("unable to navigate", "error", err)
		return
	}

	time.Sleep(2 * time.Second)

	// do some magic ...
}

Examples

For more practical examples of how to use gopilot, check out the examples provided:

Click Element - Demonstrates how to find and click on elements in a web page
Cookies - Shows how to set, get, and clear cookies
Evaluate JS - Examples of executing JavaScript in the browser context
External Browser - Shows how to connect to an existing Chrome instance instead of launching a new one
Local Storage - Shows how to interact with browser local storage
Open Chrome - Basic example of launching a Chrome browser
Open URL - Simple example of navigating to a URL
Screenshots - Shows how to capture screenshots of pages or elements
Search - Demonstrates how to search for elements on a page
Typing - Examples of typing text into input fields
Request Modifier - Demonstrates how to modify outgoing requests and provide custom responses
Listen XHR - Demonstrates how to intercept and monitor XHR requests

Advanced Usage

Headless Mode

By default, gopilot runs in headful mode, which may require a display server when running in a Docker container. To switch to headless mode, simply call the EnableHeadless method on the BrowserConfig object. You can start the browser in headless mode as follows:

// EnableHeadless will make the browser start as headless
cfg := gopilot.NewBrowserConfig()
cfg.EnableHeadless()

Connecting to External Browsers

gopilot can connect to existing Chrome/Chromium instances instead of launching a new process. This is useful for debugging, reusing browsers across multiple runs, or working with browsers that have specific profiles or extensions loaded.

Start Chrome with remote debugging:

chromium --remote-debugging-port=9222

Connect gopilot to the external browser:

cfg := gopilot.NewBrowserConfig()
cfg.ConnectionURL = "http://127.0.0.1:9222"

b := gopilot.NewBrowser(cfg, logger)
err := b.Open(ctx, &gopilot.BrowserOpenInput{})
if err != nil {
    // Connection failed - browser may not be running
    return
}
defer b.Close(ctx) // Closes pages but does NOT kill the browser

Session-Based Page Tracking: gopilot only manages pages it creates (via NewTab: true). This means:

Close() only closes pages created by this gopilot instance (session pages)
User tabs and pages from other gopilot instances are preserved
Multiple gopilot instances can safely share the same browser without conflicts

GetPages() vs GetAllPages():

GetPages() - Returns only pages created by this instance (closeable)
GetAllPages() - Returns ALL pages in the browser for inspection (calling Close() on these is a no-op)

See the External Browser example for a complete demonstration.

Configuration Options

gopilot provides several configuration options to customize browser behavior:

Browser Configuration

The BrowserConfig struct allows you to configure how the browser is launched:

type BrowserConfig struct {
    // Path specifies the path to the browser executable
    Path string

    // DebugPort specifies the port for debugging connections
    DebugPort string

    // Args contains additional command-line arguments
    Args []string

    // Envs holds environment variables for the browser process
    Envs []string

    // OpenTimeout defines how long to wait for Chrome to print the
    // "DevTools listening on" message during startup. If nil, defaults to 5s.
    OpenTimeout *time.Duration
}

Default Configuration

When you call gopilot.NewBrowserConfig(), it creates a configuration with these defaults:

Browser Path: Uses the Chrome executable specified by the GOPILOT_CHROME_EXECUTABLE environment variable, or defaults to "chromium"
Debug Port: "9222"
Default Arguments: Several arguments for optimal browser operation:
- --remote-allow-origins=*
- --no-first-run
- --no-service-autorun
- --no-default-browser-check
- --homepage=about:blank
- And several others for stability and performance

Environment Variables

GOPILOT_CHROME_EXECUTABLE: Set this to specify the path to your Chrome or Chromium executable. For example:
```
export GOPILOT_CHROME_EXECUTABLE="/usr/bin/google-chrome"
```

Adding Custom Arguments

You can add custom command-line arguments to the browser:

cfg := gopilot.NewBrowserConfig()
cfg.AddArgument("--disable-gpu")
cfg.AddArgument("--window-size=1280,720")

Project Status & Roadmap

gopilot is currently in active development ("WIP" - Work In Progress). While the core functionality is stable enough for many use cases, the API may change as we refine and improve the library.

Current Status

Core browser automation features are implemented and working
API is functional but may undergo refinements
Documentation and examples are being expanded

Planned Features

Listen for page/target events to change local data
Integration tests
Performance optimizations
Additional helper methods for common tasks

Development Priorities

API stabilization
Improved error handling and recovery
Enhanced documentation
Performance improvements

Contributions

Contributions are welcome! If you've got a feature request or an idea to share, reach out.

Documentation ¶

Overview ¶

Package gopilot provides a simple and minimalistic API for automating Chromium browsers.

gopilot is a lightweight alternative to complex browser automation tools, focusing on essential functionality using the Chrome DevTools Protocol (CDP). It's structured around three main components: Browser (manages instances), Page (represents tabs), and Element (represents DOM elements).

Key features include navigation, DOM manipulation, element interaction, screenshots, and network request monitoring. The package supports both headful (default) and headless modes, and can be configured via environment variables like GOPILOT_CHROME_EXECUTABLE.

Common use cases include web scraping, UI testing, form automation, and taking screenshots.

For examples and detailed usage, see: https://github.com/falmar/gopilot/tree/main/examples

Index ¶

Variables
type BoundingRect
type Browser
- func NewBrowser(cfg *BrowserConfig, logger *slog.Logger) Browser
type BrowserConfig
- func NewBrowserConfig() *BrowserConfig
- func (c *BrowserConfig) AddArgument(arg string)
- func (c *BrowserConfig) EnableHeadless()
type BrowserGetPagesInput
type BrowserGetPagesOutput
type BrowserNewPageInput
type BrowserNewPageOutput
type BrowserOpenInput
type ClearLocalStorageInput
type ClearLocalStorageOutput
type DispatchEventType
type Element
- func NewElement(client *cdp.Client, node dom.Node) Element
type ElementClickInput
type ElementClickOutput
type ElementDOM
type ElementInput
type ElementScrollIntoViewInput
type ElementScrollIntoViewOutput
type ElementTakeScreenshotInput
type ElementTakeScreenshotOutput
type GetCookiesInput
type GetCookiesOutput
type GetLocalStorageInput
type GetLocalStorageOutput
type InterceptRequestCallback
type InterceptRequestHandle
type InterceptResponseCallback
type InterceptResponseHandle
type LocalStorageItem
type Page
type PageCookie
type PageDOM
type PageEvaluateInput
type PageEvaluateOutput
type PageFetch
type PageInput
type PageNavigateInput
type PageNavigateOutput
type PageNavigation
type PageQuerySelectorInput
type PageQuerySelectorOutput
type PageReloadInput
type PageReloadOutput
type PageSearchInput
type PageSearchOutput
type PageStorage
type PageTakeScreenshotInput
type PageTakeScreenshotOutput
type PageTypeTextInput
type PageTypeTextOutput
type SetCookiesInput
type SetCookiesOutput
type SetLocalStorageInput
type SetLocalStorageOutput
type TypeDelayFunc
type XHREvent
type XHRMonitor
- func NewXHRMonitor(p Page) XHRMonitor

Constants ¶

This section is empty.

Variables ¶

View Source

var (
	ErrElementNotFound      = errors.New("page search: element not found")
	ErrElementSearchTimeout = errors.New("page search: timeout")
)

Functions ¶

This section is empty.

Types ¶

type BoundingRect ¶

type BoundingRect struct {
	// Top is the distance from the top of the viewport to the top of the element.
	Top float64 `json:"top"`
	// Left is the distance from the left of the viewport to the left of the element.
	Left float64 `json:"left"`
	// Bottom is the distance from the top of the viewport to the bottom of the element.
	Bottom float64 `json:"bottom"`
	// Right is the distance from the left of the viewport to the right of the element.
	Right float64 `json:"right"`
	// X is the horizontal coordinate of the element.
	X float64 `json:"x"`
	// Y is the vertical coordinate of the element.
	Y float64 `json:"y"`
	// Width is the width of the element.
	Width float64 `json:"width"`
	// Height is the height of the element.
	Height float64 `json:"height"`
	// CenterX is the centered position on x-axis
	CenterX float64
	// CenterY is the centered position on y-axis
	CenterY float64
}

BoundingRect represents the bounding box of an element on the page. It contains the coordinates of the edges and dimensions of the element.

type Browser ¶

type Browser interface {
	// Open initiates a new browser session.
	// It takes a context and BrowserOpenInput as parameters.
	// Returns an error if the browser fails to start.
	Open(ctx context.Context, in *BrowserOpenInput) error

	// NewPage creates a new page or tab in the browser.
	// Accepts context and BrowserNewPageInput to specify creation parameters.
	// Returns a BrowserNewPageOutput containing the newly created page
	// or an error if the page cannot be created.
	NewPage(ctx context.Context, in *BrowserNewPageInput) (*BrowserNewPageOutput, error)

	// GetPages retrieves only pages created by this session (tracked pages).
	// These are pages created with NewTab: true. Calling Close() on these pages will close them.
	// Returns a BrowserGetPagesOutput with a list of session pages or an error if retrieving fails.
	GetPages(ctx context.Context, in *BrowserGetPagesInput) (*BrowserGetPagesOutput, error)

	// GetAllPages retrieves ALL pages in the browser, including non-session pages.
	// Pages returned are NOT session-tracked, and calling Close() on them is a no-op.
	// Use this for inspection/debugging. For pages created by this session, use GetPages().
	GetAllPages(ctx context.Context, in *BrowserGetPagesInput) (*BrowserGetPagesOutput, error)

	// Close shuts down the browser instance and cleans up any resources.
	// Only closes session pages (pages created by this instance with NewTab: true).
	// For external browsers, closes session pages but leaves the browser running.
	Close(ctx context.Context) error

	// GetDevToolClient retrieves the DevTools client associated with the browser.
	// This client allows for advanced interactions with the browser's DevTools protocol,
	// enabling custom actions and low-level debugging or profiling features.
	GetDevToolClient() *devtool.DevTools
}

Browser defines a contract for browser operations. It allows managing browser instances and interacting with web pages.

func NewBrowser ¶

func NewBrowser(cfg *BrowserConfig, logger *slog.Logger) Browser

NewBrowser creates a new browser instance with the given configuration and logger.

type BrowserConfig ¶

type BrowserConfig struct {
	// Path specifies the path to the browser executable.
	Path string

	// DebugPort specifies the port for debugging connections.
	DebugPort string

	// Args contains additional command-line arguments to pass when launching the browser.
	Args []string

	// Envs holds any environment variables to set for the browser process.
	Envs []string

	// OpenTimeout defines how long to wait for Chrome to print the "DevTools listening on" message during startup.
	// If nil, a default of 5 seconds is used. Increase this if your environment starts Chrome slowly.
	OpenTimeout *time.Duration

	// CloseTimeout defines how long to wait for the Chrome process to terminate during shutdown.
	// If nil, a default of 5 seconds is used. Increase this if your environment needs more time to exit cleanly.
	CloseTimeout *time.Duration

	// ConnectionURL specifies the URL of an existing Chrome/Chromium browser to connect to.
	// When set, gopilot will connect to the existing browser instead of launching a new process.
	// Supports both WebSocket URLs (ws://127.0.0.1:9222/devtools/browser/UUID) and HTTP (http://127.0.0.1:9222).
	// The external browser will NOT be closed when Browser.Close() is called.
	//
	// Example:
	//   cfg := gopilot.NewBrowserConfig()
	//   cfg.ConnectionURL = "http://127.0.0.1:9222"
	ConnectionURL string
}

BrowserConfig holds configuration settings for launching a browser instance.

func NewBrowserConfig ¶

func NewBrowserConfig() *BrowserConfig

NewBrowserConfig creates a new BrowserConfig with default settings. The default Path is "chromium" and the default DebugPort is "9222". It includes several default command-line arguments for browser startup.

func (*BrowserConfig) AddArgument ¶

func (c *BrowserConfig) AddArgument(arg string)

AddArgument appends an additional command-line argument to the browser configuration. This allows users to customize the launch options for the browser instance.

func (*BrowserConfig) EnableHeadless ¶

func (c *BrowserConfig) EnableHeadless()

EnableHeadless will make the browser to start as headless

type BrowserGetPagesInput ¶

type BrowserGetPagesInput struct{}

BrowserGetPagesInput represents parameters to obtain open pages.

type BrowserGetPagesOutput ¶

type BrowserGetPagesOutput struct {
	Pages []Page
}

BrowserGetPagesOutput contains the list of open browser pages.

type BrowserNewPageInput ¶

type BrowserNewPageInput struct {
	NewTab bool
	URL    string
}

BrowserNewPageInput contains parameters for creating a new page.

type BrowserNewPageOutput ¶

type BrowserNewPageOutput struct {
	Page Page
}

BrowserNewPageOutput contains the result of creating a new page.

type BrowserOpenInput ¶

type BrowserOpenInput struct{}

BrowserOpenInput contains parameters required to open a browser.

type ClearLocalStorageInput ¶

type ClearLocalStorageInput struct{}

type ClearLocalStorageOutput ¶

type ClearLocalStorageOutput struct{}

type DispatchEventType ¶

type DispatchEventType string

const (
	DispatchEventTypeKeyDown    DispatchEventType = "keyDown"
	DispatchEventTypeKeyUp      DispatchEventType = "keyUp"
	DispatchEventTypeRawKeyDown DispatchEventType = "rawKeyDown"
	DispatchEventTypeChar       DispatchEventType = "char"
)

type Element ¶

type Element interface {
	ElementInput
	ElementDOM

	// TakeScreenshot captures a screenshot of the element.
	// It uses the element's position and size to define the capture area.
	// Input parameters can specify the format of the image.
	// Returns the screenshot data as bytes or an error if the capture fails.
	TakeScreenshot(ctx context.Context, in *ElementTakeScreenshotInput) (*ElementTakeScreenshotOutput, error)

	// GetNodeID gives the current node of the element
	GetNodeID(ctx context.Context) dom.NodeID
}

Element represents an interactive element in a web page.

func NewElement ¶

func NewElement(client *cdp.Client, node dom.Node) Element

NewElement creates a new Element instance. It takes a DOM node, DevTools instance, and CDP client as parameters. Returns a new Element implementation.

type ElementClickInput ¶

type ElementClickInput struct {
	StepDuration time.Duration // Duration for each step of the click action.
	HoldDuration time.Duration // Duration to hold the mouse press before releasing.

	ReturnHoldRelease bool // Return a release function to let user decide when to release mouse press
}

ElementClickInput specifies the input parameters for simulating a click on an element. - StepDuration: Duration to wait between each step of the click process: moving to the element, mouse press, and mouse release. - HoldDuration: Duration to wait between mouse press and mouse release. Defaults to StepDuration if not set.

type ElementClickOutput ¶

type ElementClickOutput struct {
	X float64 `json:"x"` // X coordinate of the click position.
	Y float64 `json:"y"` // Y coordinate of the click position.

	Release func() error
}

ElementClickOutput represents the output of a click action. It provides the X and Y coordinates where the click occurred.

type ElementDOM ¶

type ElementDOM interface {
	// HTML retrieves the element's outer HTML content
	HTML(ctx context.Context) (string, error)

	// Text retrieves the element's text content.
	Text(ctx context.Context) (string, error)

	// Focus sets focus on the element, allowing it to receive input.
	// Returns an error if the action fails.
	Focus(ctx context.Context) error

	// ScrollIntoView performs an action to scroll the element into the viewport.
	// Accepts an ElementScrollIntoViewInput with scroll parameters.
	// Returns an ElementScrollIntoViewOutput or an error if the action fails.
	ScrollIntoView(ctx context.Context, in *ElementScrollIntoViewInput) (*ElementScrollIntoViewOutput, error)

	// GetRect retrieves the bounding rectangle of the element.
	// Returns a BoundingRect containing the dimensions and position of the element, or an error if retrieval fails.
	GetRect(ctx context.Context) (*BoundingRect, error)

	// Remove the element from the DOM tree
	Remove(ctx context.Context) error
}

type ElementInput ¶

type ElementInput interface {
	// Click simulates a mouse click on the element.
	// Accepts an ElementClickInput containing details for the click action.
	// Returns an ElementClickOutput with the result or an error if the click fails.
	Click(ctx context.Context, in *ElementClickInput) (*ElementClickOutput, error)
}

type ElementScrollIntoViewInput ¶

type ElementScrollIntoViewInput struct{}

ElementScrollIntoViewInput contains parameters for the ScrollIntoView action.

type ElementScrollIntoViewOutput ¶

type ElementScrollIntoViewOutput struct {
	// X is the X coordinate of the scroll-to view position.
	X float64 `json:"x"`

	// Y is the Y coordinate of the scroll-to view position.
	Y float64 `json:"y"`
}

ElementScrollIntoViewOutput contains the result of the ScrollIntoView action.

type ElementTakeScreenshotInput ¶

type ElementTakeScreenshotInput struct {
	// Format specifies the desired image format for the screenshot.
	// Common formats include "png" and "jpeg".
	Format string
}

ElementTakeScreenshotInput specifies input parameters for taking a screenshot of an element.

type ElementTakeScreenshotOutput ¶

type ElementTakeScreenshotOutput struct {
	// Data contains the base64 encoded screenshot image data.
	Data []byte
}

ElementTakeScreenshotOutput represents the output of the TakeScreenshot method for an element.

type GetCookiesInput ¶

type GetCookiesInput struct{}

GetCookiesInput specifies the input for the GetCookies method.

type GetCookiesOutput ¶

type GetCookiesOutput struct {
	Cookies []PageCookie // List of cookies.
}

GetCookiesOutput contains the cookies retrieved from the browser. It returns a list of cookies.

type GetLocalStorageInput ¶

type GetLocalStorageInput struct{}

type GetLocalStorageOutput ¶

type GetLocalStorageOutput struct {
	Items []LocalStorageItem
}

type InterceptRequestCallback ¶

type InterceptRequestCallback func(ctx context.Context, req *fetch.RequestPausedReply, continueArgs *fetch.ContinueRequestArgs) (*fetch.FulfillRequestArgs, error)

InterceptRequestCallback is a function type for request interception. The callback receives details about the paused request and can modify it or provide a custom response. Return values: - (nil, nil): Continue the request with any modifications made to continueArgs - (nil, error): Abort the request with the given error - (*fetch.FulfillRequestArgs, nil): Fulfill the request with a custom response

type InterceptRequestHandle ¶

type InterceptRequestHandle struct {
	// contains filtered or unexported fields
}

InterceptRequestHandle is a handle for managing request interception callbacks.

type InterceptResponseCallback ¶

type InterceptResponseCallback func(ctx context.Context, req *fetch.RequestPausedReply, continueArgs *fetch.ContinueResponseArgs) error

InterceptResponseCallback is a function type for response interception. The callback receives details about the paused response and can modify it. If an error is returned, the response processing will be interrupted.

type InterceptResponseHandle ¶

type InterceptResponseHandle struct {
	// contains filtered or unexported fields
}

InterceptResponseHandle is a handle for managing response interception callbacks.

type LocalStorageItem ¶

type LocalStorageItem struct {
	Name  string `json:"name"`
	Value string `json:"value,omitempty"`
}

type Page ¶

type Page interface {
	PageNavigation
	PageDOM
	PageFetch
	PageStorage
	PageInput

	// Close closes the page.
	// Returns an error if closing the page fails.
	Close(ctx context.Context) error

	// Evaluate runs JavaScript on the page.
	// Takes a PageEvaluateInput and returns a PageEvaluateOutput or an error.
	Evaluate(ctx context.Context, in *PageEvaluateInput) (*PageEvaluateOutput, error)

	// TakeScreenshot captures a screenshot of the page.
	// You can choose to capture the entire page or just the visible viewport.
	// Input parameters allow you to specify the image format and capture area.
	// Returns the screenshot data or an error if the capture fails.
	TakeScreenshot(ctx context.Context, in *PageTakeScreenshotInput) (*PageTakeScreenshotOutput, error)

	// GetTargetID returns the unique identifier for the page's target.
	// This ID can be used to distinguish different pages or targets in the browser.
	GetTargetID() string

	// GetCDPClient retrieves the Chrome DevTools Protocol (CDP) client associated with the page.
	// The CDP client allows for direct communication with the browser's protocol.
	// This is useful for performing low-level operations and custom actions not exposed by higher-level methods.
	GetCDPClient() *cdp.Client
}

Page represents a web page in the browser.

type PageCookie ¶

type PageCookie struct {
	Name     string     // The name of the cookie.
	Value    string     // The value of the cookie.
	Domain   string     // The domain the cookie is associated with.
	Path     string     // The path the cookie is accessible from.
	Size     int        // The size of the cookie in bytes.
	Expires  *time.Time // The expiration time of the cookie.
	Secure   bool       // Indicates if the cookie is secure (only sent over HTTPS).
	HttpOnly bool       // Indicates if the cookie is accessible via HTTP only (not accessible via JavaScript).
	Session  bool       // Indicates if the cookie is a session cookie.
}

PageCookie represents a cookie in the browser. It includes details such as name, value, domain, path, expiration, and security features.

type PageDOM ¶

type PageDOM interface {
	// GetContent retrieves the HTML content of the page as a string.
	// Returns the content or an error if retrieving fails.
	GetContent(ctx context.Context) (string, error)

	// SetContent replaces the current DOM with supplied content
	SetContent(ctx context.Context, content string) error

	// QuerySelector finds an element matching the selector.
	// Takes a PageQuerySelectorInput and returns a PageQuerySelectorOutput or an error.
	QuerySelector(ctx context.Context, in *PageQuerySelectorInput) (*PageQuerySelectorOutput, error)

	// Search finds an element matching the text, query selector or xpath
	// Takes a PageSearchInput and returns a PageSearchOutput or an error.
	Search(ctx context.Context, in *PageSearchInput) (*PageSearchOutput, error)
}

type PageEvaluateInput ¶

type PageEvaluateInput struct {
	AwaitPromise bool
	ReturnValue  bool
	Expression   string
}

PageEvaluateInput specifies input for the Evaluate method.

type PageEvaluateOutput ¶

type PageEvaluateOutput struct {
	Value json.RawMessage
}

PageEvaluateOutput represents the output of the Evaluate method.

type PageFetch ¶

type PageFetch interface {
	// EnableFetch enables network fetch interception.
	// Returns an error if enabling fails.
	EnableFetch(ctx context.Context) error

	// DisableFetch disables network fetch interception.
	// Returns an error if disabling fails.
	DisableFetch(ctx context.Context) error

	// AddInterceptRequest adds a request interception callback.
	// Returns a handle that can be used to remove the callback later.
	AddInterceptRequest(ctx context.Context, cb InterceptRequestCallback) *InterceptRequestHandle

	// RemoveInterceptRequest removes a request interception callback.
	// The handle parameter should be the value returned by AddInterceptRequest.
	RemoveInterceptRequest(ctx context.Context, handle *InterceptRequestHandle)

	// AddInterceptResponse adds a response interception callback.
	// Returns a handle that can be used to remove the callback later.
	AddInterceptResponse(ctx context.Context, cb InterceptResponseCallback) *InterceptResponseHandle

	// RemoveInterceptResponse removes a response interception callback.
	// The handle parameter should be the value returned by AddInterceptResponse.
	RemoveInterceptResponse(ctx context.Context, handle *InterceptResponseHandle)
}

type PageInput ¶

type PageInput interface {
	// TypeText sends a sequence of keystrokes to the element as if typed by a user.
	// Accepts an ElementTypeInput containing the text to type.
	// Returns an ElementTypeOutput with the result or an error if typing fails.
	TypeText(ctx context.Context, in *PageTypeTextInput) (*PageTypeTextOutput, error)
}

type PageNavigateInput ¶

type PageNavigateInput struct {
	URL                string // The URL to navigate to.
	WaitDomContentLoad bool   // If true, waits for the DOM content to load before returning.
}

PageNavigateInput specifies the input for the Navigate method. URL is the target URL to navigate to. WaitDomContentLoad determines whether to wait for the DOM content to load.

type PageNavigateOutput ¶

type PageNavigateOutput struct {
	LoaderID network.LoaderID // The LoaderID associated with the navigation.
}

PageNavigateOutput represents the output of the Navigate method. LoaderID is the ID associated with the loading process of the page.

type PageNavigation ¶

type PageNavigation interface {
	// Activate brings page to front
	Activate(ctx context.Context) error

	// Navigate navigates the page to the specified URL.
	// The input is a PageNavigateInput containing the URL to navigate to.
	// It returns a PageNavigateOutput or an error if the navigation fails.
	Navigate(ctx context.Context, in *PageNavigateInput) (*PageNavigateOutput, error)

	// Reload reloads the current page.
	// It can take a PageReloadInput and returns a PageReloadOutput or an error.
	Reload(ctx context.Context, in *PageReloadInput) (*PageReloadOutput, error)
}

type PageQuerySelectorInput ¶

type PageQuerySelectorInput struct {
	Selector string
}

PageQuerySelectorInput contains the selector string for querying elements.

type PageQuerySelectorOutput ¶

type PageQuerySelectorOutput struct {
	Element Element
}

PageQuerySelectorOutput contains the Element found by the query.

type PageReloadInput ¶

type PageReloadInput struct {
	LoaderID           network.LoaderID // The LoaderID of the previous load.
	WaitDomContentLoad bool             // If true, waits for the DOM content to load after reload.
}

PageReloadInput specifies the input for the Reload method. LoaderID is the ID associated with the previous loading process. WaitDomContentLoad determines whether to wait for the DOM content to load after reloading.

type PageReloadOutput ¶

type PageReloadOutput struct{}

PageReloadOutput represents the output of the Reload method.

type PageSearchInput ¶

type PageSearchInput struct {
	Selector     string        // selector for querying the element.
	Pierce       bool          // Include User Agent Shadow DOM if true.
	WaitDuration time.Duration // Max duration to wait for an element to be present.
	TickDuration time.Duration // Duration between search attempts; defaults to 1 second if unset.
}

PageSearchInput contains the selector string for querying elements.

type PageSearchOutput ¶

type PageSearchOutput struct {
	Element Element // The first element matching the query.
}

PageSearchOutput contains the Element found by the query.

type PageStorage ¶

type PageStorage interface {
	// GetCookies retrieves cookies for the current page.
	// Takes a GetCookiesInput and returns GetCookiesOutput or an error.
	GetCookies(ctx context.Context, in *GetCookiesInput) (*GetCookiesOutput, error)

	// SetCookies sets cookies for the current page.
	// Takes a SetCookiesInput and returns SetCookiesOutput or an error.
	SetCookies(ctx context.Context, in *SetCookiesInput) (*SetCookiesOutput, error)

	// ClearCookies clears cookies for the current page.
	ClearCookies(ctx context.Context) error

	GetLocalStorage(ctx context.Context, in *GetLocalStorageInput) (*GetLocalStorageOutput, error)
	SetLocalStorage(ctx context.Context, in *SetLocalStorageInput) (*SetLocalStorageOutput, error)
	ClearLocalStorage(ctx context.Context) error
}

type PageTakeScreenshotInput ¶

type PageTakeScreenshotInput struct {
	// Format specifies the desired image format for the screenshot.
	// Options could be "png" or "jpeg".
	Format string

	// Full determines whether to capture the entire page or only the current viewport.
	Full bool

	// Viewport allows specifying a custom area of the page to capture.
	Viewport *cdppage.Viewport
}

PageTakeScreenshotInput specifies input parameters for taking a screenshot of a page.

type PageTakeScreenshotOutput ¶

type PageTakeScreenshotOutput struct {
	// Data contains the screenshot image data.
	Data []byte
}

PageTakeScreenshotOutput represents the output of the TakeScreenshot method for a page.

type PageTypeTextInput ¶

type PageTypeTextInput struct {
	Text      string        // The text to be typed into the page.
	Delay     time.Duration // (optional) Duration between keystrokes.
	DelayFunc TypeDelayFunc // (optional) Custom function for typing delays.

	UseRawKeyDown bool
}

PageTypeTextInput specifies input for the Type method. Text specifies the string to type into the page. Delay is the duration between keystrokes. DelayFunc is function to control typing delays.

type PageTypeTextOutput ¶

type PageTypeTextOutput struct{}

PageTypeTextOutput represents the output of the Type method. It is currently empty, but can be extended to provide additional details of the typing operation.

type SetCookiesInput ¶

type SetCookiesInput struct {
	Cookies []PageCookie // List of cookies to set.
}

SetCookiesInput specifies the input for the SetCookies method. It contains a list of cookies to set in the browser.

type SetCookiesOutput ¶

type SetCookiesOutput struct{}

SetCookiesOutput is returned after setting cookies successfully.

type SetLocalStorageInput ¶

type SetLocalStorageInput struct {
	Items []LocalStorageItem
}

type SetLocalStorageOutput ¶

type SetLocalStorageOutput struct{}

type TypeDelayFunc ¶

type TypeDelayFunc func() time.Duration

type XHREvent ¶

type XHREvent struct {
	URL    string `json:"url"`    // The URL that was requested
	Body   string `json:"body"`   // The body of the response
	Base64 bool   `json:"base64"` // Indicates if the response body is Base64 encoded
	Error  error  `json:"-"`      // Error encountered during the request (if any)
}

XHREvent represents an XHR event with related information.

type XHRMonitor ¶

type XHRMonitor interface {
	// Listen starts listening for XHR events that match the provided patterns.
	// It returns a channel of XHREvent and an error if the operation fails.
	Listen(ctx context.Context, patterns []string) (chan *XHREvent, error)

	// Stop stops monitoring the XHR requests.
	// Returns an error if stopping fails.
	Stop(ctx context.Context) error
}

XHRMonitor is an interface for monitoring XHR requests.

func NewXHRMonitor ¶

func NewXHRMonitor(p Page) XHRMonitor

NewXHRMonitor creates a new XHRMonitor instance. It takes a Page and returns an instance of XHRMonitor.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
examples
click_element command
cookies command
eval command
external_browser command
listen_xhr command
local_storage command
open_chrome command
open_url command
request_modifier command
screenshots command
search command
typing command

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

README ¶

gopilot

Table of Contents

Overview

Why Minimalistic?

Key Features

Installation

Prerequisites

Installing gopilot

Quick Start

Examples

Advanced Usage

Headless Mode

Connecting to External Browsers

Configuration Options

Browser Configuration

Default Configuration

Environment Variables

Adding Custom Arguments

Project Status & Roadmap

Current Status

Planned Features

Development Priorities

Contributions

Documentation ¶

Overview ¶

Index ¶

Constants ¶

Variables ¶

Functions ¶

Types ¶

type BoundingRect ¶

type Browser ¶

func NewBrowser ¶

type BrowserConfig ¶

func NewBrowserConfig ¶

func (*BrowserConfig) AddArgument ¶

func (*BrowserConfig) EnableHeadless ¶

type BrowserGetPagesInput ¶

type BrowserGetPagesOutput ¶

type BrowserNewPageInput ¶

type BrowserNewPageOutput ¶

type BrowserOpenInput ¶

type ClearLocalStorageInput ¶

type ClearLocalStorageOutput ¶

type DispatchEventType ¶

type Element ¶

func NewElement ¶

type ElementClickInput ¶

type ElementClickOutput ¶

type ElementDOM ¶

type ElementInput ¶

type ElementScrollIntoViewInput ¶

type ElementScrollIntoViewOutput ¶

type ElementTakeScreenshotInput ¶

type ElementTakeScreenshotOutput ¶

type GetCookiesInput ¶

type GetCookiesOutput ¶

type GetLocalStorageInput ¶

type GetLocalStorageOutput ¶

type InterceptRequestCallback ¶

type InterceptRequestHandle ¶

type InterceptResponseCallback ¶

type InterceptResponseHandle ¶

type LocalStorageItem ¶

type Page ¶

type PageCookie ¶

type PageDOM ¶

type PageEvaluateInput ¶

type PageEvaluateOutput ¶

type PageFetch ¶

type PageInput ¶

type PageNavigateInput ¶

type PageNavigateOutput ¶

type PageNavigation ¶

type PageQuerySelectorInput ¶

type PageQuerySelectorOutput ¶

type PageReloadInput ¶

type PageReloadOutput ¶

type PageSearchInput ¶