gopilot

package module
v0.0.1-rc9 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 8, 2026 License: MIT Imports: 23 Imported by: 0

README

gopilot

GoPilot Logo

Go Reference

A lightweight approach to Chromium automation using basic CDP commands.

NOTE: Breaking changes may occur until the API is finalized.

Table of Contents

Overview

gopilot is my attempt to provide a simple, minimalistic API for automating Chromium browsers. It's not meant to be another Puppeteer. Instead, it's focused on the essential features most users need for straightforward browser tasks—no fluff, just what you need.

Under the hood gopilot uses github.com/mafredri/cdp for chrome communication, inspired by gRPC provides a really nice and easy API.

Why Minimalistic?

I wanted to simplify browser automation by sticking to the core functionalities that most of us use:

  • Navigation to web pages
  • Clicking on elements
  • Typing text
  • Taking screenshots
  • Extracting HTML content

I’ve also added some features for intercepting requests, which is handy if you want to cancel or grab AJAX info. Overall, gopilot aims to be a lightweight tool that doesn’t bog you down with unnecessary complexity.

Key Features

  • Headfull mode support: Designed to run as headful and compatible with Docker using Xvfb for display.
  • Headless mode: Easily switch to headless operation when needed.
  • Navigate to a specified URL
  • Element Search finds and/or wait for elements
  • Click on elements
  • Get and set HTML content
  • Intercept Request/Response network requests for those who want to dig deeper
  • Set, get, and clear cookies and local storage
  • Screenshots the current page's viewport, the full page or an element's within is bounding box
  • Text Typing just provide the text to be written, a delay or func can be supplied per keystroke delays

Installation

Prerequisites
  • Go 1.24.0 or later
  • Chrome or Chromium browser installed on your system
Installing gopilot

To install gopilot, use the standard Go package installation command:

go get github.com/falmar/gopilot

Import it in your Go code:

import "github.com/falmar/gopilot"

Quick Start

Here's a very basic example of how to use gopilot to open a URL:

package main

import (
	"context"
	"log/slog"
	"os"
	"os/signal"
	"time"

	"github.com/falmar/gopilot"
)

func main() {
	ctx, cancel := signal.NotifyContext(context.Background(), os.Interrupt, os.Kill)
	defer cancel()

	logger := slog.New(slog.NewTextHandler(os.Stderr, &slog.HandlerOptions{
		Level: slog.LevelDebug,
	}))

	cfg := gopilot.NewBrowserConfig()
	b := gopilot.NewBrowser(cfg, logger)

	err := b.Open(ctx, &gopilot.BrowserOpenInput{})
	if err != nil {
		logger.Error("unable to open browser", "error", err)
		return
	}
	defer b.Close(ctx)

	pOut, err := b.NewPage(ctx, &gopilot.BrowserNewPageInput{})
	if err != nil {
		logger.Error("unable to open page", "error", err)
		return
	}
	page := pOut.Page
	defer page.Close(ctx)

	_, err = page.Navigate(ctx, &gopilot.PageNavigateInput{
		URL:                "https://www.google.com",
		WaitDomContentLoad: true,
	})
	if err != nil {
		logger.Error("unable to navigate", "error", err)
		return
	}

	time.Sleep(2 * time.Second)

	// do some magic ...
}

Examples

For more practical examples of how to use gopilot, check out the examples provided:

  • Click Element - Demonstrates how to find and click on elements in a web page
  • Cookies - Shows how to set, get, and clear cookies
  • Evaluate JS - Examples of executing JavaScript in the browser context
  • External Browser - Shows how to connect to an existing Chrome instance instead of launching a new one
  • Local Storage - Shows how to interact with browser local storage
  • Open Chrome - Basic example of launching a Chrome browser
  • Open URL - Simple example of navigating to a URL
  • Screenshots - Shows how to capture screenshots of pages or elements
  • Search - Demonstrates how to search for elements on a page
  • Typing - Examples of typing text into input fields
  • Request Modifier - Demonstrates how to modify outgoing requests and provide custom responses
  • Listen XHR - Demonstrates how to intercept and monitor XHR requests

Advanced Usage

Headless Mode

By default, gopilot runs in headful mode, which may require a display server when running in a Docker container. To switch to headless mode, simply call the EnableHeadless method on the BrowserConfig object. You can start the browser in headless mode as follows:

// EnableHeadless will make the browser start as headless
cfg := gopilot.NewBrowserConfig()
cfg.EnableHeadless()
Connecting to External Browsers

gopilot can connect to existing Chrome/Chromium instances instead of launching a new process. This is useful for debugging, reusing browsers across multiple runs, or working with browsers that have specific profiles or extensions loaded.

Start Chrome with remote debugging:

chromium --remote-debugging-port=9222

Connect gopilot to the external browser:

cfg := gopilot.NewBrowserConfig()
cfg.ConnectionURL = "http://127.0.0.1:9222"

b := gopilot.NewBrowser(cfg, logger)
err := b.Open(ctx, &gopilot.BrowserOpenInput{})
if err != nil {
    // Connection failed - browser may not be running
    return
}
defer b.Close(ctx) // Closes pages but does NOT kill the browser

Session-Based Page Tracking: gopilot only manages pages it creates (via NewTab: true). This means:

  • Close() only closes pages created by this gopilot instance (session pages)
  • User tabs and pages from other gopilot instances are preserved
  • Multiple gopilot instances can safely share the same browser without conflicts

GetPages() vs GetAllPages():

  • GetPages() - Returns only pages created by this instance (closeable)
  • GetAllPages() - Returns ALL pages in the browser for inspection (calling Close() on these is a no-op)

See the External Browser example for a complete demonstration.

Configuration Options

gopilot provides several configuration options to customize browser behavior:

Browser Configuration

The BrowserConfig struct allows you to configure how the browser is launched:

type BrowserConfig struct {
    // Path specifies the path to the browser executable
    Path string

    // DebugPort specifies the port for debugging connections
    DebugPort string

    // Args contains additional command-line arguments
    Args []string

    // Envs holds environment variables for the browser process
    Envs []string

    // OpenTimeout defines how long to wait for Chrome to print the
    // "DevTools listening on" message during startup. If nil, defaults to 5s.
    OpenTimeout *time.Duration
}
Default Configuration

When you call gopilot.NewBrowserConfig(), it creates a configuration with these defaults:

  • Browser Path: Uses the Chrome executable specified by the GOPILOT_CHROME_EXECUTABLE environment variable, or defaults to "chromium"
  • Debug Port: "9222"
  • Default Arguments: Several arguments for optimal browser operation:
    • --remote-allow-origins=*
    • --no-first-run
    • --no-service-autorun
    • --no-default-browser-check
    • --homepage=about:blank
    • And several others for stability and performance
Environment Variables
  • GOPILOT_CHROME_EXECUTABLE: Set this to specify the path to your Chrome or Chromium executable. For example:
    export GOPILOT_CHROME_EXECUTABLE="/usr/bin/google-chrome"
    
Adding Custom Arguments

You can add custom command-line arguments to the browser:

cfg := gopilot.NewBrowserConfig()
cfg.AddArgument("--disable-gpu")
cfg.AddArgument("--window-size=1280,720")

Project Status & Roadmap

gopilot is currently in active development ("WIP" - Work In Progress). While the core functionality is stable enough for many use cases, the API may change as we refine and improve the library.

Current Status
  • Core browser automation features are implemented and working
  • API is functional but may undergo refinements
  • Documentation and examples are being expanded
Planned Features
  • Listen for page/target events to change local data
  • Integration tests
  • Performance optimizations
  • Additional helper methods for common tasks
Development Priorities
  1. API stabilization
  2. Improved error handling and recovery
  3. Enhanced documentation
  4. Performance improvements

Contributions

Contributions are welcome! If you've got a feature request or an idea to share, reach out.

Documentation

Overview

Package gopilot provides a simple and minimalistic API for automating Chromium browsers.

gopilot is a lightweight alternative to complex browser automation tools, focusing on essential functionality using the Chrome DevTools Protocol (CDP). It's structured around three main components: Browser (manages instances), Page (represents tabs), and Element (represents DOM elements).

Key features include navigation, DOM manipulation, element interaction, screenshots, and network request monitoring. The package supports both headful (default) and headless modes, and can be configured via environment variables like GOPILOT_CHROME_EXECUTABLE.

Common use cases include web scraping, UI testing, form automation, and taking screenshots.

For examples and detailed usage, see: https://github.com/falmar/gopilot/tree/main/examples

Index

Constants

This section is empty.

Variables

View Source
var (
	ErrElementNotFound      = errors.New("page search: element not found")
	ErrElementSearchTimeout = errors.New("page search: timeout")
)

Functions

This section is empty.

Types

type BoundingRect

type BoundingRect struct {
	// Top is the distance from the top of the viewport to the top of the element.
	Top float64 `json:"top"`
	// Left is the distance from the left of the viewport to the left of the element.
	Left float64 `json:"left"`
	// Bottom is the distance from the top of the viewport to the bottom of the element.
	Bottom float64 `json:"bottom"`
	// Right is the distance from the left of the viewport to the right of the element.
	Right float64 `json:"right"`
	// X is the horizontal coordinate of the element.
	X float64 `json:"x"`
	// Y is the vertical coordinate of the element.
	Y float64 `json:"y"`
	// Width is the width of the element.
	Width float64 `json:"width"`
	// Height is the height of the element.
	Height float64 `json:"height"`
	// CenterX is the centered position on x-axis
	CenterX float64
	// CenterY is the centered position on y-axis
	CenterY float64
}

BoundingRect represents the bounding box of an element on the page. It contains the coordinates of the edges and dimensions of the element.

type Browser

type Browser interface {
	// Open initiates a new browser session.
	// It takes a context and BrowserOpenInput as parameters.
	// Returns an error if the browser fails to start.
	Open(ctx context.Context, in *BrowserOpenInput) error

	// NewPage creates a new page or tab in the browser.
	// Accepts context and BrowserNewPageInput to specify creation parameters.
	// Returns a BrowserNewPageOutput containing the newly created page
	// or an error if the page cannot be created.
	NewPage(ctx context.Context, in *BrowserNewPageInput) (*BrowserNewPageOutput, error)

	// GetPages retrieves only pages created by this session (tracked pages).
	// These are pages created with NewTab: true. Calling Close() on these pages will close them.
	// Returns a BrowserGetPagesOutput with a list of session pages or an error if retrieving fails.
	GetPages(ctx context.Context, in *BrowserGetPagesInput) (*BrowserGetPagesOutput, error)

	// GetAllPages retrieves ALL pages in the browser, including non-session pages.
	// Pages returned are NOT session-tracked, and calling Close() on them is a no-op.
	// Use this for inspection/debugging. For pages created by this session, use GetPages().
	GetAllPages(ctx context.Context, in *BrowserGetPagesInput) (*BrowserGetPagesOutput, error)

	// Close shuts down the browser instance and cleans up any resources.
	// Only closes session pages (pages created by this instance with NewTab: true).
	// For external browsers, closes session pages but leaves the browser running.
	Close(ctx context.Context) error

	// GetDevToolClient retrieves the DevTools client associated with the browser.
	// This client allows for advanced interactions with the browser's DevTools protocol,
	// enabling custom actions and low-level debugging or profiling features.
	GetDevToolClient() *devtool.DevTools
}

Browser defines a contract for browser operations. It allows managing browser instances and interacting with web pages.

func NewBrowser

func NewBrowser(cfg *BrowserConfig, logger *slog.Logger) Browser

NewBrowser creates a new browser instance with the given configuration and logger.

type BrowserConfig

type BrowserConfig struct {
	// Path specifies the path to the browser executable.
	Path string

	// DebugPort specifies the port for debugging connections.
	DebugPort string

	// Args contains additional command-line arguments to pass when launching the browser.
	Args []string

	// Envs holds any environment variables to set for the browser process.
	Envs []string

	// OpenTimeout defines how long to wait for Chrome to print the "DevTools listening on" message during startup.
	// If nil, a default of 5 seconds is used. Increase this if your environment starts Chrome slowly.
	OpenTimeout *time.Duration

	// CloseTimeout defines how long to wait for the Chrome process to terminate during shutdown.
	// If nil, a default of 5 seconds is used. Increase this if your environment needs more time to exit cleanly.
	CloseTimeout *time.Duration

	// ConnectionURL specifies the URL of an existing Chrome/Chromium browser to connect to.
	// When set, gopilot will connect to the existing browser instead of launching a new process.
	// Supports both WebSocket URLs (ws://127.0.0.1:9222/devtools/browser/UUID) and HTTP (http://127.0.0.1:9222).
	// The external browser will NOT be closed when Browser.Close() is called.
	//
	// Example:
	//   cfg := gopilot.NewBrowserConfig()
	//   cfg.ConnectionURL = "http://127.0.0.1:9222"
	ConnectionURL string
}

BrowserConfig holds configuration settings for launching a browser instance.

func NewBrowserConfig

func NewBrowserConfig() *BrowserConfig

NewBrowserConfig creates a new BrowserConfig with default settings. The default Path is "chromium" and the default DebugPort is "9222". It includes several default command-line arguments for browser startup.

func (*BrowserConfig) AddArgument

func (c *BrowserConfig) AddArgument(arg string)

AddArgument appends an additional command-line argument to the browser configuration. This allows users to customize the launch options for the browser instance.

func (*BrowserConfig) EnableHeadless

func (c *BrowserConfig) EnableHeadless()

EnableHeadless will make the browser to start as headless

type BrowserGetPagesInput

type BrowserGetPagesInput struct{}

BrowserGetPagesInput represents parameters to obtain open pages.

type BrowserGetPagesOutput

type BrowserGetPagesOutput struct {
	Pages []Page
}

BrowserGetPagesOutput contains the list of open browser pages.

type BrowserNewPageInput

type BrowserNewPageInput struct {
	NewTab bool
	URL    string
}

BrowserNewPageInput contains parameters for creating a new page.

type BrowserNewPageOutput

type BrowserNewPageOutput struct {
	Page Page
}

BrowserNewPageOutput contains the result of creating a new page.

type BrowserOpenInput

type BrowserOpenInput struct{}

BrowserOpenInput contains parameters required to open a browser.

type ClearLocalStorageInput

type ClearLocalStorageInput struct{}

type ClearLocalStorageOutput

type ClearLocalStorageOutput struct{}

type DispatchEventType

type DispatchEventType string
const (
	DispatchEventTypeKeyDown    DispatchEventType = "keyDown"
	DispatchEventTypeKeyUp      DispatchEventType = "keyUp"
	DispatchEventTypeRawKeyDown DispatchEventType = "rawKeyDown"
	DispatchEventTypeChar       DispatchEventType = "char"
)

type Element

type Element interface {
	ElementInput
	ElementDOM

	// TakeScreenshot captures a screenshot of the element.
	// It uses the element's position and size to define the capture area.
	// Input parameters can specify the format of the image.
	// Returns the screenshot data as bytes or an error if the capture fails.
	TakeScreenshot(ctx context.Context, in *ElementTakeScreenshotInput) (*ElementTakeScreenshotOutput, error)

	// GetNodeID gives the current node of the element
	GetNodeID(ctx context.Context) dom.NodeID
}

Element represents an interactive element in a web page.

func NewElement

func NewElement(client *cdp.Client, node dom.Node) Element

NewElement creates a new Element instance. It takes a DOM node, DevTools instance, and CDP client as parameters. Returns a new Element implementation.

type ElementClickInput

type ElementClickInput struct {
	StepDuration time.Duration // Duration for each step of the click action.
	HoldDuration time.Duration // Duration to hold the mouse press before releasing.

	ReturnHoldRelease bool // Return a release function to let user decide when to release mouse press
}

ElementClickInput specifies the input parameters for simulating a click on an element. - StepDuration: Duration to wait between each step of the click process: moving to the element, mouse press, and mouse release. - HoldDuration: Duration to wait between mouse press and mouse release. Defaults to StepDuration if not set.

type ElementClickOutput

type ElementClickOutput struct {
	X float64 `json:"x"` // X coordinate of the click position.
	Y float64 `json:"y"` // Y coordinate of the click position.

	Release func() error
}

ElementClickOutput represents the output of a click action. It provides the X and Y coordinates where the click occurred.

type ElementDOM

type ElementDOM interface {
	// HTML retrieves the element's outer HTML content
	HTML(ctx context.Context) (string, error)

	// Text retrieves the element's text content.
	Text(ctx context.Context) (string, error)

	// Focus sets focus on the element, allowing it to receive input.
	// Returns an error if the action fails.
	Focus(ctx context.Context) error

	// ScrollIntoView performs an action to scroll the element into the viewport.
	// Accepts an ElementScrollIntoViewInput with scroll parameters.
	// Returns an ElementScrollIntoViewOutput or an error if the action fails.
	ScrollIntoView(ctx context.Context, in *ElementScrollIntoViewInput) (*ElementScrollIntoViewOutput, error)

	// GetRect retrieves the bounding rectangle of the element.
	// Returns a BoundingRect containing the dimensions and position of the element, or an error if retrieval fails.
	GetRect(ctx context.Context) (*BoundingRect, error)

	// Remove the element from the DOM tree
	Remove(ctx context.Context) error
}

type ElementInput

type ElementInput interface {
	// Click simulates a mouse click on the element.
	// Accepts an ElementClickInput containing details for the click action.
	// Returns an ElementClickOutput with the result or an error if the click fails.
	Click(ctx context.Context, in *ElementClickInput) (*ElementClickOutput, error)
}

type ElementScrollIntoViewInput

type ElementScrollIntoViewInput struct{}

ElementScrollIntoViewInput contains parameters for the ScrollIntoView action.

type ElementScrollIntoViewOutput

type ElementScrollIntoViewOutput struct {
	// X is the X coordinate of the scroll-to view position.
	X float64 `json:"x"`

	// Y is the Y coordinate of the scroll-to view position.
	Y float64 `json:"y"`
}

ElementScrollIntoViewOutput contains the result of the ScrollIntoView action.

type ElementTakeScreenshotInput

type ElementTakeScreenshotInput struct {
	// Format specifies the desired image format for the screenshot.
	// Common formats include "png" and "jpeg".
	Format string
}

ElementTakeScreenshotInput specifies input parameters for taking a screenshot of an element.

type ElementTakeScreenshotOutput

type ElementTakeScreenshotOutput struct {
	// Data contains the base64 encoded screenshot image data.
	Data []byte
}

ElementTakeScreenshotOutput represents the output of the TakeScreenshot method for an element.

type GetCookiesInput

type GetCookiesInput struct{}

GetCookiesInput specifies the input for the GetCookies method.

type GetCookiesOutput

type GetCookiesOutput struct {
	Cookies []PageCookie // List of cookies.
}

GetCookiesOutput contains the cookies retrieved from the browser. It returns a list of cookies.

type GetLocalStorageInput

type GetLocalStorageInput struct{}

type GetLocalStorageOutput

type GetLocalStorageOutput struct {
	Items []LocalStorageItem
}

type InterceptRequestCallback

type InterceptRequestCallback func(ctx context.Context, req *fetch.RequestPausedReply, continueArgs *fetch.ContinueRequestArgs) (*fetch.FulfillRequestArgs, error)

InterceptRequestCallback is a function type for request interception. The callback receives details about the paused request and can modify it or provide a custom response. Return values: - (nil, nil): Continue the request with any modifications made to continueArgs - (nil, error): Abort the request with the given error - (*fetch.FulfillRequestArgs, nil): Fulfill the request with a custom response

type InterceptRequestHandle

type InterceptRequestHandle struct {
	// contains filtered or unexported fields
}

InterceptRequestHandle is a handle for managing request interception callbacks.

type InterceptResponseCallback

type InterceptResponseCallback func(ctx context.Context, req *fetch.RequestPausedReply, continueArgs *fetch.ContinueResponseArgs) error

InterceptResponseCallback is a function type for response interception. The callback receives details about the paused response and can modify it. If an error is returned, the response processing will be interrupted.

type InterceptResponseHandle

type InterceptResponseHandle struct {
	// contains filtered or unexported fields
}

InterceptResponseHandle is a handle for managing response interception callbacks.

type LocalStorageItem

type LocalStorageItem struct {
	Name  string `json:"name"`
	Value string `json:"value,omitempty"`
}

type Page

type Page interface {
	PageNavigation
	PageDOM
	PageFetch
	PageStorage
	PageInput

	// Close closes the page.
	// Returns an error if closing the page fails.
	Close(ctx context.Context) error

	// Evaluate runs JavaScript on the page.
	// Takes a PageEvaluateInput and returns a PageEvaluateOutput or an error.
	Evaluate(ctx context.Context, in *PageEvaluateInput) (*PageEvaluateOutput, error)

	// TakeScreenshot captures a screenshot of the page.
	// You can choose to capture the entire page or just the visible viewport.
	// Input parameters allow you to specify the image format and capture area.
	// Returns the screenshot data or an error if the capture fails.
	TakeScreenshot(ctx context.Context, in *PageTakeScreenshotInput) (*PageTakeScreenshotOutput, error)

	// GetTargetID returns the unique identifier for the page's target.
	// This ID can be used to distinguish different pages or targets in the browser.
	GetTargetID() string

	// GetCDPClient retrieves the Chrome DevTools Protocol (CDP) client associated with the page.
	// The CDP client allows for direct communication with the browser's protocol.
	// This is useful for performing low-level operations and custom actions not exposed by higher-level methods.
	GetCDPClient() *cdp.Client
}

Page represents a web page in the browser.

type PageCookie struct {
	Name     string     // The name of the cookie.
	Value    string     // The value of the cookie.
	Domain   string     // The domain the cookie is associated with.
	Path     string     // The path the cookie is accessible from.
	Size     int        // The size of the cookie in bytes.
	Expires  *time.Time // The expiration time of the cookie.
	Secure   bool       // Indicates if the cookie is secure (only sent over HTTPS).
	HttpOnly bool       // Indicates if the cookie is accessible via HTTP only (not accessible via JavaScript).
	Session  bool       // Indicates if the cookie is a session cookie.
}

PageCookie represents a cookie in the browser. It includes details such as name, value, domain, path, expiration, and security features.

type PageDOM

type PageDOM interface {
	// GetContent retrieves the HTML content of the page as a string.
	// Returns the content or an error if retrieving fails.
	GetContent(ctx context.Context) (string, error)

	// SetContent replaces the current DOM with supplied content
	SetContent(ctx context.Context, content string) error

	// QuerySelector finds an element matching the selector.
	// Takes a PageQuerySelectorInput and returns a PageQuerySelectorOutput or an error.
	QuerySelector(ctx context.Context, in *PageQuerySelectorInput) (*PageQuerySelectorOutput, error)

	// Search finds an element matching the text, query selector or xpath
	// Takes a PageSearchInput and returns a PageSearchOutput or an error.
	Search(ctx context.Context, in *PageSearchInput) (*PageSearchOutput, error)
}

type PageEvaluateInput

type PageEvaluateInput struct {
	AwaitPromise bool
	ReturnValue  bool
	Expression   string
}

PageEvaluateInput specifies input for the Evaluate method.

type PageEvaluateOutput

type PageEvaluateOutput struct {
	Value json.RawMessage
}

PageEvaluateOutput represents the output of the Evaluate method.

type PageFetch

type PageFetch interface {
	// EnableFetch enables network fetch interception.
	// Returns an error if enabling fails.
	EnableFetch(ctx context.Context) error

	// DisableFetch disables network fetch interception.
	// Returns an error if disabling fails.
	DisableFetch(ctx context.Context) error

	// AddInterceptRequest adds a request interception callback.
	// Returns a handle that can be used to remove the callback later.
	AddInterceptRequest(ctx context.Context, cb InterceptRequestCallback) *InterceptRequestHandle

	// RemoveInterceptRequest removes a request interception callback.
	// The handle parameter should be the value returned by AddInterceptRequest.
	RemoveInterceptRequest(ctx context.Context, handle *InterceptRequestHandle)

	// AddInterceptResponse adds a response interception callback.
	// Returns a handle that can be used to remove the callback later.
	AddInterceptResponse(ctx context.Context, cb InterceptResponseCallback) *InterceptResponseHandle

	// RemoveInterceptResponse removes a response interception callback.
	// The handle parameter should be the value returned by AddInterceptResponse.
	RemoveInterceptResponse(ctx context.Context, handle *InterceptResponseHandle)
}

type PageInput

type PageInput interface {
	// TypeText sends a sequence of keystrokes to the element as if typed by a user.
	// Accepts an ElementTypeInput containing the text to type.
	// Returns an ElementTypeOutput with the result or an error if typing fails.
	TypeText(ctx context.Context, in *PageTypeTextInput) (*PageTypeTextOutput, error)
}
type PageNavigateInput struct {
	URL                string // The URL to navigate to.
	WaitDomContentLoad bool   // If true, waits for the DOM content to load before returning.
}

PageNavigateInput specifies the input for the Navigate method. URL is the target URL to navigate to. WaitDomContentLoad determines whether to wait for the DOM content to load.

type PageNavigateOutput struct {
	LoaderID network.LoaderID // The LoaderID associated with the navigation.
}

PageNavigateOutput represents the output of the Navigate method. LoaderID is the ID associated with the loading process of the page.

type PageNavigation interface {
	// Activate brings page to front
	Activate(ctx context.Context) error

	// Navigate navigates the page to the specified URL.
	// The input is a PageNavigateInput containing the URL to navigate to.
	// It returns a PageNavigateOutput or an error if the navigation fails.
	Navigate(ctx context.Context, in *PageNavigateInput) (*PageNavigateOutput, error)

	// Reload reloads the current page.
	// It can take a PageReloadInput and returns a PageReloadOutput or an error.
	Reload(ctx context.Context, in *PageReloadInput) (*PageReloadOutput, error)
}

type PageQuerySelectorInput

type PageQuerySelectorInput struct {
	Selector string
}

PageQuerySelectorInput contains the selector string for querying elements.

type PageQuerySelectorOutput

type PageQuerySelectorOutput struct {
	Element Element
}

PageQuerySelectorOutput contains the Element found by the query.

type PageReloadInput

type PageReloadInput struct {
	LoaderID           network.LoaderID // The LoaderID of the previous load.
	WaitDomContentLoad bool             // If true, waits for the DOM content to load after reload.
}

PageReloadInput specifies the input for the Reload method. LoaderID is the ID associated with the previous loading process. WaitDomContentLoad determines whether to wait for the DOM content to load after reloading.

type PageReloadOutput

type PageReloadOutput struct{}

PageReloadOutput represents the output of the Reload method.

type PageSearchInput

type PageSearchInput struct {
	Selector     string        // selector for querying the element.
	Pierce       bool          // Include User Agent Shadow DOM if true.
	WaitDuration time.Duration // Max duration to wait for an element to be present.
	TickDuration time.Duration // Duration between search attempts; defaults to 1 second if unset.
}

PageSearchInput contains the selector string for querying elements.

type PageSearchOutput

type PageSearchOutput struct {
	Element Element // The first element matching the query.
}

PageSearchOutput contains the Element found by the query.

type PageStorage

type PageStorage interface {
	// GetCookies retrieves cookies for the current page.
	// Takes a GetCookiesInput and returns GetCookiesOutput or an error.
	GetCookies(ctx context.Context, in *GetCookiesInput) (*GetCookiesOutput, error)

	// SetCookies sets cookies for the current page.
	// Takes a SetCookiesInput and returns SetCookiesOutput or an error.
	SetCookies(ctx context.Context, in *SetCookiesInput) (*SetCookiesOutput, error)

	// ClearCookies clears cookies for the current page.
	ClearCookies(ctx context.Context) error

	GetLocalStorage(ctx context.Context, in *GetLocalStorageInput) (*GetLocalStorageOutput, error)
	SetLocalStorage(ctx context.Context, in *SetLocalStorageInput) (*SetLocalStorageOutput, error)
	ClearLocalStorage(ctx context.Context) error
}

type PageTakeScreenshotInput

type PageTakeScreenshotInput struct {
	// Format specifies the desired image format for the screenshot.
	// Options could be "png" or "jpeg".
	Format string

	// Full determines whether to capture the entire page or only the current viewport.
	Full bool

	// Viewport allows specifying a custom area of the page to capture.
	Viewport *cdppage.Viewport
}

PageTakeScreenshotInput specifies input parameters for taking a screenshot of a page.

type PageTakeScreenshotOutput

type PageTakeScreenshotOutput struct {
	// Data contains the screenshot image data.
	Data []byte
}

PageTakeScreenshotOutput represents the output of the TakeScreenshot method for a page.

type PageTypeTextInput

type PageTypeTextInput struct {
	Text      string        // The text to be typed into the page.
	Delay     time.Duration // (optional) Duration between keystrokes.
	DelayFunc TypeDelayFunc // (optional) Custom function for typing delays.

	UseRawKeyDown bool
}

PageTypeTextInput specifies input for the Type method. Text specifies the string to type into the page. Delay is the duration between keystrokes. DelayFunc is function to control typing delays.

type PageTypeTextOutput

type PageTypeTextOutput struct{}

PageTypeTextOutput represents the output of the Type method. It is currently empty, but can be extended to provide additional details of the typing operation.

type SetCookiesInput

type SetCookiesInput struct {
	Cookies []PageCookie // List of cookies to set.
}

SetCookiesInput specifies the input for the SetCookies method. It contains a list of cookies to set in the browser.

type SetCookiesOutput

type SetCookiesOutput struct{}

SetCookiesOutput is returned after setting cookies successfully.

type SetLocalStorageInput

type SetLocalStorageInput struct {
	Items []LocalStorageItem
}

type SetLocalStorageOutput

type SetLocalStorageOutput struct{}

type TypeDelayFunc

type TypeDelayFunc func() time.Duration

type XHREvent

type XHREvent struct {
	URL    string `json:"url"`    // The URL that was requested
	Body   string `json:"body"`   // The body of the response
	Base64 bool   `json:"base64"` // Indicates if the response body is Base64 encoded
	Error  error  `json:"-"`      // Error encountered during the request (if any)
}

XHREvent represents an XHR event with related information.

type XHRMonitor

type XHRMonitor interface {
	// Listen starts listening for XHR events that match the provided patterns.
	// It returns a channel of XHREvent and an error if the operation fails.
	Listen(ctx context.Context, patterns []string) (chan *XHREvent, error)

	// Stop stops monitoring the XHR requests.
	// Returns an error if stopping fails.
	Stop(ctx context.Context) error
}

XHRMonitor is an interface for monitoring XHR requests.

func NewXHRMonitor

func NewXHRMonitor(p Page) XHRMonitor

NewXHRMonitor creates a new XHRMonitor instance. It takes a Page and returns an instance of XHRMonitor.

Directories

Path Synopsis
examples
click_element command
cookies command
eval command
listen_xhr command
local_storage command
open_chrome command
open_url command
screenshots command
search command
typing command

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL