artifact

package
v0.1.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 8, 2025 License: Apache-2.0 Imports: 19 Imported by: 0

Documentation

Overview

Package artifact contains a data pipeline that will read workflow event records from BigQuery and ingest any available logs into cloud storage. A mapping from the original GitHub event to the cloud storage location is persisted in BigQuery along with an indicator for the status of the copy. The pipeline acts as a GitHub App for authentication purposes.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ExecuteJob

func ExecuteJob(ctx context.Context, cfg *Config) error

ExecuteJob runs the ingestion pipeline job to read GitHub action workflow logs from GitHub and store them into GCS.

func NewLogIngester

func NewLogIngester(ctx context.Context, cfg *Config) (*logIngester, error)

NewLogIngester creates a logIngester and initializes the object store, GitHub client.

Types

type ArtifactRecord

type ArtifactRecord struct {
	DeliveryID       string    `bigquery:"delivery_id" json:"delivery_id"`
	ProcessedAt      time.Time `bigquery:"processed_at" json:"processed_at"`
	Status           string    `bigquery:"status" json:"status"`
	WorkflowURI      string    `bigquery:"workflow_uri" json:"workflow_uri"`
	LogsURI          string    `bigquery:"logs_uri" json:"logs_uri"`
	GitHubActor      string    `bigquery:"github_actor" json:"github_actor"`
	OrganizationName string    `bigquery:"organization_name" json:"organization_name"`
	RepositoryName   string    `bigquery:"repository_name" json:"repository_name"`
	RepositorySlug   string    `bigquery:"repository_slug" json:"repository_slug"`
	JobName          string    `bigquery:"job_name" json:"job_name"`
}

ArtifactRecord is the output data structure that maps to the leech pipeline's output table schema.

type Config

type Config struct {
	GitHub githubclient.Config

	// BatchSize is the number of items to process in this pipeline run.
	BatchSize int

	// ProjectID is the project id where the tables live.
	ProjectID string

	// DatasetID is the dataset id where the tables live.
	DatasetID string

	// EventsTableID is the table_name of the events table.
	EventsTableID string

	// ArtifactsTableID is the table_name of the artifact_status table.
	ArtifactsTableID string

	// BucketName is the name of the GCS bucket to store artifact logs
	BucketName string
}

Config defines the set of environment variables required for running the artifact job.

func (*Config) ToFlags

func (cfg *Config) ToFlags(set *cli.FlagSet) *cli.FlagSet

ToFlags binds the config to the cli.FlagSet and returns it.

func (*Config) Validate

func (cfg *Config) Validate(ctx context.Context) error

Validate validates the artifacts config after load.

type EventRecord

type EventRecord struct {
	DeliveryID         string   `bigquery:"delivery_id" json:"delivery_id"`
	RepositorySlug     string   `bigquery:"repo_slug" json:"repo_slug"`
	RepositoryName     string   `bigquery:"repo_name" json:"repo_name"`
	OrganizationName   string   `bigquery:"org_name" json:"org_name"`
	LogsURL            string   `bigquery:"logs_url" json:"logs_url"`
	GitHubActor        string   `bigquery:"github_actor" json:"github_actor"`
	WorkflowURL        string   `bigquery:"workflow_url" json:"workflow_url"`
	WorkflowRunID      string   `bigquery:"workflow_run_id" json:"workflow_run_id"`
	WorkflowRunAttempt string   `bigquery:"workflow_run_attempt" json:"workflow_run_attempt"`
	PullRequestNumbers []string `bigquery:"pull_request_numbers" json:"pull_request_numbers"`
}

EventRecord maps the columns from the driving BigQuery query to a usable structure.

type ObjectStore

type ObjectStore struct {
	// contains filtered or unexported fields
}

ObjectStore is an implementation of the ObjectWriter interface that writes to Cloud Storage.

func NewObjectStore

func NewObjectStore(ctx context.Context) (*ObjectStore, error)

NewObjectStore creates a ObjectWriter implementation that uses cloud storage to store its objects.

func (*ObjectStore) Write

func (s *ObjectStore) Write(ctx context.Context, content io.Reader, objectDescriptor string) error

Write writes an object to Google Cloud Storage.

type ObjectWriter

type ObjectWriter interface {
	Write(ctx context.Context, content io.Reader, descriptor string) error
}

ObjectWriter is an interface for writing a object/blob to a storage medium.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL