evaluate

package

v0.5.0 Latest Latest Go to latest Published: Jun 5, 2024 License: MIT Imports: 15 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/symflower/eval-dev-quality

Links

Open Source Insights

Documentation ¶

Index ¶

Constants
Variables
func Evaluate(ctx *Context) (assessments report.AssessmentPerModelPerLanguagePerRepository, ...)
func Repository(logger *log.Logger, resultPath string, model evalmodel.Model, ...) (repositoryAssessment metrics.Assessments, problems []error, err error)
func ResetTemporaryRepository(logger *log.Logger, path string) (err error)
func TemporaryRepository(logger *log.Logger, dataPath string) (temporaryRepositoryPath string, cleanup func(), err error)
type Context

Constants ¶

View Source

const RepositoryPlainName = "plain"

RepositoryPlainName holds the name of the plain repository.

Variables ¶

View Source

var Version = "0.4.0"

Version holds the current version of the evaluation benchmark.

Functions ¶

func Evaluate ¶ added in v0.5.0

func Evaluate(ctx *Context) (assessments report.AssessmentPerModelPerLanguagePerRepository, totalScore uint64)

Evaluate runs an evaluation on the given context and returns its results.

func Repository ¶ added in v0.5.0

func Repository(logger *log.Logger, resultPath string, model evalmodel.Model, language language.Language, testDataPath string, repositoryName string) (repositoryAssessment metrics.Assessments, problems []error, err error)

Repository evaluate a repository with the given model and language.

func ResetTemporaryRepository ¶ added in v0.5.0

func ResetTemporaryRepository(logger *log.Logger, path string) (err error)

ResetTemporaryRepository resets a temporary repository back to its "initial" commit.

func TemporaryRepository ¶ added in v0.5.0

func TemporaryRepository(logger *log.Logger, dataPath string) (temporaryRepositoryPath string, cleanup func(), err error)

TemporaryRepository creates a temporary repository and initializes a git repo in it.

Types ¶

type Context ¶ added in v0.5.0

type Context struct {
	// Log holds the logger of the context.
	Log *log.Logger

	// Languages determines which language should be used for the evaluation, or empty if all languages should be used.
	Languages []evallanguage.Language

	// Models determines which models should be used for the evaluation, or empty if all models should be used.
	Models []evalmodel.Model
	// ProviderForModel holds the models and their associated provider.
	ProviderForModel map[evalmodel.Model]provider.Provider
	// QueryAttempts holds the number of query attempts to perform when a model request errors in the process of solving a task.
	QueryAttempts uint

	// RepositoryPaths determines which relative repository paths should be used for the evaluation, or empty if all repositories should be used.
	RepositoryPaths []string
	// ResultPath holds the directory path where results should be written to.
	ResultPath string
	// TestdataPath determines the testdata path where all repositories reside grouped by languages.
	TestdataPath string

	// Runs holds the number of runs to perform.
	Runs uint
	// RunsSequential indicates that interleaved runs are disabled and runs are performed sequentially.
	RunsSequential bool
	// NoDisqualification indicates that models are not to be disqualified if they fail to solve basic language tasks.
	NoDisqualification bool
}

Context holds an evaluation context.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
metrics
testing
report

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL