Documentation
¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Message ¶
type Message struct {
Role Role `json:"role"`
Content string `json:"content"`
Reasoning string `json:"reasoning"`
// Assistant tool calls
ToolCalls []ToolCall `json:"tool_calls"`
// OpenAI: An optional name for the participant. Provides the model information to differentiate between participants of the same role.
Name string `json:"name"`
// OpenAI: Assistant refusal message
Refusal string `json:"refusal"`
// Anthropic: User boolean indicating whether function call resulted in an error.
IsErr bool `json:"is_err"`
}
type Options ¶
type Options struct {
// OpenAI: Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
FrequencyPenalty *float64
// OpenAI: Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the `content` of `message`.
Logprobs bool
// OpenAI: An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and [reasoning tokens](https://platform.openai.com/docs/guides/reasoning).
MaxCompletionTokens uint
// The maximum number of tokens to generate before stopping.
MaxTokens uint
// OpenAI: How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep `n` as `1` to minimize costs.
// N *uint
// OpenAI: Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
PresencePenalty float64
// OpenAI: This feature is in Beta. If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same `seed` and parameters should return the same result. Determinism is not guaranteed, and you should refer to the `system_fingerprint` response parameter to monitor changes in the backend.
Seed *int64
// OpenAI: What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. We generally recommend altering this or `top_p` but not both.
// Anthropic: Amount of randomness injected into the response. Defaults to `1.0`. Ranges from `0.0` to `1.0`. Use `temperature` closer to `0.0` for analytical / multiple choice, and closer to `1.0` for creative and generative tasks. Note that even with `temperature` of `0.0`, the results will not be fully deterministic.
Temperature *float64
// OpenAI: An integer between 0 and 20 specifying the number of most likely tokens to return at each token position, each with an associated log probability. `logprobs` must be set to `true` if this parameter is used.
TopLogprobs *int32
TopP *float64
TopK *float32
// OpenAI: Whether to enable [parallel function calling](https://platform.openai.com/docs/guides/function-calling#configuring-parallel-function-calling) during tool use.
ParallelToolCalls *bool
// OpenAI: Cache responses for similar requests to optimize your cache hit rates. Replaces the `user` field.
// Google: Resource name of a context cache that can be used in subsequent requests.
PromptCacheKey string
// OpenAI: A stable identifier used to help detect users of your application that may be violating OpenAI's usage policies. The IDs should be a string that uniquely identifies each user. We recommend hashing their username or email address, in order to avoid sending us any identifying information. [Learn more](https://platform.openai.com/docs/guides/safety-best-practices#safety-identifiers).
SafetyIdentifier string
// OpenAI: This field is being replaced by `safety_identifier` and `prompt_cache_key`. Use `prompt_cache_key` instead to maintain caching optimizations. A stable identifier for your end-users. Used to boost cache hit rates by better bucketing similar requests and to help OpenAI detect and prevent abuse. [Learn more](https://platform.openai.com/docs/guides/safety-best-practices#safety-identifiers).
User string
// OpenAI: Modify the likelihood of specified tokens appearing in the completion. Accepts a JSON object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token.
LogitBias map[string]int64
// OpenAI: **o-series models only** Constrains effort on reasoning for [reasoning models](https://platform.openai.com/docs/guides/reasoning). (Accepts [ReasoningEffortUnion.ofString])
// Ollama: Think controls whether thinking/reasoning models will think before responding. (Accepts both [ReasoningEffortUnion.ofString] and [ReasoningEffortUnion.ofBool])
ReasoningEffort *ReasoningEffortUnion
// OpenAI: Specifies the processing type used for serving the request. Any of "auto", "default", "flex", "scale", "priority".
// Anthropic: Any of "auto", "standard_only"
ServiceTier string
// OpenAI: Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence.
// Anthropic: Custom text sequences that will cause the model to stop generating.
Stop []string
// Schema specifying the format that the model must output.
//
// *Supported providers*: OpenAI, Ollama
ResponseFormat *jsonschema.Schema
// Anthropic: Enable extended thinking. Must be ≥1024 and less than `max_tokens`.
Thinking uint64
// Ollama: How long the model will stay loaded into memory following the request.
KeepAlive *time.Duration
// Google:
// - `text/plain` (default)
// - `application/json`
ResponseMIMEType string
// OpenAI: Include usage statistics in streaming mode.
IncludeStreamMetrics bool
// l337: Controls channel buffering for streaming responses. Defaults to `0` (unbuffered).
StreamingBufferSize int
}
type Parameter ¶
type Parameter interface {
Apply(*Parameters) error
}
func WithMessage ¶
func WithSessionID ¶
type ParameterFunc ¶
type ParameterFunc func(*Parameters) error
func (ParameterFunc) Apply ¶
func (s ParameterFunc) Apply(r *Parameters) error
type Parameters ¶
type ReasoningEffortLevel ¶
type ReasoningEffortLevel string
const ( ReasoningEffortLow ReasoningEffortLevel = "low" ReasoningEffortMedium ReasoningEffortLevel = "medium" ReasoningEffortHigh ReasoningEffortLevel = "high" )
type ReasoningEffortUnion ¶
type ReasoningEffortUnion struct {
// contains filtered or unexported fields
}
Only one can be set
func NewReasoningEffortBool ¶
func NewReasoningEffortBool(enabled bool) *ReasoningEffortUnion
func NewReasoningEffortLevel ¶
func NewReasoningEffortLevel(level ReasoningEffortLevel) *ReasoningEffortUnion
func (*ReasoningEffortUnion) AsAny ¶
func (r *ReasoningEffortUnion) AsAny() any
func (*ReasoningEffortUnion) AsBool ¶
func (r *ReasoningEffortUnion) AsBool() (bool, bool)
func (*ReasoningEffortUnion) AsLevel ¶
func (r *ReasoningEffortUnion) AsLevel() (ReasoningEffortLevel, bool)
type RunResponse ¶
type RunResponse struct {
SessionID uuid.UUID `json:"session_id"`
Messages []Message `json:"messages"`
Metrics map[uuid.UUID][]metrics.Metrics `json:"metrics"`
}
func (*RunResponse) Content ¶
func (r *RunResponse) Content() string
Returns the content of the last message in the response.
Click to show internal directories.
Click to hide internal directories.