Documentation
¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
View Source
var ErrNotAccountClient = errors.New("invalid Databricks Account configuration")
Functions ¶
func Must ¶
Must panics if error is not nil. It's intended to be used with databricks.NewWorkspaceClient and databricks.NewAccountClient.
func WithProduct ¶
func WithProduct(name, version string)
WithProduct is expected to be set by developers to differentiate their app from others.
Example setting is:
func init() {
databricks.WithProduct("your-product", "0.0.1")
}
Types ¶
type AccountClient ¶
type AccountClient struct {
Config *config.Config
// This API allows you to download billable usage logs for the specified
// account and date range. This feature works with all account types.
BillableUsage *billing.BillableUsageAPI
// These APIs manage budget configuration including notifications for
// exceeding a budget for a period. They can also retrieve the status of
// each budget.
Budgets *billing.BudgetsAPI
// These APIs manage credential configurations for this workspace.
// Databricks needs access to a cross-account service IAM role in your AWS
// account so that Databricks can deploy clusters in the appropriate VPC for
// the new workspace. A credential configuration encapsulates this role
// information, and its ID is used when creating a new workspace.
Credentials *provisioning.CredentialsAPI
// These APIs enable administrators to manage custom oauth app integrations,
// which is required for adding/using Custom OAuth App Integration like
// Tableau Cloud for Databricks in AWS cloud.
//
// **Note:** You can only add/use the OAuth custom application integrations
// when OAuth enrollment status is enabled. For more details see
// :method:OAuthEnrollment/create
CustomAppIntegration *oauth2.CustomAppIntegrationAPI
// These APIs manage encryption key configurations for this workspace
// (optional). A key configuration encapsulates the AWS KMS key information
// and some information about how the key configuration can be used. There
// are two possible uses for key configurations:
//
// * Managed services: A key configuration can be used to encrypt a
// workspace's notebook and secret data in the control plane, as well as
// Databricks SQL queries and query history. * Storage: A key configuration
// can be used to encrypt a workspace's DBFS and EBS data in the data plane.
//
// In both of these cases, the key configuration's ID is used when creating
// a new workspace. This Preview feature is available if your account is on
// the E2 version of the platform. Updating a running workspace with
// workspace storage encryption requires that the workspace is on the E2
// version of the platform. If you have an older workspace, it might not be
// on the E2 version of the platform. If you are not sure, contact your
// Databricks representative.
EncryptionKeys *provisioning.EncryptionKeysAPI
// Groups simplify identity management, making it easier to assign access to
// Databricks Account, data, and other securable objects.
//
// It is best practice to assign access to workspaces and access-control
// policies in Unity Catalog to groups, instead of to users individually.
// All Databricks Account identities can be assigned as members of groups,
// and members inherit permissions that are assigned to their group.
Groups *iam.AccountGroupsAPI
// The Accounts IP Access List API enables account admins to configure IP
// access lists for access to the account console.
//
// Account IP Access Lists affect web application access and REST API access
// to the account console and account APIs. If the feature is disabled for
// the account, all access is allowed for this account. There is support for
// allow lists (inclusion) and block lists (exclusion).
//
// When a connection is attempted: 1. **First, all block lists are
// checked.** If the connection IP address matches any block list, the
// connection is rejected. 2. **If the connection was not rejected by block
// lists**, the IP address is compared with the allow lists.
//
// If there is at least one allow list for the account, the connection is
// allowed only if the IP address matches an allow list. If there are no
// allow lists for the account, all IP addresses are allowed.
//
// For all allow lists and block lists combined, the account supports a
// maximum of 1000 IP/CIDR values, where one CIDR counts as a single value.
//
// After changes to the account-level IP access lists, it can take a few
// minutes for changes to take effect.
IpAccessLists *settings.AccountIpAccessListsAPI
// These APIs manage log delivery configurations for this account. The two
// supported log types for this API are _billable usage logs_ and _audit
// logs_. This feature is in Public Preview. This feature works with all
// account ID types.
//
// Log delivery works with all account types. However, if your account is on
// the E2 version of the platform or on a select custom plan that allows
// multiple workspaces per account, you can optionally configure different
// storage destinations for each workspace. Log delivery status is also
// provided to know the latest status of log delivery attempts. The
// high-level flow of billable usage delivery:
//
// 1. **Create storage**: In AWS, [create a new AWS S3 bucket] with a
// specific bucket policy. Using Databricks APIs, call the Account API to
// create a [storage configuration object](#operation/create-storage-config)
// that uses the bucket name. 2. **Create credentials**: In AWS, create the
// appropriate AWS IAM role. For full details, including the required IAM
// role policies and trust relationship, see [Billable usage log delivery].
// Using Databricks APIs, call the Account API to create a [credential
// configuration object](#operation/create-credential-config) that uses the
// IAM role's ARN. 3. **Create log delivery configuration**: Using
// Databricks APIs, call the Account API to [create a log delivery
// configuration](#operation/create-log-delivery-config) that uses the
// credential and storage configuration objects from previous steps. You can
// specify if the logs should include all events of that log type in your
// account (_Account level_ delivery) or only events for a specific set of
// workspaces (_workspace level_ delivery). Account level log delivery
// applies to all current and future workspaces plus account level logs,
// while workspace level log delivery solely delivers logs related to the
// specified workspaces. You can create multiple types of delivery
// configurations per account.
//
// For billable usage delivery: * For more information about billable usage
// logs, see [Billable usage log delivery]. For the CSV schema, see the
// [Usage page]. * The delivery location is
// `<bucket-name>/<prefix>/billable-usage/csv/`, where `<prefix>` is the
// name of the optional delivery path prefix you set up during log delivery
// configuration. Files are named
// `workspaceId=<workspace-id>-usageMonth=<month>.csv`. * All billable usage
// logs apply to specific workspaces (_workspace level_ logs). You can
// aggregate usage for your entire account by creating an _account level_
// delivery configuration that delivers logs for all current and future
// workspaces in your account. * The files are delivered daily by
// overwriting the month's CSV file for each workspace.
//
// For audit log delivery: * For more information about about audit log
// delivery, see [Audit log delivery], which includes information about the
// used JSON schema. * The delivery location is
// `<bucket-name>/<delivery-path-prefix>/workspaceId=<workspaceId>/date=<yyyy-mm-dd>/auditlogs_<internal-id>.json`.
// Files may get overwritten with the same content multiple times to achieve
// exactly-once delivery. * If the audit log delivery configuration included
// specific workspace IDs, only _workspace-level_ audit logs for those
// workspaces are delivered. If the log delivery configuration applies to
// the entire account (_account level_ delivery configuration), the audit
// log delivery includes workspace-level audit logs for all workspaces in
// the account as well as account-level audit logs. See [Audit log delivery]
// for details. * Auditable events are typically available in logs within 15
// minutes.
//
// [Audit log delivery]: https://docs.databricks.com/administration-guide/account-settings/audit-logs.html
// [Billable usage log delivery]: https://docs.databricks.com/administration-guide/account-settings/billable-usage-delivery.html
// [Usage page]: https://docs.databricks.com/administration-guide/account-settings/usage.html
// [create a new AWS S3 bucket]: https://docs.databricks.com/administration-guide/account-api/aws-storage.html
LogDelivery *billing.LogDeliveryAPI
// These APIs manage metastore assignments to a workspace.
MetastoreAssignments *catalog.AccountMetastoreAssignmentsAPI
// These APIs manage Unity Catalog metastores for an account. A metastore
// contains catalogs that can be associated with workspaces
Metastores *catalog.AccountMetastoresAPI
// These APIs manage network configurations for customer-managed VPCs
// (optional). Its ID is used when creating a new workspace if you use
// customer-managed VPCs.
Networks *provisioning.NetworksAPI
// These APIs enable administrators to enroll OAuth for their accounts,
// which is required for adding/using any OAuth published/custom application
// integration.
//
// **Note:** Your account must be on the E2 version to use these APIs, this
// is because OAuth is only supported on the E2 version.
OAuthEnrollment *oauth2.OAuthEnrollmentAPI
// These APIs manage private access settings for this account.
PrivateAccess *provisioning.PrivateAccessAPI
// These APIs enable administrators to manage published oauth app
// integrations, which is required for adding/using Published OAuth App
// Integration like Tableau Cloud for Databricks in AWS cloud.
//
// **Note:** You can only add/use the OAuth published application
// integrations when OAuth enrollment status is enabled. For more details
// see :method:OAuthEnrollment/create
PublishedAppIntegration *oauth2.PublishedAppIntegrationAPI
// Identities for use with jobs, automated tools, and systems such as
// scripts, apps, and CI/CD platforms. Databricks recommends creating
// service principals to run production jobs or modify production data. If
// all processes that act on production data run with service principals,
// interactive users do not need any write, delete, or modify privileges in
// production. This eliminates the risk of a user overwriting production
// data by accident.
ServicePrincipals *iam.AccountServicePrincipalsAPI
// These APIs manage storage configurations for this workspace. A root
// storage S3 bucket in your account is required to store objects like
// cluster logs, notebook revisions, and job results. You can also use the
// root storage S3 bucket for storage of non-production DBFS data. A storage
// configuration encapsulates this bucket information, and its ID is used
// when creating a new workspace.
Storage *provisioning.StorageAPI
// These APIs manage storage credentials for a particular metastore.
StorageCredentials *catalog.AccountStorageCredentialsAPI
// User identities recognized by Databricks and represented by email
// addresses.
//
// Databricks recommends using SCIM provisioning to sync users and groups
// automatically from your identity provider to your Databricks Account.
// SCIM streamlines onboarding a new employee or team by using your identity
// provider to create users and groups in Databricks Account and give them
// the proper level of access. When a user leaves your organization or no
// longer needs access to Databricks Account, admins can terminate the user
// in your identity provider and that user’s account will also be removed
// from Databricks Account. This ensures a consistent offboarding process
// and prevents unauthorized users from accessing sensitive data.
Users *iam.AccountUsersAPI
// These APIs manage VPC endpoint configurations for this account.
VpcEndpoints *provisioning.VpcEndpointsAPI
// The Workspace Permission Assignment API allows you to manage workspace
// permissions for principals in your account.
WorkspaceAssignment *iam.WorkspaceAssignmentAPI
// These APIs manage workspaces for this account. A Databricks workspace is
// an environment for accessing all of your Databricks assets. The workspace
// organizes objects (notebooks, libraries, and experiments) into folders,
// and provides access to data and computational resources such as clusters
// and jobs.
//
// These endpoints are available if your account is on the E2 version of the
// platform or on a select custom plan that allows multiple workspaces per
// account.
Workspaces *provisioning.WorkspacesAPI
}
func NewAccountClient ¶
func NewAccountClient(c ...*Config) (*AccountClient, error)
NewAccountClient creates new Databricks SDK client for Accounts or returns error in case configuration is wrong
type WorkspaceClient ¶
type WorkspaceClient struct {
Config *config.Config
// The alerts API can be used to perform CRUD operations on alerts. An alert
// is a Databricks SQL object that periodically runs a query, evaluates a
// condition of its result, and notifies one or more users and/or
// notification destinations if the condition was met.
Alerts *sql.AlertsAPI
// A catalog is the first layer of Unity Catalog’s three-level namespace.
// It’s used to organize your data assets. Users can see all catalogs on
// which they have been assigned the USE_CATALOG data permission.
//
// In Unity Catalog, admins and data stewards manage users and their access
// to data centrally across all of the workspaces in a Databricks account.
// Users in different workspaces can share access to the same data,
// depending on privileges granted centrally in Unity Catalog.
Catalogs *catalog.CatalogsAPI
// Cluster policy limits the ability to configure clusters based on a set of
// rules. The policy rules limit the attributes or attribute values
// available for cluster creation. Cluster policies have ACLs that limit
// their use to specific users and groups.
//
// Cluster policies let you limit users to create clusters with prescribed
// settings, simplify the user interface and enable more users to create
// their own clusters (by fixing and hiding some values), control cost by
// limiting per cluster maximum cost (by setting limits on attributes whose
// values contribute to hourly price).
//
// Cluster policy permissions limit which policies a user can select in the
// Policy drop-down when the user creates a cluster: - A user who has
// cluster create permission can select the Unrestricted policy and create
// fully-configurable clusters. - A user who has both cluster create
// permission and access to cluster policies can select the Unrestricted
// policy and policies they have access to. - A user that has access to only
// cluster policies, can select the policies they have access to.
//
// If no policies have been created in the workspace, the Policy drop-down
// does not display.
//
// Only admin users can create, edit, and delete policies. Admin users also
// have access to all policies.
ClusterPolicies *compute.ClusterPoliciesAPI
// The Clusters API allows you to create, start, edit, list, terminate, and
// delete clusters.
//
// Databricks maps cluster node instance types to compute units known as
// DBUs. See the instance type pricing page for a list of the supported
// instance types and their corresponding DBUs.
//
// A Databricks cluster is a set of computation resources and configurations
// on which you run data engineering, data science, and data analytics
// workloads, such as production ETL pipelines, streaming analytics, ad-hoc
// analytics, and machine learning.
//
// You run these workloads as a set of commands in a notebook or as an
// automated job. Databricks makes a distinction between all-purpose
// clusters and job clusters. You use all-purpose clusters to analyze data
// collaboratively using interactive notebooks. You use job clusters to run
// fast and robust automated jobs.
//
// You can create an all-purpose cluster using the UI, CLI, or REST API. You
// can manually terminate and restart an all-purpose cluster. Multiple users
// can share such clusters to do collaborative interactive analysis.
//
// IMPORTANT: Databricks retains cluster configuration information for up to
// 200 all-purpose clusters terminated in the last 30 days and up to 30 job
// clusters recently terminated by the job scheduler. To keep an all-purpose
// cluster configuration even after it has been terminated for more than 30
// days, an administrator can pin a cluster to the cluster list.
Clusters *compute.ClustersAPI
// This API allows executing commands on running clusters.
CommandExecutor compute.CommandExecutor
// This API allows retrieving information about currently authenticated user
// or service principal.
CurrentUser *iam.CurrentUserAPI
// In general, there is little need to modify dashboards using the API.
// However, it can be useful to use dashboard objects to look-up a
// collection of related query IDs. The API can also be used to duplicate
// multiple dashboards at once since you can get a dashboard definition with
// a GET request and then POST it to create a new one.
Dashboards *sql.DashboardsAPI
// This API is provided to assist you in making new query objects. When
// creating a query object, you may optionally specify a `data_source_id`
// for the SQL warehouse against which it will run. If you don't already
// know the `data_source_id` for your desired SQL warehouse, this API will
// help you find it.
//
// This API does not support searches. It returns the full list of SQL
// warehouses in your workspace. We advise you to use any text editor, REST
// client, or `grep` to search the response from this API for the name of
// your SQL warehouse as it appears in Databricks SQL.
DataSources *sql.DataSourcesAPI
// DBFS API makes it simple to interact with various data sources without
// having to include a users credentials every time to read a file.
Dbfs *files.DbfsAPI
// The SQL Permissions API is similar to the endpoints of the
// :method:permissions/set. However, this exposes only one endpoint, which
// gets the Access Control List for a given object. You cannot modify any
// permissions using this API.
//
// There are three levels of permission:
//
// - `CAN_VIEW`: Allows read-only access
//
// - `CAN_RUN`: Allows read access and run access (superset of `CAN_VIEW`)
//
// - `CAN_MANAGE`: Allows all actions: read, run, edit, delete, modify
// permissions (superset of `CAN_RUN`)
DbsqlPermissions *sql.DbsqlPermissionsAPI
Experiments *ml.ExperimentsAPI
// An external location is an object that combines a cloud storage path with
// a storage credential that authorizes access to the cloud storage path.
// Each external location is subject to Unity Catalog access-control
// policies that control which users and groups can access the credential.
// If a user does not have access to an external location in Unity Catalog,
// the request fails and Unity Catalog does not attempt to authenticate to
// your cloud tenant on the user’s behalf.
//
// Databricks recommends using external locations rather than using storage
// credentials directly.
//
// To create external locations, you must be a metastore admin or a user
// with the **CREATE_EXTERNAL_LOCATION** privilege.
ExternalLocations *catalog.ExternalLocationsAPI
// Functions implement User-Defined Functions (UDFs) in Unity Catalog.
//
// The function implementation can be any SQL expression or Query, and it
// can be invoked wherever a table reference is allowed in a query. In Unity
// Catalog, a function resides at the same level as a table, so it can be
// referenced with the form
// __catalog_name__.__schema_name__.__function_name__.
Functions *catalog.FunctionsAPI
// Registers personal access token for Databricks to do operations on behalf
// of the user.
//
// See [more info].
//
// [more info]: https://docs.databricks.com/repos/get-access-tokens-from-git-provider.html
GitCredentials *workspace.GitCredentialsAPI
// The Global Init Scripts API enables Workspace administrators to configure
// global initialization scripts for their workspace. These scripts run on
// every node in every cluster in the workspace.
//
// **Important:** Existing clusters must be restarted to pick up any changes
// made to global init scripts. Global init scripts are run in order. If the
// init script returns with a bad exit code, the Apache Spark container
// fails to launch and init scripts with later position are skipped. If
// enough containers fail, the entire cluster fails with a
// `GLOBAL_INIT_SCRIPT_FAILURE` error code.
GlobalInitScripts *compute.GlobalInitScriptsAPI
// In Unity Catalog, data is secure by default. Initially, users have no
// access to data in a metastore. Access can be granted by either a
// metastore admin, the owner of an object, or the owner of the catalog or
// schema that contains the object. Securable objects in Unity Catalog are
// hierarchical and privileges are inherited downward.
//
// Securable objects in Unity Catalog are hierarchical and privileges are
// inherited downward. This means that granting a privilege on the catalog
// automatically grants the privilege to all current and future objects
// within the catalog. Similarly, privileges granted on a schema are
// inherited by all current and future objects within that schema.
Grants *catalog.GrantsAPI
// Groups simplify identity management, making it easier to assign access to
// Databricks Workspace, data, and other securable objects.
//
// It is best practice to assign access to workspaces and access-control
// policies in Unity Catalog to groups, instead of to users individually.
// All Databricks Workspace identities can be assigned as members of groups,
// and members inherit permissions that are assigned to their group.
Groups *iam.GroupsAPI
// Instance Pools API are used to create, edit, delete and list instance
// pools by using ready-to-use cloud instances which reduces a cluster start
// and auto-scaling times.
//
// Databricks pools reduce cluster start and auto-scaling times by
// maintaining a set of idle, ready-to-use instances. When a cluster is
// attached to a pool, cluster nodes are created using the pool’s idle
// instances. If the pool has no idle instances, the pool expands by
// allocating a new instance from the instance provider in order to
// accommodate the cluster’s request. When a cluster releases an instance,
// it returns to the pool and is free for another cluster to use. Only
// clusters attached to a pool can use that pool’s idle instances.
//
// You can specify a different pool for the driver node and worker nodes, or
// use the same pool for both.
//
// Databricks does not charge DBUs while instances are idle in the pool.
// Instance provider billing does apply. See pricing.
InstancePools *compute.InstancePoolsAPI
// The Instance Profiles API allows admins to add, list, and remove instance
// profiles that users can launch clusters with. Regular users can list the
// instance profiles available to them. See [Secure access to S3 buckets]
// using instance profiles for more information.
//
// [Secure access to S3 buckets]: https://docs.databricks.com/administration-guide/cloud-configurations/aws/instance-profiles.html
InstanceProfiles *compute.InstanceProfilesAPI
// IP Access List enables admins to configure IP access lists.
//
// IP access lists affect web application access and REST API access to this
// workspace only. If the feature is disabled for a workspace, all access is
// allowed for this workspace. There is support for allow lists (inclusion)
// and block lists (exclusion).
//
// When a connection is attempted: 1. **First, all block lists are
// checked.** If the connection IP address matches any block list, the
// connection is rejected. 2. **If the connection was not rejected by block
// lists**, the IP address is compared with the allow lists.
//
// If there is at least one allow list for the workspace, the connection is
// allowed only if the IP address matches an allow list. If there are no
// allow lists for the workspace, all IP addresses are allowed.
//
// For all allow lists and block lists combined, the workspace supports a
// maximum of 1000 IP/CIDR values, where one CIDR counts as a single value.
//
// After changes to the IP access list feature, it can take a few minutes
// for changes to take effect.
IpAccessLists *settings.IpAccessListsAPI
// The Jobs API allows you to create, edit, and delete jobs.
//
// You can use a Databricks job to run a data processing or data analysis
// task in a Databricks cluster with scalable resources. Your job can
// consist of a single task or can be a large, multi-task workflow with
// complex dependencies. Databricks manages the task orchestration, cluster
// management, monitoring, and error reporting for all of your jobs. You can
// run your jobs immediately or periodically through an easy-to-use
// scheduling system. You can implement job tasks using notebooks, JARS,
// Delta Live Tables pipelines, or Python, Scala, Spark submit, and Java
// applications.
//
// You should never hard code secrets or store them in plain text. Use the
// :service:secrets to manage secrets in the [Databricks CLI]. Use the
// [Secrets utility] to reference secrets in notebooks and jobs.
//
// [Databricks CLI]: https://docs.databricks.com/dev-tools/cli/index.html
// [Secrets utility]: https://docs.databricks.com/dev-tools/databricks-utils.html#dbutils-secrets
Jobs *jobs.JobsAPI
// The Libraries API allows you to install and uninstall libraries and get
// the status of libraries on a cluster.
//
// To make third-party or custom code available to notebooks and jobs
// running on your clusters, you can install a library. Libraries can be
// written in Python, Java, Scala, and R. You can upload Java, Scala, and
// Python libraries and point to external packages in PyPI, Maven, and CRAN
// repositories.
//
// Cluster libraries can be used by all notebooks running on a cluster. You
// can install a cluster library directly from a public repository such as
// PyPI or Maven, using a previously installed workspace library, or using
// an init script.
//
// When you install a library on a cluster, a notebook already attached to
// that cluster will not immediately see the new library. You must first
// detach and then reattach the notebook to the cluster.
//
// When you uninstall a library from a cluster, the library is removed only
// when you restart the cluster. Until you restart the cluster, the status
// of the uninstalled library appears as Uninstall pending restart.
Libraries *compute.LibrariesAPI
// A metastore is the top-level container of objects in Unity Catalog. It
// stores data assets (tables and views) and the permissions that govern
// access to them. Databricks account admins can create metastores and
// assign them to Databricks workspaces to control which workloads use each
// metastore. For a workspace to use Unity Catalog, it must have a Unity
// Catalog metastore attached.
//
// Each metastore is configured with a root storage location in a cloud
// storage account. This storage location is used for metadata and managed
// tables data.
//
// NOTE: This metastore is distinct from the metastore included in
// Databricks workspaces created before Unity Catalog was released. If your
// workspace includes a legacy Hive metastore, the data in that metastore is
// available in a catalog named hive_metastore.
Metastores *catalog.MetastoresAPI
ModelRegistry *ml.ModelRegistryAPI
// Permissions API are used to create read, write, edit, update and manage
// access for various users on different objects and endpoints.
Permissions *iam.PermissionsAPI
// The Delta Live Tables API allows you to create, edit, delete, start, and
// view details about pipelines.
//
// Delta Live Tables is a framework for building reliable, maintainable, and
// testable data processing pipelines. You define the transformations to
// perform on your data, and Delta Live Tables manages task orchestration,
// cluster management, monitoring, data quality, and error handling.
//
// Instead of defining your data pipelines using a series of separate Apache
// Spark tasks, Delta Live Tables manages how your data is transformed based
// on a target schema you define for each processing step. You can also
// enforce data quality with Delta Live Tables expectations. Expectations
// allow you to define expected data quality and specify how to handle
// records that fail those expectations.
Pipelines *pipelines.PipelinesAPI
// View available policy families. A policy family contains a policy
// definition providing best practices for configuring clusters for a
// particular use case.
//
// Databricks manages and provides policy families for several common
// cluster use cases. You cannot create, edit, or delete policy families.
//
// Policy families cannot be used directly to create clusters. Instead, you
// create cluster policies using a policy family. Cluster policies created
// using a policy family inherit the policy family's policy definition.
PolicyFamilies *compute.PolicyFamiliesAPI
// Databricks Providers REST API
Providers *sharing.ProvidersAPI
// These endpoints are used for CRUD operations on query definitions. Query
// definitions include the target SQL warehouse, query text, name,
// description, tags, parameters, and visualizations.
Queries *sql.QueriesAPI
// Access the history of queries through SQL warehouses.
QueryHistory *sql.QueryHistoryAPI
// Databricks Recipient Activation REST API
RecipientActivation *sharing.RecipientActivationAPI
// Databricks Recipients REST API
Recipients *sharing.RecipientsAPI
// The Repos API allows users to manage their git repos. Users can use the
// API to access all repos that they have manage permissions on.
//
// Databricks Repos is a visual Git client in Databricks. It supports common
// Git operations such a cloning a repository, committing and pushing,
// pulling, branch management, and visual comparison of diffs when
// committing.
//
// Within Repos you can develop code in notebooks or other files and follow
// data science and engineering code development best practices using Git
// for version control, collaboration, and CI/CD.
Repos *workspace.ReposAPI
// A schema (also called a database) is the second layer of Unity
// Catalog’s three-level namespace. A schema organizes tables, views and
// functions. To access (or list) a table or view in a schema, users must
// have the USE_SCHEMA data permission on the schema and its parent catalog,
// and they must have the SELECT permission on the table or view.
Schemas *catalog.SchemasAPI
// The Secrets API allows you to manage secrets, secret scopes, and access
// permissions.
//
// Sometimes accessing data requires that you authenticate to external data
// sources through JDBC. Instead of directly entering your credentials into
// a notebook, use Databricks secrets to store your credentials and
// reference them in notebooks and jobs.
//
// Administrators, secret creators, and users granted permission can read
// Databricks secrets. While Databricks makes an effort to redact secret
// values that might be displayed in notebooks, it is not possible to
// prevent such users from reading secrets.
Secrets *workspace.SecretsAPI
// Identities for use with jobs, automated tools, and systems such as
// scripts, apps, and CI/CD platforms. Databricks recommends creating
// service principals to run production jobs or modify production data. If
// all processes that act on production data run with service principals,
// interactive users do not need any write, delete, or modify privileges in
// production. This eliminates the risk of a user overwriting production
// data by accident.
ServicePrincipals *iam.ServicePrincipalsAPI
// The Serving Endpoints API allows you to create, update, and delete model
// serving endpoints.
//
// You can use a serving endpoint to serve models from the Databricks Model
// Registry. Endpoints expose the underlying models as scalable REST API
// endpoints using serverless compute. This means the endpoints and
// associated compute resources are fully managed by Databricks and will not
// appear in your cloud account. A serving endpoint can consist of one or
// more MLflow models from the Databricks Model Registry, called served
// models. A serving endpoint can have at most ten served models. You can
// configure traffic settings to define how requests should be routed to
// your served models behind an endpoint. Additionally, you can configure
// the scale of resources that should be applied to each served model.
ServingEndpoints *serving.ServingEndpointsAPI
Shares *sharing.SharesAPI
// The SQL Statement Execution API manages the execution of arbitrary SQL
// statements and the fetching of result data.
//
// **Release status**
//
// This feature is in [Public Preview].
//
// **Getting started**
//
// We suggest beginning with the [SQL Statement Execution API tutorial].
//
// **Overview of statement execution and result fetching**
//
// Statement execution begins by issuing a
// :method:statementexecution/executeStatement request with a valid SQL
// statement and warehouse ID, along with optional parameters such as the
// data catalog and output format.
//
// When submitting the statement, the call can behave synchronously or
// asynchronously, based on the `wait_timeout` setting. When set between
// 5-50 seconds (default: 10) the call behaves synchronously and waits for
// results up to the specified timeout; when set to `0s`, the call is
// asynchronous and responds immediately with a statement ID that can be
// used to poll for status or fetch the results in a separate call.
//
// **Call mode: synchronous**
//
// In synchronous mode, when statement execution completes within the `wait
// timeout`, the result data is returned directly in the response. This
// response will contain `statement_id`, `status`, `manifest`, and `result`
// fields. The `status` field confirms success whereas the `manifest` field
// contains the result data column schema and metadata about the result set.
// The `result` field contains the first chunk of result data according to
// the specified `disposition`, and links to fetch any remaining chunks.
//
// If the execution does not complete before `wait_timeout`, the setting
// `on_wait_timeout` determines how the system responds.
//
// By default, `on_wait_timeout=CONTINUE`, and after reaching
// `wait_timeout`, a response is returned and statement execution continues
// asynchronously. The response will contain only `statement_id` and
// `status` fields, and the caller must now follow the flow described for
// asynchronous call mode to poll and fetch the result.
//
// Alternatively, `on_wait_timeout` can also be set to `CANCEL`; in this
// case if the timeout is reached before execution completes, the underlying
// statement execution is canceled, and a `CANCELED` status is returned in
// the response.
//
// **Call mode: asynchronous**
//
// In asynchronous mode, or after a timed-out synchronous request continues,
// a `statement_id` and `status` will be returned. In this case polling
// :method:statementexecution/getStatement calls are required to fetch the
// result and metadata.
//
// Next, a caller must poll until execution completes (`SUCCEEDED`,
// `FAILED`, etc.) by issuing :method:statementexecution/getStatement
// requests for the given `statement_id`.
//
// When execution has succeeded, the response will contain `status`,
// `manifest`, and `result` fields. These fields and the structure are
// identical to those in the response to a successful synchronous
// submission. The `result` field will contain the first chunk of result
// data, either `INLINE` or as `EXTERNAL_LINKS` depending on `disposition`.
// Additional chunks of result data can be fetched by checking for the
// presence of the `next_chunk_internal_link` field, and iteratively `GET`
// those paths until that field is unset: `GET
// https://$DATABRICKS_HOST/{next_chunk_internal_link}`.
//
// **Fetching result data: format and disposition**
//
// Result data from statement execution is available in two formats: JSON,
// and [Apache Arrow Columnar]. Statements producing a result set smaller
// than 16 MiB can be fetched as `format=JSON_ARRAY`, using the
// `disposition=INLINE`. When a statement executed in `INLINE` disposition
// exceeds this limit, the execution is aborted, and no result can be
// fetched. Using `format=ARROW_STREAM` and `disposition=EXTERNAL_LINKS`
// allows large result sets, and with higher throughput.
//
// The API uses defaults of `format=JSON_ARRAY` and `disposition=INLINE`.
// `We advise explicitly setting format and disposition in all production
// use cases.
//
// **Statement response: statement_id, status, manifest, and result**
//
// The base call :method:statementexecution/getStatement returns a single
// response combining `statement_id`, `status`, a result `manifest`, and a
// `result` data chunk or link, depending on the `disposition`. The
// `manifest` contains the result schema definition and the result summary
// metadata. When using `disposition=EXTERNAL_LINKS`, it also contains a
// full listing of all chunks and their summary metadata.
//
// **Use case: small result sets with INLINE + JSON_ARRAY**
//
// For flows that generate small and predictable result sets (<= 16 MiB),
// `INLINE` downloads of `JSON_ARRAY` result data are typically the simplest
// way to execute and fetch result data.
//
// When the result set with `disposition=INLINE` is larger, the result can
// be transferred in chunks. After receiving the initial chunk with
// :method:statementexecution/executeStatement or
// :method:statementexecution/getStatement subsequent calls are required to
// iteratively fetch each chunk. Each result response contains a link to the
// next chunk, when there are additional chunks to fetch; it can be found in
// the field `.next_chunk_internal_link`. This link is an absolute `path` to
// be joined with your `$DATABRICKS_HOST`, and of the form
// `/api/2.0/sql/statements/{statement_id}/result/chunks/{chunk_index}`. The
// next chunk can be fetched by issuing a
// :method:statementexecution/getStatementResultChunkN request.
//
// When using this mode, each chunk may be fetched once, and in order. A
// chunk without a field `next_chunk_internal_link` indicates the last chunk
// was reached and all chunks have been fetched from the result set.
//
// **Use case: large result sets with EXTERNAL_LINKS + ARROW_STREAM**
//
// Using `EXTERNAL_LINKS` to fetch result data in Arrow format allows you to
// fetch large result sets efficiently. The primary difference from using
// `INLINE` disposition is that fetched result chunks contain resolved
// `external_links` URLs, which can be fetched with standard HTTP.
//
// **Presigned URLs**
//
// External links point to data stored within your workspace's internal
// DBFS, in the form of a presigned URL. The URLs are valid for only a short
// period, <= 15 minutes. Alongside each `external_link` is an expiration
// field indicating the time at which the URL is no longer valid. In
// `EXTERNAL_LINKS` mode, chunks can be resolved and fetched multiple times
// and in parallel.
//
// ----
//
// ### **Warning: We recommend you protect the URLs in the EXTERNAL_LINKS.**
//
// When using the EXTERNAL_LINKS disposition, a short-lived pre-signed URL
// is generated, which the client can use to download the result chunk
// directly from cloud storage. As the short-lived credential is embedded in
// a pre-signed URL, this URL should be protected.
//
// Since pre-signed URLs are generated with embedded temporary credentials,
// you need to remove the authorization header from the fetch requests.
//
// ----
//
// Similar to `INLINE` mode, callers can iterate through the result set, by
// using the `next_chunk_internal_link` field. Each internal link response
// will contain an external link to the raw chunk data, and additionally
// contain the `next_chunk_internal_link` if there are more chunks.
//
// Unlike `INLINE` mode, when using `EXTERNAL_LINKS`, chunks may be fetched
// out of order, and in parallel to achieve higher throughput.
//
// **Limits and limitations**
//
// Note: All byte limits are calculated based on internal storage metrics
// and will not match byte counts of actual payloads.
//
// - Statements with `disposition=INLINE` are limited to 16 MiB and will
// abort when this limit is exceeded. - Statements with
// `disposition=EXTERNAL_LINKS` are limited to 100 GiB. - The maximum query
// text size is 16 MiB. - Cancelation may silently fail. A successful
// response from a cancel request indicates that the cancel request was
// successfully received and sent to the processing engine. However, for
// example, an outstanding statement may complete execution during signal
// delivery, with the cancel signal arriving too late to be meaningful.
// Polling for status until a terminal state is reached is a reliable way to
// determine the final state. - Wait timeouts are approximate, occur
// server-side, and cannot account for caller delays, network latency from
// caller to service, and similarly. - After a statement has been submitted
// and a statement_id is returned, that statement's status and result will
// automatically close after either of 2 conditions: - The last result chunk
// is fetched (or resolved to an external link). - One hour passes with no
// calls to get the status or fetch the result. Best practice: in
// asynchronous clients, poll for status regularly (and with backoff) to
// keep the statement open and alive. - After fetching the last result chunk
// (including chunk_index=0) the statement is automatically closed.
//
// [Apache Arrow Columnar]: https://arrow.apache.org/overview/
// [Public Preview]: https://docs.databricks.com/release-notes/release-types.html
// [SQL Statement Execution API tutorial]: https://docs.databricks.com/sql/api/sql-execution-tutorial.html
StatementExecution *sql.StatementExecutionAPI
// A storage credential represents an authentication and authorization
// mechanism for accessing data stored on your cloud tenant. Each storage
// credential is subject to Unity Catalog access-control policies that
// control which users and groups can access the credential. If a user does
// not have access to a storage credential in Unity Catalog, the request
// fails and Unity Catalog does not attempt to authenticate to your cloud
// tenant on the user’s behalf.
//
// Databricks recommends using external locations rather than using storage
// credentials directly.
//
// To create storage credentials, you must be a Databricks account admin.
// The account admin who creates the storage credential can delegate
// ownership to another user or group to manage permissions on it.
StorageCredentials *catalog.StorageCredentialsAPI
// Primary key and foreign key constraints encode relationships between
// fields in tables.
//
// Primary and foreign keys are informational only and are not enforced.
// Foreign keys must reference a primary key in another table. This primary
// key is the parent constraint of the foreign key and the table this
// primary key is on is the parent table of the foreign key. Similarly, the
// foreign key is the child constraint of its referenced primary key; the
// table of the foreign key is the child table of the primary key.
//
// You can declare primary keys and foreign keys as part of the table
// specification during table creation. You can also add or drop constraints
// on existing tables.
TableConstraints *catalog.TableConstraintsAPI
// A table resides in the third layer of Unity Catalog’s three-level
// namespace. It contains rows of data. To create a table, users must have
// CREATE_TABLE and USE_SCHEMA permissions on the schema, and they must have
// the USE_CATALOG permission on its parent catalog. To query a table, users
// must have the SELECT permission on the table, and they must have the
// USE_CATALOG permission on its parent catalog and the USE_SCHEMA
// permission on its parent schema.
//
// A table can be managed or external. From an API perspective, a __VIEW__
// is a particular kind of table (rather than a managed or external table).
Tables *catalog.TablesAPI
// Enables administrators to get all tokens and delete tokens for other
// users. Admins can either get every token, get a specific token by ID, or
// get all tokens for a particular user.
TokenManagement *settings.TokenManagementAPI
// The Token API allows you to create, list, and revoke tokens that can be
// used to authenticate and access Databricks REST APIs.
Tokens *settings.TokensAPI
// User identities recognized by Databricks and represented by email
// addresses.
//
// Databricks recommends using SCIM provisioning to sync users and groups
// automatically from your identity provider to your Databricks Workspace.
// SCIM streamlines onboarding a new employee or team by using your identity
// provider to create users and groups in Databricks Workspace and give them
// the proper level of access. When a user leaves your organization or no
// longer needs access to Databricks Workspace, admins can terminate the
// user in your identity provider and that user’s account will also be
// removed from Databricks Workspace. This ensures a consistent offboarding
// process and prevents unauthorized users from accessing sensitive data.
Users *iam.UsersAPI
// Volumes are a Unity Catalog (UC) capability for accessing, storing,
// governing, organizing and processing files. Use cases include running
// machine learning on unstructured data such as image, audio, video, or PDF
// files, organizing data sets during the data exploration stages in data
// science, working with libraries that require access to the local file
// system on cluster machines, storing library and config files of arbitrary
// formats such as .whl or .txt centrally and providing secure access across
// workspaces to it, or transforming and querying non-tabular data files in
// ETL.
Volumes *catalog.VolumesAPI
// A SQL warehouse is a compute resource that lets you run SQL commands on
// data objects within Databricks SQL. Compute resources are infrastructure
// resources that provide processing capabilities in the cloud.
Warehouses *sql.WarehousesAPI
// The Workspace API allows you to list, import, export, and delete
// notebooks and folders.
//
// A notebook is a web-based interface to a document that contains runnable
// code, visualizations, and explanatory text.
Workspace *workspace.WorkspaceAPI
// This API allows updating known workspace settings for advanced users.
WorkspaceConf *settings.WorkspaceConfAPI
}
func NewWorkspaceClient ¶
func NewWorkspaceClient(c ...*Config) (*WorkspaceClient, error)
NewWorkspaceClient creates new Databricks SDK client for Workspaces or returns error in case configuration is wrong
Directories
¶
| Path | Synopsis |
|---|---|
|
examples
|
|
|
aad-explicit-spn
command
|
|
|
custom-auth
command
|
|
|
default-auth
command
|
|
|
long-running
command
|
|
|
internal
|
|
|
code
Package holds higher-level abstractions on top of OpenAPI that are used to generate code via text/template for Databricks SDK in different languages.
|
Package holds higher-level abstractions on top of OpenAPI that are used to generate code via text/template for Databricks SDK in different languages. |
|
gen
command
Usage: openapi-codegen
|
Usage: openapi-codegen |
|
Databricks SDK for Go APIs
|
Databricks SDK for Go APIs |
|
billing
These APIs allow you to manage Billable Usage, Budgets, Log Delivery, etc.
|
These APIs allow you to manage Billable Usage, Budgets, Log Delivery, etc. |
|
catalog
These APIs allow you to manage Account Metastore Assignments, Account Metastores, Account Storage Credentials, Catalogs, External Locations, Functions, Grants, Metastores, Schemas, Storage Credentials, Table Constraints, Tables, Volumes, etc.
|
These APIs allow you to manage Account Metastore Assignments, Account Metastores, Account Storage Credentials, Catalogs, External Locations, Functions, Grants, Metastores, Schemas, Storage Credentials, Table Constraints, Tables, Volumes, etc. |
|
compute
These APIs allow you to manage Cluster Policies, Clusters, Command Execution, Global Init Scripts, Instance Pools, Instance Profiles, Libraries, Policy Families, etc.
|
These APIs allow you to manage Cluster Policies, Clusters, Command Execution, Global Init Scripts, Instance Pools, Instance Profiles, Libraries, Policy Families, etc. |
|
files
DBFS API makes it simple to interact with various data sources without having to include a users credentials every time to read a file.
|
DBFS API makes it simple to interact with various data sources without having to include a users credentials every time to read a file. |
|
iam
These APIs allow you to manage Account Groups, Account Service Principals, Account Users, Current User, Groups, Permissions, Service Principals, Users, Workspace Assignment, etc.
|
These APIs allow you to manage Account Groups, Account Service Principals, Account Users, Current User, Groups, Permissions, Service Principals, Users, Workspace Assignment, etc. |
|
jobs
The Jobs API allows you to create, edit, and delete jobs.
|
The Jobs API allows you to create, edit, and delete jobs. |
|
ml
These APIs allow you to manage Experiments, Model Registry, etc.
|
These APIs allow you to manage Experiments, Model Registry, etc. |
|
oauth2
These APIs allow you to manage Custom App Integration, O Auth Enrollment, Published App Integration, etc.
|
These APIs allow you to manage Custom App Integration, O Auth Enrollment, Published App Integration, etc. |
|
pipelines
The Delta Live Tables API allows you to create, edit, delete, start, and view details about pipelines.
|
The Delta Live Tables API allows you to create, edit, delete, start, and view details about pipelines. |
|
provisioning
These APIs allow you to manage Credentials, Encryption Keys, Networks, Private Access, Storage, Vpc Endpoints, Workspaces, etc.
|
These APIs allow you to manage Credentials, Encryption Keys, Networks, Private Access, Storage, Vpc Endpoints, Workspaces, etc. |
|
serving
The Serving Endpoints API allows you to create, update, and delete model serving endpoints.
|
The Serving Endpoints API allows you to create, update, and delete model serving endpoints. |
|
settings
These APIs allow you to manage Account Ip Access Lists, Ip Access Lists, Token Management, Tokens, Workspace Conf, etc.
|
These APIs allow you to manage Account Ip Access Lists, Ip Access Lists, Token Management, Tokens, Workspace Conf, etc. |
|
sharing
These APIs allow you to manage Providers, Recipient Activation, Recipients, Shares, etc.
|
These APIs allow you to manage Providers, Recipient Activation, Recipients, Shares, etc. |
|
sql
These APIs allow you to manage Alerts, Dashboards, Data Sources, Dbsql Permissions, Queries, Query History, Statement Execution, Warehouses, etc.
|
These APIs allow you to manage Alerts, Dashboards, Data Sources, Dbsql Permissions, Queries, Query History, Statement Execution, Warehouses, etc. |
|
workspace
These APIs allow you to manage Git Credentials, Repos, Secrets, Workspace, etc.
|
These APIs allow you to manage Git Credentials, Repos, Secrets, Workspace, etc. |
Click to show internal directories.
Click to hide internal directories.