migration

command module
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 30, 2025 License: Apache-2.0 Imports: 1 Imported by: 0

README

qdrant-migration

Note: This project is in beta. The API may change in future releases.

This tool helps to migrate data to Qdrant from other sources. It will stream all vectors from a collection in the source Qdrant instance to the target Qdrant instance.

The target collection can have a different replication or sharding configuration, expect the vector size and distance need to be the same.

Supported sources:

  • Other Qdrant instances

Installation

The easiest way to run the qdrant-migration tool is as a container. You can run it any machine where you have connectivity to both the source and the target Qdrant databases. Direct connectivity between both databases is not required. For optimal performance, you should run the tool on a machine with a fast network connection and minimum latency to both databases.

To pull the latest image run:

$ docker pull registry.cloud.qdrant.io/library/qdrant-migration

In addtion, every release providies precompiled binaries for all major OS and CPU architectures. You can download the latest release from the releases page.

Usage

To migrate from one Qdrant instance to another, you can provide the following parameters:

$ docker run --net=host --rm -it registry.cloud.qdrant.io/library/qdrant-migration qdrant --help
Usage: migration qdrant --source-url=STRING --source-collection=STRING --target-url=STRING --target-collection=STRING [flags]

Migrate data from a Qdrant database to Qdrant.

Flags:
  -h, --help                        Show context-sensitive help.
      --debug                       Enable debug mode.
      --trace                       Enable trace mode.
      --skip-tls-verification       Skip TLS verification.
      --version                     Print version information and quit

      --source-url=STRING           Source GRPC URL, e.g. https://your-qdrant-hostname:6334
      --source-collection=STRING    Source collection
      --source-api-key=STRING       Source API key ($SOURCE_API_KEY)
      --target-url=STRING           Target GRPC URL, e.g. https://your-qdrant-hostname:6334
      --target-collection=STRING    Target collection
      --target-api-key=STRING       Target API key ($TARGET_API_KEY)
  -b, --batch-size=50               Batch size
  -c, --create-target-collection    Create the target collection if it does not exist
  -m, --migration-marker=STRING     Migration marker to resume the migration

Example:

$ docker run --net=host --rm -it registry.cloud.qdrant.io/library/qdrant-migration qdrant \
    --source-url 'https://source-qdrant-hostname:6334' \
    --source-collection 'source-collection' \
    --target-url 'https://target-qdrant-hostname:6334' \
    --target-collection 'target-collection'

You can provide the API keys either as command line arguments or as environment variables:

$ docker run --net=host --rm -it \
    -e SOURCE_API_KEY='xyz' \ 
    registry.cloud.qdrant.io/library/qdrant-migration qdrant \
    --source-url 'https://source-qdrant-hostname:6334' \
    --source-collection 'source-collection' \
    --target-url 'https://target-qdrant-hostname:6334' \
    --target-collection 'target-collection' \
    --target-api-key 'abc'

If you want to resume a cancelled migration, or if you want to migrate vectors that may have been added after the last migration run, you can pass the migration marker as a flag, which is printed out at the beginning of the migration process:

$ docker run --net=host --rm -it registry.cloud.qdrant.io/library/qdrant-migration qdrant \
    --source-url 'https://source-qdrant-hostname:6334' \
    --source-collection 'source-collection' \
    --target-url 'https://target-qdrant-hostname:6334' \
    --target-collection 'target-collection' \
    --migration-marker 'migration-2025-02-14T19:18:30+01:00'
Migration considerations

The migration tool will stream all vectors from the source collection to the target collection. If the target collection exists before starting the migration, its configuration regarding vector size and dimensions must match. The replication factor, shard configuration or on_disk settings can be different. If the target collection does not exist, you can create it by passing the --create-target-collection flag.

Existing vectors in the target collection with the same ids as in the source collection will be overwritten. If you want to keep the existing vectors, you should create a new collection and migrate the vectors there.

The batch size can be adjusted with the --batch-size flag. The default batch size is 50, which is a good starting point for most use cases. If you experience performance issues, you can try to increase the batch size. Ideally a batch should be around 32MiB in size including vectors and payloads.

The Qdrant version of the source and target databases should be the same minor version. Differences in the patch version are fine.

Development

Running tests

The migration tool has two kind of tests, Golang unit tests and integration tests written with bats.

To run the Golang tests, execute:

$ make test_unit

To run the integration tests, execute:

$ make test_integration

To run all tests, execute:

$ make test
Linting

This project uses golangci-lint to lint the code. To run the linter, execute:

$ make lint

Code formatting is ensured with gofmt. To format the code, execute:

$ make fmt
Pre-commit hooks

This project uses pre-commit to run the linter and the formatter before every commit. To install the pre-commit hooks, execute:

$ pre-commit install-hooks

Releasing a new version

To release a new version create and push a release branch that follows the pattern release/vX.Y.Z. The release branch should be created from the main branch, or from the latest release branch in case of a hot fix.

A GitHub Action will then create a new release, build the binaries, push the Docker image to the registry, and create a Git tag.

Documentation

The Go Gopher

There is no documentation for this package.

Directories

Path Synopsis
pkg

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL