README
¶
json2hat
Import company affiliations from cncf/devstats into GrimoireLab Sorting Hat database.
Environment parameters
Setting Sorting Hat database parameters: you can either provide full database connect string/dsn via SH_DSN=... or provide all or some paramaters individually, via SH_* environment variables. SH_DSN=.. has a higher priority and no SH_* parameters are used if SH_DSN is provided. When using SH_* parameters, only SH_PASS is required, all other parameters have default values.
Sorting Hat database connection parameters:
SH_DSN- provides full database connect string, for example:SH_DSN='shuser:shpassword@tcp(shhost:shport)/shdb?charset=utf8'SH_USER- user name, defaults toshuser.SH_PASS- password - required.SH_PROTO- protocol, defaults totcp.SH_HOST- host, defaults tolocalhost.SH_PORT- port, defaults to3306.SH_DB- database name, defaults toshdb.SH_PARAMS- additional parameters that can be specified via?param1=value1¶m2=value2&...¶mN=valueN, defaults to?charset=utf8. You can useSH_PARAMS='-'to specify empty params.
To cleanup existing company affiliations (delete from organizations and enrollments tables) set the SH_CLEANUP variable.
Testing connection:
SH_TEST_CONNECT- set this variable to only test connection.
Affiliations JSON path
json2hat needs to read cncf/devstats affiliations json file. It first tries to read a local json file and fallbacks to a remote file.
You can set local file path via SH_LOCAL_JSON_PATH=/path/to/github_users.json. Default value is github_users.json. If local file is found then no remote file is read.
You can set remote file path via SH_REMOTE_JSON_PATH=http://some.url.org/path/to/github_users.json. Default value is https://github.com/cncf/devstats/raw/master/github_users.json. This file is only read when reading local json fails. If both local and remote files cannot be read program exists with a fatal error message.
Company acquisitions YAML path
json2hat needs to read cncf/devstats company acquisitions/name mapping yaml file. It first tries to read a local json file and fallbacks to a remote file.
You can set local file path via SH_LOCAL_YAML_PATH=/path/to/companies.yaml. Default value is companies.yaml. If local file is found then no remote file is read.
You can set remote file path via SH_REMOTE_YAML_PATH=http://some.url.org/path/to/companies.yaml. Default value is https://github.com/cncf/devstats/raw/master/companies.yaml. This file is only read when reading local json fails. If both local and remote files cannot be read program exists with a fatal error message.
DA company names mapping
json2hat reads this file for mappings.
Docker
json2hat is packaged as a docker image docker.io/dajohn/json2hat. You can use scripts from docker/ directory to manage docker image.
Scripts (most require setting docker username via something like this: docker login; DOCKER_USER=your_user_name ./docker/docker_scriptname.sh):
docker/docker_build.sh- this will buildjson2hatdocker image. Image is using multi layer setup to build the smallest possible output. It don't even havebash. SeeDockerfilefor details. Image is only about 6Mb size.docker/docker_run.sh- this will executejson2hatfrom within the container. You should passSH_*variables to control Sorting Hat database connection and affiliations JSON path.docker/docker_publish.sh- it will publishjson2hatimage to your docker hub.docker/docker_pull.sh- it will pulljson2hatimage from your docker hub.docker/docker_remove.sh- removes generatedjson2hatdocker image.docker/docker_cleanup.sh- removes generatedjson2hatdocker image and executesdocker system prune.
Running locally
- Replace env with
prodortestorlocal:./json2hat.sh env. - Pass
ONLY_GGH_USERNAME=1if you want to match username only for git and GitHub source. - Pass
ONLY_GGH_NAME=1if you want to match name only for git and GitHub source. - Clear
NO_PROFILE_UPDATEenv if you do not want import to be able to update country and other profile data. - Pass
REPLACE=1env if you want to replace any existing affiliations found (will only touch affiliations withproject_sluglikecncf/*orcncf-f). - Pass
DRY_RUN=1to avoid and DB writing. - Pass
SKIP_BOTS=1to avoid auto marking bots. - Pass
ONLY_GGH_USERNAME=1to match usernames only for git or GitHub usernames. - Pass
ONLY_GGH_NAME=1to match names only for git or GitHub names. - Use
NAME_MATCH=nto specify how to match using name: 0 - do not match using name, 1 - match only when single hit, 2 - match on multiple hits, default is 1. - Set
ORGS_RO=1to skip adding any new organizations. It will dump a CSV file with missing org names then and won't add any enrollments to orgs that were not found (directly, lowerace or by acquisition or mapping YAMLs). - Set
MISSING_ORGS_CSV=filename.csvto specify filename containing missing orgs (only whenORGS_ROis used), default ismissing.csvif not specified.
Company names mapping
You should call DA affiliations API map_org_names after a successfull CNCF affiliations data import.
Documentation
¶
There is no documentation for this package.