stats

package
v0.12.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 8, 2022 License: MIT Imports: 1 Imported by: 0

Documentation

Overview

package stats calculates some statistics for a group of scientific names.

It uses data received from verification of the names against the Catalogue of Life, finds distribution of names across Kingdoms and finds a taxon that contains a given percentage (always a majority) of scientific names of genera and lower.

Index

Constants

This section is empty.

Variables

View Source
var RankStr = map[Rank]string{
	Empty:        "empty",
	Unknown:      "unknown",
	SubSpecies:   "subspecies",
	Species:      "species",
	SuperSpecies: "superspecies",
	SubGenus:     "subgenus",
	Genus:        "genus",
	SuperGenus:   "supergenus",
	SubTribe:     "subtribe",
	Tribe:        "tribe",
	InfraFamily:  "infrafamily",
	SubFamily:    "subfamily",
	Family:       "family",
	SuperFamily:  "superfamily",
	InfraOrder:   "infraorder",
	SubOrder:     "suborder",
	Order:        "order",
	SuperOrder:   "superorder",
	ParvClass:    "parvclass",
	SubTerClass:  "subterclass",
	InfraClass:   "infraclass",
	SubClass:     "subclass",
	Class:        "class",
	SuperClass:   "superclass",
	SubPhylum:    "subphylum",
	Phylum:       "phylum",
	SuperPhylum:  "superphylum",
	SubKingdom:   "subkingdom",
	Kingdom:      "kingdom",
	SuperKingdom: "superkingdom",
	Empire:       "empire",
}
View Source
var StrRank = func() map[string]Rank {
	res := make(map[string]Rank)
	for k, v := range RankStr {
		res[v] = k
	}
	return res
}()

StrRank conversts a rank string to Rank type.

Functions

func AddRank

func AddRank(cs []Taxon)

AddRank converts a RankStr to its Rank value and saves it in taxons.

Types

type Hierarchy

type Hierarchy interface {
	// Taxons method produces a slice of taxons that represent a path in a
	// hierarchy.
	Taxons() []Taxon
}

An interface that allows to produce a normalized verion of a hierarchy as a slice of taxons, ordered accorting from more general to more specific taxons.

type Rank

type Rank int

Rank represents a rank of a taxon.

const (
	Empty Rank = iota
	Unknown
	SubSpecies
	Species
	SuperSpecies
	SubGenus
	Genus
	SuperGenus
	SubTribe
	Tribe
	InfraFamily
	SubFamily
	Family
	SuperFamily
	InfraOrder
	SubOrder
	Order
	SuperOrder
	ParvClass
	SubTerClass
	InfraClass
	SubClass
	Class
	SuperClass
	SubPhylum
	Phylum
	SuperPhylum
	SubKingdom
	Kingdom
	SuperKingdom
	Empire
)

func NewRank

func NewRank(s string) Rank

NewRank creates Rank from a string.

func (Rank) Index

func (r Rank) Index() int

Index returns the index of a rank position in the ranksData.

func (Rank) String

func (r Rank) String() string

String returns the string representation of a Rank.

type Stats

type Stats struct {
	// NamesNum is the number of names that are used for stats calculation.
	// These names include names of a rank `genus` and lower,
	// verified to the Catalogue of Life
	NamesNum int

	//Kingdom is the most prevalent kingdom in the group of names.
	Kingdom Taxon

	// MainTaxon is the taxon that contains at least the percentage of names
	// according to the MainTaxonThreshold
	MainTaxon Taxon

	// KingdomPercentage is a value between 0 and 1 representing the percentage
	// of names located in a particular kingdom.
	KingdomPercentage float32

	// MainTaxonPercentage is a value between 0 and 1 representing the
	// percentage of names located in the MainTaxon.
	MainTaxonPercentage float32

	// Kingdoms is the distribution of names across detected kingdoms.
	Kingdoms []TaxonDist
}

Stats struct provides statistical data about a group of verified by the Catalogue of Life scientific names. It contains data about names number used for the stats calculation, the distribution of these names across Kingdoms registered in CoL, as well as the lowest taxon that contains at least a majority of all these names. A user submits the desired threshold for the calculation of such taxon.

func New

func New(
	h []Hierarchy,
	threshold float32,
) Stats

New takes several hierarhies, a MainTaxon threshold value, and returns back the kingdom where most of items belong to (if rank 'kingdom' is provided), percentage of how many items belong to that kingdom, and the highest ranking taxon that includes at least the given percentage of species. The percentage is provided via threshold parameter.

The algorithm assumes that all items belong to the same classification tree.

type Taxon

type Taxon struct {
	// ID is the Catalogue of Life ID for the taxon.
	ID string

	// Name is the name of the taxon.
	Name string

	// RankStr is a string representation of the taxon's rank.
	RankStr string

	// Rank represents taxon's rank via Rank type. Rank type is derived from
	// int type.
	Rank
}

Taxon struct represents a particular taxon according to the Catalogue of Life (CoL). It includes an ID from CoL, name of the taxon, and numerical and string representation of the taxon's rank.

type TaxonDist

type TaxonDist struct {
	// NamesNum is the number of names found for this particular rank.
	NamesNum int

	// Name is the scientific name of the taxon.
	Name string

	// Percentage is the percentage of names belonging to this taxon.
	Percentage float32
}

TaxonDist provides information how a group of names is distributed across taxons of the same rank.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL