Systematics.hpp
Track genotypes, species, clades, or lineages of organisms in a world.
Note that this file powers the Python library phylotrackpy. The main consequence is that we should generally prefer emp_optional_throw to emp_assert in any circumstance where user input could trigger an error. emp_assert will trigger a segfault in Python (killing the whole interpreter), whereas emp_optional_throw will raise a Python exception.
- Todo:
We should provide an option to back up systematics data to a file so that it doesn’t all need to be kept in memory, especially if we’re only doing post-analysis.
This inheritance system makes adding new systematics-related data tracking kind of a pain. Over time, this will probably become a maintainability problem. We could make the inheritance go away and just use signals, but then the World could not maintain systematics managers.
-
template<typename ORG_INFO, typename DATA_STRUCT = datastruct::no_data>
class Taxon - #include <Systematics.hpp>
A Taxon represents a type of organism in a phylogeny.
Genotypes are the most commonly used Taxon; in general taxa can be anything from a shared genome sequence, a phenotypic trait, or a even a position in the world (if you want to track an evolutionary pathway)
- Param ORG_INFO:
The information type associated with an organism, used to categorize it.
Public Types
-
using data_t = DATA_STRUCT
Public Functions
-
inline size_t GetID() const
Get a unique ID for this taxon; IDs are assigned sequentially, so newer taxa have higher IDs.
-
inline void NullifyParent()
-
inline size_t GetNumOrgs() const
Get the number of living organisms currently associated with this Taxon.
-
inline void SetNumOrgs(size_t n)
Set the number of living organisms currently associated with this Taxon.
-
inline size_t GetTotOrgs() const
Get the total number of organisms that have ever lived associated with this Taxon.
-
inline void SetTotOrgs(size_t n)
Set the total number of organisms that have ever lived associated with this Taxon.
-
inline size_t GetNumOff() const
Get the number of taxa that were produced by organisms from this Taxon.
-
inline size_t GetDepth() const
Get the number of taxanomic steps since the ancestral organism was injected into the World.
-
inline double GetOriginationTime() const
- Returns:
this taxon’s origination time
-
inline void SetOriginationTime(double time)
Set this taxon’s origination time.
-
inline double GetDestructionTime() const
- Returns:
this taxon’s destruction time
-
inline void SetDestructionTime(double time)
Sets this taxon’s destruction time.
-
inline void AddOrg()
Add a new organism to this Taxon.
-
inline void AddTotalOffspring()
Recursively increment total offspring count for this and all ancestors.
-
inline int GetTotalOffspring()
Get total number of offspring directly or indirectly descending from this taxon.
-
inline bool RemoveOrg()
Remove an organism from this Taxon (after it dies). Removals must return true if the taxon needs to continue; false if it should deactivate.
-
inline void RemoveFromOffspring(Ptr<this_t> offspring_tax)
Remove specified taxon from this taxon’s offspring list.
-
inline bool RemoveOffspring(Ptr<this_t> offspring_tax)
Remove and offspring taxa after its entire sub-tree has died out (pruning)
-
inline void RemoveTotalOffspring()
Reduce the total count of extant offspring and recursively do so for all ancestors (gets called on a taxon’s parent when that taxon goes extinct)
Protected Attributes
-
size_t id
ID for this Taxon (Unique within this Systematics)
-
int num_orgs
How many organisms currently exist of this group?
-
int tot_orgs
How many organisms have ever existed of this group?
-
int num_offspring
How many direct offspring groups exist from this one.
-
int total_offspring
How many total extant offspring taxa exist from this one (i.e. including indirect)
-
size_t depth
How deep in tree is this node? (Root is 0)
-
double origination_time
When did this taxon first appear in the population?
-
double destruction_time
When did this taxon leave the population?
-
DATA_STRUCT data
A struct for storing additional information about this taxon.
Friends
- friend class Systematics
-
template<typename ORG>
class SystematicsBase - #include <Systematics.hpp>
A base class for Systematics, maintaining information common to all systematics managers and providing virtual functions. You probably don’t want to instantiate this. It just exists so that you can make containers of Systematics managers of different types.
Subclassed by Systematics< ORG, ORG_INFO, DATA_STRUCT >
Public Types
-
using data_node_t = DataNode<double, data::Current, data::Info, data::Range, data::Stats, data::Pull>
-
using data_ptr_t = Ptr<data_node_t>
Public Functions
-
inline SystematicsBase(bool _active = true, bool _anc = true, bool _all = false, bool _pos = true)
-
inline virtual ~SystematicsBase()
-
inline bool GetTrackSynchronous() const
Are we tracking a synchronous population?
-
inline bool GetStoreActive() const
Are we storing all taxa that are still alive in the population?
-
inline bool GetStoreAncestors() const
Are we storing all taxa that are the ancestors of living organisms in the population?
-
inline bool GetStoreOutside() const
Are we storing all taxa that have died out, as have all of their descendants.
-
inline bool GetArchive() const
Are we storing any taxa types that have died out?
-
inline bool GetStorePosition() const
Are we storing the positions of taxa?
-
inline size_t GetTotalOrgs() const
How many living organisms are currently being tracked?
-
inline size_t GetNumRoots() const
How many independent trees are being tracked?
-
inline size_t GetNextID() const
What ID will the next taxon have?
-
inline double GetAveDepth() const
What is the average phylogenetic depth of organisms in the population?
-
inline size_t GetUpdate() const
- Returns:
current update/time step
-
inline void SetTrackSynchronous(bool new_val)
Are we tracking organisms evolving in synchronous generations?
-
inline void SetStoreActive(bool new_val)
Are we storing all taxa that are still alive in the population?
-
inline void SetStoreAncestors(bool new_val)
Are we storing all taxa that are the ancestors of living organisms in the population?
-
inline void SetStoreOutside(bool new_val)
Are we storing all taxa that have died out, as have all of their descendants.
-
inline void SetArchive(bool new_val)
Are we storing any taxa types that have died out?
-
inline void SetStorePosition(bool new_val)
Are we storing the location of taxa?
-
inline void SetUpdate(size_t ud)
Sets the current update/time step.
-
inline data_ptr_t AddDataNode(const std::string &name)
Add a data node to this systematics manager
- Parameters:
name – the name of the data node (so it can be found later)
-
inline data_ptr_t AddDataNode(std::function<vector<double>()> pull_set_fun, const std::string &name)
Add a data node to this systematics manager
- Parameters:
name – the name of the data node (so it can be found later)
pull_set_fun – a function to run when the data node is requested to pull data (returns vector of values)
-
inline data_ptr_t AddDataNode(std::function<double()> pull_fun, const std::string &name)
Add a data node to this systematics manager
- Parameters:
name – the name of the data node (so it can be found later)
pull_fun – a function to run when the data node is requested to pull data (returns single value)
-
inline data_ptr_t GetDataNode(const std::string &name)
- Returns:
a pointer to the data node with the specified name
-
virtual data_ptr_t AddEvolutionaryDistinctivenessDataNode(const std::string &name = "evolutionary_distinctiveness") = 0
-
virtual data_ptr_t AddPairwiseDistanceDataNode(const std::string &name = "pairwise_distance") = 0
-
virtual data_ptr_t AddPhylogeneticDiversityDataNode(const std::string &name = "phylogenetic_diversity") = 0
-
virtual data_ptr_t AddDeleteriousStepDataNode(const std::string &name = "deleterious_steps") = 0
-
virtual data_ptr_t AddVolatilityDataNode(const std::string &name = "volatility") = 0
-
virtual data_ptr_t AddUniqueTaxaDataNode(const std::string &name = "unique_taxa") = 0
-
virtual data_ptr_t AddMutationCountDataNode(const std::string &name = "mutation_count", const std::string &mutation = "substitution") = 0
-
virtual size_t GetNumActive() const = 0
-
virtual size_t GetNumAncestors() const = 0
-
virtual size_t GetNumOutside() const = 0
-
virtual size_t GetTreeSize() const = 0
-
virtual size_t GetNumTaxa() const = 0
-
virtual int GetMaxDepth() const = 0
-
virtual int GetPhylogeneticDiversity() const = 0
-
virtual double GetMeanPairwiseDistance(bool branch_only) const = 0
-
virtual double GetSumDistance() const = 0
-
virtual double GetSumPairwiseDistance(bool branch_only) const = 0
-
virtual double GetVariancePairwiseDistance(bool branch_only) const = 0
-
virtual int SackinIndex() const = 0
-
virtual double CollessLikeIndex() const = 0
-
virtual double GetAverageOriginTime(bool) const = 0
-
virtual int GetMRCADepth() const = 0
-
virtual void AddOrg(ORG &&org, WorldPosition pos) = 0
-
virtual void AddOrg(ORG &org, WorldPosition pos) = 0
-
virtual void AddOrg(ORG &&org, WorldPosition pos, WorldPosition parent) = 0
-
virtual void AddOrg(ORG &org, WorldPosition pos, WorldPosition parent) = 0
-
virtual bool RemoveOrg(WorldPosition pos) = 0
-
virtual void RemoveOrgAfterRepro(WorldPosition pos) = 0
-
virtual double CalcDiversity() const = 0
-
virtual void Update() = 0
-
virtual void SetNextParent(WorldPosition pos) = 0
-
virtual void SwapPositions(WorldPosition p1, WorldPosition p2) = 0
Protected Attributes
-
bool store_active
Store all of the currently active taxa?
-
bool store_ancestors
Store all of the direct ancestors from living taxa?
-
bool store_outside
Store taxa that are extinct with no living descendants?
-
bool archive
Set to true if we are supposed to do any archiving of extinct taxa.
-
bool store_position
Keep a vector mapping positions to pointers.
-
bool track_synchronous
Does this systematics manager need to keep track of current and next positions?
-
size_t org_count
How many organisms are currently active?
-
size_t total_depth
Sum of taxa depths for calculating average.
-
size_t num_roots
How many distinct injected ancestors are currently in population?
-
mutable int max_depth
Depth of deepest taxon. -1 means needs to be recalculated.
-
size_t next_id
What ID value should the next new taxon have?
-
size_t curr_update
-
using data_node_t = DataNode<double, data::Current, data::Info, data::Range, data::Stats, data::Pull>
-
template<typename ORG, typename ORG_INFO, typename DATA_STRUCT = datastruct::no_data>
class Systematics : public SystematicsBase<ORG> - #include <Systematics.hpp>
A tool to track phylogenetic relationships among organisms. The systematics class tracks the relationships among all organisms based on the INFO_TYPE provided. If an offspring has the same value for INFO_TYPE as its parent, it is grouped into the same taxon. Otherwise a new Taxon is created and the old one is used as its parent in the phylogeny. If the provided INFO_TYPE is the organism’s genome, a traditional phylogeny is formed, with genotypes. If the organism’s behavior/task set is used, then organisms are grouped by phenotypes. If the organism’s position is used, the evolutionary path through space is tracked. Any other aspect of organisms can be tracked this way as well.
Unnamed Group
-
virtual void AddOrg(ORG &&org, WorldPosition pos)
Add information about a new organism, including its stored info and parent’s taxon; If you would like the systematics manager to track taxon age, you can also supply the update at which the taxon is being added.
- Parameters:
org – a reference to the organism being added
pos – the position of the organism being added
parent – a pointer to the org’s parent
- Returns:
a pointer for the associated taxon.
-
virtual void AddOrg(ORG &&org, WorldPosition pos, WorldPosition parent)
-
virtual void AddOrg(ORG &org, WorldPosition pos)
-
virtual void AddOrg(ORG &org, WorldPosition pos, WorldPosition parent)
Unnamed Group
-
virtual bool RemoveOrg(WorldPosition pos)
Remove an instance of an organism; track when it’s gone.
- Parameters:
pos – the world position of the individual being removed
taxon – a pointer to the taxon of the individual being removed
Unnamed Group
-
virtual void RemoveOrgAfterRepro(WorldPosition pos)
Mark an instance of a taxon to be removed; track when it’s gone. This is a work-around to deal with steady state/non-synchronous populations in which an organism might die as its offspring is born (e.g. in a spatial world where the offspring replaces the parent). If the bookkeeping is not handled correctly, we could accidentally mark the taxon as extinct when it is actually continuing. By using this method, the taxon won’t be removed until after the next org is added or the next time an org is marked for removal.
- Parameters:
pos – the world position of the individual being removed
taxon – a pointer to the taxon of the individual being removed
Unnamed Group
-
inline virtual void SetNextParent(WorldPosition pos)
Tell systematics manager that the parent of the next taxon added will be the one specified by this function (either at the specified position or the one pointed to by the given pointer) Works with version of AddOrg that only takes org, position, and update. Will be set to null after being assigned as the parent of a taxon
Public Types
-
using taxon_t = Taxon<ORG_INFO, DATA_STRUCT>
Public Functions
-
void RemoveOffspring(Ptr<taxon_t> offspring, Ptr<taxon_t> taxon)
Called when an offspring taxa has been deleted.
-
void MarkExtinct(Ptr<taxon_t> taxon)
Called when there are no more living members of a taxon. There may be descendants.
-
inline Systematics(fun_calc_info_t calc_taxon, bool store_active = true, bool store_ancestors = true, bool store_all = false, bool store_pos = true)
Contructor for Systematics; controls what information should be stored.
- Parameters:
calc_taxon – A function that takes an organism and calculates what taxon it belongs to
store_active – Should living organisms’ taxa be tracked? (typically yes!)
store_ancestors – Should ancestral organisms’ taxa be maintained? (yes for lineages!)
store_all – Should all dead taxa be maintained? (typically no; it gets BIG!)
store_pos – Should the systematics tracker keep track of organism positions? (probably yes - only turn this off if you know what you’re doing)
-
Systematics(const Systematics&) = delete
-
Systematics(Systematics&&) = default
-
inline ~Systematics()
-
virtual void Update()
Switch to next update/time step Useful for keeping track of taxon survival times and population positions in synchronous generation worlds.
-
inline void SetCalcInfoFun(fun_calc_info_t f)
Set function used to calculate taxons from organisms.
-
void RemoveBefore(int ud)
Remove all taxa that 1) went extinct before the specified update/time step, and 2) only have ancestors that went extinct before the specified update/time step. Warning: this function invalidates most measurements you could make about tree topology. It is useful in select situations where you need to store ancestors for some period of time, but cannot computationally afford to store all ancestors for your entire run.
-
inline void ApplyToActiveTaxa(const std::function<void(const Ptr<taxon_t> tax)> &fun) const
Run the given function on every active taxon (const version)
- Parameters:
fun – the function to run on each taxon
-
inline void ApplyToActiveTaxa(const std::function<void(Ptr<taxon_t> tax)> &fun)
Run the given function on every active taxon
- Parameters:
fun – the function to run on each taxon
-
inline void ApplyToAncestorTaxa(const std::function<void(const Ptr<taxon_t> tax)> &fun) const
Run the given function on every ancestor taxon (const version)
- Parameters:
fun – the function to run on each taxon
-
inline void ApplyToAncestorTaxa(const std::function<void(Ptr<taxon_t> tax)> &fun)
Run the given function on every ancestor taxon
- Parameters:
fun – the function to run on each taxon
-
inline void ApplyToOutsideTaxa(const std::function<void(const Ptr<taxon_t> tax)> &fun) const
Run the given function on every outside taxon (const version)
- Parameters:
fun – the function to run on each taxon
-
inline void ApplyToOutsideTaxa(const std::function<void(Ptr<taxon_t> tax)> &fun)
Run the given function on every outside taxon
- Parameters:
fun – the function to run on each taxon
-
inline void ApplyToAllTaxa(const std::function<void(const Ptr<taxon_t> tax)> &fun) const
Run given function on all taxa (const version)
- Parameters:
fun – the function to run on each taxon
-
inline void ApplyToAllTaxa(const std::function<void(Ptr<taxon_t> tax)> &fun)
Run given function on all taxa
- Parameters:
fun – the function to run on each taxon
-
template<typename T>
inline vector<T> ApplyToAllTaxa(const std::function<T(const Ptr<taxon_t> tax)> &fun) const Run given function on all taxa and return result (const version)
- Parameters:
fun – the function to run on each taxon
- Returns:
a vector containing the results of running the function on each taxon
-
template<typename T>
inline vector<T> ApplyToAllTaxa(const std::function<T(Ptr<taxon_t> tax)> &fun) Run given function on all taxa and return result (const version)
- Parameters:
fun – the function to run on each taxon
- Returns:
a vector containing the results of running the function on each taxon
-
inline const std::unordered_set<Ptr<taxon_t>, hash_t> &GetActive() const
- Returns:
set of active (extant/living) taxa0
-
inline const std::unordered_set<Ptr<taxon_t>, hash_t> &GetAncestors() const
- Returns:
set of ancestor taxa (extinct, but have active descendants)
-
inline const std::unordered_set<Ptr<taxon_t>, hash_t> &GetOutside() const
- Returns:
set of outside taxa (extinct, with no active descendants)
-
inline virtual size_t GetNumActive() const
- Returns:
the number of taxa that are still active in the population
-
inline virtual size_t GetNumAncestors() const
- Returns:
the number of taxa that are ancestors of living organisms (but have died out themselves)
-
inline virtual size_t GetNumOutside() const
- Returns:
the number of taxa that are stored that have died out, as have their descendents
-
inline virtual size_t GetTreeSize() const
- Returns:
the number of taxa that are in the current phylogeny
-
inline virtual size_t GetNumTaxa() const
- Returns:
the number of taxa that are stored in total
-
virtual int GetMaxDepth() const
- Returns:
the phylogenetic depth (lineage length) of the taxon with the longest lineage out of all active taxa
-
inline Ptr<taxon_t> GetNextParent() const
- Returns:
the taxon that will be used as the parent of the next taxon created via the version of AddOrg that does not accept a parent
-
inline bool IsTaxonAt(WorldPosition id) const
- Returns:
true if there is a taxon at specified location
-
inline Ptr<taxon_t> GetTaxonAt(WorldPosition id) const
- Returns:
pointer to taxon at specified location
-
inline SignalKey OnNew(std::function<void(Ptr<taxon_t> t, ORG &org)> &fun)
Provide a function for Systematics to call each time a new taxon is created. Trigger: New taxon is made Argument: Pointer to taxon, reference to org taxon was created from
-
inline SignalKey OnExtinct(std::function<void(Ptr<taxon_t> t)> &fun)
Provide a function for Systematics to call each time a taxon goes extinct. Trigger: Taxon is going extinct Argument: Pointer to taxon
-
inline SignalKey OnPrune(std::function<void(Ptr<taxon_t>)> &fun)
Provide a function for Systematics to call each time a taxon is about to be pruned (removed from ancestors). Trigger: Taxon is about to be killed Argument: Pointer to taxon
-
inline virtual data_ptr_t AddEvolutionaryDistinctivenessDataNode(const std::string &name = "evolutionary_distinctiveness")
Add data node that records evolutionary distinctiveness when requested to pull. Used by AddPhylodiversityFile in World_output.hpp
-
inline virtual data_ptr_t AddPairwiseDistanceDataNode(const std::string &name = "pairwise_distance")
Add data node that records pairwise distance when requested to pull. Used by AddPhylodiversityFile in World_output.hpp
-
inline virtual data_ptr_t AddPhylogeneticDiversityDataNode(const std::string &name = "phylogenetic_diversity")
Add data node that records phylogenetic distinctiveness when requested to pull. Used by AddPhylodiversityFile in World_output.hpp
-
inline virtual data_ptr_t AddDeleteriousStepDataNode(const std::string &name = "deleterious_steps")
Add data node that records counts of deleterious steps along lineages in this systematics manager when requested to pull. Used by AddLineageMutationFile in World_output.hpp
-
inline virtual data_ptr_t AddVolatilityDataNode(const std::string &name = "volatility")
Add data node that phenotypic volatility (changes in phenotype) along lineages in this systematics manager when requested to pull. Used by AddLineageMutationFile in World_output.hpp
-
inline virtual data_ptr_t AddUniqueTaxaDataNode(const std::string &name = "unique_taxa")
Add data node that records counts of unique taxa along lineages in this systematics manager when requested to pull. Used by AddLineageMutationFile in World_output.hpp
-
inline virtual data_ptr_t AddMutationCountDataNode(const std::string &name = "mutation_count", const std::string &mutation = "substitution")
Add data node that records counts of mutations of the specified type along lineages in this systematics manager when requested to pull. Used by AddLineageMutationFile in World_output.hpp
-
inline virtual int GetPhylogeneticDiversity() const
From (Faith 1992, reviewed in Winters et al., 2013), phylogenetic diversity is the sum of edges in the minimal spanning tree connected the taxa you’re calculating diversity of.
This calculates phylogenetic diversity for all extant taxa in the tree, assuming all edges from parent to child have a length of one. Possible extensions to this function that might be useful in the future include:
Pass it a set of taxon_t pointers and have it calculate PD for just those taxa
Enable calculation of branch lengths by amount of time that elapsed between origination of parent and origination of offspring
Enable a paleontology compatibility mode where only branching points are calculated
-
int GetPhylogeneticDiversityNormalize(int generation = 0, std::string filename = "") const
- Returns:
phylogenetic diversity if used without any arguments . If you want to receive normalized data, you need to include the number of generations your tree has you also need to specify a file with which to normalize your data. If value is outside of the values in the file, 100th percentile will be returned NOTE: This is experimental and in early development/research phases!
-
inline double GetTaxonDistinctiveness(Ptr<taxon_t> tax) const
This is a metric of how distinct
tax
is from the rest of the population.(From Vane-Wright et al., 1991; reviewed in Winter et al., 2013)
-
double GetEvolutionaryDistinctiveness(Ptr<taxon_t> tax, double time) const
This metric (from Isaac, 2007; reviewed in Winter et al., 2013) measures how distinct
tax
is from the rest of the population, weighted for the amount of unique evolutionary history that it represents.To quantify length of evolutionary history, this method needs
time:
the current time, in whatever units time is being measured in when taxa are added to the systematics manager. Note that passing a time in the past will produce inaccurate results (since we don’t know what the state of the tree was at that time).Assumes the tree is all connected. Will return -1 if this assumption isn’t met.
-
inline vector<double> GetAllEvolutionaryDistinctivenesses(double time) const
- Parameters:
time – The time step at which the calculation is being done
- Returns:
A vector of evolutionary distinctiveness of all active taxa
-
inline double GetMeanEvolutionaryDistinctiveness(double time) const
- Parameters:
time – The time step at which the calculation is being done
- Returns:
Mean evolutionary distinctiveness of all active taxa
-
inline double GetSumEvolutionaryDistinctiveness(double time) const
- Parameters:
time – The time step at which the calculation is being done
- Returns:
Sum of evolutionary distinctiveness of all active taxa
-
inline double GetVarianceEvolutionaryDistinctiveness(double time) const
- Parameters:
time – The time step at which the calculation is being done
- Returns:
Variance of evolutionary distinctiveness of all active taxa
-
inline virtual double GetMeanPairwiseDistance(bool branch_only = false) const
Calculates mean pairwise distance between extant taxa (Webb and Losos, 2000). This measurement is also called Average Taxonomic Diversity (Warwick and Clark, 1998) (for demonstration of equivalence see Tucker et al, 2016). This measurement tells you about the amount of distinctness in the community as a whole.
This measurement assumes that the tree is fully connected. Will return -1 if this is not the case.
- Parameters:
branch_only – only counts distance in terms of nodes that represent a branch between two extant taxa (potentially useful for comparison to biological data, where non-branching nodes generally cannot be inferred).
-
inline virtual double GetSumDistance() const
Calculates summed branch lengths of tree. Tucker et al 2017 points out that this is a measure of phylogenetic richness.
-
inline virtual double GetSumPairwiseDistance(bool branch_only = false) const
Calculates summed pairwise distance between extant taxa. Tucker et al 2017 points out that this is a measure of phylogenetic richness.
This measurement assumes that the tree is fully connected. Will return -1 if this is not the case.
- Parameters:
branch_only – only counts distance in terms of nodes that represent a branch between two extant taxa (potentially useful for comparison to biological data, where non-branching nodes generally cannot be inferred)
-
inline virtual double GetVariancePairwiseDistance(bool branch_only = false) const
Calculates variance of pairwise distance between extant taxa. Tucker et al 2017 points out that this is a measure of phylogenetic regularity.
This measurement assumes that the tree is fully connected. Will return -1 if this is not the case.
- Parameters:
branch_only – only counts distance in terms of nodes that represent a branch between two extant taxa (potentially useful for comparison to biological data, where non-branching nodes generally cannot be inferred).
-
virtual vector<double> GetPairwiseDistances(bool branch_only = false) const
Calculates a vector of all pairwise distances between extant taxa.
This method assumes that the tree is fully connected. Will return -1 if this is not the case.
- Parameters:
branch_only – only counts distance in terms of nodes that represent a branch between two extant taxa (potentially useful for comparison to biological data, where non-branching nodes generally cannot be inferred). *
-
std::set<Ptr<taxon_t>> GetCanopyExtantRoots(int time_point = 0) const
Returns a vector containing all taxa that were extant at
time_point
and were at that time the most recent ancestors of taxa that are now extant Example: Say the only current extant taxon is C, its lineage goes A -> B -> C, and B and C were both alive at the specified time_point. This function would only return B. If, however, there were another currently extant taxon that were descended directly from A, then this function would return both A and B.
-
int GetDistanceToRoot(Ptr<taxon_t> tax) const
- Parameters:
tax – the taxon who’s distance to root you want to calculate
- Returns:
the total number of ancestors between the given taxon and MRCA, if there is one. If there is no common ancestor, distance to the root of this tree is calculated instead.
-
int GetBranchesToRoot(Ptr<taxon_t> tax) const
Calculates the number of branching points leading to multiple extant taxa between the given taxon and the most-recent common ancestor (or the root of its subtree, if no MRCA exists). This is useful because a lot of stats for phylogenies are designed for phylogenies reconstructed from extant taxa. These phylogenies generally only contain branching points, rather than every ancestor along the way to the current taxon.
- Parameters:
tax – taxon to calculate branches from
- Returns:
Number of branching points between tax and root
-
inline virtual int SackinIndex() const
- Returns:
Sackin Index of this tree (Sackin, 1972; reviewed in Shao, 1990). Measures tree balance
-
inline virtual std::unordered_map<int, int> GetOutDegreeDistribution() const
Returns dictionary containing a histogram of node out degrees e.g. {1:4, 2:10, 3:4} means the tree has 4 unifurcations, 10 bifurcations, and 4 trifurcations
-
inline virtual double GetAverageOriginTime(bool normalize = false) const
Get average origin time for whole phylogeny. If
- Parameters:
normalize – is set to true, will apply normalization to make result comparable to what you would expect from a strictly bifurcating tree (as most reconstruction methods will produce). This normalization is achieved by multiplying each taxon’s values by the number of offspring taxa it has minus one.
-
inline virtual double CollessLikeIndex() const
Calculate Colless Index of this tree (Colless, 1982; reviewed in Shao, 1990). Measures tree balance. The standard Colless index only works for bifurcating trees, so this will be a Colless-like Index, as suggested in “Sound Colless-like balance indices for multifurcating trees” (Mir, 2018, PLoS One)
-
Ptr<taxon_t> GetMRCA() const
- Returns:
a pointer to the Most-Recent Common Ancestor for the population.
-
virtual int GetMRCADepth() const
- Returns:
the depth of the Most-Recent Common Ancestor; return -1 for none.
- Returns:
a pointer to the Most-Recent Ancestor shared by two taxa.
-
virtual double CalcDiversity() const
- Returns:
the genetic diversity of the population.
-
inline vector<Ptr<taxon_t>> GetLineage(Ptr<taxon_t> tax) const
- Returns:
vector containing the lineages of the specified taxon
-
inline vector<Ptr<taxon_t>> GetLineageToMRCA(Ptr<taxon_t> tax) const
- Returns:
vector containing the lineages of the specified taxon up to and including the MRCA, but not past the MRCA
-
virtual void PrintStatus(std::ostream &os = std::cout) const
Print details about the Systematics manager. First prints setting, followed by all active, ancestor, and outside taxa being stored. Format for taxa is [ id | number of orgs in this taxon, number of offspring taxa of this taxon | parent taxon]
- Parameters:
os – output stream to print to
-
void PrintLineage(Ptr<taxon_t> taxon, std::ostream &os = std::cout) const
Print a whole lineage. Format: “Lineage:”, followed by each taxon in the lineage, each on new line
- Parameters:
taxon – a pointer to the taxon to print the lineage of
os – output stream to print to
-
inline void AddSnapshotFun(const std::function<std::string(const taxon_t&)> &fun, const std::string &key, const std::string &desc = "")
Add a new snapshot function. When a snapshot of the systematics is taken, in addition to the default set of functions, all user-added snapshot functions are run. Functions take a reference to a taxon as input and return the string to be dumped in the file at the given key.
-
void Snapshot(const std::string &file_path) const
Take a snapshot of current state of taxon phylogeny. WARNING: Current, this function assumes one parent taxon per-taxon.
- Parameters:
file_path – the file to store the snapshot data in
-
inline virtual void SwapPositions(WorldPosition p1, WorldPosition p2)
-
void LoadFromFile(const std::string &file_path, const std::string &info_col = "info", bool assume_leaves_extant = true, bool adjust_total_offspring = true)
Load data from a file into the systematics manager. Intended to be used on an empty systematics object Expects a file in ALife phylogeny standard format (see: https://alife-data-standards.github.io/alife-data-standards/phylogeny)
- Parameters:
file_path – the file to load data from
info_col – the name of the column in the file that contains the taxon info (i.e. the information distinguishing taxa; should match return type of calc_taxon_info_fun)
assume_leaves_extant – if true, assumes that all leaf nodes correspond to extant taxa
adjust_total_offspring – if true, adjusts the total offspring count of each taxon based on the number of offspring it actually has (time consuming but necessary for some stats)
-
size_t GetNumActive() const = 0
-
size_t GetNumAncestors() const = 0
-
size_t GetNumOutside() const = 0
-
size_t GetTreeSize() const = 0
-
size_t GetNumTaxa() const = 0
-
int GetPhylogeneticDiversity() const = 0
-
double GetMeanPairwiseDistance(bool branch_only) const = 0
-
double GetSumDistance() const = 0
-
double GetSumPairwiseDistance(bool branch_only) const = 0
-
double GetVariancePairwiseDistance(bool branch_only) const = 0
-
vector<double> GetPairwiseDistances(bool branch_only) const = 0
-
std::unordered_map<int, int> GetOutDegreeDistribution() const = 0
-
double GetAverageOriginTime(bool) const = 0
-
int GetMRCADepth() const = 0
-
void AddOrg(ORG &&org, WorldPosition pos) = 0
-
void AddOrg(ORG &org, WorldPosition pos) = 0
-
void AddOrg(ORG &&org, WorldPosition pos, WorldPosition parent) = 0
-
void AddOrg(ORG &org, WorldPosition pos, WorldPosition parent) = 0
-
bool RemoveOrg(WorldPosition pos) = 0
-
void RemoveOrgAfterRepro(WorldPosition pos) = 0
-
void PrintStatus(std::ostream &os) const = 0
-
double CalcDiversity() const = 0
-
virtual void Update() = 0
-
inline size_t GetUpdate() const
- Returns:
current update/time step
-
inline void SetUpdate(size_t ud)
Sets the current update/time step.
-
void SetNextParent(WorldPosition pos) = 0
-
inline data_ptr_t GetDataNode(const std::string &name)
- Returns:
a pointer to the data node with the specified name
-
inline data_ptr_t AddDataNode(const std::string &name)
Add a data node to this systematics manager
- Parameters:
name – the name of the data node (so it can be found later)
-
inline data_ptr_t AddDataNode(std::function<vector<double>()> pull_set_fun, const std::string &name)
Add a data node to this systematics manager
- Parameters:
name – the name of the data node (so it can be found later)
pull_set_fun – a function to run when the data node is requested to pull data (returns vector of values)
-
inline data_ptr_t AddDataNode(std::function<double()> pull_fun, const std::string &name)
Add a data node to this systematics manager
- Parameters:
name – the name of the data node (so it can be found later)
pull_fun – a function to run when the data node is requested to pull data (returns single value)
-
data_ptr_t AddEvolutionaryDistinctivenessDataNode(const std::string &name = "evolutionary_distinctiveness") = 0
-
data_ptr_t AddPairwiseDistanceDataNode(const std::string &name = "pairwise_distance") = 0
-
data_ptr_t AddPhylogeneticDiversityDataNode(const std::string &name = "phylogenetic_diversity") = 0
-
data_ptr_t AddDeleteriousStepDataNode(const std::string &name = "deleterious_steps") = 0
-
data_ptr_t AddVolatilityDataNode(const std::string &name = "volatility") = 0
-
data_ptr_t AddUniqueTaxaDataNode(const std::string &name = "unique_taxa") = 0
-
data_ptr_t AddMutationCountDataNode(const std::string &name = "mutation_count", const std::string &mutation = "substitution") = 0
-
int GetMaxDepth() const = 0
Public Members
-
vector<SnapshotInfo> user_snapshot_funs
Collection of all desired snapshot file columns.
-
WorldPosition removal_pos = {0, 0}
Position of taxon to next be removed.
Private Types
-
using parent_t = SystematicsBase<ORG>
Private Members
-
fun_calc_info_t calc_info_fun
Function that takes an organism and returns the unit being tracked by systematics.
-
bool num_orgs_defaulted = false
Keep track of whether we have loaded from a file that didn’t
-
bool total_offspring_defaulted = false
provide num_orgs
Keep track of whether we have loaded from a file without
-
bool store_active
recalculating total offspring
-
bool store_ancestors
Store all of the direct ancestors from living taxa?
-
bool store_outside
Store taxa that are extinct with no living descendants?
-
bool archive
Set to true if we are supposed to do any archiving of extinct taxa.
-
bool store_position
Keep a vector mapping positions to pointers.
-
bool track_synchronous
Does this systematics manager need to keep track of current and next positions?
-
size_t org_count
How many organisms are currently active?
-
size_t total_depth
Sum of taxa depths for calculating average.
-
size_t num_roots
How many distinct injected ancestors are currently in population?
-
size_t next_id
What ID value should the next new taxon have?
-
size_t curr_update
-
mutable int max_depth
Depth of deepest taxon. -1 means needs to be recalculated.
-
struct SnapshotInfo
- #include <Systematics.hpp>
Struct for keeping track of what information to print out in snapshot files.
Public Functions
-
inline SnapshotInfo(const snapshot_fun_t &_fun, const std::string &_key, const std::string &_desc = "")
-
inline SnapshotInfo(const snapshot_fun_t &_fun, const std::string &_key, const std::string &_desc = "")
-
virtual void AddOrg(ORG &&org, WorldPosition pos)
-
namespace datastruct
The systematics manager allows an optional second template type that can store additional data about each taxon in the phylogeny. Here are some structs containing common pieces of additional data to track. Note: You are responsible for filling these in! Adding the template just gives you a place to store your data.
-
struct no_data
- #include <Systematics.hpp>
-
struct fitness
- #include <Systematics.hpp>
The default - an empty struct.
Public Types
-
template<typename PHEN_TYPE>
struct mut_landscape_info - #include <Systematics.hpp>
Track information related to the mutational landscape Maps a string representing a type of mutation to a count representing the number of that type of mutation that occurred to bring about this taxon.
Public Types
Public Functions
-
inline const PHEN_TYPE &GetPhenotype() const
This taxon’s phenotype (for assessing phenotypic change)
- Returns:
this taxon’s phenotype
-
inline double GetFitness() const
- Returns:
this taxon’s fitness
-
inline void RecordMutation(const std::unordered_map<std::string, int> &muts)
Adds mutations to the list of mutations that occurred to make this taxon
- Parameters:
muts – can contain as many strings (types of mutation) as desired, each accompanied by a number indicating how many of that mutation occurred Example: {“point_mutation”:2, “insertion”:1}
-
inline void RecordFitness(double fit)
Record the fitness of this taxon
- Parameters:
fit – the fitness
-
inline const PHEN_TYPE &GetPhenotype() const
-
struct no_data