cluster

A cluster is an ordered set of hits related to a model which satisfy the model distance constraints.

cluster API reference

cluster

class macsypy.cluster.Cluster(hits, model, hit_weights)[source]

Handle hits relative to a model which collocates

__contains__(v_hit)[source]
Parameters:

v_hit (macsypy.hit.ModelHit object) – The hit to test

Returns:

True if the hit is in the cluster hits, False otherwise

__init__(hits, model, hit_weights)[source]
Parameters:
__str__()[source]
Returns:

a string representation of this cluster

__weakref__

list of weak references to the object (if defined)

_check_replicon_consistency()[source]
Raise:

MacsypyError if all hits of a cluster are NOT related to the same replicon

fulfilled_function(*genes)[source]
Parameters:

gene – The genes which must be tested.

Returns:

the common functions between genes and this cluster.

Return type:

set of string

property functions
Returns:

The set of functions encoded by this cluster function mean gene name or reference gene name for exchangeables genes for instance

<model vers=”2.0”>

<gene a presence=”mandatory”/> <gene b presence=”accessory”/>

<exchangeable>

<gene c />

</exchangeable>

<gene/>

</model>

the functions for a cluster corresponding to this model wil be {‘a’ , ‘b’}

Return type:

frozenset

property hit_weights
Returns:

the different weight for the hits used to compute the score

Return type:

macsypy.hit.HitWeight

property loner
Returns:

True if this cluster is made of only some hits representing the same gene and this gene is tag as loner False otherwise: - contains several hits coding for different genes - contains one hit but gene is not tag as loner (max_gene_required = 1)

merge(cluster, before=False)[source]

merge the cluster in this one. (do it in place)

Parameters:
  • cluster (macsypy.cluster.Cluster object) –

  • before (bool) – If False the hits of the cluster will be add at the end of this one, Otherwise the cluster hits will be inserted before the hits of this one.

Returns:

None

Raises:

MacsypyError – if the two clusters have not the same model

property multi_system
Returns:

True if this cluster is made of only one hit representing a multi_system gene False otherwise:

  • contains several hits

  • contains one hit but gene is not tag as loner (max_gene_required = 1)

replace(old, new)[source]

replace hit old in this cluster by new one. (work in place)

Parameters:
Returns:

None

property replicon_name
Returns:

The name of the replicon where this cluster is located

Return type:

str

property score
Returns:

The score for this cluster

Return type:

float

build_clusters

macsypy.cluster.build_clusters(hits, rep_info, model, hit_weights)[source]

From a list of filtered hits, and replicon information (topology, length), build all lists of hits that satisfied the constraints:

  • max_gene_inter_space

  • loner

  • multi_system

If Yes create a cluster A cluster contains at least two hits separated by less or equal than max_gene_inter_space Except for loner genes which are allowed to be alone in a cluster

Parameters:
  • hits (list of macsypy.hit.ModelHit objects) – list of filtered hits

  • rep_info (macsypy.Indexes.RepliconInfo object) – the replicon to analyse

  • model (macsypy.model.Model object) – the model to study

Returns:

list of regular clusters, the special clusters (loners not in cluster and multi systems)

Return type:

tuple with 2 elements

  • true_clusters which is list of Cluster objects

  • true_loners: a dict { str function: :class:macsypy.hit.Loner | :class:macsypy.hit.LonerMultiSystem object}