hit¶
This module implements class relative to hit and some functions to do some computation on hit objects.
macsypy.hit.CoreHit |
Modelize a hmm hit on the replicon. There is only one Corehit for a CoreGene. |
macsypy.hit.ModelHit |
Modelize a hit and its relation to the Model. |
macsypy.hit.AbstractCounterpartHit |
Parent class of Loner, MultiSystem. It’s inherits from ModelHit. |
macsypy.hit.Loner |
Modelize “true” Loner. |
macsypy.hit.MultiSystem |
Modelize hit which can be used in several Systems (same model) |
macsypy.hit.LonerMultiSystem |
Modelize a hit representing a gene Loner and MultiSystem at same time. |
macsypy.hit.HitWeight |
The weights apply to the hit to compute score |
macsypy.hit.get_best_hit_4_func() |
Return the best hit for a given function |
macsypy.hit.sort_model_hits() |
Sort hits |
macsypy.hit.compute_best_MSHit() |
Choose among svereal multisystem hits the best one |
macsypy.hit.get_best_hits() |
If several profile hit the same gene return the best hit |
A Hit is created when hmmsearch find similarities between a profile and protein of the input dataset
Below the ingheritance diagram of Hits

And a diagram showing the interaction between CoreGene, ModelGene, Model, Hit, Loner, … interactions
The diagram above represents the models, genes and hit generated from the definitions below.
hit API reference¶
CoreHit¶
-
class
macsypy.hit.
CoreHit
(gene, hit_id, hit_seq_length, replicon_name, position_hit, i_eval, score, profile_coverage, sequence_coverage, begin_match, end_match)[source]¶ Handle the hits filtered from the Hmmer search. The hits are instanciated by
HMMReport.extract()
method In one run of MacSyFinder, there exists only one CoreHit per gene These hits are independent of anymacsypy.model.Model
instance.-
__eq__
(other)[source]¶ Return True if two hits are totally equivalent, False otherwise.
Parameters: other ( macsypy.report.CoreHit
object) – the hit to compare to the current objectReturns: the result of the comparison Return type: boolean
-
__gt__
(other)[source]¶ compare two Hits. If the sequence identifier is the same, do the comparison on the score. Otherwise, do it on alphabetical comparison of the sequence identifier.
Parameters: other ( macsypy.report.CoreHit
object) – the hit to compare to the current objectReturns: True if self is > other, False otherwise
-
__init__
(gene, hit_id, hit_seq_length, replicon_name, position_hit, i_eval, score, profile_coverage, sequence_coverage, begin_match, end_match)[source]¶ Parameters: - gene (
macsypy.gene.CoreGene
object) – the gene corresponding to this profile - hit_id (str) – the identifier of the hit
- hit_seq_length (int) – the length of the hit sequence
- replicon_name (str) – the name of the replicon
- position_hit (int) – the rank of the sequence matched in the input dataset file
- i_eval (float) – the best-domain evalue (i-evalue, “independent evalue”)
- score (float) – the score of the hit
- profile_coverage (float) – percentage of the profile that matches the hit sequence
- sequence_coverage (float) – percentage of the hit sequence that matches the profile
- begin_match (int) – where the hit with the profile starts in the sequence
- end_match (int) – where the hit with the profile ends in the sequence
- gene (
-
__lt__
(other)[source]¶ Compare two Hits. If the sequence identifier is the same, do the comparison on the score. Otherwise, do it on alphabetical comparison of the sequence identifier.
Parameters: other ( macsypy.report.CoreHit
object) – the hit to compare to the current objectReturns: True if self is < other, False otherwise
-
__str__
()[source]¶ Returns: Useful information on the CoreHit: regarding Hmmer statistics, and sequence information Return type: str
-
__weakref__
¶ list of weak references to the object (if defined)
-
ModelHit¶
-
class
macsypy.hit.
ModelHit
(hit, gene_ref, gene_status)[source]¶ Encapsulates a
macsypy.report.CoreHit
This class stores a CoreHit that has been attributed to a putative system. Thus, it also stores:- the system,
- the status of the gene in this system, (‘mandatory’, ‘accessory’, …
- the gene in the model for which it’s an occurrence
for one gene it can exist several ModelHit instance one for each Model containing this gene
-
__init__
(hit, gene_ref, gene_status)[source]¶ Parameters: - hit (
macsypy.hit.CoreHit
object) – a match between a hmm profile and a replicon - gene_ref (
macsypy.gene.ModelGene
object) –The ModelGene link to this hit The ModeleGene have the same name than the CoreGene But one hit can be link to several ModelGene (several Model) To know for what gene this hit play role use the
macsypy.gene.ModelGene.alternate_of()
hit.gene_ref.alternate_of()
- gene_status (
macsypy.gene.GeneStatus
object) –
- hit (
-
__weakref__
¶ list of weak references to the object (if defined)
-
hit
¶ Returns: The CoreHit below this ModelHit Return type: macsypy.hit.CoreHit
oject
-
loner
¶ Returns: True if the hit represent a loner macsypy.Gene.ModelGene
, False otherwise. A True Loner is a hit representing a gene with the attribute loner and which does not include in a cluster.- a hit representing a loner gene but include in a cluster is not a true loner
- a hit which is not include with other gene in a cluster but does not represent a gene loner is not a True loner (This situation may append when min_genes_required = 1)
Return type: bool
-
multi_model
¶ Returns: True if the hit represent a multi_model macsypy.Gene.ModelGene
, False otherwise.Return type: bool
-
multi_system
¶ Returns: True if the hit represent a multi_system macsypy.Gene.ModelGene
, False otherwise.Return type: bool
AbstractCounterpartHit¶
-
class
macsypy.hit.
AbstractCounterpartHit
(hit, gene_ref=None, gene_status=None, counterpart=None)[source]¶ Abstract Class to handle ModelHit wit equivalent for instance Loner or MultiSystem hit
-
__init__
(hit, gene_ref=None, gene_status=None, counterpart=None)[source]¶ Parameters: - hit (
macsypy.hit.CoreHit
object) – a match between a hmm profile and a replicon - gene_ref (
macsypy.gene.ModelGene
object) –The ModelGene link to this hit The ModeleGene have the same name than the CoreGene But one hit can be link to several ModelGene (several Model) To know for what gene this hit play role use the
macsypy.gene.ModelGene.alternate_of()
hit.gene_ref.alternate_of()
- gene_status (
macsypy.gene.GeneStatus
object) –
- hit (
-
counterpart
¶ Returns: The set of hits that can play the same role
-
loner
¶ Returns: True if the hit represent a loner macsypy.Gene.ModelGene
, False otherwise. A True Loner is a hit representing a gene with the attribute loner and which does not include in a cluster.- a hit representing a loner gene but include in a cluster is not a true loner
- a hit which is not include with other gene in a cluster but does not represent a gene loner is not a True loner (This situation may append when min_genes_required = 1)
Return type: bool
-
multi_system
¶ Returns: True if the hit represent a multi_system macsypy.Gene.ModelGene
, False otherwise.Return type: bool
-
Loner¶
-
class
macsypy.hit.
Loner
(hit, gene_ref=None, gene_status=None, counterpart=None)[source]¶ Handle hit which encode for a gene tagged as loner and which not clustering with other hit.
-
__init__
(hit, gene_ref=None, gene_status=None, counterpart=None)[source]¶ hit that is outside a cluster, the gene_ref is a loner
Parameters: - hit (
macsypy.hit.CoreHit
object) – a match between a hmm profile and a replicon - gene_ref (
macsypy.gene.ModelGene
object) –The ModelGene link to this hit The ModeleGene have the same name than the CoreGene But one hit can be link to several ModelGene (several Model) To know for what gene this hit play role use the
macsypy.gene.ModelGene.alternate_of()
hit.gene_ref.alternate_of()
- gene_status (
macsypy.gene.GeneStatus
object) – - counterpart (list of
macsypy.hit.CoreHit
) – the other occurence of the gene or exchangeable in the replicon
- hit (
-
loner
¶ Returns: True if the hit represent a loner macsypy.Gene.ModelGene
, False otherwise. A True Loner is a hit representing a gene with the attribute loner and which does not include in a cluster.- a hit representing a loner gene but include in a cluster is not a true loner
- a hit which is not include with other gene in a cluster but does not represent a gene loner is not a True loner (This situation may append when min_genes_required = 1)
Return type: bool
-
MultiSystem¶
-
class
macsypy.hit.
MultiSystem
(hit, gene_ref=None, gene_status=None, counterpart=None)[source]¶ Handle hit which encode for a gene tagged as loner and which not clustering with other hit.
-
__init__
(hit, gene_ref=None, gene_status=None, counterpart=None)[source]¶ hit that is outside a cluster, the gene_ref is a loner
Parameters: - hit (
macsypy.hit.CoreHit
object) – a match between a hmm profile and a replicon - gene_ref (
macsypy.gene.ModelGene
object) –The ModelGene link to this hit The ModeleGene have the same name than the CoreGene But one hit can be link to several ModelGene (several Model) To know for what gene this hit play role use the
macsypy.gene.ModelGene.alternate_of()
hit.gene_ref.alternate_of()
- gene_status (
macsypy.gene.GeneStatus
object) – - counterpart (list of
macsypy.hit.CoreHit
) – the other occurence of the gene or exchangeable in the replicon
- hit (
-
multi_system
¶ Returns: True if the hit represent a multi_system macsypy.Gene.ModelGene
, False otherwise.Return type: bool
-
LonerMultiSystem¶
-
class
macsypy.hit.
LonerMultiSystem
(hit, gene_ref=None, gene_status=None, counterpart=None)[source]¶ - Handle hit which encode for a gene
- gene tagged as multi-system
- and gene tagged as loner also
- and the hit do not clustering with other hits.
-
__init__
(hit, gene_ref=None, gene_status=None, counterpart=None)[source]¶ hit that is outside a cluster, the gene_ref is loner and multi_system
Parameters: - hit (
macsypy.hit.CoreHit
|macsypy.hit.ModelHit
|macsypy.hit.MultiSystem
object) – a match between a hmm profile and a replicon - gene_ref (
macsypy.gene.ModelGene
object) –The ModelGene link to this hit The ModeleGene have the same name than the CoreGene But one hit can be link to several ModelGene (several Model) To know for what gene this hit play role use the
macsypy.gene.ModelGene.alternate_of()
hit.gene_ref.alternate_of()
- gene_status (
macsypy.gene.GeneStatus
object) – - counterpart (list of
macsypy.hit.CoreHit
) – the other occurence of the gene or exchangeable in the replicon
- hit (
HitWeight¶
-
class
macsypy.hit.
HitWeight
(itself: float = 1, exchangeable: float = 0.8, mandatory: float = 1, accessory: float = 0.5, neutral: float = 0, out_of_cluster: float = 0.7)[source]¶ The weight to compute the cluster and system score see user documentation macsyfinder functionning for further details by default
- itself = 1
- exchangeable = 0.8
- mandatory = 1
- accessory = 0.5
- neutral = 0
- out_of_cluster = 0.7
-
__weakref__
¶ list of weak references to the object (if defined)
get_best_hit_4_func¶
-
macsypy.hit.
get_best_hit_4_func
(function, hits, key='score')[source]¶ select the best Loner among several ones encoding for same function
- score
- i_evalue
- profile_coverage
Parameters: - function (str) – the name of the function fulfill by the hits (all hits must have same function)
- hits (sequence of
macsypy.hit.ModelHit
object) – the hits to filter. - key (str) – The criterion used to select the best hit ‘score’, i_evalue’, ‘profile_coverage’
Returns: the best hit
Return type: macsypy.hit.ModelHit
object
sort_model_hits¶
-
macsypy.hit.
sort_model_hits
(model_hits)[source]¶ Sort
macsypy.hit.ModelHit
per functionParameters: model_hits – a sequence of macsypy.hit.ModelHit
Returns: dict {str function name: [model_hit, …] }
compute_best_MSHit¶
get_best_hits¶
-
macsypy.hit.
get_best_hits
(hits, key='score')[source]¶ If several hits match the same protein, keep only the best match based either on
- score
- i_evalue
- profile_coverage
Parameters: - hits ([
macsypy.hit.CoreHit
object, …]) – the hits to filter, all hits must match the same protein. - key (str) – The criterion used to select the best hit ‘score’, i_evalue’, ‘profile_coverage’
Returns: the list of the best hits
Return type: [
macsypy.hit.CoreHit
object, …]