gene

The Gene object represents genes encoding the protein components of a Model. There is 2 kind of gene The CoreGene (macsypy.gene.CoreGene) which must be unique given a name. A CoreGene must have a corresponding HMM protein profile. A ModelGene encapsulate a CoreGene and is linked to a Model.

Warning

To optimize computation and to avoid concurrency problems when we search several models, each gene must be instantiated only once, and stored in gene_bank. gene_bank is a macsypy.gene.GeneBank object. The gene_bank and model_bank (macsypy.model.ModelBank object) are instantiated in macsypy.scripts.macsyfinder.main() function and filled by a definition_parser (macsypy.defintion_parser.DefinitionParser)

Example to get a CoreGene object:

# get a model object
model_a = model_bank("TXSS/model_a")
model_b = model_bank("TXSS/model_b")

# get of a <CoreGene> object
t2ss =  gene_bank[("TXSS", "T2SS")]
pilO = gene_bank[("TXSS", "pilO")]

to create a ModelGene

modelA_t2ss(t2ss, model_A)
modelA_pilO(pilO, model_a, loner=True, inter_gene_max_space=12)
modelB_pilO(pilO, model_b, inter_gene_max_space=5)

There is only one instance of CoreGene with a given name (model family name, gene name) in one MSF run. But several instance of a ModelGene with the same name may exists. Above, there is 2 <ModelGene> representing pilO one in model_a the second in model_b with different properties.

Exchangeable inherits from ModelGene. Then a gene in some model is seen as a Gene, in some other models as an Exchangeable. But there only one instance of the corresponding CoreGene.:

core_sctn = gene_bank(("TXSS", "sctN"))
core_sctn_flg = gene_bank(("TXSS", "sctN_FLG"))
model_sctn = ModelGene(core_sctn, model_a)
ex_sctn_flg = Exchangeable(core_stn_flg, model_sctn)
model_sctn.add_exchangeable(ex_sctn_flg)

model_sctn_flg = ModelGene(core_sctn_flg, model_b)

which means that in model_a the gene sctn can be functionally replaced by sctn_flg. In Model_a it appear as an alternative to sctn but in model_B it appear as sctn_flg itself. In one MacSyFinder run several instances of ModelGene and/or Exchangeable with the same name may coexists . But in A whole macsyfinder run there is only one instance core_sctn_flg and core_sctn.

GeneBank

class macsypy.gene.GeneBank[source]

Store all Gene objects. Ensure that genes are instanciated only once.

__contains__(gene)[source]

Implement the membership test operator

Parameters:gene (macsypy.gene.CoreGene object) – the gene to test
Returns:True if the gene is in, False otherwise
Return type:boolean
__getitem__(key)[source]
Parameters:key (tuple (string, string)) – The key to retrieve a gene. The key is composed of the name of models family and the gene name. for instance CRISPR-Cas/cas9_TypeIIB (‘CRISPR-Cas’ , ‘cas9_TypeIIB’) or TXSS/T6SS_tssH (‘TXSS’, ‘T6SS_tssH’)
Returns:return the Gene corresponding to the key.
Return type:macsypy.gene.CoreGene object
Raises:KeyError – if the key does not exist in GeneBank.
__init__()[source]

Initialize self. See help(type(self)) for accurate signature.

__iter__()[source]

Return an iterator object on the genes contained in the bank

__weakref__

list of weak references to the object (if defined)

add_new_gene(model_location, name, profile_factory)[source]

Create a gene and store it in the bank. If the same gene (same name) is add twice, it is created only the first time.

Parameters:
  • model_location (macsypy.registry.ModelLocation object) – the location where the model family can be found.
  • name (str) – the name of the gene to add
  • profile_factory (profile.ProfileFactory object.) – The Profile factory
genes_fqn()[source]
Returns:the fully qualified name for all genes in the bank
Return type:str

Gene

There is two classes to modelize a gene: macsypy.gene.CoreGene and macsypy.gene.ModelGene. The CoreGene are created using the macsypy.gene.GeneBank factory and there is only one instance of a CoreGene with a given name. Whereas several ModelGene with the same name can appear in different model and can have differents properties, loner in one model and not in an other, have different inter_gene_max_space … The ModelGene is attached to the model and is composed of a CoreGene.

Note

The macsypy.hit.Hit object are link to a CoreGene, whereas the macsypy.hit.ValidHit ref_gene attribute reference a macsypy.gene.ModelGene

CoreGene

class macsypy.gene.CoreGene(model_location, name, profile_factory)[source]

Modelize gene attach to a profile. It can be only one instance with the the same name (familly name, gene name)

__hash__()[source]

Return hash(self).

__init__(model_location, name, profile_factory)[source]

Initialize self. See help(type(self)) for accurate signature.

__weakref__

list of weak references to the object (if defined)

model_family_name

The name of the model family for instance ‘CRISPRCas’ or ‘TXSS’

name

The name of the gene a hmm profile with the same name must exists.

profile

The HMM protein Profile corresponding to this gene macsypy.profile.Profile object

ModelGene

class macsypy.gene.ModelGene(gene, model, loner=False, multi_system=False, inter_gene_max_space=None)[source]

Handle Gene describe in a Model

__hash__()[source]

Return hash(self).

__init__(gene, model, loner=False, multi_system=False, inter_gene_max_space=None)[source]

Handle gene described in a Model

Parameters:
  • gene (a macsypy.gene.CoreGene object.) – a gene link to a profile
  • model (macsypy.model.Model object.) – the model that owns this Gene
  • loner (boolean.) – True if the Gene can be isolated on the genome (with no contiguous genes), False otherwise.
  • multi_system (boolean.) – True if this Gene can belong to different occurrences of this System.
  • inter_gene_max_space (integer) – the maximum space between this Gene and another gene of the System.
__str__()[source]

Print the name of the gene and of its exchangeable genes.

__weakref__

list of weak references to the object (if defined)

add_exchangeable(exchangeable)[source]

Add a exchangeable gene to this Gene

Parameters:exchangeable (macsypy.gene.Exchangeable object) – the exchangeable to add
alternate_of()[source]
Returns:the gene to which this one is an exchangeable to (reference gene), or itself if it is a first level gene.
Return type:macsypy.gene.ModelGene object
exchangeables
Returns:the list of genes which can replace this one without any effect on the model
Return type:list of macsypy.gene.ModelGene objects
inter_gene_max_space
Returns:The maximum distance allowed between this gene and another gene for them to be considered co-localized. If the value is not set at the Gene level, return the value set at the System level.
Return type:integer.
is_accessory(model)[source]
Returns:True if the gene is within the accessory genes of the model, False otherwise.
Parameters:model (macsypy.model.Model object.) – the query of the test
Return type:boolean.
is_exchangeable
Returns:True if this gene is describe in the model as an exchangeable. False if ot is describe as first level gene.
is_forbidden(model)[source]
Returns:True if the gene is within the forbidden genes of the model, False otherwise.
Parameters:model (macsypy.model.Model object.) – the query of the test
Return type:boolean.
is_mandatory(model)[source]
Returns:True if the gene is within the mandatory genes of the model, False otherwise.
Parameters:model (macsypy.model.Model object.) – the query of the test
Return type:boolean.
loner
Returns:True if the gene can be isolated on the genome, False otherwise
Return type:boolean
model
Returns:the Model that owns this Gene
Return type:macsypy.model.Model object
multi_system
Returns:True if this Gene can belong to different occurrences of the model (and can be used for multiple System assessments), False otherwise.
Return type:boolean.

Exchangeable

class macsypy.gene.Exchangeable(c_gene, gene_ref)[source]

Handle Exchangeables. Exchangeable are ModelGene which can replaced functionally an other ModelGene. Biologically it can be Homolog or Analog

__init__(c_gene, gene_ref)[source]
Parameters:
add_exchangeable(exchangeable)[source]

This method should never be called, it’s a security to avoid to add exchangeable to an exchangeable.

Parameters:exchangeable (macsypy.gene.Exchangeable) –
Raises:MacsypyError
alternate_of()[source]
Returns:the gene to which this one is an exchangeable to (reference gene)
Return type:macsypy.gene.ModelGene object
is_exchangeable
Returns:True

GeneStatus

class macsypy.gene.GeneStatus[source]

Handle status of Gene GeneStatus can take 4 value:

  • MANDATORY
  • ACCESSORY
  • FORBIDDEN
  • NEUTRAL