The Parser of Systems definitions

The system parser object “SystemParser” instanciates Systems and Genes objects from XML system definitions (see Macromolecular systems definition). The parsing consists in three phases.

Phase 1.

  • each Gene is parsed from the System it is defined
  • From the list of System to detect, the list of Systems to parse is established

Phase 2.

  • For each system to parse
    • create the System
    • add this System to the system_bank
    • create the Genes defined in this System with their attributes but not their Homologs
    • add these Genes in the gene_bank

Phase 3.

  • For each System to search

    • For each Gene defined in this System:

      • create the Homologs by encapsulating Genes from the gene_bank
      • add the Gene to the System

For instance:

Syst_1
<system inter_gene_max_space="10">
    <gene name=”A” mandatory=”1” loner="1">
        <homologs>
            <gene name=”B” sys_ref=”Syst_2”>
        </homologs>
    </gene>
<system>

Syst_2
<system inter_gene_max_space="15">
    <gene name=”B” mandatory=”1”>
        <homologs>
            <gene name=”B” sys_ref=”Syst_1”
            <gene name=”C” sys_ref=”Syst_3”>
        </homologs>
    </gene>
<system>

Syst_3
<system inter_gene_max_space="20">
    <gene name=”c” mandatory=”1” />
<system>

With the example above:

  • the Syst_1 has a gene_A
  • the gene_A has homolog gene_B
  • the gene_B has a reference to Syst_2
  • gene_B attributes from the Syst_2 are used to build the Gene
  • the Syst_2 has attributes as defined in the corresponding XML file (inter_gene_max_space ,...)

Contrariwise:

  • the gene_B has no Homologs
  • the Syst_2 has no Genes

Note

The only “full” Systems (i.e., with all corresponding Genes created) are those to detect.

SystemParser API reference

class macsypy.system_parser.SystemParser(cfg, system_bank, gene_bank)[source]

Build a System instance from the corresponding System definition described in the XML file (named after the system’s name) found at the dedicated location (“-d” command-line option).

__init__(cfg, system_bank, gene_bank)[source]

Constructor

Parameters:
__weakref__

list of weak references to the object (if defined)

_create_genes(system, system_node)[source]

Create genes belonging to the systems. Be careful, the returned genes have not their homologs/analogs set yet.

Parameters:
  • system (macsypy.system.System object) – the System currently parsing
  • system_node (:class”Et.ElementTree object) – the element gene
Returns:

a list of the genes belonging to the system.

Return type:

[macsypy.gene.Gene, ...]

_create_system(system_name, system_node)[source]
Parameters:
  • system_name (string) – the name of the system to create. This name must match a XML file in the definition directory (“-d” option in the command-line)
  • system_node (:class”Et.ElementTree object.) – the node corresponding to the system.
Returns:

the system corresponding to the name.

Return type:

macsypy.system.System object.

_fill(system, system_node)[source]

Fill the system with genes found in this system definition. Add homologs to the genes if necessary.

Parameters:
  • system (macsypy.system.System object) – the system to fill
  • system_node (:class”Et.ElementTree object) – the “node” in the XML hierarchy corresponding to the system
_parse_analog(node, gene_ref, curr_system)[source]

Parse a xml element gene and build the corresponding object

Parameters:
  • node (xml.etree.ElementTree.Element object.) – a “node” corresponding to the gene element in the XML hierarchy
  • gene_ref (class:macsypy.gene.Gene object.) – the gene which this gene is homolog to
Returns:

the gene object corresponding to the node

Return type:

macsypy.gene.Analog object

_parse_homolog(node, gene_ref, curr_system)[source]

Parse a xml element gene and build the corresponding object

Parameters:
  • node (xml.etree.ElementTree.Element object.) – a “node” corresponding to the gene element in the XML hierarchy
  • gene_ref (class:macsypy.gene.Gene object) – the gene which this gene is homolog to
Returns:

the gene object corresponding to the node

Return type:

macsypy.gene.Homolog object

check_consistency(systems)[source]

Check the consistency of the co-localization features between the different values given as an input: between XML definitions, configuration file, and command-line options.

Parameters:systems (list of class:macsypy.system.System object) – the list of systems to check
Raise:macsypy.macsypy_error.SystemInconsistencyError if one test fails

(see feature)

In the different possible situations, different requirements need to be fulfilled (“mandatory_genes” and “accessory_genes” consist of lists of genes defined as such in the system definition):

  • If: min_mandatory_genes_required = None ; min_genes_required = None
  • Then: min_mandatory_genes_required = min_genes_required = len(mandatory_genes)

always True by Systems design

  • If: min_mandatory_genes_required = value ; min_genes_required = None
  • Then: min_mandatory_genes_required <= len(mandatory_genes)
  • AND min_genes_required = min_mandatory_genes_required

always True by design

  • If: min_mandatory_genes_required = None ; min_genes_required = Value
  • Then: min_mandatory_genes_required = len(mandatory_genes)
  • AND min_genes_required >= min_mandatory_genes_required
  • AND min_genes_required <= len(mandatory_genes+accessory_genes)

to be checked

  • If: min_mandatory_genes_required = Value ; min_genes_required = Value
  • Then: min_genes_required <= len(accessory_genes+mandatory_genes)
  • AND min_genes_required >= min_mandatory_genes_required
  • AND min_mandatory_genes_required <= len(mandatory_genes)

to be checked

parse(systems_2_detect)[source]
Parse systems definition in XML format to build the corresponding system objects,
and add them to the system factory after checking its consistency.

To get the system ask it to system_bank :param systems_2_detect: a list with the names of the systems to parse (eg ‘T2SS’) :type systems_2_detect: list of string

system_to_parse(sys_2_parse, parsed_systems)[source]
Parameters:sys_2_parse ([string, ..]) – a dict of systems to parse
Returns:the list of systems’ names to parse. Scan the whole chain of ‘system_ref’ in a recursive way.
Return type:[string, ..]