Installation

MacSyFinder works with models for macromolecular systems that are not shipped with it, you have to install them separately. See the macsydata section below. We also provide container so you can use macsyfinder directly.

MacSyFinder dependencies

Python version >=3.7 is required to run MacSyFinder: https://docs.python.org/3.7/index.html

MacSyFinder has one program dependency:

The hmmsearch program should be installed (e.g., in the PATH) in order to use MacSyFinder. Otherwise, the paths to this executable must be specified in the command-line: see the command-line options.

MacSyFinder also relies on six Python library dependencies:

  • colorlog
  • colorama
  • pyyaml
  • packaging
  • networkx
  • pandas

These dependencies will be automatically retrieved and installed when using pip for installation (see below).

MacSyFinder Installation procedure

It is recommended to use pip to install the MacSyFinder package.

Archive overview

  • doc => the documentation in html and pdf
  • test => all what is needed for unitary tests
  • macsypy => the macsyfinder python library
  • setup.py => the installation script
  • setup.cfg => the installation script
  • pyproject.toml => the project installation build tool
  • COPYING => the licensing
  • COPYRIGHT => the copyright
  • README.md => very brief macsyfinder overview
  • CONTRIBUTORS => list of people who contributed to the code

Installation steps:

Make sure every required dependency/software is present.

By default MacSyFinder will try to use hmmsearch in your PATH. If hmmsearch is not in the PATH, you have to set the absolute path to hmmsearch in a configuration file or in the command-line upon execution. If the tools are not in the path, some test will be skipped and a warning will be raised.

Perform the installation.

pip install macsyfinder

If you do not have the privileges to perform a system-wide installation, you can either install it in your home directory or use a virtual environment.

installation in your home directory
pip install --user macsyfinder
installation in a virtualenv
python3 -m venv macsyfinder
cd macsyfinder
source bin/activate
pip install macsyfinder

To exit the virtualenv just execute the deactivate command. To run macsyfinder, you need to activate the virtualenv:

source macsyfinder/bin/activate

Then run macsyfinder or macsydata.

Note

Super-user privileges (i.e., sudo) are necessary if you want to install the program in the general file architecture.

Note

If you do not have the privileges, or if you do not want to install MacSyFinder in the Python libraries of your system, you can install MacSyFinder in a virtual environment (http://www.virtualenv.org/).

Warning

When installing a new version of MacSyFinder, do not forget to uninstall the previous version installed !

Uninstalling MacSyFinder

To uninstall MacSyFinder (the last version installed), run:

(sudo) pip uninstall macsyfinder

If you install it in a virtualenv, just delete the virtual environment. For instance if you create a virtualenv name macsyfinder:

python3 -m venv macsyfinder

To delete it, remove the directory:

rm -R macsyfinder

From container

With Docker

The docker image is available on Docker Hub (https://hub.docker.com/repository/docker/gempasteur/macsyfinder) The computations are performed under msf user in /home/msf inside the container. So You have to mount a directory from the host in the container to exchange data (inputs data, and results) from the host and the container. The shared directory must be writable by the msf user or overwrite the user in the container by your id (see example below)

Furthermore the models are no longer packaged along macsyfinder. So you have to install them by yourself. For that we provide a command line tool macsydata which is inspired by pip.

macsydata search PACKNAME
macsydata install PACKNAME== or >=, or ... VERSION

To work with Docker you have to install models in a directory which will be mounted in the image at run time

mkdir shared_dir
cd shared_dir

install desired models in my_models directory

docker run -v ${PWD}/:/home/msf -u $(id -u ${USER}):$(id -g ${USER})  gempasteur/macsyfinder:<tag> macsydata install --target /home/msf/my_models <MODELS_PACK>

run msf against all models contains in <MODELS_PACK>

docker run -v ${PWD}/:/home/msf -u $(id -u ${USER}):$(id -g ${USER})  gempasteur/macsyfinder:<tag> macsyfinder --db-type unordered_replicon --models-dir=/home/msf/my_models/ --models  <MODELS_PACK>  all --sequence-db my_genome.fasta -w 12

With Apptainer (formely Singularity)

As the docker image is registered in docker hub you can also use it directly with Apptainer (https://apptainer.org/). Unlike docker you have not to worry about shared directory, your HOME and /tmp are automatically shared.

# install desired models in my_models directory
apptainer run -H ${HOME} docker://gempasteur/macsyfinder:<tag> macsydata install --target my_models <MODELS_PACK>

# run msf against all models contains in <MODELS_PACK>
apptainer run -H ${HOME} docker://gempasteur/macsyfinder:<tag> macsyfinder --db-type unordered_replicon --models-dir=my_models --models <MODELS_PACK> all --sequence-db my_genome.fasta -w 12

If you intend to run apptainer from host which cannot access internet (cluster node for instance), you have to

  1. download the image locally
  2. transfert the image file on the right file system
  3. and then use it.
apptainer build msf-<tag>.simg docker://gempasteur/macsyfinder:<tag>
cp msf-<tag>.simg <cluster_file_system>
apptainer run -H ${HOME} msf-<tag>.simg macsyfinder --db-type unordered_replicon --models-dir=my_models --models <MODELS_PACK> all --sequence-db my_genome.fasta -w 12

Models installation with macsydata

Once MacSyFinder is installed you have access to an utility program to manage the models: macsydata

This script allows to search, download, install and get information from MacSyFinder models stored on github (https://github.com/macsy-models) or locally installed. The general syntax for macsydata is:

macsydata <general options> <subcommand> <sub command options> <arguments>

To list all models available on macsy-models:

macsydata available

To search for models on macsy-models:

macsydata search TXSS

you can also search in models description:

macsydata search -S secretion

To install a model package:

macsydata install <model name>

To install a model when you have not the right to install it system-wide

To install it in your home (./macsyfinder/data):

macsydata install --user <model name>

To install it in any directory:

macsydata install --target <model dir> <model_name>

To know how to cite a model package:

macsydata cite <model name>

To show the model definition:

macsydata definition <package or subpackage> model1 [model2, ...]

for instance to show model definitions T6SSii and T6SSiii in TXSS+/bacterial subpackage:

macsydata definition TXSS+/bacterial T6SSii T6SSiii

To show all models definitions in TXSS+/bacterial subpackage:

macsydata definition TXSS+/bacterial

To list all macsydata subcommands:

macsydata --help

To list all available options for a subcommand:

macsydata <subcommand> --help

For models not stored in macsy-models the commands available, search, installation from remote or upgrade from remote are NOT available.

For models NOT stored in macsy-models, you have to manage them semi-manually. Download the archive (do not unarchive it), then use macsydata to install the archive.