Snakeparse API¶
API¶
Snakeparse: command-line parsing library for Snakemake¶
This module allows multiple Snakemake workflows to be combined into one command line interface. Arguments to Snakemake and workflow-specific arguments are supported.
This module is inspired by tool-chains like fgbio, Picard, and samtools, that:
- support multiple tools which all can be specified on the same command line
- combines the argument parsing from all tools into one command line
- supports dispatching the tool-specific arguments to sub-parsers
- enable argument parsing for the tool-chain, that can propogate to all tools; in this case, arugenets for Snakemake.
The following is a minimumal usage example for how to use the API:
>>> SnakeParse(args=sys.argv[1:]).run()
The magic comes from creating a configuration object
(SnakeParseConfig
) that configures the paths to where
the Snakemake files (snakefiles) live, as well as various options for how
workflows are displayed on the command line. Once the configuration object has
been created, it’s as simple as:
>>> SnakeParse(args=sys.argv[1:], config=config)
The given arguments may contain the argument separator --
. All arguments
prior will be passed to Snakemake, while all arguments after will be passed to
the specified workflow. Which workflow to run is determined as follows:
- If the argument separator is present, then if there is only one workflow configured, use that one, otherwise, assume the name of the workflow is specified immediate after the argument separator.
- If the argument separator is not present, search for the first argument that matches a known workflow name.
If no workflows are configured, but the -s/--snakefile
option is given before
the argument separator, then this workflow is added to the list of workflows,
and that workflow will be executed.
For a more typical example, suppose the path to the the snakefiles is
~/snakemake-workflows
, then the following will work:
>>> config = SnakeParseConfig(snakefile_globs='~/snakemake-workflows/*')
>>> SnakeParse(args=sys.argv[1:], config=config).run()
Below are some example argument lists. Suppose the workflow named Example
has
a single required command line option --message
that takes a message to be
printed.
For running a single workflow:
>>> args = ['--snakefile', '/path/to/snakefile', '--', '--message', 'Hello!']
If a single workflow has been added via
SnakeParseConfig
object:
>>> args = ['Example', '--message', 'Hello!']
Alternatively, the workflow name can be omitted when only one workflow has been configured:
>>> args = ['--', '--message', 'Hello!']
If multiple workflows are configured, then the name must be explictly used.
In some cases, the options to Snakemake take multiple values, so it is ambiguous
where the arguments to Snakemake end and the arguments to the workflow begin.
Use the --
argument to explicitly seperate the two lists. The workflow name
should be immediately after the --
seperator:
>>> args = ['--force-run', 'rule-1', 'rule-2', '--', 'Example', '--message', 'Hello!']]
In the above example, the arguments ['--force-run', 'rule-1', 'rule-2']
are
passed to Snakemake, while the arguments ['--message', 'Hello!']
are passed to
the SnakeParser for the Example workflow.
Ther are two ways for your snakefile source to receive the parsed arguments: (1)
define a concrete subclass of SnakeParse
, or (2) define
a method snakeparser(**kwargs)
that returns a concrete sub-class of
SnakeParse
. For convenience when implementing parsing
using the argparse module, the class
SnakeArgumentParser
can be used for (1), while the
method argparser()
can be used for (2). For the
Example
workflow above, an example implementation with a concrete class
definition is as follows:
>>> from snakeparse.api import SnakeArgumentParser
>>> class Parser(SnakeArgumentParser):
... def __init__(self, **kwargs):
... super().__init__(**kwargs)
... self.parser.add_argument('--message', help='The message.', required=True)
Alternatively, a method can be defined in ‘example_snakeparser.py’
>>> from snakeparse.parser import argparser
>>> def snakeparser(**kwargs):
... p = argparser(**kwargs)
... p.parser.add_argument('--message', help='The message.', required=True)
... return p
The module contains the following public classes:
SnakeParser
– The abstract base class that implements the workflow- specific argument parsing. This parser will be invoked by SnakeParse prior to running Snakemake, to ensure that the command line arguments are specified correctly. This parser is likely also used in the Snakemake file to re-instantiate the parsed arguments.
SnakeArgumentParser
– The abstract base class to help argument- parsers that use python’s argparse module.
SnakeParseException
– The exception raised by this module.
SnakeParseWorkflow
– A container class for basic meta information- about a supported workflow, including to but not limited to the name displayed on the command line, the paths to the snakefile and SnakeParse file, a workflow group to which this workflow belongs, and a short description to display on the commad line.
SnakeParseConfig
– The class used to configure SnakeParse. In- particular, where workflows are located (if they are to be discovered), definitions for the workflow (if they are to be explicitly defined), where SnakeParse files live (either generally or relative to the Snakemake files), the names of the workflow groups, and other miscellaneous options.
SnakeParse
– The main entry point for command-line parsing for- Snakemake. The configuration for SnakeParse will be optionally loaded, then the workflow to run will be parsed, then the workflow arguments will be parsed, and finally the workflow will be run along with all the Snakemake specific arguments.
All other classes in this module are considered implementation details.
-
class
snakeparse.api.
SnakeArgumentParser
(**kwargs: typing.Any) → None¶ The abstract base class to help argument parsers that use python’s argparse module. Keyword arguments will be passed to the argument parser’s constructor.
-
parse_args
(args: typing.List[str]) → typing.Any¶ Parses the command line arguments.
-
parse_args_file
(args_file: pathlib.Path) → typing.Any¶ Parses command line arguments from an arguments file
-
print_help
(file: typing.Union[typing.IO[str], NoneType] = None) → None¶ Prints the help message
-
-
class
snakeparse.api.
SnakeParse
(args: typing.List[str] = [], config: typing.Union[_ForwardRef('SnakeParseConfig'), NoneType] = None, debug: bool = False, file: typing.IO[str] = <_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>) → None¶ The main entry point for command-line parsing for Snakemake.
Keyword Arguments: - args (List[str]) – The list of command line arguments.
- config (Optional['SnakeParseConfig']) – Optionally a SnakeParse configuration object.
- debug (bool) – Print extra debuggin information in the parser’s help message.
- file (TextIOWrapper) – The file to write any error or help messages, defaults to sys.stdout.
-
run
() → None¶ Execute the Snakemake workflow
-
static
usage_short
(prog: str = 'snakeparse', workflow_name: str = None) → str¶ A one line usage to display at the top of any usage or help message.
-
class
snakeparse.api.
SnakeParseConfig
(config_path: typing.Union[pathlib.Path, NoneType] = None, prog: str = None, snakemake: typing.Union[pathlib.Path, NoneType] = None, name_transform: str = None, parent_dir_is_group_name: bool = True, workflows: typing.Dict[str, _ForwardRef('SnakeParseWorkflow')] = OrderedDict(), groups: typing.Dict[str, str] = OrderedDict(), snakefile_globs: typing.Union[typing.List[str], NoneType] = []) → None¶ The class used to configure SnakeParse.
Keyword Arguments: - config_path (Optional[Path]) – The path to the SnakeParse configuration file, in JSON, YAML, or HOCON format. Details of its contents are described below.
- prog (str) – The name of the tool-chain.
- snakemake (Optional[Path]) – The optional path to the Snakemake executable.
- name_transform – An optional method to transform the basename of a workflow’s snakefile to the canonical name to use on the command line.
- parent_dir_is_group_name (bool) – True to use the parent directory of the workflow’s snakefile as the workflow’s group name. Only applied when searching directories for snakefiles, or when group name is not explicitly given.
- workflows (Dict[str, 'SnakeParseWorkflow']) – Optionally, the list of workflows as SnakeParseWorkflow objects.
- groups (Dict[str, str]) – Optionally, one or more key-value pairs, with the key being the canonical workflow group name, and the value being a description for that group to display on the command line.
- snakefile_globs (Optional[List[str]]) – Optionally, or more glob strings specifying where snakefile files can be found.
NB: the values in the configuration file take precedence over the keyword arguments. In particular, the configuration file may override Worfklows given in the ‘workflows’ keyword argument.
The following are configuration paths (in HOCON paths) allowed in the configuration file:
- snakemake – the optional path to the Snakemake executable.
- prog – the optional name of the tool-chain.
- name_transform – alias for a built-in method to produce the workflow’s name (see the similarly named keyword argument). Either ‘snake_to_camel’ or ‘camel_to_snake’ for converting from Snake case to Camel case, or vice versa.
- parent_dir_is_group_name – optional; see the similarly named keyword argument.
- workflows – optionally, a list of configuration objects, one per workflow specified. They object key should be the canonical workflow name to be displayed on the command line, with a dictionary of key value pairs specifying the wofklow configuration with the same names as SnakeParseWorkflow (snakefile, group, and description). Only the snakefile key-value pair is required.
- groups - optional; see the similarly named keyword argument.
- snakefile_globs – optional; see the similarly named keyword argument.
-
add_group
(name: str, description: str, strict: bool = True) → snakeparse.api.SnakeParseConfig¶ Adds a new group with the given name and description. If strict is
True
, then no group with the same name should already exist.
-
add_snakefile
(snakefile: pathlib.Path) → snakeparse.api.SnakeParseWorkflow¶ Adds a new workflow with the given snakefile. A workflow with the same name should not exist.
-
add_workflow
(workflow: snakeparse.api.SnakeParseWorkflow) → snakeparse.api.SnakeParseWorkflow¶ Adds the workflow to the list of workflows. A workflow with the same name should not exist.
-
static
config_parser
(usage: str = '==SUPPRESS==') → argparse.ArgumentParser¶ Returns an
ArgumentParser
for the configuration options
-
static
name_transfrom_from
(key: str) → typing.Callable[str, str]¶ Returns the built-in method to format the workflow’s name. Should be either ‘snake_to_camel’ or ‘camel_to_snake’ for converting from Snake case to Camel case, or vice versa.
-
static
parser_from
(workflow: snakeparse.api.SnakeParseWorkflow) → snakeparse.api.SnakeParser¶ Builds the SnakeParser for the given workflow
-
exception
snakeparse.api.
SnakeParseException
¶ The exception raised by classes in this module.
-
class
snakeparse.api.
SnakeParseWorkflow
(name: str, snakefile: pathlib.Path, group: typing.Union[str, NoneType] = None, description: typing.Union[str, NoneType] = None) → None¶ A container class for basic meta information about a workflow to be included on the command line.
Keyword Arguments: - name (str) – The canonical name of the workflow displayed on the command line.
- snakefile (Optional[str]) – The path to the snakefile file.
- group ((Optional[str]):) – The name of the workflow group, used to group workflows on the command line.
- description (Optional[str]) – A short description of the workflow, used when listing the workflows.
-
class
snakeparse.api.
SnakeParser
→ None¶ The abstract base class for implementing the workflow specific argument parsing.
-
description
¶ A short description of the workflow, used when listing the workflows.
-
group
¶ The name of the workflow group to which this group belongs.
-
parse_args
(args: typing.List[str]) → typing.Any¶ Parses the command line arguments.
-
parse_args_file
(args_file: pathlib.Path) → typing.Any¶ Parses command line arguments from an arguments file
-
parse_config
(config: dict) → typing.Any¶ Parses arguments from a Snakemake config object. It is assumed the arguments are contained in an arguments file, whose path is stored in the config with key
SnakeParse.ARGUMENT_FILE_NAME_KEY
.
-
print_help
(file: typing.Union[typing.IO[str], NoneType] = None) → None¶ Prints the help message
-
-
snakeparse.parser.
argparser
(**kwargs: typing.Any) → snakeparse.api.SnakeArgumentParser¶ Returns an SnakeParser that has an initialized member variable parser of type
ArgumentParser
. The keyword arguments are passed to the constructor ofArgumentParser
.