Skip to content

Discovery of valid parameters for OO-syntax #117

@FrithiofJensen

Description

@FrithiofJensen

Hi!

When configuring a Ruffus pipeline using the OO-Syntax, one will encounter issues with passing parameters to Ruffus Task Objects.

Certain types of tasks, such as 'Split()' will refuse a 'pipeline_dir=' parameter that other tasks, such as 'Transform()' will be happy to work with (Is this a bug or deliberate?).

One way to discover which 'variant' parameters are accepted by Tasks is to use the 'inspect' module.

In a Python session one can try to look at the source for Tasks _prepare_<task-type> function, like so:

import inspect
import ruffus
from ruffus import *

def tf(*args, **kwargs):
    print(args, kwargs)

pl = ruffus.Pipeline(name='testing')
task = pl.split(task_func=tf, name='atask', output='stuff')


print(inspect.getsource(task._prepare_split))
    def _prepare_split(self, unnamed_args, named_args):
        """
        Common code for @split and pipeline.split
        """
        self.error_type = ruffus_exceptions.error_task_split
        self._set_action_type(Task._action_task_split)
        self._setup_task_func = Task._split_setup
        self.needs_update_func = self.needs_update_func or needs_update_check_modify_time
        self.job_wrapper = job_wrapper_io_files
        self.job_descriptor = io_files_one_to_many_job_descriptor
        self.single_multi_io = self._one_to_many
        # output is a glob
        self.indeterminate_output = 1

        #
        #   Parse named and unnamed arguments
        #
        self.parsed_args = parse_task_arguments(unnamed_args, named_args,
                                                ["input", "output", "extras"],
                                                self.description_with_args_placeholder)

print(inspect.getsource(task._prepare_transform))
    def _prepare_transform(self, unnamed_args, named_args):
        """
        Common function for pipeline.transform and @transform
        """
        self.error_type = ruffus_exceptions.error_task_transform
        self._set_action_type(Task._action_task_transform)
        self._setup_task_func = Task._transform_setup
        self.needs_update_func = self.needs_update_func or needs_update_check_modify_time
        self.job_wrapper = job_wrapper_io_files
        self.job_descriptor = io_files_job_descriptor
        self.single_multi_io = self._many_to_many

        #   Parse named and unnamed arguments
        self.parsed_args = parse_task_arguments(unnamed_args, named_args,
                                                ["input", "filter", "modify_inputs",
                                                 "output", "extras", "output_dir"],
                                                self.description_with_args_placeholder)


The pattern seems to be that a list of 'permitted parameters' are passed in 'parse_task_arguments' - some of these may be optional, others required.

Some parameters are not explicitly mentioned here but always passed, like 'name' and 'task_func'.

Anyways, Hope this helps someone a little!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions