Skip to content

Conversation

@vineetbansal
Copy link

Summary

This is essentially trying to resurrect the idea of __getitem__ access to a Job returning the item from its .output, as in a previous PR, but limited in scope to avoid breaking changes.

A use-case for this (included here as a test) is:

    @job
    def make_str(s):
        return {"hello": s}

    @job
    def capitalize(s):
        return s.upper()

    job1 = make_str("world")
    job2 = capitalize(job1["hello"])  # <---- instead of `job1.output["hello"]`

    flow = Flow([job1, job2])

Context

While working with our package that uses jobflow, prefect, parsl and other workflow engines under the hood, we've recognized that flows are expressed succinctly as lookups on jobs:

    @job
    def greetings(s):
        return {"hello": f"Hello {s}", "bye": f"Goodbye {s}"}

    @job
    def upper(s):
        return s.upper()

    @flow
    def greet(s):
        job1 = greetings(s)
        job2 = upper(job1["hello"])
        return job2

While the @flow decorator is a parallel effort that we hope can be merged in soon, this PR focuses only on the getitem part.

The existing PR on this from a couple of years ago was also trying to integrate attribute access to jobs and flows, which is a more overreaching change, and calls for more internal refactoring than what we feel might be needed. For example, one suggested change there:

def __getattr__(self, name: str) -> OutputReference:
    if attr := getattr(self.output, name, None):
        return attr
    raise AttributeError(f"{type(self).__name__} has no attribute {name!r}")

would necessitate that all current and future attributes of Jobs and OutputReferences be non-overlapping, which might be more trouble for development than is worth. A dict-lookup should not have the same disruptive effect from the point of view of the codebase, but will provide a useful shortcut for flow creation.

Checklist

Before a pull request can be merged, the following items must be checked:

  • Code is in the standard Python style.
    The easiest way to handle this is to run the following in the correct sequence on
    your local machine. Start with running black on your new code. This will
    automatically reformat your code to PEP8 conventions and removes most issues. Then run
    pycodestyle, followed by flake8.
  • Docstrings have been added in theNumpy docstring format.
    Run pydocstyle on your code.
  • Type annotations are highly encouraged. Run mypy to
    type check your code.
  • Tests have been added for any new functionality or bug fixes.
  • All linting and tests pass.

Note that the CI system will run all the above checks. But it will be much more
efficient if you already fix most errors prior to submitting the PR. It is highly
recommended that you use the pre-commit hook provided in the repository. Simply
pip install pre-commit and then pre-commit install and a check will be run
prior to allowing commits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant