-
Notifications
You must be signed in to change notification settings - Fork 680
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Draft] [flytekit] Polish - Map Tasks #6139
Comments
Can we also consider ways to include fixed arguments? Currently, defining Consider the following @fl.task
def my_task(greeting: str, name: str):
print(f"{greeting}, {name}")
@fl.workflow
def my_workflow(greeting: str="hello", names: list[str]=["bob", "bill"]):
fl.map_task(partial(my_task, greeting=greeting) min_success_ratio=0.5))(name=names) I see two strategies here:
@fl.workflow
def my_new_workflow(greeting: str="hello", names: list[str]=["bob", "bill"]):
fl.map(my_task, greeting=greeting, name=names, tolerance=0.5)
# rename `map_task` to `apply`
# tolerance is of type `float|int`
# greeting and name are both **kwargs, and are either used as dynamic or static inputs depending on the type hints used at registration time
@fl.workflow
def my_new_workflow(greeting: str="hello", names: list[str]=["bob", "bill"]):
fl.map(my_task, tolerance=0.5).fix(greeting=greeting).run(name=names)
@fl.workflow
def my_new_workflow(greeting: str="hello", names: list[str]=["bob", "bill"]):
with fl.map(my_task, tolerance=0.5) as mapper:
mapper(greeting=greeting, name=names) The downside of option 1 is that there are reserved argument names to 2 and 3 are a little bit strange. |
@cosmicBboy you mentioned maybe also @granthamtaylor can we move your bullet to a separate issue? (i can copy it.) wanted to keep these tickets to the ones where there was pretty much unanimous support. |
Hmm, I don't think any of the alternative invocation syntaxes make it easier for DS users to learn, except maybe for (1).
I think we'll find many annoying edgecases here, e.g. support for mapping over two lists of elements of matching length (2) feels really DSL-y... not a bad thing, but I might as well learn the more generalizable Pythonic syntax for this (3) seems like an inappropriate use of context managers. A slight variation to (1) that might be clearer and less error-prone: @fl.task
def my_task(greeting: str, name: str):
print(f"{greeting}, {name}")
@fl.workflow
def my_new_workflow(greeting: str="hello", names: list[str]=["bob", "bill"]):
fl.map(my_task, partial_kwargs={"greeting": greeting}, tolerance=0.5)(names)
# or
fl.map(my_task, partial={"greeting": greeting}, tolerance=0.5)(names)
# or
fl.map(my_task, fixed_kwargs={"greeting": greeting}, tolerance=0.5)(names) |
I don't see how this is an edge case. The typing should dictate the behavior, not the lengths of inputs. What I had meant by #1 is that if a task has an argument of type This will all be know at registration time. Perhaps all of these inputs could be provided either to the call of the parameterized Both are a little wanting, and rely upon static typing. However, that is arguably already a precedent with |
There is one for Launchplans, but not for Last I checked, the |
A pretty straightforward case is: @fl.task
def my_task(
x: int, # 👈 map over this
y: list[int], # 👈 a finite set of values that I don't want to map over
z: str,
):
... Under proposal (1) in this comment, the workflow would be: @fl.workflow
def wf(x_list: list[int], y: list[int], z: str):
fl.map(my_task, x=x_list, y=y, z=z) We would have to automagically do the following work under the hood:
This is possible, but should we do this? It feels a little too magical for me, but curious what others think. |
I kind of like that it is so magical. It would also maintain backwards compatibility (one can just use However, I can see that it is perhaps too magical. Open to feedback! |
Map Task Polish
This is a series of tickets to improve the flytekit authoring experience. If any changes are not possible to make in a backwards-compatible way, split it out into a separate ticket.
Rename map task
Rename
map_task
to justmap
This will interfere with the native Pythonmap
but it's okay, as we now recommend usersimport flytekit as fl
Failure Toleration
The failure toleration parameters for
map_task
are very powerful but too verbose. Let's markmin_successes
andmin_success_ratio
as deprecated and make a new argumenttolerance
that is typefloat | int
.Parallelism
Deprecate the
max_parallelism
argument of workflow and LaunchPlan (is there one?) make a new one calledconcurrency
to match that ofmap_task
.Are you sure this issue hasn't been raised already?
Have you read the Code of Conduct?
The text was updated successfully, but these errors were encountered: