Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add static and runtime dag info, API to fetch ancestor and successor tasks #2124

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

talsperre
Copy link
Collaborator

@talsperre talsperre commented Oct 31, 2024

Add runtime DAG info so that we can query the ancestor and successor tasks for a given task easily.

Usage

from metaflow import Task, namespace
namespace(None)
task = Task('RuntimeDAGFlow/18/step_c/32076012', attempt=0)

To get ancestors, progenies, and siblings, use the following API:

ancestors = task.immediate_ancestors()
successors = task.immediate_successors()
siblings = task.closest_siblings()

@talsperre talsperre force-pushed the dev/add-runtime-dag-info branch from 48c771d to ec43f14 Compare November 1, 2024 18:34
Comment on lines 675 to 690
@classmethod
def _filter_tasks_by_metadata(
cls, flow_id, run_id, query_step, field_name, field_value
):
raise NotImplementedError()

@classmethod
def filter_tasks_by_metadata(
cls, flow_id, run_id, query_step, field_name, field_value
):
# TODO: Do we need to do anything wrt to task attempt?
task_ids = cls._filter_tasks_by_metadata(
flow_id, run_id, query_step, field_name, field_value
)
return task_ids

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a need for the private method, or could this simply be contained in the public-facing one? right now its not doing anything before calling the private one.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, did you have an implementation of this for service.py yet?

def filter_tasks_by_metadata(
cls, flow_id, run_id, query_step, field_name, field_value
):
# TODO: Do we need to do anything wrt to task attempt?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably not, as the ancestors for task attempts should be identical, right? What about the immediate_siblings though, will they include or exclude attempts of the same task?

@talsperre talsperre force-pushed the dev/add-runtime-dag-info branch from ffbf68a to c6fb9ac Compare January 2, 2025 23:25
@talsperre talsperre changed the title Add static and runtime dag info, API to fetch ancestor tasks Add static and runtime dag info, API to fetch ancestor and successor tasks Jan 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants