Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List jobs per job type #576

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

javiermtorres
Copy link
Contributor

@javiermtorres javiermtorres commented Jan 7, 2025

What's changing

A GET request can be issued to the /jobs/{JOB_TYPE}/ endpoint, where currently JOB_TYPE is either inference or evaluate, to select all jobs of the desired type. A GET request can still be issued to /jobs/{UUID}/ where UUID is a v4 UUID to retrieve the details of a single job.

Closes #380

How to test it

As tested in the integration tests, two jobs, one inference and one evaluation, can be started, and only one would be selected with the new API.

Additional notes for reviewers

N/A

I already...

  • Tested the changes in a working environment to ensure they work as expected
  • Added some tests for any new functionality
  • Updated the documentation (both comments in code and product documentation under /docs)
  • Checked if a (backend) DB migration step was required and included it if required

@github-actions github-actions bot added backend api Changes which impact API/presentation layer labels Jan 7, 2025
@javiermtorres javiermtorres force-pushed the javiermtorres/issue-380-list-per-job-type branch from 458d480 to eba403f Compare January 7, 2025 17:45
@javiermtorres javiermtorres changed the title Add db interface, tests List jobs per job type Jan 8, 2025
@github-actions github-actions bot added the sdk label Jan 8, 2025
@javiermtorres javiermtorres marked this pull request as ready for review January 8, 2025 16:42
@@ -103,20 +105,57 @@ def list_jobs(
)


@router.get("/{job_id}")
def get_job(service: JobServiceDep, job_id: UUID) -> Job:
@router.get("/{job_spec}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what will a spec be in this case? we could consider a more descriptive variable name here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i.e. are we changing this from a UUID?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are. It can be a uuid (returning a single job) or a job type name (inference or evaluate, returning a list of jobs). It follows the approach in the create_job service for consistency. I'd rather use a param (actually inheritance via anyOf plus a discriminator).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to have one route return one type of object for clarity and consistency. If we have a spec that returns multiple objects, we should return a dictionary of elements.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a couple of constraints: make it backwards compatible, and follow a similar interface to the existing calls.
Regarding the dictionary, the GET /jobs call already returns a list, so I'd see it natural to return a list as well.
Regarding the routes, I can use a more specific route like GET /jobs/per_type/{job_type} to avoid overloading the /jobs/{job_id} route. Wdyt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can actually use /jobs/{uuid} and /jobs/inference/ / /jobs/evaluate/ with trailing slashes :-/ It's a bit messy too but I kinda like it better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agreed with @aittalam on using /jobs/{job_type}/ to keep it consistent with the existing POST methods. Let me know if this would be ok for you.

ray_job = _get_ray_job(job_id)

# Combine both types of response.
x = ray_job.model_dump() # JobSubmissionResponse
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we're looking to understand the type here, we can annotate it:

x: JobSubmissionResponse = ray_job.model_dump()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, I'm not sure I wrote that, but type annotations are fine for me. IIRC this would need to be taken from ray (but we do have those dependencies and can import the types, right?).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I wrote that

I mean I just copied the existing approach of merging the Ray info together with our own.


# Merge Ray jobs into the repositories jobs
for job in jobs.items:
found_job = next(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand this piece of logic: why are we looking to combine Ray job and lumigator IDs here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was done in a separate PR already, I'm just following it. The Ray info is bundled with the Lumigator job info so the UI has all the info in one place. Please check #514

op.add_column("jobs", sa.Column("job_type", sa.String(), nullable=True))


def downgrade() -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we adding and removing the same column here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly (which may not be doing!) adding the column is part of the upgrade procedure into this db revision from the previous, and removing the column is part of the downgrade procedure from this db revision into the previous. But maybe I'm not getting it right :-/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirm! This is automatic code generated by alembic to upgrade / downgrade the database to a new / older version.
As Dimitri's PR has been merged, I think you don't need to have this anymore as you are not actively modifying the DB (IIRC)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I stand corrected, the previous PR was not dealing with thejob_type field. You'll need to run alembic again then and generate a more recent migration if you have not done it already

@javiermtorres javiermtorres force-pushed the javiermtorres/issue-380-list-per-job-type branch from 238cbb1 to 0197eac Compare January 10, 2025 14:02
@github-actions github-actions bot added the schemas Changes to schemas (which may be public facing) label Jan 13, 2025
revision: str = "ef5ee5662ce3" # pragma: allowlist secret
down_revision: str | None = "e9679cbc3c36" # pragma: allowlist secret
branch_labels: str | Sequence[str] | None = None
depends_on: str | Sequence[str] | None = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make sure when we merge this it doesn't conflict with other alembic changes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmmmm, true :-/ I'll regenerate the alembic part just before merging just in case.

@@ -111,15 +111,15 @@ def list_jobs(
# Merge Ray jobs into the repositories jobs
for job in jobs.items:
found_job = next(
Copy link
Member

@veekaybee veekaybee Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest splitting this code into two lines and commenting for legibility, and in splitting it up, looks like this can go into a dict?

return_val: type = (job for job in filter(lambda x: x.submission_id == str(job.id), ray_jobs)), None
found_job: [type here] = next(return_val)

suggested refactor:

ray_jobs = {}

for job in jobs.items():
  submission_id = job[1].submission_id
  if submission_id:
    ray_jobs[submission_id] = job[1]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handling a set/dict to keep track of the ids definitely helps finding out ray/job relations.
Ok, refactoring 👍

Copy link
Contributor Author

@javiermtorres javiermtorres Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did something similar:

    # Get all jobs Ray knows about on a dict
    ray_jobs = {ray_job.submission_id: ray_job for ray_job in _get_all_ray_jobs()}

    results = list[Job]()

    # Merge Ray jobs into the repositories jobs
    job: JobResponse
    for job in jobs.items:
        job_id = str(job.id)
        if job_id in ray_jobs:
            # Combine both types of response.

"""Attempts to retrieve merged job data from the Lumigator repository and Ray
for a valid UUID.

The result is a merged representation which forms an augmented view of a 'job'.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like the comment string is incorrectly formatted

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what sense? I don't get the issue :(

x = ray_job.model_dump() # JobSubmissionResponse
y = job.model_dump() # JobResponse
merged = {**x, **y}
ray_info = ray_job.dict()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would recommend creating or using an exiting PyDantic class for this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do. It's just an older version that Ray uses, afaict :-/ I tried using model_dump, but if I got it right it's their Pydantic version not implementing it.

def _get_all_ray_jobs() -> list[RayJobDetails]:

from ray.job_submission import JobDetails as RayJobDetails

https://github.com/ray-project/ray/blob/916f534e571278b26733812b24a7b3dee08f24e4/python/ray/dashboard/modules/job/pydantic_models.py#L38

@@ -194,7 +240,7 @@ def get_job_result_download(
return service.get_job_result_download(job_id)


def _get_all_ray_jobs() -> list[JobSubmissionResponse]:
def _get_all_ray_jobs() -> list[RayJobDetails]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we changing this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the info we get is modelled by the Ray pydantic models. Then it is merged with our own job info.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The merged information is not modelled as JointJobInfo(RayJobDetails, JobResponse) because this would require the lumigator schemas to import the ray package, which is too heavy for the end user as there is no separate api or model package for Ray.

@@ -24,6 +24,9 @@
from backend.services.datasets import DatasetService
from backend.settings import settings

DEFAULT_SKIP = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's injecting per method call to follow the rest of the repo

self, skip: int = 0, limit: int = 100

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I'd prefer to have defaults as consts and not magic numbers, to keep consistency over the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api Changes which impact API/presentation layer backend schemas Changes to schemas (which may be public facing) sdk
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Return jobs of specific type
3 participants