Grouping resources #1189

jameshcorbett · 2024-04-28T19:56:00Z

jameshcorbett
Apr 28, 2024
Maintainer

I was testing rabbit scheduling on some clusters and I found that it didn't work the way I thought it did. I don't know how this got by me, I was sure I tested it, so I thought something must have broken the functionality I was relying on but I went back a couple of versions and it still didn't work.

At the moment, a rabbit jobspec looks like the following:

{"resources": [
    {
      "type": "rack",
      "count": 3,
      "with": [
        {
          "type": "node",
          "count": 1,
          "exclusive": true,
          "with": [
            {
              "type": "slot",
              "count": 1,
              "with": [
                {
                  "type": "core",
                  "count": 1
                }
              ],
              "label": "task"
            }
          ]
        },
        {
          "type": "rabbit",
          "count": 1,
          "with": [
            {
              "type": "ssd",
              "count": 2048,
              "exclusive": true
            }
          ]
        }
      ]
    }
  ]
}

Where the count of rack is N >= 1 and the count of node and rabbit is always 1. I found that these jobspecs would only be satisfiable on clusters that had at least N racks. That's not what I want, I want these jobspecs to be satisfiable on any cluster that has at least N nodes which each have an associated rabbit with the requisite storage. The actual rack count is not important--if the resources all fit on one rack, great; if the resources need to be split across multiple racks, fine.

I think what I want is a way to group "node" and "rabbit" together and multiply them, something like what slot does here:

[
  {
    "type": "slot",
    "count": 4,
    "with": [
      {
        "type": "core",
        "count": 2
      },
      {
        "type": "gpu",
        "count": 1
      }
    ],
    "label": "task"
  }
]

Except instead of "core" and "gpu" it would be "node" and "rabbit".

Any ideas?

jameshcorbett · 2024-04-29T17:45:56Z

jameshcorbett
Apr 29, 2024
Maintainer Author

Summarizing some things I've tried with @milroy offline:

Placing "slot" above "rabbit" and "node" seems to work except that it allocates everything exclusively. In particular, it allocates rabbit vertices exclusively, which is bad because they will generally need to be shared.

version: 9999
resources:
    - type: slot
      count: 2
      label: default
      with:
      - type: rabbit
        count: 1
        with:
          - type: ssd
            count: 1 
      - type: node
        count: 1
        with:
          - type: core
            count: 1

Trying to request it non-exclusively gives an error:

    // If a non-exclusive resource request is explicitly given on a
    // resource that lies under slot, this spec is invalid.
    if (exclusive_in && resource.exclusive == Jobspec::tristate_t::FALSE) {
        errno = EINVAL;
        m_err_msg += "by_excl: exclusivity conflicts at jobspec=";
        m_err_msg += resource.label + " : vertex=" + (*m_graph)[u].name;
        goto done;
    }

Another idea was to eliminate the top-level entry entirely, like the following:

version: 9999
resources:
    - type: rabbit
      count: 2
      exclusive: false
      with:
        - type: slot
          count: 1
          label: default
          with:
            - type: ssd
              count: 1
    - type: node
      count: 2
      with:
        - type: slot
          count: 1
          label: default
          with:
            - type: core
              count: 1

But this allows Fluxion to pick rabbits completely independently of its choice of nodes. That's good for ephemeral lustre requests, but isn't the target functionality.

Another option was to use count: {min: N, max: M} like the following:

version: 9999
resources:
  - type: rack
    count:
      min: 1
      max: 2
      operator: "+"
      operand: 1
    with:
      - type: node
        count:
          min: 2
          max: 4
          operator: "+"
          operand: 1
        with:
          - type: slot
            count: 1
            label: default
            with:
              - type: core
                count: 1
      - type: rabbit
        count: 1
        with:
          - type: slot
            count: 1
            label: default
            with:
              - type: ssd
                count:
                  min: 1000
                  max: 3000
                  operator: "+"
                  operand: 1000

but this gives a varying amount of resources depending on what's available.

Ideas for going forward: add something that works like "slot" only doesn't enforce exclusivity.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Grouping resources #1189

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Grouping resources #1189

jameshcorbett Apr 28, 2024 Maintainer

Replies: 1 comment

jameshcorbett Apr 29, 2024 Maintainer Author

jameshcorbett
Apr 28, 2024
Maintainer

jameshcorbett
Apr 29, 2024
Maintainer Author