Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tutorials: add flux proxy command tutorial #200

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

chu11
Copy link
Member

@chu11 chu11 commented Feb 10, 2023

No description provided.

- ``flux-mini-submit`` (:ref:`flux-mini-submit`): "Submit a job in a Flux instance"
- ``flux proxy`` (:ref:`flux-proxy-command`): "flux proxy basics"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we here give the reader a better sense of why they would want to use flux proxy, and how it differs from the "ssh across clusters" tutorial?

Flux Proxy Basics
=================

With Flux, it is very common to create a Flux :ref:`subinstances<subinstance>`. However, it can be confusing how to interact with those subinstances.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is saying when I launch a flux job that controls some set of subinstances, I would use flux proxy to communicate with them? Or could I have flux controlling multiple clusters, and then flux proxy would give me access to a cluster? What is a subinstance (or what can it be) in concrete terms? Why would I want to communicate with it? E.g., what we might have here is "you might want to do this if X or Y or Z" is what you are doing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a better intro here would be to talk about traditional batch jobs running a batch script, and how in Flux those "batch jobs" are actually running a full instance of Flux, and the flux proxy command can be used to execute processes, including an interactive shell, as if they were connected directly to the instance of Flux running your batch workload.

Also, flux proxy is not restricted to running against subinstances. You can also use it to connect to a system instance running on another cluster, a test instance running locally, or as you show below, a Flux instance running under a Slurm, LSF, or other resource manager job.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a better intro here would be to talk about traditional batch jobs running a batch script, and how in Flux those "batch jobs" are actually running a full instance of Flux, and the flux proxy command can be used to execute processes, including an interactive shell, as if they were connected directly to the instance of Flux running your batch workload.

The way I imagined it was that we'd eventually have this discussed more in a flux mini batch tutorial. Although I can probably go into it a little more than I have so the reader isn't quite as clueless.


.. code-block:: sh

#!/bin/sh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait where does flux mini batch come in? This looks like it's running 4x flux mini submit?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a batch equivalent?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To answer your question @v, flux mini batch is the flux mini utility that submits a "batch job" which in Flux is a job that runs an instance of Flux on the assigned resources with a batch script as its "initial program"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay - so you give it a script with other flux submit commands? As opposed to (what I've seen for other managers) a text file with some list of single commands (without the job manager directive) to run?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does that (conceptually) compare to a job array?

Copy link
Member Author

@chu11 chu11 Feb 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait where does flux mini batch come in? This looks like it's running 4x flux mini submit?

I listed the script and then run the script in flux mini batch right below, I guess I should put a sentence in between, perhaps its not obvious when skimming.


.. code-block:: console

> flux mini batch -n4 ./subinstance-jobs.sh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I totally don't understand what is going on here, lol (I've never used batch before).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK going to stop here and submit this early feedback because I can't follow along (sorry, not familiar with batch!)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lemme try to add a bit more details then. one issue is we are writing these tutorials out of order. I jumped on flux proxy mostly b/c you already wrote the ssh-across-clusters one. Really flux mini batch probably should have come after flux mini submit. which I will add to the todo list.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we encourage the use of --cc in this case? Possibly using waitable jobs and flux job wait for more "standard" practice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for some of these small examples I figure its easier to do this than have to explain --cc (leaving --cc to the #195).

Do we have a preference on flux job wait vs flux job status for something this small?

Copy link
Contributor

@grondo grondo Feb 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good points. Given that, I'm fine with current example, was more of a question anyway.

Copy link
Contributor

@grondo grondo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also added some quick comments, though I didn't get all the way through.

Flux Proxy Basics
=================

With Flux, it is very common to create a Flux :ref:`subinstances<subinstance>`. However, it can be confusing how to interact with those subinstances.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a better intro here would be to talk about traditional batch jobs running a batch script, and how in Flux those "batch jobs" are actually running a full instance of Flux, and the flux proxy command can be used to execute processes, including an interactive shell, as if they were connected directly to the instance of Flux running your batch workload.

Also, flux proxy is not restricted to running against subinstances. You can also use it to connect to a system instance running on another cluster, a test instance running locally, or as you show below, a Flux instance running under a Slurm, LSF, or other resource manager job.


.. code-block:: sh

#!/bin/sh
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To answer your question @v, flux mini batch is the flux mini utility that submits a "batch job" which in Flux is a job that runs an instance of Flux on the assigned resources with a batch script as its "initial program"


.. code-block:: console

> flux mini batch -n4 ./subinstance-jobs.sh
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we encourage the use of --cc in this case? Possibly using waitable jobs and flux job wait for more "standard" practice.

Connect to the Instance
-----------------------

The easiest way to operate with a subinstance is to use ``flux-proxy``. You can pass it the subinstance jobid and the command you want to run against that subinstance. Lets submit a job via ``flux mini submit`` and list it via ``flux jobs``.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd remove subinstance here, might be less confusing to just say jobid.

fh2fcK1 achu sleep R 1 1 4.152m opal186
fa6txwZ achu sleep R 1 1 4.157m opal186

Notice that we're running ``flux jobs`` against the subinstance itself. We don't have to specify the ``--recursive`` option to list the jobids anymore.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More interesting might be to submit more work to the instance (this is where flux job wait --all would come in handy)


Notice that we're running ``flux jobs`` against the subinstance itself. We don't have to specify the ``--recursive`` option to list the jobids anymore.

You can also ssh to the node if you know the subinstance's Flux :ref:`URI<URI>`. This may be useful if you want to proxy to the subinstance from outside of the cluster at a later time (see :ref:`SSH Across Clusters`<ssh-across-clusters>` for more information).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mean here "you can also specify a native uri" instead of a target jobid?

fa6txwZ achu sleep R 1 1 6.385m opal186


If you do not specify a command to run, you will be dropped into a shell that will forward all commands to the subinstance.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If you do not specify a command to run, you will be dropped into a shell that will forward all commands to the subinstance.
If you do not specify a command to run, you will be dropped into a shell that will forward all messages to the subinstance.

?

@chu11 chu11 force-pushed the flux_proxy_tutorial branch from 226ee89 to 293cc90 Compare February 13, 2023 15:21
@chu11
Copy link
Member Author

chu11 commented Feb 13, 2023

Re-pushed, re-working the flow of the tutorial in a different way. Instead of starting with flux mini batch I start with flux mini alloc and try to add a little more description about subinstances along the way. I could certainly add a bit more, but I think its best to save that for an eventual flux mini alloc / flux mini batch tutorial that we can eventually point to.

Side note, pushed the dumb cleanups into #202, so its only the tutorial now

@chu11
Copy link
Member Author

chu11 commented Feb 17, 2023

re-pushed with some minor tweaks given discussion from other PRs. If there is still a little confusion of what is flux mini batch doing and why would I want to make a subinstance, I think that there will be future tutorials to deal with that.

@chu11 chu11 force-pushed the flux_proxy_tutorial branch from 9860222 to dc7ec31 Compare February 17, 2023 15:23
@@ -7,6 +7,7 @@ Welcome to the Command Tutorials! These tutorials should help you to map specifi
with your use case, and then see detailed usage.

- ``flux mini submit/flux mini run`` (:ref:`flux-mini-submit`): "Submit a job in a Flux instance"
- ``flux proxy`` (:ref:`flux-proxy-command`): "Send commands to other Flux instances"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I guess this is good for now - some of these are definitely more advanced.

@chu11 chu11 force-pushed the flux_proxy_tutorial branch 3 times, most recently from d167514 to 0078415 Compare March 27, 2023 13:57
@chu11
Copy link
Member Author

chu11 commented Mar 27, 2023

rebased, removing "flux mini" references

@chu11 chu11 force-pushed the flux_proxy_tutorial branch from 0078415 to 47103a0 Compare March 27, 2023 14:00
@chu11 chu11 force-pushed the flux_proxy_tutorial branch from 47103a0 to 693b2b9 Compare June 7, 2023 18:51
Problem: There's no flux proxy command tutorial.

Add one.
@chu11 chu11 force-pushed the flux_proxy_tutorial branch from 693b2b9 to c6b16a7 Compare June 8, 2023 15:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants