-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tutorials: add flux proxy command tutorial #200
base: master
Are you sure you want to change the base?
Conversation
tutorials/commands/index.rst
Outdated
- ``flux-mini-submit`` (:ref:`flux-mini-submit`): "Submit a job in a Flux instance" | ||
- ``flux proxy`` (:ref:`flux-proxy-command`): "flux proxy basics" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we here give the reader a better sense of why they would want to use flux proxy, and how it differs from the "ssh across clusters" tutorial?
Flux Proxy Basics | ||
================= | ||
|
||
With Flux, it is very common to create a Flux :ref:`subinstances<subinstance>`. However, it can be confusing how to interact with those subinstances. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is saying when I launch a flux job that controls some set of subinstances, I would use flux proxy to communicate with them? Or could I have flux controlling multiple clusters, and then flux proxy would give me access to a cluster? What is a subinstance (or what can it be) in concrete terms? Why would I want to communicate with it? E.g., what we might have here is "you might want to do this if X or Y or Z" is what you are doing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a better intro here would be to talk about traditional batch jobs running a batch script, and how in Flux those "batch jobs" are actually running a full instance of Flux, and the flux proxy
command can be used to execute processes, including an interactive shell, as if they were connected directly to the instance of Flux running your batch workload.
Also, flux proxy
is not restricted to running against subinstances. You can also use it to connect to a system instance running on another cluster, a test instance running locally, or as you show below, a Flux instance running under a Slurm, LSF, or other resource manager job.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a better intro here would be to talk about traditional batch jobs running a batch script, and how in Flux those "batch jobs" are actually running a full instance of Flux, and the flux proxy command can be used to execute processes, including an interactive shell, as if they were connected directly to the instance of Flux running your batch workload.
The way I imagined it was that we'd eventually have this discussed more in a flux mini batch
tutorial. Although I can probably go into it a little more than I have so the reader isn't quite as clueless.
|
||
.. code-block:: sh | ||
|
||
#!/bin/sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait where does flux mini batch come in? This looks like it's running 4x flux mini submit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a batch equivalent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To answer your question @v, flux mini batch
is the flux mini
utility that submits a "batch job" which in Flux is a job that runs an instance of Flux on the assigned resources with a batch script as its "initial program"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay - so you give it a script with other flux submit commands? As opposed to (what I've seen for other managers) a text file with some list of single commands (without the job manager directive) to run?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does that (conceptually) compare to a job array?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait where does flux mini batch come in? This looks like it's running 4x flux mini submit?
I listed the script and then run the script in flux mini batch
right below, I guess I should put a sentence in between, perhaps its not obvious when skimming.
|
||
.. code-block:: console | ||
|
||
> flux mini batch -n4 ./subinstance-jobs.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I totally don't understand what is going on here, lol (I've never used batch before).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK going to stop here and submit this early feedback because I can't follow along (sorry, not familiar with batch!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lemme try to add a bit more details then. one issue is we are writing these tutorials out of order. I jumped on flux proxy
mostly b/c you already wrote the ssh-across-clusters one. Really flux mini batch
probably should have come after flux mini submit
. which I will add to the todo list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we encourage the use of --cc
in this case? Possibly using waitable jobs and flux job wait
for more "standard" practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for some of these small examples I figure its easier to do this than have to explain --cc
(leaving --cc
to the #195).
Do we have a preference on flux job wait
vs flux job status
for something this small?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good points. Given that, I'm fine with current example, was more of a question anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also added some quick comments, though I didn't get all the way through.
Flux Proxy Basics | ||
================= | ||
|
||
With Flux, it is very common to create a Flux :ref:`subinstances<subinstance>`. However, it can be confusing how to interact with those subinstances. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a better intro here would be to talk about traditional batch jobs running a batch script, and how in Flux those "batch jobs" are actually running a full instance of Flux, and the flux proxy
command can be used to execute processes, including an interactive shell, as if they were connected directly to the instance of Flux running your batch workload.
Also, flux proxy
is not restricted to running against subinstances. You can also use it to connect to a system instance running on another cluster, a test instance running locally, or as you show below, a Flux instance running under a Slurm, LSF, or other resource manager job.
|
||
.. code-block:: sh | ||
|
||
#!/bin/sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To answer your question @v, flux mini batch
is the flux mini
utility that submits a "batch job" which in Flux is a job that runs an instance of Flux on the assigned resources with a batch script as its "initial program"
|
||
.. code-block:: console | ||
|
||
> flux mini batch -n4 ./subinstance-jobs.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we encourage the use of --cc
in this case? Possibly using waitable jobs and flux job wait
for more "standard" practice.
Connect to the Instance | ||
----------------------- | ||
|
||
The easiest way to operate with a subinstance is to use ``flux-proxy``. You can pass it the subinstance jobid and the command you want to run against that subinstance. Lets submit a job via ``flux mini submit`` and list it via ``flux jobs``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd remove subinstance here, might be less confusing to just say jobid.
fh2fcK1 achu sleep R 1 1 4.152m opal186 | ||
fa6txwZ achu sleep R 1 1 4.157m opal186 | ||
|
||
Notice that we're running ``flux jobs`` against the subinstance itself. We don't have to specify the ``--recursive`` option to list the jobids anymore. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More interesting might be to submit more work to the instance (this is where flux job wait --all
would come in handy)
|
||
Notice that we're running ``flux jobs`` against the subinstance itself. We don't have to specify the ``--recursive`` option to list the jobids anymore. | ||
|
||
You can also ssh to the node if you know the subinstance's Flux :ref:`URI<URI>`. This may be useful if you want to proxy to the subinstance from outside of the cluster at a later time (see :ref:`SSH Across Clusters`<ssh-across-clusters>` for more information). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you mean here "you can also specify a native uri" instead of a target jobid?
fa6txwZ achu sleep R 1 1 6.385m opal186 | ||
|
||
|
||
If you do not specify a command to run, you will be dropped into a shell that will forward all commands to the subinstance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you do not specify a command to run, you will be dropped into a shell that will forward all commands to the subinstance. | |
If you do not specify a command to run, you will be dropped into a shell that will forward all messages to the subinstance. |
?
226ee89
to
293cc90
Compare
Re-pushed, re-working the flow of the tutorial in a different way. Instead of starting with Side note, pushed the dumb cleanups into #202, so its only the tutorial now |
293cc90
to
2efaf74
Compare
2efaf74
to
9860222
Compare
re-pushed with some minor tweaks given discussion from other PRs. If there is still a little confusion of what is |
9860222
to
dc7ec31
Compare
@@ -7,6 +7,7 @@ Welcome to the Command Tutorials! These tutorials should help you to map specifi | |||
with your use case, and then see detailed usage. | |||
|
|||
- ``flux mini submit/flux mini run`` (:ref:`flux-mini-submit`): "Submit a job in a Flux instance" | |||
- ``flux proxy`` (:ref:`flux-proxy-command`): "Send commands to other Flux instances" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I guess this is good for now - some of these are definitely more advanced.
d167514
to
0078415
Compare
rebased, removing "flux mini" references |
0078415
to
47103a0
Compare
47103a0
to
693b2b9
Compare
Problem: There's no flux proxy command tutorial. Add one.
693b2b9
to
c6b16a7
Compare
No description provided.