We know how to use scontrol to reserve nodes or cores, but is there a way to specifically reserve a gres ... in particular, a GPU? What would the command be? Thanks, Paul.
Hey Paul, Currently, Slurm can’t explicitly create reservations for GRESs. Supported resources that can be reserved include cores, nodes, licenses, burst buffers, and features. However, you could possibly work around this limitation by creating reservations based off a feature. To do that, set a feature for nodes that have GPUs with a string like “k80” and issue a command like the following: scontrol create reservation starttime=now nodes=all duration=15 features=k80 user=root Just be aware that “the reservation creation request can [only] identify... *one* feature that every selected node must contain.” See https://slurm.schedmd.com/reservations.html. Luckily, more advanced GPU reservation and scheduling is something we are actively developing for 19.05, so stay tuned! And of course, for merely scheduling GPUs, take a look at https://slurm.schedmd.com/gres.html and https://slurm.schedmd.com/gres.conf.html.
Feel free to reopen this bug if you have any more follow up. Thanks, Michael
I do not see this working yet with 22.05. scontrol create reservation partition=debug starttime=now duration=120 duration=120 user=root flags=maint nodes=ALL tres=gres/gpu:gtx=3 scontrol: error: TRES type 'gres/gpu:gtx' not supported with reservations
Hi Markus. We have an active enhancement tracking this request bug#10934. > I do not see this working yet with 22.05. As mentioned previously this is something Slurm does not currently support. Quoting Tim's reply regarding future plans for reservations and gres: > This remains an unsponsored development request, and as such, there is no > specific timeframe we expect to implement this on. If a SchedMD customer is > interested in sponsoring development then it'll be much more likely to move > forward, otherwise, like a lot of other tickets filed under '5 - Enhancement' - > this will remain in limbo.
*** Bug 10934 has been marked as a duplicate of this bug. ***
Markus, Could you give me a few examples of what you would expect in your usage of this? An expectation on our end is a reservation will always include cores/nodes along with the other TRES requested. Does this meet with your expectations/workflow as well?
Markus, I am starting to work on this and am looking for the further guidance mentioned in comment 9. Please reply at your earliest convenience.
(In reply to Danny Auble from comment #9) > An expectation on our end is a reservation will always include cores/nodes > along with the other TRES requested. Does this meet with your > expectations/workflow as well? Depending on your definition of "cores". I would not require cpu-cores as part of the reservation as it could default to the partition DefCpuPerGPU when a gpu TRES is provided. For nodes, I agree with your expectation, as in "one node with 4 gpus" or "4 nodes with one gpu each"
*** Bug 17226 has been marked as a duplicate of this bug. ***
Markus, I think I have what is required in the master branch after commit 0c0ddd55f1. Could you please test and verify things are working as you would expect. I added 2 options. TRES=gres/gpu:1 or TRESPerNode=gres/gpu:1 Both are case insensitive. Let me know if you have any problems or not. Thanks!
Please reopen (or open a new bug) if this is not working as expected. The current master branch should have all the functionality required.
I am on holidays from 27-Oct till 7-Nov-2023