Torch Utility Applications

Torch has several utility applications that can give you information related to your account and jobs on the cluster:

`myquota`

Users can check their current utilization of quota using the myquota command. The myquota command provides a report of the current quota limits on mounted filesystems, the user's quota utilization, as well as the percentage of quota utilization.

In the following example the user who executes the myquota command is out of inodes in their home directory. The user inode quota limit on the /home file system 30.0K inodes and the user has 33000 inodes, thus 110% of the inode quota limit.

$ myquota
Quota Information for NetID
Hostname: torch-login-2 at 2025-12-09 17:18:24

Filesystem   Environment       Backed up?   Allocation           Current Usage
Space        Variable          /Flushed?    Space / Files        Space(%) / Files(%)

/home        $HOME             YES/NO       0.05TB/0.03M         0.0TB(0.0%)/33000(110%)
/scratch     $SCRATCH          NO/YES       5.0TB/5.0M           0.0TB(0.0%)/1(0%)
/archive     $ARCHIVE          YES/NO       2.0TB/0.02M          0.0TB(0.0%)/1(0%)

`my_slurm_accounts`

my_slurm_accounts returns a list of SLURM accounts associated with your HPC account:

[NetID@torch-login-b-1 ~]$ my_slurm_accounts
Account                          Descr                                                        
-------------------------------- ------------------------------------------------------------ 
torch_pr_XXX_XXXXX               project description                                                  

Use the appropriate entry in the Account column for the job you are submitting.
You will need to specify the account on the command line like:

srun --account=torch_pr_XXX_XXXXX --pty bash
sbatch -c4 -t2:00:00 --mem=4G --account=torch_pr_XXX_XXXXX my_script.sh

or in your sbatch file you'll need to add a line like:

#SBATCH --account=torch_pr_XXX_XXXXX

You'll need to modify the above to use your actual account.

Please see Slurm: Command reference for details.

For more information about slurm accounts please see Slurm Accounts.

`nvidia-smi`

nvidia-smi (NVIDIA System Management Interface) is a command-line utility, based on the NVIDIA Management Library (NVML), used to monitor and manage NVIDIA GPU devices

It will provide detailed information like:

GPU utilization
Memory usage
P-States: Performance states from P0 (max performance) to P12 (minimum idle)
device details like power consumption and temperature

tip

You can get output refreshed every 5 seconds with:

nvidia-smi -l 5

Alternatively, you can use:

/share/apps/images/run-nvtop-3.2.0.bash nvtop

tip

You can get very detailed information about the GPU with:

[NetID@gl001 ~]$ nvidia-smi -q

`seff`

The seff script can be used to display status information about a user’s historical or running jobs.

Here's example output for a job:

[NetID@torch-login-b-1 ~]$ seff 6239104
Job ID: 6239104
Cluster: torch
State: COMPLETED (exit code 0)
Nodes: 1
Cores per node: 5
CPU Utilized: 00:00:07
CPU Efficiency: 28.00% of 00:00:25 core-walltime
Job Wall-clock time: 00:00:05
Memory Utilized: 50.54 MB
Memory Efficiency: 4.94% of 1.00 GB

As you can see above, seff gives information about CPU and memory efficiency to help you more efficiently use our cluster resources.

tip

Requesting the minimum resources needed for your job can help it spend less time in the queue.

`show_slurm_qos`

This shows the maximum number of cpus/gpus and memory allowed for different wall times.

[NetID@torch-login-b-0 ~]$ show_slurm_qos
                    Name     MaxWall                MaxTRESPU    Preempt   PreemptExemptTime PreemptMode 
------------------------ ----------- ------------------------ ---------- ------------------- ----------- 
               cpu_short    06:00:00          cpu=32,mem=120G                                    cluster 
                  cpu168  7-00:00:00       cpu=1000,mem=2000G                                    cluster 
                   cpu48  2-00:00:00       cpu=3000,mem=6000G                                    cluster 
                 cpuprem  2-00:00:00    cpu=30000,mem=120000G                                    cluster 
                  gpu168  7-00:00:00               gres/gpu=4                                    cluster 
                   gpu48  2-00:00:00              gres/gpu=16                                    cluster 
             interactive    06:00:00           cpu=16,mem=60G                                    cluster

You can see that, in general, the partitions with shorter wall times will allow the use of greater resources.

myquota​

my_slurm_accounts​

nvidia-smi​

seff​

show_slurm_qos​

`myquota`

`my_slurm_accounts`

`nvidia-smi`

`seff`

`show_slurm_qos`