Turing: Control commands for batch jobs

The principal commands used to control your jobs are the following:

  • llsubmit to submit a job in batch.
$ llsubmit my_job.ll
llsubmit: Processed command file through Submit Filter: ''/bglocal/loadl/Fidris/llsubmit_exit''.
llsubmit: The job ''turing1.idris.fr.2208'' has been submitted.

Any other messages imply a job error; sometimes an error message set up by IDRIS will indicate which submission parameter was omitted.

Note: At IDRIS, the command llsubmit does not allow any options.

  • llq [-u] displays information about the evolution of all the batch jobs on the machine. The option -u restricts the displaying of jobs belonging to a specified login (for example, your own jobs).
  $ llq -u rlab432
  Id                       Owner      Submitted   ST PRI Class        Running On
  ------------------------ ---------- ----------- -- --- ------------ -----------
  turing1.2208.0           rlab432    10/10 16:53 I  100 2Rt1
  1 job step(s) in queue, 1 waiting, 0 pending, 0 running, 0 held, 0 preempted
The status (ST) column indicates if your job is running (R) or idle/waiting (I).   The other current states are NQ (Not Queued) or outside of the queues; H(Hold); or CS (Changing State) which indicates that a job or a step has finished but is in the process of exiting the queues.  
  • llcancel deletes a job. For example, to delete the job turing1.2208.0 which is running on the Blue Gene/Q machine:
  $ llcancel turing1.2208
  llcancel: Cancel command has been sent to the central manager
  • idrjar displays information concerning the consumption of your batch jobs on the Turing machine. It includes the following:
    • Elapsed time.
    • Dates of job submission and the beginning and end of execution.
    • Reserved resources This information cannot be obtained until the day following the job execution. To know more about this command, launch idrjar -h on Turing. For Example:
  $ idrjar
  |----------------------------------------------|
  |--- IDRIS/CNRS.Version of 3 October 2013 ---|
  |----------------------------------------------|
  Outputs for login rlab432 for the period
          ==> 01 October 2013 to 09 October 2013
   Owner                  Job Name                       JobId        Queue tEse   #T   S
  ------- ---------------------------------------- ------------------ ----- ----- ----- -
  rlab432 run_test_big                             turing1.82766.0    2Rt3  66684  8192 C
  rlab432 nom_travail1234                          turing1.83111.0    1Rt2  34007  4096 C
  rlab432 run_test_small                           turing1.82992.0    MRt2  27457  1024 C
  rlab432 run_test_small                           turing1.83064.0    MRt2  27384  1024 C
  rlab432 run_test_short_medium                    turing1.83249.0    1Rt2    312  4096 R
  ---------------------------------------------------------------------------------------
          TOTAL CONSUMPTION OF THE ABOVE JOBS ==> 714924544, or 198590.15h
  ------------------------------- LEGEND -------------------------------
  tEse  : Elapsed time consumed in seconds
  #T    : number of reserved cores
  S     : C (completed) ==> job completed normally
          R (removed)   ==> job destroyed during execution (by using the command llcancel, for example)