Skip to content

submitted a job to a gpu32 partition, without specifying a GPU32 Account. #8

@lesolano

Description

@lesolano

I began by prompting it to create a .py script I wanted to run. It also created the following .sh to drive the .py script. We may not have access to these resources or it is not finding the appropriate way to request them.

terminal give the following error:

(base) [lesolano@login-i17 Cobos]$ sbatch /dfs7/swaruplab/lesolano/Cobos/ParseReseq/Analysis/scripts/train_mrvi_FAST.sh
sbatch: error: !!! slurm_job_submit ERROR: 302732 submitted a job to a gpu32 partition, without specifying a GPU32 Account.
sbatch: error: Batch job submission failed: Unspecified error

The generated .sh script:

#!/bin/bash
# FAST OPTION: Uses newer L40S GPUs (3x faster than A30)
# Resources: L40S GPU + maximum available memory (256GB)
# Expected runtime: 30-60 minutes

#SBATCH --job-name=Train_mrVI_L40S         # Job name
#SBATCH -A vswarup_lab                     # Account
#SBATCH -p gpu32                           # Partition with L40S GPUs
#SBATCH --nodes=1                          # Number of nodes
#SBATCH --ntasks=1                         # Number of tasks
#SBATCH --gres=gpu:L40S:1                  # GPU: L40S (3x faster than A30)
#SBATCH --error=err_mrvi_l40s_%j.log       # Error log file
#SBATCH --output=out_mrvi_l40s_%j.log      # Output log file
#SBATCH --time=12:00:00                    # Time limit (12 hours)
#SBATCH --mail-user=lesolano@uci.edu       # Email notifications
#SBATCH --mail-type=END,FAIL,REQUEUE       # Notify on completion/failure/requeue
#SBATCH --mem=230G                         # Request 230GB (L40S nodes have 256GB available)

# Load conda
echo "Sourcing conda..."
source /pub/lesolano/miniconda3/etc/profile.d/conda.sh
if [ $? -ne 0 ]; then
    echo "ERROR: Failed to source conda"
    exit 1
else
    echo "Successfully sourced conda"
fi

# Activate environment
conda activate scgpu

# Navigate to scripts directory
cd /dfs7/swaruplab/lesolano/Cobos/ParseReseq/Analysis/scripts

# Run mrVI training
echo "Starting mrVI training on L40S GPU with 230GB memory..."
echo "Expected runtime: 30-60 minutes"
python Train_mrVI_Model.py

echo "mrVI training complete!"

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions