Caltech Resnick High Performance Computing Center Logo

General

  • About
    • Mission
    • Research community
    • Getting help
    • Infrastructure
    • Sponsorship
    • Citing the Center
  • People
    • IMSS Leadership
    • IMSS Technical Leadership
    • IMSS Technical Staff
    • Contact
  • Services
    • Getting started
    • Performance & optimization
    • Software & environment support
    • Data transfer & storage
    • Interactive & visualization computing
    • Get in touch
  • Support
    • Getting Help
    • Self-service documentation
    • Consultations
    • System Status
    • Location
  • Rates
    • Cost Calculator
    • Compute Pricing
      • Compute Unit Definitions
    • Storage Pricing
    • Questions
  • Resources
    • Cluster Overview
    • Login Nodes
      • Standard Login
      • Visualization Login
    • Compute Nodes
    • GPU Nodes
    • Contact
  • Software
    • System Software
    • User Software
    • Pre-installed Software
    • Installation Methods
      • Module System
      • Spack
      • Anaconda/Conda
      • Singularity/Apptainer Containers
    • Software Guides
      • Abaqus
        • Known Issue
        • Solution
        • Example Job Script
        • Getting Help
      • MATLAB
        • Loading MATLAB
        • Running MATLAB on the Cluster
        • Parallel Computing
        • Tips
      • Jupyter Notebook
        • Launching Jupyter
        • Adding Conda Environments as Kernels
        • Accessing Group Directories
        • Julia Support
        • Configuration
        • Troubleshooting
      • cryoSPARC
        • Installation
        • Requesting Access
        • Connecting
        • Management Commands
        • Job Submission
      • RStudio
        • Launching RStudio
        • Package Installation
        • Example: Installing tidyverse
        • Troubleshooting
      • NVIDIA NGC
        • Account Setup
        • Configuration
        • Pulling Containers
        • Running Containers
        • NGC CLI Tool
        • Example SLURM Script
        • Popular NGC Containers
      • Relion using SBGrid
        • Setup
        • SBGrid Preferences
        • SLURM Submission Template
        • Verification
        • Troubleshooting
      • AlphaFold
        • Prerequisites
        • Setup
        • Job Submission
        • Output
        • Additional Options
        • Tips
      • VSCode
        • Prerequisites
        • Setup
        • First Connection
        • Advanced: Direct Compute Node Connection
        • Troubleshooting
      • Available Software Modules
        • Categories
        • Requesting New Software
        • See Also
    • Institutional Licenses
    • Request New Software
  • Open OnDemand
    • Features
    • Getting Started
    • Interactive Desktop
    • Troubleshooting
      • PATH Conflicts
      • Browser Issues
      • File Upload Failures
      • “Request Header Too Long” Error
      • Home Directory Quota
    • Support
  • System Status
    • Current Status
    • Active Announcements
    • Check the Cluster Yourself
      • Job queues
      • Node availability
      • Storage health
      • Job efficiency after a run
    • Notifications
      • Mailing List
    • Reporting an Issue
  • Training
    • Self-Study Resources
      • Internal
      • External (recommended)
    • Announcements
  • Citing the Resnick High Performance Computing Center
    • Recommended acknowledgement
    • BibTeX
    • CITATION.cff
    • Let us know when you publish
    • Questions
  • Policies
    • Acceptable use
    • Data handling
    • Export control
    • Security
    • Questions

Getting Started

  • Quick Start Guide
    • 1. Get an account
    • 2. Connect via SSH
      • Recommended SSH config
    • 3. Move some data over
    • 4. Find and load software
    • 5. Submit your first job
    • 6. Cheat sheet
    • 7. Where to put your files
    • Next steps
    • Stuck?
  • Getting Started
    • Quick Links
    • First Steps
    • Logging In
    • Next Steps
  • Account Information
    • Getting an Account
      • For New Groups
      • For Existing Groups
    • Multi-Factor Authentication
      • Supported Methods
    • Eligibility Certification
    • HPC End-User Agreement

Running Jobs

  • SLURM Commands
    • Job Submission
      • sbatch
      • salloc
      • srun
    • Resource Request Parameters
    • Environment Variables
    • Queue Management
      • squeue
      • scancel
      • scontrol
    • Usage Reporting
      • sreport
      • sacct
    • Account Management
    • Task Launching
      • MPI Jobs
    • Example Batch Script
  • Fairshare & Job Priority
    • How fairshare works
    • Check your group’s shares and usage
    • See where your jobs sit in the queue
    • What raises your priority
  • Example Job Scripts
    • Basic Examples
      • Serial Job
      • Multi-threaded (OpenMP)
      • MPI (Multi-node)
    • GPU Jobs
      • Single GPU
      • Multi-GPU
      • Specific GPU Type
    • Python & Conda
      • Conda Environment
      • Jupyter Batch
    • Job Arrays
      • Parameter Sweep
      • Limit Concurrent Jobs
    • Applications
      • MATLAB
      • R
      • GROMACS
      • AlphaFold
    • Job Dependencies
      • Sequential Pipeline
      • Fan-out, Fan-in
    • Email Notifications
    • Generate a Custom Script
    • See Also
  • Best Practices
    • Resource Requests
      • Request What You Need
      • Match CPUs to Parallelism
      • Add Time Buffer
    • Job Arrays
    • Storage
      • Use the Right Location
      • Monitor Your Quota
      • Scratch Warning
    • I/O Performance
    • Code Efficiency
      • Profile First
      • Set Thread Counts
    • Checkpointing
    • Environment Management
    • Monitoring
    • Good Citizenship
    • Pre-Submit Checklist
    • Questions?
  • GPU Computing
    • Available GPUs
    • Requesting GPUs
      • Single GPU
      • Multiple GPUs
      • Specific GPU Type
    • Basic GPU Job
    • CUDA Programming
      • Load CUDA
      • Compile CUDA Code
      • Check GPU in Code
    • Deep Learning Frameworks
      • PyTorch
      • TensorFlow
      • JAX
    • Using Containers for Deep Learning
      • NGC Containers (Recommended)
      • Pull NGC Container
    • Multi-Node GPU Training
    • GPU Memory Management
      • Check Memory Usage
      • Reducing Memory Usage
    • Best Practices
    • Troubleshooting
      • “CUDA out of memory”
      • GPU not detected