Skip to content

[Guide] PyTorch Environment Setup: Docker + SSH + Jupyter

Summary: A step-by-step guide to setting up a Deep Learning environment on NVIDIA GPU Bare Metal Servers using the official PyTorch Docker image. Covers high-performance flags, SSH port mapping (2222), and JupyterLab setup.

1. Host Operations: Docker Run

Start the container with flags optimized for A800 clusters.

bash
docker run -it \
  --gpus all \
  --ipc=host \
  --network host \
  --privileged=true \
  --shm-size=20G \
  -v /home/test/test11/limu:/workspace/limu \
  --name pytorch-dev-env \
  nvcr.io/nvidia/pytorch:25.11-py3 bash

Key Parameters:

  • --network host: Critical for NCCL performance. Requires SSH port change.
  • --ipc=host: Prevents Shared Memory errors in PyTorch.

2. Container Setup: SSH Service

Since we use --network host, port 22 is occupied by the host. We use 2222 for the container.

bash
# Install tools
apt-get update && apt-get install -y openssh-server vim

# Edit /etc/ssh/sshd_config
# Port 2222
# PermitRootLogin yes

# Restart SSH
service ssh restart

3. JupyterLab Setup

bash
pip install jupyterlab --break-system-packages
nohup jupyter lab --ip=0.0.0.0 --port=8888 --allow-root --no-browser > jupyter.log 2>&1 &

4. Persist Environment (Commit)

Run on Host:

bash
docker commit -m "SSH+Jupyter Ready" <Container_ID> pytorch-jupyter:v1

AI-HPC Organization