Skip to content

Practical Guide: IBM Spectrum Scale (GPFS) ECE Deployment

Abstract: This document is based on real-world production implementation records, detailing how to build a highly reliable and high-performance parallel file system on x86 servers using Erasure Code Edition (ECE).

1. Architecture Planning & Prerequisites

ECE (Erasure Code Edition) allows building data protection on commodity servers using distributed erasure coding, eliminating the need for expensive proprietary storage arrays.

1.1 Hardware Configuration Example

  • Storage Nodes: 4x High-Performance Servers (NVMe SSDs for data tier, HDD for capacity tier).
  • Network: InfiniBand (IB) HDR/NDR for data plane, Gigabit Ethernet for management plane.
  • OS: CentOS 8.x / RHEL 8.x

1.2 Environment Initialization (All Nodes)

1. Disable Firewall & SELinux

bash
systemctl stop firewalld && systemctl disable firewalld
sed -i "s/^SELINUX=.*/SELINUX=disabled/g" /etc/sysconfig/selinux
setenforce 0

2. Configure Trust & Hosts Ensure /etc/hosts contains all node resolutions and configure root passwordless SSH.

bash
# Distribute keys
ssh-copy-id gpfs01; ssh-copy-id gpfs02; ssh-copy-id gpfs03; ssh-copy-id gpfs04

3. Optimize System Parameters

bash
echo "ulimit -n 65536" >> /etc/profile
# Essential Dependencies
yum -y install gcc-c++ kernel-devel cpp binutils compat-libstdc++-33
yum -y install python3 python3-distro ansible nvme-cli sg3_utils net-snmp

2. InfiniBand Network Configuration

The core of high-performance storage lies in low-latency networking.

1. Upgrade Firmware (MFT Tool)

bash
# Start MFT service
mst start
# Burn firmware
flint -d /dev/mst/mt4123_pciconf0 -i fw-ConnectX6-rel.bin burn
# Reset device
mlxfwreset --device /dev/mst/mt4123_pciconf0 reset

2. Install OFED Driver

bash
./mlnxofedinstall --force
/etc/init.d/openibd restart
ibstat # Ensure state is LinkUp

3. Spectrum Scale Software Installation

Using Ansible Toolkit for automated deployment.

1. Initialize Cluster

bash
cd /usr/lpp/mmfs/5.1.5.1/ansible-toolkit
# Setup primary installer node
./spectrumscale setup -s 10.252.0.21 -st ece

2. Add Nodes

bash
# Add 4 storage nodes (Quorum + Manager)
./spectrumscale node add -a -q -m -so gpfs01
./spectrumscale node add -a -q -m -so gpfs02
./spectrumscale node add -a -q -m -so gpfs03
./spectrumscale node add -a -q -m -so gpfs04

3. Execute Installation

bash
./spectrumscale install --skip no-ece-check
./spectrumscale deploy --skip no-ece-check

4. ECE Storage Core Configuration

This is the critical step defining how disks are sliced and protected.

4.1 Specify RDMA Network

bash
# Force communication over IB ports
mmchconfig verbsPorts="mlx5_0" -N ece_cluster
mmshutdown -a && mmstartup -a

4.2 Drive Mapping

ECE needs to know the exact physical slot of each drive.

bash
# Auto-scan NVMe
ecedrivemapping --mode nvme
# Or manually specify HDD Slots
ecedrivemapping --mode lmr --slot-range 0-23

4.3 Create Recovery Group (RG)

bash
# Create Node Class
mmvdisk nodeclass create --node-class nc_1 -N gpfs01,gpfs02,gpfs03,gpfs04
# Configure Servers
mmvdisk server configure --node-class nc_1 --recycle-one
# Create RG
mmvdisk recoverygroup create --recovery-group rg_1 --node-class nc_1 --complete-log-format

4.4 Define VDisk Sets

Using 4+2P (4 Data + 2 Parity) Erasure Coding strategy.

bash
# 1. Metadata Tier (3-Way Replication)
mmvdisk vdiskset define --vdisk-set vs-meta   --recovery-group rg_1 --code 3Way --block-size 1m   --nsd-usage metadataonly --storage-pool system --set-size 2%

# 2. Data Tier (4+2P)
mmvdisk vdiskset define --vdisk-set vs-data   --recovery-group rg_1 --code 4+2p --block-size 8m   --nsd-usage dataonly --storage-pool data-pool --set-size 90%

# 3. Create
mmvdisk vdiskset create --vdisk-set vs-meta
mmvdisk vdiskset create --vdisk-set vs-data

4.5 Create File System

bash
mmvdisk filesystem create --file-system gpfs01 --vdisk-set vs-meta
mmvdisk filesystem add --file-system gpfs01 --vdisk-set vs-data
mmchfs gpfs01 -T /gpfs01
mmmount gpfs01 -a

5. Client Mounting & Testing

5.1 Client Installation

Clients do not need ECE licenses, only Client licenses.

bash
# Install packages
dpkg -i gpfs.base*.deb gpfs.gpl*.deb gpfs.msg.en*.deb ...
# Build kernel module
mmbuildgpl

5.2 Join Cluster

Execute on Manager Node:

bash
mmaddnode -N client01
mmchlicense client --accept -N client01

5.3 Performance Test (IOR)

bash
# Sequential Write (16GB file, 4MB block size)
mpirun -np 16 --hostfile hosts ior -w -t 4m -b 16g -F -o /gpfs01/test

6. Maintenance Commands

  • Check Health: mmgetstate -a
  • Check Physical Disks: mmvdisk pdisk list --recovery-group rg_1
  • Check NSD Distribution: mmlsnsd -M

AI-HPC Organization