Ember Cluster Hardware Overview

  • Capability Cluster for Large Parallel Jobs
  • 262 Dual Socket-Six Core Nodes (3144 total cores)
  • 2.8 GHz Intel Xeon (Westmere X5660) processors
  • 24 Gbytes memory per node (2 Gbytes per processor core)
  • Mellanox QDR Infiniband interconnect
  • Gigabit Ethernet interconnect for management

NFS Home Directory

Your home directory, an NFS mounted file system is one choice for I/O. Speedwise this space carries the worst statistical performance. This space is visible to all nodes on the clusters through an auto-mounting system.

NFS Scratch (/scratch/serial)

There is an NFS scratch filesystem for use on Ember: /scratch/serial. This space is visible to all clusters and can be accessed with the path /scratch/serial.

Parallel file system (/scratch/ibrix/chpc_gen)

The parallel file system has 60 TB of disk capacity. It is attached to the Infiniband network to obtain a larger potential network bandwidth. It is served by 2 load sharing redundant servers. Like /scratch/serial this space is seen on all of Ember's interactive and compute nodes. However it is still a shared resource and may therefore perform slower when it is subjected to significant user load. Users should test their applications performance to see if they experience any unexpected performance issues within this space. There also exists two restricted scratch spaces: /scratch/ibrix/icse_cap (120TB) and /scratch/ibrix/icse_perf (46 TB); usage of this space is restricted to icse users.

Local Disk (/scratch/local)

The local scratch space is a storage space unique to each individual node. The local scratch space is cleaned aggressively and is not supported by the CHPC. It can be accessed on each node through /scratch/local. This space will be one of the fastest, but certainly not the largest (430 GB). Users must remove all their files from /scratch/local at the end of their calculation.

It is important to keep in mind that ALL users must remove excess files on their own. Preferably this can be done within the user's batch job when he/she has finished the computation. Leaving files in any scratch space creates an impediment to other users who are trying to run their own jobs. Simply delete all extra files from any space other than your home directory when it is not being used immediately.

Updraft Cluster Hardware Overview

  • Capability Cluster for Large Parallel Jobs
  • 256 Dual-Quad Core Nodes (2048 total cores)
  • 2.8 GHz Intel Xeon (Harpertown) processors
  • 16 Gbytes memory per node (2 Gbytes per processor core)
  • Qlogic Infiniband DDR (InfiniPath QLE 7240) interconnect
  • Gigabit Ethernet interconnect

NFS Home Directory

NFS mounted file system, your home directory is one choice for i/o. Speed wise this space carries the worst statistical performance. This space is visible to all of the nodes of the clusters through an auto-mounting system.

NFS Scratch (/scratch/general, /scratch/uintah)

NFS mounted file system There are two NFS scratch filesystems for use on updraft, depending upon your group. If you aren't sure which space to use, it is probably the /scratch/general space. The uintah space is reserved for the users of the uintah allocation and nodes.

NFS Scratch (/scratch/serial)

This space is visible to all clusters and can be access with the path /scratch/general or /scratch/uintah. Each user will be responsible for creating directories and cleaning up after their jobs. This filesystem is not backed up.

Local Disk (/scratch/local)

Local Scratch space is the storage space unique to each individual node. Local Scratch is cleaned aggressively and is not supported by CHPC. It can be accessed on each node through "/scratch/local". This space will be the fastest, but not necessarily the largest (200GB). Users should use this space at their own risk.

When running jobs, it is important to know that making a flow from one storage system to another is the best idea. For example, taking a job that isn't too large and doesn't need much time on the node should be placed in the "/scratch/general" and then outputted to the user's home directory using a batch job.

It is also important to keep in mind that ALL users must remove excess files on their own. Preferably this can be done with the user's batch job when he/she has finished the computation. Leaving files in any "/scratch/" space creates an impediment to other users who are trying to run their own jobs. Simply delete all extra files from any space other than your home directory when it is not being used immediately.

sanddunearch (156 nodes, 312 procs, 624 cores)

  • "parallel cluster" for highly parallel parallel jobs requiring high speed interconnect.
  • 2.4 GHz dual-core Opteron processors
  • 8 Gbytes memory per node (2 Gbytes per processor core)
  • Both Infiniband and Gigabit Ethernet interconnects
  • 156 nodes available to the general pool
  • Some restricted nodes:
sanddunearch restricted nodes
Research Group proc speed node memory node/procs fast interconnect
Molinero 2.6 GHz 16 Gbytes 24/192 infiniband
Schuster 1.8 GHz 2 Gbytes 12/48 infiniband
tj 2.66 GHz 16 Gbytes 12/96 infiniband
zpn 2.67 GHz 24 Gbytes 12/96 infiniband
liu 2.67 GHz 24 Gbytes 18/144 infiniband
Strong 2.67 GHz 24 Gbytes 9/72 infiniband
Simons 2.66 Ghz 16 Gbytes 9/72 infiniband

Telluride Cluster Overview (restricted)

  • For dedicated use by one research group
  • 72 nodes, 576 processors
  • 2.333 (48 nodes) and 2.66 (24 nodes) GHz processors
  • 16 Gbytes memory per node
  • Inifiniband interconnect
  • Has own scratch: /scratch/tr and /scratch/tr1.