(CHPC) at the University of Utah
Hello, and welcome to this brief introduction to the Center for High-Performance Computing, or “CHPC,” at the University of Utah.
CHPC’s mission is to support the diverse research computing needs of the university.
To this end, we offer many resources. including computational servers (running both Linux and Windows) in a general and protected environment.
virtual machines; data storage and movement; and advanced, high-speed networking.
You can make use of these resources. if you have large tasks, you need to use multiple servers and require a single, powerful computer.
or have large data sets to store or specific applications to install and run. Additionally, you can use the Protected Environment. You have IRB-governed Protected Health Information (or PHI). Further, if you have research computing.
not met by your local resources (in your department, for example), CHPC can also likely help meet them.
If you’re new to CHPC, you may find the documentation, presentations, and training videos particularly helpful.
The CHPC documentation is home to many pages describing the use of all. the resources mentioned previously and tips and tricks for getting started, and usage examples.
Each semester, there is a presentation series covering many topics of interest (including software modules and the Slurmscheduler).
features hands-on introductions to such things as the Linux command line and python scripting language.
Training videos supplement the documentation and presentations with walkthroughs for common tasks, such as transferring data and writing slurs scripts. I’ll mention many external resources to keep this video brief; where appropriate.
I’d like to give an overview of some of the concepts, tools, and software packages that will be most useful to you when working with HPC resources.
Most of CHPC’s computational servers and other resources are housed in the Downtown Data Center, or DDC.
The DDC is operational 24 hours a day, 7 days a week, 365 days a year, and is also home to other computational resources used by the university.
Periodic tours of the facility are offered, so keep an eye out for emails and updates on the CHPC webpage.
The resources in the DDC are varied. Principally, there are several “clusters,” which are a collection of servers with similar hardware.
The servers themselves are large computers, typically featuring multiple CPUs with several cores and some memory quantity.
some may also have GPUs or specific hardware suited to certain computational work.
There are also nodes with other purposes besides computational research work, such as resource managers to allocate tasks to individual servers within the cluster.
Network switches transfer data between servers and externally (with the internet).
Storage systems host research data and backups in-home and group directories and archive storage space. Individual resources work together to support the computational work of researchers.
CHPC staff provide a wide variety of services, from hardware diagnostics and maintenance to networking support to user services and software management.
How can you interact with CHPC resources? Most users will be using the clusters for research work.
These can be reached with software tools like SSH and FastX. Once you have an account, you can access login nodes directly with such tools, then prepare larger tasks—called “jobs”—for submission to the scheduler, Slurm.
When the resources you request become available, your job is started. Several videos cover the process of connecting to CHPC Linux resources and submitting jobs (including interactive jobs).
Graphical work can be done through interactive jobs with FastX or X forwarding.
It’s also possible to use the Frisco nodes for brief tasks requiring graphical software. The usage policies are generally a bit more forgiving than they are on typical login nodes.
Connecting to Windows resources is done similarly: the documentation and remote desktop video are good starting points.
The policies are slightly different from those for Linux resources, so please make sure your workload is acceptable.
The time available on computational resources limited, and there are a few options for submitting your work as a result. First, general nodes can be used by research groups with an allocation of time.
When this time has been used, the same nodes can be used in “freecycle”.
Still, jobs submitted this way can be preempted by groups with allocation and are not guaranteed to run to completion. Beyond the general nodes.
there are nodes owned by research groups, which can be used by the owners at any time.
Users who aren’t in the research group that owns the nodes can still submit jobs as a guest, but as with “freecycle” jobs, they’ll eligible for preemption.
Information about purchasing nodes can be found on the CHPC policy descriptions. Allocations on HPC clusters are updated quarterly. There are three types of allocation at CHPC. The first is a quick allocation.
This is for new PIs who have not had an allocation before and provides up to 20,000 core hours.
The second type is the small allocation, which also provides up to 20,000 core hours. This is intended for PIs who have received an allocation before.
Quick and small allocations are reviewed internally by CHPC staff and not by the allocation committee.
By contrast, regular allocations for anything more than 20,000 core hours are reviewed by the allocation committee.
More information about allocations and links to relevant forms can be found on the CHPC documentation.
The Protected Environment, or PE, was refreshing 2017 with an NIH Shared Instrumentation grant.
Much like the general environment, it has a Linux cluster, Windows servers, and virtual machines, among other resources.
Access is restricted to researchers with sensitive work in nature, and there are specific requirements for new projects.
If you feel your project is best suited to the PE, please review the documentation for more information and a procedure for getting started.
The data you generate or require when working on CHPC resources can be stored in home directories or group filesystems.
Space beyond the default home directory quotas must be purchased by research groups; the current storage cost is provided on the CHPC documentation.
Some spaces are also backed up; please refer to the documentation for the most recent information.
Archive storage is available in both the general and protected environment (and recent prices are, again, provided on the storage documentation pages). Scratch storage is an alternative if you don’t want to save your data long-term.
There’s a lot of scratch space that can be used by anyone. Still, the files will be scrubbed after some time of inactivity, so if you don’t access them, they’ll eventually be deleted. There’s also local scratch space on the nodes themselves.
which can be used for quick file storage but is also not persistent. Users are also encouraged to take advantage of the university’s free storage space on Box and Google Drive. However, for all storage media.
please be sure you are permitted to store your data; on some platforms, PHI and Personally identifiable information, or PII, are not permitted. Further information is available from the Information Security Office.
It is generally displayed on each service’s homepage, such as the university’s Google Drive or Box offerings. There are several options for moving data to and from CHPC.
First, you can map home directories and GroupSpaces directly to your computer and interact with them as you would any other file.
This requires a connection on the campus network, however, and the campus VPN will be required if you’re connecting from off-campus.
Second, you can use command-line tools like SCP, sync, and curl to move small data sets. This can be done on login nodes, among other resources.
Third, you can use theData Transfer Nodes (or DTNs) with parallel transfer tools for large data sets.
Each of these methods is described in further detail on the “Data Transfers” page of the CHPC documentation. Several methods also have associated training videos. If you need data from someone without a CHPC account.
you can use the Guest Transfer Service (provided the data are not restricted). Guest accounts for transferring data can be made by CHPC users. The process is described in the “Data Transfers” documentation.
Software on CHPC Linux systems is usually provided as a “module,” which allows it to be distributed and loaded easily without interfering with other programs. Many software packages are already installed.
they can be viewed on the Software Database or with the “module” tool [command] from a Linux resource. The user services team can assist you with installations of software that is not currently available.
Documentation and video tutorials describing the creation and use of modules are available.
The system used to submit jobs to computational resources on Linux clusters is the Slurm scheduler, which can be interacted with from login nodes.
You can request specific resources and specify such things as the number of cores and amount of memory required for a job.
Jobs can be submitted interactively, which allows you to work as you would on any other Linux server, or with batch scripts.
which automate running computational work. Both methods are described on the Slurm documentation pages on the CHPC website and shown in video tutorials.
In some cases, you can also use the Open OnDemand tools to submit jobs to Slurm without using the Linux interface or learning how to use Linux.