SURF's Digital Services for Research and Development
What is SURF?
SURF is the ICT cooperative for Dutch education and research institutions. As a collaborative organization, SURF’s members—its owners—work together to deliver top-tier digital services, address complex innovation challenges, and exchange valuable knowledge.
Computing and storage infrastructure are essential for cutting-edge research. SURF supports researchers with a diverse range of computing and storage services. But before diving into these services, let’s briefly explore what a cluster computer is.
What is a cluster computer?
A cluster computer is essentially a group of interconnected computers, called nodes, working together as a unified system. Each node has its own CPU, memory, and disk space, along with access to a shared file system. Imagine these nodes connected by network cables, like those in your home or office.
Cluster computers are designed for high-performance workloads, allowing users to run hundreds of computational tasks simultaneously.
Different types of Services provided by SURF:
Some of the computing and storage solution provided by SURF are:
1) Spider Cluster - High-performance Data Processing (DP) platform:
Spider is a versatile DP platform aimed at processing large structured data sets. Spider is an in house compute cluster built on top of SURF’s in-house elastic Cloud. This allows for scalable processing of many terabytes or even petabytes of data, utilizing many hundreds of cores simultaneously, in exceedingly short timespans. Superb network throughput ensures connectivity to external data storage systems. Spider is used for large scale multi-year data intensive projects, for users to actively process their data, such are large static data sets or continuously growing data sets. Examples include genomics data, astronomic telescope data, physics detector data and satellite earth observations.
2) Snellius Cluster - the Dutch National supercomputer:
Snellius is the Dutch National supercomputer hosted at SURF. The system facilitates scientific research carried out in many Universities, independent research institutes, governmental organizations, and private companies in the Netherlands. Snellius is a cluster of heterogeneous nodes built by Lenovo, containing predominantly AMD technology, with capabilities for high performance computing (parallel, symmetric multiprocessing). The system also has several system-specific storage resources, that are geared towards supporting the various types of computing.
3) SURF Research Cloud (SRC):
SURF Research Cloud is a service to facilitate scientists’ collaborative work. The central idea in SRC is collaborative workspace. A workspcae translates directly to a "Virtual Machine". These hosted workspaces aka virtual machines can be used for conducting research and development individually or together with your team/project members.
4) Research Data Storage Services:
4.1) Data Archive : The SURF Data Archive allows users to safely archive up to petabytes of valuable research data to ensure the long term accessibility and reproducibility of their work. The Data Archive is also connected to SURF’s compute infrastructure, via a fast network connection, allowing for the seamless depositing and retrieval of data.
4.2) Data Repository : The Data Repository service is a web-based data publication and archiving platform that allows researchers to store, annotate and publish research data to ensure long-term preservation and availability of their datasets. All published datasets get their own DOI and Handle, while every file gets its own independent Handle to allow persistent reference on all levels.
4.3) dCache : dCache is scalable storage system. It contains more than 50 petabytes of scientific data, accessible through several authentication methods and protocols. It consists of magnetic tape storage and hard disk storage and both are addressed by a common file system.
4.4) Object Store : Object storage is ideal for storing unstructured data that can grow without bound. Object storage does not have a directory-type structure like a normal file system has but it organises its data in so-called containers that contain objects. There is no tree-like structure with files and directories. There are only containers with objects in them. SURF Object Store service is based on Ceph RGW and provides access using the S3 protocol, which is the defacto standard for addressing object storage.
How to Get Started with SURF Services?
The DSRI team is here to help you navigate SURF’s services, including:
1) Grant Applications:
We assist researchers in applying for SURF grants. For instance:
* Small applications: Up to 1 million System Billing Units (SBU) on Snellius and/or 100 TB of dCache storage.(https://www.surf.nl/en/small-compute-applications-nwo)
* Large applications: Customized resource allocations based on project needs.
2) Resource Estimation:
Unsure about your computing and storage requirements? We help estimate your needs in terms of SURF’s billing units.
3) Use Case Analysis:
We assess whether your research project is a good fit for SURF’s services.
External Resources and references
- SURF: https://www.surf.nl/en
- Deep Learning Tutorials by UvA: https://uvadlc-notebooks.readthedocs.io/en/latest/index.html