Topics

Building an Alphafold3 cluster with slurm

It would be fascinating to comprehensively analyze protein functions in the context of de novo genome determination. While I was working on small-scale analyses, I was fortunate enough to be assigned to a collaborative project with Professor Kyoko Miura (Kyushu University) to analyze the structure of all proteins encoded in the genome. We are currently conducting calculations using Alphafold3 (Af). We have installed 12 PCs for Af calculations and have already completed calculations for all proteins using Af2. To improve accuracy, we are re-analyzing using Af3, which requires further analysis. We have built a clustered system using the Simple Linux Utility for Resource Management (slurm) work manager. While we manually assigned jobs to individual PCs and performed the calculations in a loop, we frequently encountered machine failures, resulting in significant delays. Since submitting jobs individually is inefficient, we built a cluster using Slurm to enable flexible job distribution.

The PCs were introduced in the first and second years, and although the boards were slightly different, the specifications of the compute nodes were roughly as follows: The head was my retired laptop from over 10 years ago.

CPU: AMD Ryzen7 7700X
RAM: DDR5 128GB
NVMe: 4TB
HDD: 4TB
GPU: Nvidia RTX3060 / RTX4070Ti Super

When this cluster is operating at full capacity, it is possible to deduce the structures of approximately 600 proteins per day of average length (approximately 500 residues). If a single biological species has approximately 40,000 proteins, including isoforms, this can be completed in approximately two months.

Leave a Reply

Your email address will not be published. Required fields are marked *