Storage and Milan Partition Delivered

Storage and Milan Partition Delivered

Published: 12 Aug 2024 by NEMO Team

The Weka Storage and the Milan partition of the new NEMO2 cluster have been delivered in the last weeks. In the next weeks, the companies will run tests and benchmarks and set up a final configuration for the systems. After that, we plan to start a first phase of NEMO2 with limited functionality, which we will increase over time.

Unfortunately, we had to shut down a large portion of the old NEMO cluster since the Milan partition needed to be installed into the same rack space. We will continue to operate the remaining approximately 300 nodes and the old BeeGFS until the official launch of NEMO2.

The first partition consists of:

  • 137 Milan nodes, each node has
    • 2x 64 Core AMD EPYC 7763 2.45GHz processors (128 cores per node)
    • 512GiB DDR4 RAM
    • 100 Gbit/s Omni-Path Interconnect
    • 100 Gbis/s Ethernet
    • 1.92TB local NvMe Disk
  • 1 Petabyte Weka Parallel Storage
    • Benchmark extrapolation: 80GB/s write, >200GB/s read (limited to 800Gbit/s Uplink)
    • Connected through 8x 100GbE Uplink
    • Each node connected via 100GbE

In the next month this partition will be extended by the Genoa and GPU/APU/KI partitions:

  • 96x Genoa nodes
  • 4x AMD APU nodes
  • 8x Nvidia L40S nodes (4x L40S each node)
  • further KI nodes will follow first half 2025
NEMO2

Latest Posts

Genoa Nodes Delivered

The AMD Genoa, Machine Learning and AI partitions for NEMO2 were delivered on December 4th. The acceptance of the storage has been delayed, so that NEMO2 could not yet start this year. However, calculations with the Milan nodes in NEMO1 are still possible.

End of Life and Milan Nodes in NEMO1

The Genoa partition for NEMO2 will be delivered on December 4th, at the same time all old NEMO1 nodes will be removed. To ease the transition, some new Milan nodes will be booted into NEMO1 environment and will remain available until at least January 31st. Users are encouraged to switch to the new Milan nodes and use the ‘milan’ queue for their jobs (-q milan). If demand increases, additional nodes will be added next week. The launch of NEMO2 is delayed due to unavailable storage, with further updates on testing and data transfer to follow once it becomes available.

Storage and Milan Partition Delivered

The Weka Storage and Milan partition have been successfully delivered for the new NEMO2 cluster. Testing, benchmarking, and system configuration will take place in the coming weeks. We anticipate starting with limited functionality and gradually expanding it over time. A portion of the old NEMO cluster had to be shut down to accommodate the installation of the Milan partition.