Sonja Philipp (Data Center Group): Daniel, thank you for being here. Who are you, what do you do, and what does Menzel IT do?
Daniel Menzel (Menzel IT GmbH): My name is Daniel Menzel, I am the founder and CEO of Menzel IT GmbH. We are a Berlin-based company specializing in high-performance infrastructures, called HPC, and private cloud computing clusters.
SP: Menzel IT and the Data Center Group have worked together before. How did the contact come about and what was the project about?
DM: We got to know each other through a mutual customer, which is based in research. Menzel IT had been supporting the customer for some time and Data Center Group then joined them as data center planners. We made sure that both the infrastructure and the IT in it is up and running.
SP: You just mentioned that your company specializes in HPC. What do you mean by that?
DM: There are various definitions of HPC: One is based on the assumption that you have powerful servers, and powerful means high performance. In the actual sense, and this is the definition we also follow, HPC means simulation in research - e.g. for 5G, 6G, crash simulations or fluid simulations. In short: modeling the real world in mathematics, for which computing power and capacity are required.
SP: What are typical applications and computational capabilities in HPC? What is calculated and what is the end product?
DM: At the end of the simulation, there is always a single line or a picture. And before that, there is always the translation of a real issue into mathematics. When an engineer or scientist has figured out the problem, he is able to express the problem in various computational problems, for example mobile phone optimization, crash tests, or: How do I get 2 tons of steel into a mold without it cooling? But also blast simulations or fluid simulations can be translated into mathematics as a real-world problem. HPC can also be used in climate research, which is becoming increasingly relevant today. Weather forecasting, for example, is computed by specialized HPC clusters that dump the collected input values into a large mathematical model, resulting in the final weather forecast. All these use cases become similar mathematics and can then be computed in HPC.
SP: What components does HPC consist of and which are particularly important?
DM: An HPC cluster essentially consists of a server, which includes CPUs and GPUs1. These servers interact as one large system and usually require centralized storage, which the network needs to provide itself with data and perform inter-process communication. This means that server 1 talks to server 2, server 3 talks to server 17, etc. We call the individual servers compute nodes because it is the single device.
SP: You just mentioned clusters. Cluster is then an association of many of these nodes?
DM: Exactly. Starting with small clusters with 5-7 nodes up to the big ones, where 500, 600, maybe even 1000 servers work together in a so-called "queue", i.e. as one entity, and compute together on the same mathematical or physical task.
SP: This setup sounds technically very complex. What are the special features or requirements for the IT infrastructures of a high-performance data center compared to a classic data center?
DM: HPC means: I need a lot of everything. In the infrastructure, that starts with the power. HPC means power densities of 40, 50, even 60 kW in a rack, which of course also have to come out as cooling.
SP: HPC is accordingly very energy-intensive. Are there efforts or strategies to make it more efficient and sustainable? Are there also technologies in this direction?
DM: In any case, there are technologies to make the issue more sustainable. On the infrastructure side, I'm thinking about how to both efficiently feed the energy in and get it cooled. The issue of water cooling is a very big one with HPC because I can't get the energy out of the system any other way. But there are also efforts to write code more efficiently. It makes a huge difference whether I have to compute for 3, 5, or 10 days at a time. Both in terms of speed and cost. If I'm using 150, 200, or 500 kW for 2 to 5 days, I'm going to see that on my electric bill at some point. There are an amazing number of possibilities, which are far too seldom exploited. This is also where Menzel IT GmbH comes in to advise.
The full video you see here:
SP: Let's assume I'm a research institute and I'm considering whether to work On Prem or move the data to the cloud. What are the advantages and disadvantages of the two models?
DM: The cloud always has the advantage of fast availability, of course. A major disadvantage is data security. In research, this is of great importance due to "intellectual property" and patents, because configuration errors quickly put you in danger of competitors getting hold of important data. Not because of a mistake by the cloud provider, but because of mistakes by the admin. Institutions feel safer to have an on-site data center set up behind a VPN. On the other hand, startups in particular don't have the capacity, either monetarily or logistically, to build a data center. But as a company grows, it's worth it. When looking at 2 or 3 years, you are at a factor of 2 or 3 for cloud infrastructures. The per-minute prices of the cloud providers mask in this case. In the case of HPC, however, the infrastructure must also be used. It is not enough to open the browser. The software has to be made to run, the queuing system has to work, my storage has to run. These things still require a great deal of expertise and, accordingly, men and women power. That's why the costing is very on premises in the long term.
SP: Your forecast: Where is the journey heading? What are the developments of the future?
DM: I believe that the amount of what is simulated will increase even more in the future. Crash test simulations are a good example in this case. It's insanely expensive to keep rebuilding a new car with changed parameters. On the PC, on the other hand, all I have to do is turn a few knobs and then simply recalculate: Is this cell now better or worse compressed to protect people? The future will be to test real things and processes only in the very last step and instead to simulate them digitally first.
SP: Thank you very much for the interview and the explanations on the subject of HPC.
DM: You're very welcome. Thank you.
Picture: © peach_adobe / #436585107 / stock.adobe.com (Standard licence)