Scroll Top

HPC in the Cloud – bitter pill or soothing remedy?  


Owen Thomas, co-founder of Red Oak Consulting

UK universities and research institutions have built an enviable reputation over many years for helping to inform the development of countless industries, from genomics to nuclear, and AI to automotive engineering and design. It’s where so many bright ideas are turned into economic reality.

At the crux of much of this is high-performance computing (HPC) where petabytes of data can be processed in milliseconds.

With a few exceptions – say, areas of national defence or critical national infrastructure (CNI) – yes, HPC can be in the Cloud. But it’s not as simple as making a decision about where HPC should be. There’s a panoply of factors to consider, most of which are entirely contingent on local and operational considerations, around cost – long term or short term costs – who maintains and manages it, and the available skills, and time, to oversee HPC demands.

Furthermore, commercial decisions need factoring given the investments required and, like every other industry, security is right up there as a priority. For instance, research is often conducted on behalf of governments, big pharma, finance and even high-profile and high-ticket sports such as Formula 1, where secrets are guarded with Crown Jewel care and attention. If computational findings that contribute to even the smallest of marginal gains on the racetrack are compromised, it’s game over. Data is precious.

None of this should preclude cloud-based HPC bringing considerable power and potential to these situations.

According to research report by Market Research Future (MRFR), in April 2023, the cloud HPC market size was valued at $5.5 billion in 2022 and is predicted to grow over the next seven years at a compound annual growth rate (CAGR) of approximately 16.68%, to reach $16.19 billion by the end of the decade. That’s big business, and it’s where commercial gains and intellectual property meet.

Total cost of ownership

HPC is an expensive business, on or off the cloud.

Depending upon where your influence lies in an organisation, your objective may be swayed by different factors. A CFO may focus upon cost (especially Capex) while a CTO or CEO might see investment, or total cost of ownership (TCO). In our experience, this is the most common battleground we witness, across academia and research and, many other industries.

It’s true, with an on-premise infrastructure, organisations have complete control of all operations and likewise, will own or at least manage and maintain both the server and datacentre environment.

However, as all CFOs will testify, this is a fixed and depreciating asset, which needs to be written down over a period of time; usually five years. But even that doesn’t tell the whole story because capital outlay for hardware and infrastructure typically accounts for only a third of the overall costs of running an HPC environment. The remaining two thirds are eaten up by maintenance and running costs, and may fall within an operations budget. In most cases, to fund a £5 million capital outlay, £15 million is likely to be required.

An additional challenge posed here is that many organisations may be encouraged to continue to run hardware for longer than they should, whereas, in practice, at the five-year point, it’s more economical to invest in a new system, rather than run an old system for another year.

Research and higher education organisations must be able to demonstrate ROI, which of course is only right. However, the timescale in which you measure it is as important to understand, as is the total cost of ownership (TCO).

Understanding of how HPC generates value in an organisation is pivotal to determining the most appropriate strategy, a process which involves identifying the right metrics by which ‘value’ can be quantified. Often, a metric can be identified by an organisational value of HPC. While at times it might be perceived as a binary outcome – with HPC, operations are feasible, and without it, they’re not – this perspective neglects the opportunity to assess a range of diverse metrics.

The focus on costs can sometimes eclipse the perceived value, especially when HPC introduces new operational strategies that enhance value in ways previously unattainable.

One of the key advantages of the Cloud, is that within the pay-as-you-go system, organisations only need to apportion budgets a year at a time, enabling them to invest elsewhere with the allocated operational HPC budget, where a guaranteed return on investment can be realised.

Security matters

Security, particularly around data breaches and fending off of cyber-attacks, remains one of the most urgent matters of concern among IT administrators in any field of operations or marketplace. But in academia and research it has particular potency when it comes to cloud computing.

Last year, the average global cost of a successful cyber-attack reached a record high of $4.35 million (£3.41 million), as reported in IBM’s 2022 Data Breach Report. This sum covers potential ransom payments, regulatory fines, and the impacts of data and intellectual property loss, alongside disruptions to custom and productivity, as the victim strives to restore normal operations over the following months. However, until there is a wider understanding of how to overcome them, and they are not intractable with the right policies in place, security issues will remain the most urgent matters of concern among IT administrators in academic and other organisations when it comes to cloud computing.

In a recent survey concerning academic information infrastructure conducted by Japan’s Ministry of Education, Culture, Sports, Science and Technology (MEXT), it was observed that over 90% of universities have integrated cloud services into their information systems. However, the survey’s findings reveal a paradox: while enhanced security motivates 50% of the universities employing cloud services, an equal percentage reject these services on the grounds of security.

Such divergent opinions surely underscore the absence of universally accepted criteria for implementing cloud services, but also, and more worryingly, how a lack of consensus can contribute potentially to researchers’ apprehension, particularly in scientific fields which are so reliant on sensitive data like genomic medical research. 

If researchers in fields like genomics remain unaware of the potential to leverage the computational capacity and data accessibility provided by cloud services, the progress of research will potentially be hampered. With the appropriate support in place and the right experience of cloud based HPC behind them, there is every opportunity to unlock the full power of the cloud securely.

Checklists for cloud migration and integration should be put in place, including audit and selection of the best data centre, issues of interest including measures for disaster prevention, failure, and disaster response, and when considering the processing and storage of certain types of sensitive data, it may be necessary to confirm the location (country or region) of the data centre itself.

In sum therefore, yes the complex interplay of vast computing resources and the intricate nature of HPC tasks can expose sensitive data to risks, but with the right policies and tools in place, data can be given the highest possible security and protection on the cloud.

Minding the skills gap

In many organisations that require HPC, and none more so than higher education, there exists the ‘workforce development/workforce management’ dilemma. In essence, organisations realise there is a skill and resource shortage, but continue to have reservations around outsourcing. There are many excellent business computing degree courses at UK universities, but to our knowledge, other than Edinburgh University, HPC degree options are limited. Meanwhile, many good research staff members in higher education make the move to the corporate and private sector, where there is more funding and reward.

The challenge, and it’s one we are working closely with customers on, is how to retain a level of in-house knowledge, while ensuring the outsourced support is also there. Our mantra at Red Oak Consulting, is that with a managed service in place, ROMS, we have the experience and skills to fill that skills shortage and knowledge gap and work alongside research bodies and higher education institutions to train, mentor and develop staff. We recognise also that over time needs will evolve, and that the nature of that support may also evolve, because managing HPC is not the same as managing enterprise IT.

The future is bright

Crucially, universities and research departments need complete peace of mind in the transition to the cloud once they realise it is the best option for them. If appropriately outsourced within a robust managed service arrangement, they can draw on the deep-level expertise to steer them seamlessly onto the cloud, or indeed onto a hybrid model.

The decision about which environment is best suited will always include cost and security considerations, but neither should be looked at in isolation. An investment of this magnitude and importance needs to be examined in its entirety, quantifying its value to the organisation as a whole within a TCO model. Get it right, and the cloud could be the best decision they can possibly make in the race to harness the power of data.


Related Posts

Privacy Preferences
When you visit our website, it may store information through your browser from specific services, usually in form of cookies. Here you can change your privacy preferences. Please note that blocking some types of cookies may impact your experience on our website and the services we offer.