Turing : Introduction to optimisation

Introduction

Optimisation is an essential step in the life of a computing code. Unfortunately, it is rarely brought to fruition or perhaps never even attempted. The purpose of this document is to give you an idea of the benefits of this type of work and to explain how to proceed with it. In this domain, there is no easy recipe but numerous methods can be suggested.

What does optimisation mean?

Optimisation of a computing code for a supercomputer such as Turing means reducing its need for resources. There are different resource needs but generally we are focusing on elapsed time. Memory consumption or disk space, however, also come into consideration.

Benefits of optimisation

Optimising an application can provide a number of benefits (if your efforts are successful!):

  • Results obtained more rapidly by reducing execution time
  • More results obtained with your attributed hours
  • Larger calculations effectuated
  • Competitive advantage obtained in relationship to other teams
  • Better understanding of your code and the machine architecture including their interactions
  • Bugs discovered and corrected thanks to the re-reading of code sources

Other benefits:

  • Reduction in energy consumption due to decreased calculation time (when not calculating, the Turing compute nodes consume almost nothing)
  • Better use of the machine (the costs of purchasing, maintenance and use of a supercomputer are not insignificant): Each hour which is attributed to you represents an expenditure for the whole scientific community.
  • Freeing up resources for other research groups

When NOT to optimise ?

Optimisation is not always the best strategy: Not all computing codes need to be optimised (however, this is rarely true for massively parallel machines); furthermore, the benefits can be insufficient in relationship with the effort needed to do the optimisation.

The following are reasons to NOT optimise :

  • Insufficient resources or means to do the optimisation (lack of personnel, time, skills). Note: IDRIS can help you with this.
  • Decrease in code portability (many optimisations are specific to the machine architecture)
  • Decrease in source readability and trickier maintenance
  • Risk of decreased performance on other architectures
  • Risk of introducing new bugs
  • Using a code which is not perennial
  • Using a sufficiently rapid code (there is no reason to optimise a code which already gives results in a reasonable time)
  • Using a code which has already been optimised

Some people would prefer to wait until the machines evolve rather than optimise, thinking that a faster computer would give better performance without having to make this effort. However, this reasoning is not valid because the current tendancy in the evolution of materials is towards a more and more massive parallelism but with only slightly faster cores. In addition, input/output performance is developing less rapidly than the computing power. Finally, keeping in mind that a supercomputer has a lifespan of several years, it would be necessary to be very patient!

Other considerations before optimising

An application should not be optimised until after it is already functioning correctly. Moreover, the optimisation work should never be started too early when writing the application (except for the choice of algorithms) : In fact, there is a great risk of adding supplementary bugs or strongly reducing the code readability and comprehension. In addition, you could optimise procedures which will later be abandonned, totally re-written, or used very little (if at all). There is also a danger of spending a lot of time optimising parts of code which have very low resource consumption: Spending 3 days to gain a factor of 2 in a subroutine representing 0.001% of execution time is not very productive.

You should not begin optimising unless you think that your application is too slow or does not allow running your large calculations in a reasonable amount of time. If you are entirely satisfied with your application, the investment may not be necessary.

The amount of time needed should also be considered before beginning this process. Optimising a computing code is not an easy task and takes a lot of time (generally much more than estimated, especially if you are not experienced in this). If you have numerous other constraints, don't begin! It is better to obtain results, even if you need to wait a little, than to begin optimising a code and then not have enough time to do your calculations.

Advice on how to proceed

Profiling

Once you have decided to optimise your application, do not start by modifying your sources. Before beginning, it is absolutely necessary necessary to identify the critical parts of your application which need improvement. Most of the time, the code sections which consume the most resources are not the ones we think: Profiling will allow you to verify (and often invalidate) your hypotheses about this.

Profiling an application consists of determining the parts which use the most resources. It is necessary to be very attentive to do this on a data set which is characteristic of what you wish to calculate. Try to approach as closely as possible to the real conditions of a calculation production in order to obtain a realistic profile. There are profiling tools available on Turing; these are described in the following pages:

The optimisation

Once the profiling has been carried out, and the zones with the highest consumption have been identified, the optimisation can begin. Clearly, one should start with the zones which have the highest resource consumption as it is there where you can potentially find the most gains. It is not very useful to work on the parts which have low consumption, especially since your time is precious.

Optimisation can be done in 3 different domains; the following pages provide details:

It is very important to systematically verify that modifications made do not change the results obtained. It is already very easy to introduce bugs when you program, but it is even easier to do during the optimisation phases.

Be sure to verify that there has indeed been an improvement in performance: The introduced modifications can have the opposite effect. We can very often observe gains which are close to zero, or even performance losses. This is generally due to having complicated the optimisation work of the compiler or from an inaccurate understanding of the expected effects of the modifications.

When the gains are very slight, you need to ask yourself if it's worth keeping the new code, especially if the readability has suffered from it. You can always return to the initial solution and keep an application which is easier to maintain.

What does IDRIS provide ?

In addition to the documentation on our web server, we have several other means available to help you:

  • Blue Gene/Q training (Turing usage courses) are offered at IDRIS. Although these are not focused on optimisation, the courses provide a detailed description of the architecture along with much valuable advice. A good understanding of the functioning of Turing is indispensable for obtaining significant gains.
  • For timely assistance in case of need, the User Support Team can respond to your questions by e-mail () or by telephone at +33 (0)1.69.35.85.55.
  • For more complex assistance needs, a service of Advanced Support can be provided.