--
- = Experimental
- = Theoretical
- = Computational
- ~ Observational
- ~ Mathematical
- ~ Numerical
TODO: Tähän olisi kiva saada jotain syvällisempää laskennalliseen paradigmaan liittyen.
2.2) In computational science the tasks to be solved are divided into steps. Arrange the steps in right order:
- Define mathematical equations that describe the problem.
- Weigh different algorithms and consider them in solving the problem numerically.
- Code (e.g. with Python or similar) the implementation of the algorithm.
- Run the program developed for the problem.
2.3) With High-performance computing, experiments which might be impossible to implement for various reasons, can sometimes be replaced with simulations.
TRUE
- = Climate and weather
- = Particle physics and cosmology
- = Winner of FIFA World Cup 2022
- ~ What to eat for breakfast
- ~ Amount of taxes to be paid
- = CSC
- ~ KAMK
- ~ LUMI
- ~ VTT
4.1) Generally High-performing computers consist of the same components as personal computers, but they must perform together in a more complex entity.
TRUE
4.2) Programming for HPC environments is partly the same as in normal environments, but there are various aspects to take in consideration (such as massive parallel computing and optimized libraries) to be able to perform at full potential.
TRUE
--
--
Exam (5–10 questions), 80% correct to pass, multiple attempts allowed What is a supercomputer?
- = Electronic Numerical Integrator and Computer (ENIAC)
- ~ Large Unified Modern Infrastructure (LUMI)
- ~ Numerical Object Key Integrator Apparatus (NOKIA)
- ~ Calculative Super Computer (CSC)
1.2) Parallel computing is a relatively new concept, which was introduced to super computing in the beginning of 2000.
FALSE
1.3) Until recently processors have followed Moore's law, where it's density of transistors double every two years, hence the performance also presumably doubles. The biggest constraining reason stalling this development nowadays is
- = Power consumption and heat production, and physical and manufacturing limitations due to the size of a transistor approaching quantum effects.
- ~ Governance regulation of development work
- ~ Duopoly of Intel and AMD on processor markets
- ~ Saturation of the processor market
TODO: Viivanleveys? Kysymyksessä hyvä ajatus, mutta logiikka pitää miettiä uusiksi.
1.4) Computing power increases over time as technology advances. How many times faster is a normal modern laptop compared to a supercomputer from thirty years ago?
- = 1000
- ~ 10
- ~ they are approximately equal
- ~ 1/10th
TODO: Laskentateho vs. energiankulutus? Laskentatehon kasvu?
2.1) CPU aka central processing unit is called the [1] of a computer. Modern CPUs consists of multiple [2] which can operate independently. Sometimes terms CPU and [3] are used interchangeably.
- [1] brain
- [2] cores
- [3] processor
TRUE
2.3) The backbone of a supercomputer that connects multiple nodes and e.g. network storage together is called
- = interconnect
- ~ LAN-party
- ~ WiFi
- ~ parallel network
- = Floating Point Operations per second
- ~ Right side sandal of Flip-flops
- ~ A failing artist or an act
- ~ Fluorescent Lighting Of PC
- = Computing power measurement is standardised to FLOP/s and the performance of different supercomputers around the world is ranked and reported on different websites.
- = Theoretical performance of a CPU is higher than the performance in reality.
- = New LUMI supercomputer is expected to perform theoretically 0,55 EFLOP/s and rank top 10 in the list of all supercomputers in the world.
- ~ A Common benchmark tool to measure supercomputer performance is called TOP500.
- ~ The new LUMI supercomputer is estimated to be a maximum of ten times faster than the older Mahti supercomputer.
4.1) IMDB has approximately 7,5 million movie titles in their online database. One movie streamed over the internet is an average of 3,5 GB. How many Mahti's (capacity of 8900 TB) would a service provider need to store all IMDB titles to be streamed over the internet?
- = 3
- ~ 1/8th
- ~ 10
- ~ 27
- = HDD
- = SSD
- = Magnetic tapes
- ~ Blu-ray discs
- ~ USB flash drives
- = 1 B -> Byte
- = 10^15 B -> Petabyte
- = 10^6 B -> Megabyte
- = Decreased performance because of a badly designed or implemented storage solution cannot be improved by adding more CPU's to the supercomputer.
- = Instead of the speed of a single hard disk, the performance comes from intelligent and distributed mechanism of reading and writing data on multiple disks simultaneously.
- ~ Mahti's storage is implemented with 170 000 Blu-ray disks and equivalent amount of RW Blu-ray-drives.
- ~ More persistent (and cheaper) storage solutions are gained with the help of VHS tapes.
- = IaaS -> Cloud capabilities type in which the cloud service customer can provision and use processing, storage and networking resources provided by a cloud service provider.
- = PaaS -> Cloud capabilities type in which the cloud service customer can deploy, manage and run customer-created applications using one or more programming language and execution environment supported by a cloud service provider.
- = SaaS -> Cloud capabilities type in which the cloud service customer can use the cloud service provider's applications.
TODO: En ole varma mitä "Cloud capabilities type" tarkoittaa
5.2) Traditionally supercomputing environments are tailor-made and use-cases must meet these, sometimes very precise, requirements of the execution environment. Parts of modern HPC environment services may be offered in a cloud computing manner, where a customer has network access to a scalable and elastic pool of shareable physical or virtual resources with self-service provisioning and administration on-demand.
TRUE
- = Japan
- ~ USA
- ~ China
- ~ Finland
- ~ Sweden
6.2) The most energy efficient supercomputer is the NVIDIA DGX SuperPOD in the USA. It's rank on the top500 list is 170th and it's power efficiency is 26195 GFlops/watts. List the supercomputers mentioned in the material from most energy efficient to least.
-
- USA
-
- Japan
-
- China
- = LUMI will be located in Kajaani and all of its waste heat is used to heat up the local district.
- = LUMI will consist of different services: CPU and GPU based raw calculating power and e.g. AI specialised platform(s).
- = With estimated 552 PFLOP/s and 117 PB storage capacity it is estimated that LUMI will rank in the top 10 of the fastest supercomputers in the world.
- ~ LUMI will require the space of a football field when installed.
Exam (5–10 questions), 80% correct to pass, multiple attempts allowed Running and using a supercomputer
2.1) Select all the correct options that are related to operating supercomputers and their services.
- = Supercomputers offer several different types of tasks in which a person can specialise to. TODO: need to be made more clear
- = Maintaining a supercomputer is teamwork.
- ~ To be able to work with supercomputers, a person needs to master a wide variety of skills, starting from HVAC and ending with customer service about a spesific Python library needed for modeling weather.
- ~ Administrating supercomputers is quite dull and monotonous work where architectures are straightforward and trivial to implement and maintain.
3.1) LUMI aims to be one of the top ten fastest supercomputers in the world. Yet at the same time energy-efficiency and sustainability needs to be taken in consideration. Choose all the correct statements related to LUMI.
- = 100 percent of the used energy is renewable (hydropower).
- = Waste heat is utilised further and led to the district heating network.
- ~ The supercomputer is turned off for the night.
- ~ For more efficient cooling, supercomputers will be submerged in a swimming pool.
4.1) When you buy a new phone or a computer, perhaps you ponder whether to buy something now or wait for a little while for a newer model. Supercomputer vendors must know the current situation, but sell future promises. For the money invested, the buyer wants to have state-of-the-art) hardware which does not exist yet at the time of purchase. Which of the following are tools for a successful deal.
- = Performance commitments that the vendor will propose.
- = Request for information (RFI) then proposal (RFP) to initiate negotiations with potential sellers.
- = Teamwork of multiple people from both - buyer's and seller's - sides.
- ~ A Well-functioning online store for selecting the right supercomputer parts and real time price updates.
- ~ Specifications of an old supercomputer which will be updated by adding more memory and CPUs.
- = similar to visiting a post office or a bank. You take a queuing number and wait for your turn. With supercomputers this is done with a batch queuing system. Note that a batch queue is not implemented sequentally, but a number of variables effect how the queue is unloaded.
- = possible over the internet by using a text based user interface or in some cases a web browser.
- ~ very expensive. Especially for researchers, who will be invoiced independantly. TODO: maybe this should be removed, because it is expensive, and some funding entity does pay for it.
- ~ like driving a Formula 1 car. Only few can do it and it takes a long time to learn just the basics. TODO: This should be rethought, because it is somewhat hard, but not impossible for a reasercher.
- = Linux
- ~ Windows 7
- ~ macOS
- ~ BeOS
Exam (5–10 questions), 80% correct to pass, multiple attempts allowed Parallel computing concepts
- = Five people pushing a car together.
- = Ten people filling a swimming pool with their own buckets.
- = Students gathering "trashes"(trash) from (the) school yard.
- ~ 4 x 100 meters relay
- ~ Working in three shifts
- ~ Reading two books at the same time instead of one.
- = tightly coupled
- ~ trivially parallel
- ~ tightly coupled
- = trivially parallel
- = Choosing the right amount of resources for the task at hand may be tricky.
- = Good performance is a result of good orchestration of resources co-operation.
- ~ The supercomputer takes care of scaling the computational tasks automatically.
- ~ Programming languages offer automatic scaling. E.g. in Python one imports the scomputer library and calls its scomputer.scale() method.
TODO: scaling --> parallelization? ja tätä kyllä jotkut kielet/systeemit tekevät enemmän tai vähemmän automaattisesti
--
--
Exam (5–10 questions), 80% correct to pass, multiple attempts allowed More technical aspects of parallel computing
--
--
- = Memory and memory access limitations of a shared memory architecture.
- = Easy scalability. TODO: hardiksen puolesta juu, mutta ohjelmoijalle hankalaa ...
- ~ CPU's need for fast access to memory.
- ~ Handling of large and complex data.
TODO:Toinen vaihtoehto
4.1) Drag and drop the right memory / storage types in the memory hierarchy pyramid. On top is the fastest. (jos drag'n drop tyyppisiä tehtäviä on tuettu?)
- Registers
- L1
- L2
- L3
- Main memory (RAM)
- Remote memory
- File system disks
5.1) Tässä on aiheeseen liittyvä mielenkiintoinen artikkeli: https://blog.royalsloth.eu/posts/the-compiler-will-optimize-that-away/
Tuosta jos saisi tuon kuvan (Process-Memory performance gap) mukaan kysymykseen.
Following picture (pic X) describes the problem (of) what "inequal"(unequal?) development of (the) computer's core technology has caused: Processors need to wait for the memory. To improve the performance, amount of reads and writes to the memory must be optimised. Optimisation is done in program code. Select two methods that can be used to improve performance.
- = Structure the code to enable compiler instruction level parallelism.
- = Vectorizing parts of code where possible.
- ~ Reducing amount of cycles by giving less instructions.
- ~ Overclocking the supercomputer's random access memory (RAM).
--
- = To gain better performance, different nodes must be located relatively near to each other.
- = All things taken in to consideration, in the most optimized synthetic scenario performance is up to 80 % of the theoretical peak performance in a supercomputer.
- ~ Mahti's interconnect network topology is awesomely called Dragon's flight based on the TV serie Game of Thrones.
- ~ All of Mahti's nodes are connected to each other through a fully connected network, and it was a summer job of three interns.
- ~ Interconnect is the second fastest part of the chain feeding data to CPU's, right after L1 cache.
Exam (5–10 questions), 80% correct to pass, multiple attempts allowed HPC programming
- = Supercomputer applications are often only data oriented (output is mainly numbers), and do not require a graphical user interface
- = Programming models for a cluster is highly different compared to models for a single CPU desktop computer
- = Scientific computing is highly oriented around performance optimization, since the difference between 0.1s and 0.01s can mean a huge increase in operation time.
- ~ Compiling the the software on all of the nodes separately is a tedious task given often to interns
- ~ Writing software for a supercomputer leads to countless arguments over who gets to use the supercomputer next
- = GPU programming models are often produced by the vendor for specific hardware
- = Since the focus on GPU utilization is fairly new there is still lack of a industry standard
- ~ The GPU programming model is dependant on the vendor of the systems RAM requiring changes to the source code every time the systems RAM is of a different make and model.
- C
- C++
- Fortran
- Python
- JavaScript
- C#
- Haskell
- Cobol
- = A compiled language is directly converted to machine code that the computer can execute and understand while an interpreted language requires a intermediate process (interpreter) to convert the language at run time to machine code.
- ~ Compiled languages are often much slower than interpreted languages due to not having the fast interpreter to translate the code to a computer
- ~ Compiled languages are often more easy to write than interpreted languages
- MPI, OpenMP, basic differences between these (OpenMP: node level, incremental approach ok, all memory accessible for all thereads (unless otherwise specified); MPI: between nodes too, explicit send/receive when accessing memory etc.)
- CUDA/HIP etc.
- = Heavily optimized due to sometimes even decades of development
- = Freedom to focus on science over developing code for general operations
- ~ Allows the scientist to "sweep under the rug" complicated operations that they might not understand