*example:* because it creates a delayed array, we access one element
%% Cell type:code id: tags:
``` haskell
reshape(Z:.4:.2)c!(Z:.1:.0::DIM2)
```
%% Output
%% Cell type:markdown id: tags:
**rank**<br/>
number of dimensions
```haskell
rank::Shapesh=>sh->Int
```
%% Cell type:markdown id: tags:
**size**<br/>
number of elements
```haskell
size::Shapesh=>sh->Int
```
%% Cell type:markdown id: tags:
**extent**<br/>
get the shape of an array
```haskell
extent::(Shapesh,Sourcere)=>Arrayrshe->sh
```
since rank and size work on shapes we use extent often
%% Cell type:code id: tags:
``` haskell
rank(extentc)
size(extentc)
```
%% Output
%% Cell type:markdown id: tags:
### Delayed Representations and computations
D -- Functions from indices to elements.
C -- Cursor functions.
these are arrays which are not yet calculated. They are calculated with specialised functions for sequentiel or parallel computation. This has the advantage of
- beeing able to exactly choose when and how you calculate
- intermediate arrays are not calculated (*fusion*)
*fusion* means, if you have several operations on some data, in the end the combined operations are used in parallel and there will be no intermediate arrays (which is fast and saves memory)
doing it in the ihaskell notebook don't run it in parallel.. We need to compile it
## actually running the program in parallel
At compilation we need to at the command-line option `-threaded` and `-rtsopts` which enable multiple threads and the ability to have runtime-options respectively.
For example, for the mandelbrot program we do
```
ghc -Odph -rtsopts -threaded -O3 mandelbrot.lhs
```
At runtime we specify with how many threads we will run the program. This is done with the command-line option `-Nx` while `x` is the number of threads.
Again for example run the program
```
./mandelbrot +RTS -N2 -s -RTS
```
and compare the runtime to `-N1`
```
1 thread: Total time 0.875s ( 0.867s elapsed)
2 threads: Total time 0.964s ( 0.607s elapsed)
```
This is not perfect but good enough for this quick program.
## tool for monitoring parallel processes: threadscope
*threadscope* is a program which helps visualizing what is happening during a parallel calculation.
For this we need to compile it with the fla `-eventlog` option which enables creating a *eventlog* which threadscope uses to create its report.
then there is a file `mandelbrot.eventlog` and we can run threadscope with it:
```
threadscope mandelbrot.eventlog
```
The result is:
<imgsrc="../images/threadscope.png"></img>
here we can see, that most of the time is spend in a sequential writing the matrix to the drive but the calculation itself is fine in parallel.
%% Cell type:markdown id: tags:
### general way to parallelize your program
To get the most out of parallelization you need to consider several things. This is out of the scope here but we cover some simple advices for the usage with repa.
- 1) think about what parts of you program are trivially parallizable (calculations which do not dependent on each other)
- 2) parallelize them with repa
- 3) think of parts of your program which you can transform into trivially parallelize parts. This usually involves splitting up your algorithm.
Doing especially 3) can be hard. The next step after 3) would be to parallelize over several computers. This has the additional complexity that computers don't share memory and you have to transfer this memory between the computing nodes. This is its own lecture (using for example MPI, the Message Passing Interface).