README.md 4.64 KB
Newer Older
1
# Aggregator
akhuziy's avatar
akhuziy committed
2
The tool aggregates data in a format which is suitable for generating reports.
akhuziy's avatar
akhuziy committed
3

4
5
6
*Currently also contains the script to export `jobinfo` data into DB from the post execution script*.

## Requirements
akhuziy's avatar
akhuziy committed
7

8
9
10
The aggregator needs particular software packages to be installed along with data available in DB.

-  Python 3.
akhuziy's avatar
akhuziy committed
11
-  InfluxDB. Currently the aggregator works only with InfluxDB, so it should be installed and the aggregator should be configured as shown in configuration section below.
akhuziy's avatar
akhuziy committed
12
-  DB should contain all required data specified in [DB spec](https://gibraltar.chi.uni-hannover.de/profit-hpc/ProfiT-HPC/blob/d9a21af233ab373bf90420e3f0f0c05e1c65aef8/Internals/DB/InfluxDBspec.md).
13

akhuziy's avatar
akhuziy committed
14
15
16
17
18
19
You should also install Python dependencies with:

```bash
python3 -m pip install -r requirements.txt
```

20
## Configuration
akhuziy's avatar
akhuziy committed
21

22
Samples for configurations can be stored in the repository, please rename corresponding templates with an extension `.py.sample` to `.py` and change placeholder values accordingly.
23
24
25
26
27

Real configs should be ignored in `.gitignore` file.

Sample configs must not be used in the code.

28
**Example**: `influxdb.py.sample` -> `influxdb.py`
akhuziy's avatar
akhuziy committed
29

30
31
32
33
34
## Security

The aggregator tool requires an access to the database (currently InlfuxDB). DB credentials are stored in a plain text file and accessible by anyone who has right to read configuration files. In order to prevent users from accessing DB credentials and job data that does not belong to them, the toolkit provides a setuid executable to run aggregator. In order to use this feature follow the instructions:

1. Configure aggregator to use setuid binary by setting `SECUSER=True` in `conf/config.py`.
35
36
37
38
39
40
41
2. Compile setuid binary: `cd setuid-runner; make`.
    - It requires Python3 to be installed in the system.
    - Configure the Makefile and set at least `SPATH` variable and `LDFLAGS` variable if needed.
3. You can move `setuid-runner/setuid-runner` anywhere where it is accessible by users
4. Change the ownership of all files to `safeuser` and remove the access for anyone else: `chown -R safeuser aggregator && chmod -R go-rwx aggregator`
5. Set **setuid** bit of the `setuid-runner` binary and ownership accordingly: `chmod u+s,a+rx setuid-runner && chown safeuser setuid-runner`
6. Now in order to fetch the data call `setuid-runner` as you call `data.py`, for instance: `./setuid-runner -t text JOBID`
42

akhuziy's avatar
akhuziy committed
43
## Usage
akhuziy's avatar
akhuziy committed
44

akhuziy's avatar
akhuziy committed
45
The main executable of the aggregator module is `data.py`. You can type `./data.py -h` for more help.
akhuziy's avatar
akhuziy committed
46
47

```
akhuziy's avatar
akhuziy committed
48
usage: data.py [-h] [-t {text,pdf,all}] [-o [OUTPUT_DIR]] JOBID
akhuziy's avatar
akhuziy committed
49

akhuziy's avatar
akhuziy committed
50
51
Gets the job information required for generating text or PDF reports and
outputs it in JSON format.
akhuziy's avatar
akhuziy committed
52
53
54
55
56
57

positional arguments:
  JOBID                 job ID used in the batch system

optional arguments:
  -h, --help            show this help message and exit
akhuziy's avatar
akhuziy committed
58
  -t {text,pdf,all}, --type {text,pdf,all}
akhuziy's avatar
akhuziy committed
59
                        type of the output (default: text)
akhuziy's avatar
akhuziy committed
60
61
62
  -o [OUTPUT_DIR], --output-dir [OUTPUT_DIR]

                        output directory (default: None)
akhuziy's avatar
akhuziy committed
63
64
65
```

## Get test output
akhuziy's avatar
akhuziy committed
66

akhuziy's avatar
akhuziy committed
67
68
69
70
71
In order to get `json` output from the test data, located at `test/data`, a docker container with InfluxDB instance and imported data should be running. In order to build the docker container with necessary test data, run the following script:
```
test/docker/influxdb/build_influxdb.sh
```
To run the InfluxDB container, simply execute:
akhuziy's avatar
akhuziy committed
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
```
test/docker/influxdb/run_influxdb.sh
```
After the influxDB instance is up you need to configure the aggregator to use it, by copying `influxdb.sample` -> `influxdb.py` and editing it as:

```
IDB = {
    "username": "",
    "password": "",
    "database": "pfit",
    "api_url": "http://localhost:58086",
    "ssl": False,
}
```

akhuziy's avatar
akhuziy committed
87
As an example the following command will output the test data in the `json` format for a `pdf` report:
akhuziy's avatar
akhuziy committed
88
89

```
akhuziy's avatar
akhuziy committed
90
  ./data.py -t pdf 1352076
akhuziy's avatar
akhuziy committed
91
92
```

akhuziy's avatar
akhuziy committed
93
94
You can use other `JOBID`s which you can find in `test/data` directory.

akhuziy's avatar
akhuziy committed
95
96
**Note**: Test data with JOBID `2368599` doesn't include *Infiniband* metrics, therefore it is required to switch off `InfluxDB` support in the configuration file(`conf/config.py`) before using it.

akhuziy's avatar
akhuziy committed
97
## Export job info
akhuziy's avatar
akhuziy committed
98

99
100
101
102
103
In order to export the job info with `JOBID` you should call `export.py`:
```bash
  export.py JOBID
```
Then the aggregator will gather a job information with ID `JOBID` from the batch system configured in `/conf/config.py` and save it into the configured database as a `pfit-jobinfo` and additionally in `pfit-jobinfo-alloc` measurements.
akhuziy's avatar
akhuziy committed
104
105
106
107

## Recommendation system

Currently the recommendation system is a module used by aggregator and located in this repository under [rcm](./rcm) directory. Please see the [documentations](./rcm/docs) and [README](./rcm/README.md) files for more information.