Commit d7af9a98 authored by Igor Merkulow's avatar Igor Merkulow
Browse files

minor update to the spec

parent 099e63ae
# Extended metrics set
IMPORTANT: here we are using a different naming scheme to clearly distinguish between the metrics collected and metrics used in the reports. The metrics in the report are defined independently from all collectors to avoid ambiguities and errors. We will provide an explanation or an example how these values are calculated so that they will have ideantical meaning in all reports. For example, to get an impression how much data is read from the disk, we have a parameter called `pfit_fs_read_bytes` - this value will probably never exist as a collected metric, since most of the file systems report their data separately. But for the basic understanding of the program IO behavior, an aggregated value over all file systems should be sufficient.
IMPORTANT: here we are using a different naming scheme to clearly distinguish between the metrics collected and metrics used in the reports. The metrics in the report are defined independently from all collectors to avoid ambiguities and errors. We will provide an explanation or an example how these values are calculated so that they will have identical meaning in all reports. For example, to get an impression how much data is read from the disk, we have a parameter called `pfit_fs_read_bytes` - this value will probably never exist as a collected metric, since most of the file systems report their data separately. But for the basic understanding of the program IO behavior, an aggregated value over all file systems should be sufficient.
The data in the extended set will have an additional marker, denoting if a value is required or not. Required values should be easy to collect and be sufficient to get a high-level picture of the job. The optional values represent specific metrics that may be not available everywhere. Should the values be available, they will be included in the report, otherwise the section is skipped.
The data in the extended set will have an additional marker, denoting if a value is required or not. Required values should be easy to collect and be sufficient to get a high-level picture of the job. The optional values represent specific metrics that may be not available everywhere. Should the values be present, they will be included in the report, otherwise the section is skipped.
Additional values, that are not covered by this specification, are allowed, but currently will be not included in the report. This decision should allow developers to parse all the metrics from a source without the need to eliminate the "unnecessary" results. Additionally, it should simplify the extension of the specification in the future.
Additional values, not covered by this specification, are allowed, but currently will be not included in the report. This decision should allow developers to parse all the metrics from a source without the need to eliminate the "unnecessary" results. Additionally, it should simplify the extension of the specification in the future.
## Data subsets
......@@ -77,7 +77,7 @@ Most of this information is provided by the job management system and can probab
Additional explanations:
- `pfit_requested_time` should roughly be equal to (start_time - end_time). Upper limit is one year.
- `pfit_requested_time` should ideally be roughly equal to (end_time - start_time). Significantly lower value can indicate a problem or over-requesting of resources. Both situations should be investigated and avoided. The upper limit is currently set to one year.
- `pfit_requested_cores` is aggregated over all nodes (total sum).
- `pfit_num_used_nodes` - it has to be equal to the number of node-related data blocks in the set.
- `pfit_sampling_interval` is a value that we set in the configuration, so it should be identical for all nodes, but it can also be aggregated if necessary (e.g. the shortest should be stated here.). TODO: define how exactly the interval is specified (e.g. if "1m30s" should be allowed or only "90s") and adapt the RegEx. Maybe integer value in seconds would be better. Currently, only lengths between 2 and 6 are allowed by the validator.
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment