Skip to content

Commit 7177158

Browse files
committed
Updated documentation of schema for function profile entries in global database
1 parent 8a44e35 commit 7177158

File tree

1 file changed

+36
-48
lines changed

1 file changed

+36
-48
lines changed

sphinx/source/io_schema/provdb_schema.rst

Lines changed: 36 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -186,50 +186,48 @@ Global database
186186

187187
Below we describe the JSON schema for the **func_stats**, **counter_stats** and **ad_model** collections of the **global database** component of the provenance database.
188188

189+
A common data structure **RunStats** is used extensively to represent statistics (mean, min/max, std. dev., etc) of some quantity. It has the following schema:
190+
191+
| {
192+
| **'accumulate'**: *The sum of all values (same as mean \* count). In some cases this entry is not populated*,
193+
| **'count'**: *The number of values*,
194+
| **'kurtosis'**: *kurtosis of the distribution of values*,
195+
| **'maximum'**: *maximum value*,
196+
| **'mean'**: *average value*,
197+
| **'minimum'**: *minimum value*,
198+
| **'skewness'**: *skewness of distribution of values*,
199+
| **'stddev'**: *standard deviation of distribution of values*
200+
| }
201+
202+
189203
Function profile statistics schema
190204
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
191205

192-
**func_stats** contains aggregated profile information for all functions. The JSON schema is as follows:
206+
**func_stats** contains aggregated profile information and anomaly information for all functions. The JSON schema is as follows:
193207

194208
| {
195-
| **'app'**: *program index*,
196-
| **'fid'**: *global function index*,
197-
| **'name'**: *function name*,
198-
| **'exclusive'**: *Statistics of runtime exclusive of children*
199-
| {
200-
| **'accumulate'**: *unused*,
201-
| **'count'**: *total function executions*,
202-
| **'kurtosis'**: *kurtosis of function exclusive time distribution*,
203-
| **'maximum'**: *maximum function exclusive time*,
204-
| **'mean'**: *average function exclusive time*,
205-
| **'minimum'**: *minimum function exclusive time*,
206-
| **'skewness'**: *skewness of function exclusive time distribution*,
207-
| **'stddev'**: *standard deviation of function exclusive time distribution*,
208-
| },
209-
| **'inclusive'**: *Statistics of runtime inclusive of children*
210-
| {
211-
| **'accumulate'**: *unused*,
212-
| **'count'**: *total function executions*,
213-
| **'kurtosis'**: *kurtosis of function inclusive time distribution*,
214-
| **'maximum'**: *maximum function inclusive time*,
215-
| **'mean'**: *average function inclusive time*,
216-
| **'minimum'**: *minimum function inclusive time*,
217-
| **'skewness'**: *skewness of function inclusive time distribution*,
218-
| **'stddev'**: *standard deviation of function inclusive time distribution*,
219-
| },
220-
| **'stats'**: *Statistics on function anomalies per timestep observed in run to-date*
221-
| {
222-
| **'accumulate'**: *total number of anomalies observed for this function*,
223-
| **'count'**: *number of timesteps data colected for*,
224-
| **'kurtosis'**: *kurtosis of distribution of anomalies/step*,
225-
| **'maximum'**: *maximum anomalies/step*,
226-
| **'mean'**: *average anomalies/step*,
227-
| **'minimum'**: *minimum anomalies/step*,
228-
| **'skewness'**: *skewness of distribution of anomalies/step*,
229-
| **'stddev'**: *standard deviation distribution of anomalies/step*,
230-
| }
209+
| **"__id"**: *record index*,
210+
| **"app"**: *application/program index*,
211+
| **"fid"**: *function index*,
212+
| **"fname"**: *function name*,
213+
| **"anomaly_metrics"**: *statistics on anomalies for this function (object). Note this entry is null if no anomalies were detected*
214+
| {
215+
| **"anomaly_count"**: *statistics on the anomaly count for time steps in which anomalies were detected, as well as the total number of anomalies (RunStats)*
216+
| **"first_io_step"**: *the first IO step in which an anomaly was detected*,
217+
| **"last_io_step"**: *the last IO step in which an anomaly was detected*,
218+
| **"max_timestamp"**: *the last anomaly's timestamp*,
219+
| **"min_timestamp"**: *the first anomaly's timestamp*,
220+
| **"score"**: *statistics on the scores for the anomalies (RunStats)*,
221+
| **"severity"**: *statistics on the severity of the anomalies (RunStats)*,
222+
| },
223+
| **"runtime_profile"**: *statistics on function runtime (i.e. the function profile) (object)*
224+
| {
225+
| **"exclusive_runtime"**: *statistics on the runtime excluding child function calls (RunStats)*,
226+
| **"inclusive_runtime"**: *statistics on the runtime including child function calls (RunStats)*
227+
| }
231228
| }
232229
230+
233231
Counter statistics schema
234232
^^^^^^^^^^^^^^^^^^^^^^^^^
235233

@@ -238,17 +236,7 @@ The **counter_stats** collection has the following schema:
238236
| {
239237
| **'app'**: *Program index*,
240238
| **'counter'**: *Counter description*,
241-
| **'stats'**: *Global aggregated statistics on counter values since start of run*,
242-
| {
243-
| **'accumulate'**: *Unused*,
244-
| **'count'**: *Number of times counter appeared*,
245-
| **'kurtosis'**: *kurtosis of distribution of value*,
246-
| **'maximum'**: *maximum value*,
247-
| **'mean'**: *average value*,
248-
| **'minimum'**: *minimum value*,
249-
| **'skewness'**: *skewness of distribution of values*,
250-
| **'stddev'**: *standard deviation of distribution of values*
251-
| }
239+
| **'stats'**: *Global aggregated statistics on counter values since start of run (RunStats)*
252240
| }
253241
254242
AD model schema

0 commit comments

Comments
 (0)