Skip to content

Make 'race' self-contained #258

Closed
Closed

Description

When Rally stores data in an Elasticsearch metrics-store it stores two entities: metrics and races. A race contains a few meta-data about a single benchmark. Here is an example:

{
  "pipeline": "from-sources-complete",
  "car": "4gheap",
  "laps": 1,
  "distribution-version": "6.0.0-alpha1",
  "track": "nested",
  "target-hosts": [
    "some-target-host:39200"
  ],
  "environment": "nightly",
  "revision": "afd45c1",
  "selected-challenge": {
    "name": "nested-search-challenge",
    "operations": [
      "index-append",
      "force-merge",
      "randomized-nested-queries",
      "randomized-term-queries",
      "randomized-sorted-term-queries",
      "match-all",
      "nested-date-histo"
    ]
  },
  "user-tag": "",
  "trial-timestamp": "20170405T062922Z"
}

Whenever a user wants to see a tournament report, we use these metadata to fetch the metrics for this race and regenerate the report. The problem is that all this relies on an Elasticsearch metrics store.

Furthermore, we have created https://github.com/elastic/rally-results to help the community share benchmark results on different hardware and the suggestion is to use a mixture of a result file and the output of esrally list facts.

I think we can solve all these problems by making a race self-contained (example below). Rally would then always store the contents of a race either to the Elasticsearch metrics store (but in the new format) and / or as a file. For tournament reporting, we could then just read the file if the user did not setup an Elasticsearch metrics store. We can also use the same file for sharing results.

Here is an example of the envisioned structure:

{
  "environment": "nightly",
  "rally_version": "0.5.1.dev0 (git revision: 21e64fc)",
  "cluster": [
    {
      "node": "10.17.33.22",
      "type": "bare-metal",
      "provider": "hetzner",
      "instance_type": "ex41ssd",
      "hardware": {
        "cpu_model": "Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz",
        "disk": [
          {
            "file-system": "ext4",
            "type": "ssd",
            "device": "/dev/mapper/vg00-root",
            "raid_level": 0
          }
        ],
        "memory": "32gb"
      },
      "software": {
        "os_name": "Linux",
        "os_version": "4.4.0-38-generic",
        "jvm_vendor": "Oracle Corporation",
        "jvm_version": "1.8.0_101-b13",
        "distribution_version": "5.2.2"
      }
    }
  ],
  "race": {
    "pipeline": "from-sources-complete",
    "trial-timestamp": "20170405T062922Z",
    "track": "geonames",
    "challenge": "append-fast-no-conflicts",
    "car": "4gheap",
    "total-laps": 1
  },
  "#COMMENT": "We only persist individual laps. 'All' can be derived from that. Each nested array represents a lap.",
  "results": [
    [
      {
        "name": "indexing-time",
        "#COMMENT": "We will store data as fine-grained as possible.",
        "value": 780000,
        "unit": "ms"
      },
      {
        "name": "min-throughput",
        "operation": "index-append",
        "value": 67928,
        "unit": "docs/s"
      }
    ]
  ]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    :MetricsHow metrics are stored, calculated or aggregated:ReportingCommand line reporting:UsabilityMakes Rally easier to useenhancementImproves the status quohighlightA substantial improvement that is worth mentioning separately in release notes

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions