Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check fields type and format #13188

Merged
merged 14 commits into from
Aug 15, 2019
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG-developer.next.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -47,3 +47,4 @@ The list below covers the major changes between 7.0.0-rc2 and master only.
- Introduce beat.OutputChooses publisher mode. {pull}12996[12996]
- Ensure that beat.Processor, beat.ProcessorList, and processors.ProcessorList are compatible and can be composed more easily. {pull}12996[12996]
- Add support to close beat.Client via beat.CloseRef (a subset of context.Context). {pull}13031[13031]
- Add checks for types and formats used in fields definitions in `fields.yml` files. {pull}13188[13188]
1 change: 1 addition & 0 deletions CHANGELOG.next.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@ https://github.com/elastic/beats/compare/v7.0.0-alpha2...master[Check the HEAD d
- ILM: Use GET instead of HEAD when checking for alias to expose detailed error message. {pull}12886[12886]
- Fix seccomp policy preventing some features to function properly on 32bit Linux systems. {issue}12990[12990] {pull}13008[13008]
- Fix install-service.ps1's ability to set Windows service's delay start configuration. {pull}13173[13173]
- Fix some incorrect types and formats in field.yml files. {pull}13188[13188]

*Auditbeat*

Expand Down
18 changes: 9 additions & 9 deletions libbeat/kibana/testdata/extensive/fields.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1110,7 +1110,7 @@
fields:
- name: kernel.pct
type: scaled_float
format: percentage
format: percent
sayden marked this conversation as resolved.
Show resolved Hide resolved
description: >
The system kernel consumed by the Docker server.
- name: kernel.ticks
Expand All @@ -1119,23 +1119,23 @@
CPU kernel ticks.
- name: system.pct
type: scaled_float
format: percentage
format: percent
description: >
- name: system.ticks
type: long
description: >
CPU system ticks.
- name: user.pct
type: scaled_float
format: percentage
format: percent
description: >
- name: user.ticks
type: long
description: >
CPU user ticks
- name: total.pct
type: scaled_float
format: percentage
format: percent
description: >
Total CPU usage.
# TODO: how to document cpu list?
Expand Down Expand Up @@ -1309,7 +1309,7 @@
Total memory resident set size.
- name: pct
type: scaled_float
format: percentage
format: percent
description: >
Memory resident set size percentage.
- name: usage
Expand All @@ -1324,7 +1324,7 @@
Max memory usage.
- name: pct
type: scaled_float
format: percentage
format: percent
description: >
Memory usage percentage.
- name: total
Expand Down Expand Up @@ -2061,7 +2061,7 @@

- name: throttle.pct
type: scaled_float
format: percentage
format: percent
description: >
Current throttle percentage for the server when slowstart
is active, or no value if slowstart is inactive.
Expand Down Expand Up @@ -4885,7 +4885,7 @@
Number of consumers.
- name: consumers.utilisation.pct
type: long
format: percentage
format: percent
description: >
Fraction of the time (between 0.0 and 1.0) that the queue is able to immediately deliver messages to consumers. This can be less than 1.0 if consumers are limited by network congestion or prefetch count.
- name: messages.total.count
Expand Down Expand Up @@ -5000,8 +5000,8 @@
fields:
- name: used.value
type: long
description: >
format: bytes
description: >
Used memory.

- name: used.rss
Expand Down
2 changes: 1 addition & 1 deletion libbeat/kibana/testdata/extensive/metricbeat-6.json

Large diffs are not rendered by default.

52 changes: 52 additions & 0 deletions libbeat/mapping/field.go
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,9 @@ func (d *DynamicType) Unpack(s string) error {

// Validate ensures objectTypeParams are not mixed with top level objectType configuration
func (f *Field) Validate() error {
if err := f.validateType(); err != nil {
return err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about returning here error trying to check 'type' field in 'fields.yml' fike '%s'. Check the 'fields.yml' in the module you are modifying for an incorrect or unknonwn 'type' value

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I have reordered error wrapping a little bit, this is what is printed now:

Generated global fields.yml file for metricbeat is invalid: incorrect type configuration for field 'max.bytes': unexpected formatter for number type, expected one of: string, url, bytes, duration, number, percent, color accessing '52.fields.0.fields.0.fields.1.fields.2.fields.0' (source:'fields.yml')
exit status 3

This is not ideal, but I haven't found a way to get more info from the underlying parsing library. In any case it is much better than what we have now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally agree. It's much better. Sorry for asking you a bit more 😅 maybe it's worth to just add (source: 'beats/metricbeat/module/{your-module}/_meta/fields.yml') or something similar. A new contributor won't know where the fields.yml is gonna be located.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part of the message is generated by the library, we are a bit limitated by that.

Also take into account that code in this PR only checks the global fields.yml, fields.yml files in modules and metricsets are not valid YAML by design, so they are not so easy to parse and check (and we don't). They are concatenated to create the global fields.yml and only this should be a valid YAML, but we are actually not even checking that, and this is not detected at all unless something mysteriously breaks in some integration test.

In summary, yes, at the moment it is not possible to easily know the module of the fields causing the problems, but hopefully a PR will only touch a limited number of fields, so it shouldn't be so hard to spot.

}
if len(f.ObjectTypeParams) == 0 {
return nil
sayden marked this conversation as resolved.
Show resolved Hide resolved
}
Expand All @@ -119,6 +122,55 @@ func (f *Field) Validate() error {
return nil
}

func (f *Field) validateType() error {
switch strings.ToLower(f.Type) {
case "text", "keyword":
return errors.Wrapf(stringType.validate(f.Format), "field %s", f.Name)
case "long", "integer", "short", "byte", "double", "float", "half_float", "scaled_float":
kaiyan-sheng marked this conversation as resolved.
Show resolved Hide resolved
return errors.Wrapf(numberType.validate(f.Format), "field %s", f.Name)
case "date", "date_nanos":
return errors.Wrapf(dateType.validate(f.Format), "field %s", f.Name)
case "geo_point":
return errors.Wrapf(geoPointType.validate(f.Format), "field %s", f.Name)
case "boolean", "binary", "ip", "alias", "array":
if f.Format != "" {
return fmt.Errorf("no format expected for field %s, found: %s", f.Name, f.Format)
}
case "object", "group", "nested":
// No check for them yet
case "":
// Module keys, not used as fields
default:
// There are more types, not being used by beats, to be added if needed
return fmt.Errorf("unexpected type '%s' for field '%s'", f.Type, f.Name)
}
return nil
}

type fieldTypeGroup struct {
name string
formats []string
}

var (
stringType = fieldTypeGroup{"string", []string{"string", "url"}}
kaiyan-sheng marked this conversation as resolved.
Show resolved Hide resolved
numberType = fieldTypeGroup{"number", []string{"string", "url", "bytes", "duration", "number", "percent", "color"}}
sayden marked this conversation as resolved.
Show resolved Hide resolved
dateType = fieldTypeGroup{"date", []string{"string", "url", "date"}}
geoPointType = fieldTypeGroup{"geo_point", []string{"geo_point"}}
)

func (g *fieldTypeGroup) validate(format string) error {
if format == "" {
return nil
}
for _, expected := range g.formats {
if expected == format {
return nil
}
}
return fmt.Errorf("unexpected format for %s type, expected one of: %s", g.name, strings.Join(g.formats, ", "))
}

func LoadFieldsYaml(path string) (Fields, error) {
var keys []Field

Expand Down
7 changes: 7 additions & 0 deletions libbeat/scripts/cmd/global_fields/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ import (
"path/filepath"

"github.com/elastic/beats/libbeat/generator/fields"
"github.com/elastic/beats/libbeat/mapping"
)

func main() {
Expand Down Expand Up @@ -95,6 +96,12 @@ func main() {
os.Exit(3)
}

_, err = mapping.LoadFieldsYaml(output)
if err != nil {
fmt.Fprintf(os.Stderr, "Generated global fields.yml file for %s is invalid: %+v\n", name, err)
os.Exit(3)
}

outputPath, _ := filepath.Abs(output)
if err != nil {
outputPath = output
Expand Down
41 changes: 19 additions & 22 deletions metricbeat/docs/fields.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -4791,7 +4791,7 @@ Percentage of time in kernel space.

type: scaled_float

format: percentage
format: percent

--

Expand All @@ -4813,7 +4813,7 @@ Percentage of total CPU time in the system.

type: scaled_float

format: percentage
format: percent

--

Expand All @@ -4835,7 +4835,7 @@ Percentage of time in user space.

type: scaled_float

format: percentage
format: percent

--

Expand All @@ -4857,7 +4857,7 @@ Total CPU usage.

type: scaled_float

format: percentage
format: percent

--

Expand All @@ -4869,7 +4869,7 @@ Percentage of CPU time in this core.

type: object

format: percentage
format: percent

--

Expand Down Expand Up @@ -5479,7 +5479,7 @@ Memory resident set size percentage.

type: scaled_float

format: percentage
format: percent

--

Expand Down Expand Up @@ -5510,7 +5510,7 @@ Memory usage percentage.

type: scaled_float

format: percentage
format: percent

--

Expand Down Expand Up @@ -12003,7 +12003,7 @@ Current throttle percentage for the server when slowstart is active, or no value

type: scaled_float

format: percentage
format: percent

--

Expand Down Expand Up @@ -14129,7 +14129,7 @@ CPU usage as a percentage of the total node allocatable CPU

type: scaled_float

format: percentage
format: percent

--

Expand All @@ -14141,7 +14141,7 @@ CPU usage as a percentage of the defined limit for the container (or total node

type: scaled_float

format: percentage
format: percent

--

Expand Down Expand Up @@ -14257,7 +14257,7 @@ Memory usage as a percentage of the total node allocatable memory

type: scaled_float

format: percentage
format: percent

--

Expand All @@ -14269,7 +14269,7 @@ Memory usage as a percentage of the defined limit for the container (or total no

type: scaled_float

format: percentage
format: percent

--

Expand Down Expand Up @@ -15246,7 +15246,7 @@ CPU usage as a percentage of the total node CPU

type: scaled_float

format: percentage
format: percent

--

Expand All @@ -15258,7 +15258,7 @@ CPU usage as a percentage of the defined limit for the pod containers (or total

type: scaled_float

format: percentage
format: percent

--

Expand All @@ -15284,7 +15284,7 @@ Memory usage as a percentage of the total node allocatable memory

type: scaled_float

format: percentage
format: percent

--

Expand All @@ -15296,7 +15296,7 @@ Memory usage as a percentage of the defined limit for the pod containers (or tot

type: scaled_float

format: percentage
format: percent

--

Expand Down Expand Up @@ -22945,8 +22945,6 @@ The date and time FPM has started.

type: date

format: epoch_second

--

[float]
Expand Down Expand Up @@ -22986,8 +22984,6 @@ The date and time the process has started

type: date

format: epoch_second

--

*`php_fpm.process.start_since`*::
Expand Down Expand Up @@ -24676,7 +24672,7 @@ Fraction of the time (between 0.0 and 1.0) that the queue is able to immediately

type: long

format: percentage
format: percent

--

Expand Down Expand Up @@ -24950,11 +24946,12 @@ Redis memory stats.
*`redis.info.memory.used.value`*::
+
--
Total number of bytes allocated by Redis.


type: long

format: bytes Total number of bytes allocated by Redis.
format: bytes

--

Expand Down
Loading