Description
Description of the problem including expected versus actual behavior:
In ECS version 1.6 process
schema is reused on itself to create the parent
section. If a custom schema is used to reuse process
onto the custom schema, the parent
fields are not included.
Steps to reproduce:
Create these files in a directory called test_schema_reuse
:
custom_process.yml
---
- name: process
title: Process
group: 2
short: These fields contain information about a process.
description: >
These fields contain information about a process.
These fields can help you correlate metrics information with a process id/name
from a log message. The `process.pid` often stays in the metric itself and is
copied to the global field for correlation.
reusable:
top_level: true
expected:
- DoubleReuse
type: group
fields:
- name: test_base
level: custom
type: keyword
description: Object for all custom defined fields to live in.
custom_double_reuse.yml
---
- name: DoubleReuse
title: DoubleReuse
group: 2
short: double reuse example.
description: double reuse example
type: group
fields:
- name: process
level: custom
type: object
description: >
Process.
To make things a little easier you can short circuit the generator like so:
diff --git a/scripts/generator.py b/scripts/generator.py
index b7ae2a4..3b5140b 100644
--- a/scripts/generator.py
+++ b/scripts/generator.py
@@ -43,6 +43,9 @@ def main():
fields = loader.load_schemas(ref=args.ref, included_files=args.include)
cleaner.clean(fields)
finalizer.finalize(fields)
+ ecs_helpers.yaml_dump('ecs.yml', fields)
+ import sys
+ sys.exit()
fields = subset_filter.filter(fields, args.subset, out_dir)
nested, flat = intermediate_files.generate(fields, os.path.join(out_dir, 'ecs'), default_dirs)
Run python scripts/generator.py --include <path to test_schema_reuse> --ref v1.6.0
Examine the output of ecs.yml
:
DoubleReuse section of ecs.yml
DoubleReuse:
field_details:
dashed_name: DoubleReuse
description: double reuse example
flat_name: DoubleReuse
name: DoubleReuse
node_name: DoubleReuse
short: double reuse example.
type: group
fields:
process:
field_details:
dashed_name: DoubleReuse-process
description: 'These fields contain information about a process.
These fields can help you correlate metrics information with a process id/name
from a log message. The `process.pid` often stays in the metric itself
and is copied to the global field for correlation.'
flat_name: DoubleReuse.process
intermediate: true
name: process
node_name: process
original_fieldset: process
short: These fields contain information about a process.
type: group
fields:
args:
field_details:
dashed_name: DoubleReuse-process-args
description: 'Array of process arguments, starting with the absolute path
to the executable.
May be filtered to protect sensitive information.'
example:
- /usr/bin/ssh
- -l
- user
- 10.0.0.16
flat_name: DoubleReuse.process.args
ignore_above: 1024
level: extended
name: args
node_name: args
normalize:
- array
original_fieldset: process
short: Array of process arguments.
type: keyword
args_count:
field_details:
dashed_name: DoubleReuse-process-args-count
description: 'Length of the process.args array.
This field can be useful for querying or performing bucket analysis
on how many arguments were provided to start a process. More arguments
may be an indication of suspicious activity.'
example: 4
flat_name: DoubleReuse.process.args_count
level: extended
name: args_count
node_name: args_count
normalize: []
original_fieldset: process
short: Length of the process.args array.
type: long
code_signature:
field_details:
dashed_name: DoubleReuse-process-code-signature
description: These fields contain information about binary code signatures.
flat_name: DoubleReuse.process.code_signature
intermediate: true
name: code_signature
node_name: code_signature
original_fieldset: code_signature
short: These fields contain information about binary code signatures.
type: group
fields:
exists:
field_details:
dashed_name: DoubleReuse-process-code-signature-exists
description: Boolean to capture if a signature is present.
example: 'true'
flat_name: DoubleReuse.process.code_signature.exists
level: core
name: exists
node_name: exists
normalize: []
original_fieldset: code_signature
short: Boolean to capture if a signature is present.
type: boolean
status:
field_details:
dashed_name: DoubleReuse-process-code-signature-status
description: 'Additional information about the certificate status.
This is useful for logging cryptographic errors with the certificate
validity or trust status. Leave unpopulated if the validity or trust
of the certificate was unchecked.'
example: ERROR_UNTRUSTED_ROOT
flat_name: DoubleReuse.process.code_signature.status
ignore_above: 1024
level: extended
name: status
node_name: status
normalize: []
original_fieldset: code_signature
short: Additional information about the certificate status.
type: keyword
subject_name:
field_details:
dashed_name: DoubleReuse-process-code-signature-subject-name
description: Subject name of the code signer
example: Microsoft Corporation
flat_name: DoubleReuse.process.code_signature.subject_name
ignore_above: 1024
level: core
name: subject_name
node_name: subject_name
normalize: []
original_fieldset: code_signature
short: Subject name of the code signer
type: keyword
trusted:
field_details:
dashed_name: DoubleReuse-process-code-signature-trusted
description: 'Stores the trust status of the certificate chain.
Validating the trust of the certificate chain may be complicated,
and this field should only be populated by tools that actively check
the status.'
example: 'true'
flat_name: DoubleReuse.process.code_signature.trusted
level: extended
name: trusted
node_name: trusted
normalize: []
original_fieldset: code_signature
short: Stores the trust status of the certificate chain.
type: boolean
valid:
field_details:
dashed_name: DoubleReuse-process-code-signature-valid
description: 'Boolean to capture if the digital signature is verified
against the binary content.
Leave unpopulated if a certificate was unchecked.'
example: 'true'
flat_name: DoubleReuse.process.code_signature.valid
level: extended
name: valid
node_name: valid
normalize: []
original_fieldset: code_signature
short: Boolean to capture if the digital signature is verified against
the binary content.
type: boolean
command_line:
field_details:
dashed_name: DoubleReuse-process-command-line
description: 'Full command line that started the process, including the
absolute path to the executable, and all arguments.
Some arguments may be filtered to protect sensitive information.'
example: /usr/bin/ssh -l user 10.0.0.16
flat_name: DoubleReuse.process.command_line
ignore_above: 1024
level: extended
multi_fields:
- flat_name: DoubleReuse.process.command_line.text
name: text
norms: false
type: text
name: command_line
node_name: command_line
normalize: []
original_fieldset: process
short: Full command line that started the process.
type: keyword
entity_id:
field_details:
dashed_name: DoubleReuse-process-entity-id
description: 'Unique identifier for the process.
The implementation of this is specified by the data source, but some
examples of what could be used here are a process-generated UUID, Sysmon
Process GUIDs, or a hash of some uniquely identifying components of
a process.
Constructing a globally unique identifier is a common practice to mitigate
PID reuse as well as to identify a specific process over time, across
multiple monitored hosts.'
example: c2c455d9f99375d
flat_name: DoubleReuse.process.entity_id
ignore_above: 1024
level: extended
name: entity_id
node_name: entity_id
normalize: []
original_fieldset: process
short: Unique identifier for the process.
type: keyword
executable:
field_details:
dashed_name: DoubleReuse-process-executable
description: Absolute path to the process executable.
example: /usr/bin/ssh
flat_name: DoubleReuse.process.executable
ignore_above: 1024
level: extended
multi_fields:
- flat_name: DoubleReuse.process.executable.text
name: text
norms: false
type: text
name: executable
node_name: executable
normalize: []
original_fieldset: process
short: Absolute path to the process executable.
type: keyword
exit_code:
field_details:
dashed_name: DoubleReuse-process-exit-code
description: 'The exit code of the process, if this is a termination event.
The field should be absent if there is no exit code for the event (e.g.
process start).'
example: 137
flat_name: DoubleReuse.process.exit_code
level: extended
name: exit_code
node_name: exit_code
normalize: []
original_fieldset: process
short: The exit code of the process.
type: long
hash:
field_details:
dashed_name: DoubleReuse-process-hash
description: 'The hash fields represent different hash algorithms and
their values.
Field names for common hashes (e.g. MD5, SHA1) are predefined. Add fields
for other hashes by lowercasing the hash algorithm name and using underscore
separators as appropriate (snake case, e.g. sha3_512).'
flat_name: DoubleReuse.process.hash
intermediate: true
name: hash
node_name: hash
original_fieldset: hash
short: Hashes, usually file hashes.
type: group
fields:
md5:
field_details:
dashed_name: DoubleReuse-process-hash-md5
description: MD5 hash.
flat_name: DoubleReuse.process.hash.md5
ignore_above: 1024
level: extended
name: md5
node_name: md5
normalize: []
original_fieldset: hash
short: MD5 hash.
type: keyword
sha1:
field_details:
dashed_name: DoubleReuse-process-hash-sha1
description: SHA1 hash.
flat_name: DoubleReuse.process.hash.sha1
ignore_above: 1024
level: extended
name: sha1
node_name: sha1
normalize: []
original_fieldset: hash
short: SHA1 hash.
type: keyword
sha256:
field_details:
dashed_name: DoubleReuse-process-hash-sha256
description: SHA256 hash.
flat_name: DoubleReuse.process.hash.sha256
ignore_above: 1024
level: extended
name: sha256
node_name: sha256
normalize: []
original_fieldset: hash
short: SHA256 hash.
type: keyword
sha512:
field_details:
dashed_name: DoubleReuse-process-hash-sha512
description: SHA512 hash.
flat_name: DoubleReuse.process.hash.sha512
ignore_above: 1024
level: extended
name: sha512
node_name: sha512
normalize: []
original_fieldset: hash
short: SHA512 hash.
type: keyword
name:
field_details:
dashed_name: DoubleReuse-process-name
description: 'Process name.
Sometimes called program name or similar.'
example: ssh
flat_name: DoubleReuse.process.name
ignore_above: 1024
level: extended
multi_fields:
- flat_name: DoubleReuse.process.name.text
name: text
norms: false
type: text
name: name
node_name: name
normalize: []
original_fieldset: process
short: Process name.
type: keyword
pe:
field_details:
dashed_name: DoubleReuse-process-pe
description: These fields contain Windows Portable Executable (PE) metadata.
flat_name: DoubleReuse.process.pe
intermediate: true
name: pe
node_name: pe
original_fieldset: pe
short: These fields contain Windows Portable Executable (PE) metadata.
type: group
fields:
architecture:
field_details:
dashed_name: DoubleReuse-process-pe-architecture
description: CPU architecture target for the file.
example: x64
flat_name: DoubleReuse.process.pe.architecture
ignore_above: 1024
level: extended
name: architecture
node_name: architecture
normalize: []
original_fieldset: pe
short: CPU architecture target for the file.
type: keyword
company:
field_details:
dashed_name: DoubleReuse-process-pe-company
description: Internal company name of the file, provided at compile-time.
example: Microsoft Corporation
flat_name: DoubleReuse.process.pe.company
ignore_above: 1024
level: extended
name: company
node_name: company
normalize: []
original_fieldset: pe
short: Internal company name of the file, provided at compile-time.
type: keyword
description:
field_details:
dashed_name: DoubleReuse-process-pe-description
description: Internal description of the file, provided at compile-time.
example: Paint
flat_name: DoubleReuse.process.pe.description
ignore_above: 1024
level: extended
name: description
node_name: description
normalize: []
original_fieldset: pe
short: Internal description of the file, provided at compile-time.
type: keyword
file_version:
field_details:
dashed_name: DoubleReuse-process-pe-file-version
description: Internal version of the file, provided at compile-time.
example: 6.3.9600.17415
flat_name: DoubleReuse.process.pe.file_version
ignore_above: 1024
level: extended
name: file_version
node_name: file_version
normalize: []
original_fieldset: pe
short: Process name.
type: keyword
imphash:
field_details:
dashed_name: DoubleReuse-process-pe-imphash
description: 'A hash of the imports in a PE file. An imphash -- or
import hash -- can be used to fingerprint binaries even after recompilation
or other code-level transformations have occurred, which would change
more traditional hash values.
Learn more at https://www.fireeye.com/blog/threat-research/2014/01/tracking-malware-import-hashing.html.'
example: 0c6803c4e922103c4dca5963aad36ddf
flat_name: DoubleReuse.process.pe.imphash
ignore_above: 1024
level: extended
name: imphash
node_name: imphash
normalize: []
original_fieldset: pe
short: A hash of the imports in a PE file.
type: keyword
original_file_name:
field_details:
dashed_name: DoubleReuse-process-pe-original-file-name
description: Internal name of the file, provided at compile-time.
example: MSPAINT.EXE
flat_name: DoubleReuse.process.pe.original_file_name
ignore_above: 1024
level: extended
name: original_file_name
node_name: original_file_name
normalize: []
original_fieldset: pe
short: Internal name of the file, provided at compile-time.
type: keyword
product:
field_details:
dashed_name: DoubleReuse-process-pe-product
description: Internal product name of the file, provided at compile-time.
example: "Microsoft\xAE Windows\xAE Operating System"
flat_name: DoubleReuse.process.pe.product
ignore_above: 1024
level: extended
name: product
node_name: product
normalize: []
original_fieldset: pe
short: Internal product name of the file, provided at compile-time.
type: keyword
pgid:
field_details:
dashed_name: DoubleReuse-process-pgid
description: Identifier of the group of processes the process belongs
to.
flat_name: DoubleReuse.process.pgid
format: string
level: extended
name: pgid
node_name: pgid
normalize: []
original_fieldset: process
short: Identifier of the group of processes the process belongs to.
type: long
pid:
field_details:
dashed_name: DoubleReuse-process-pid
description: Process id.
example: 4242
flat_name: DoubleReuse.process.pid
format: string
level: core
name: pid
node_name: pid
normalize: []
original_fieldset: process
short: Process id.
type: long
ppid:
field_details:
dashed_name: DoubleReuse-process-ppid
description: Parent process' pid.
example: 4241
flat_name: DoubleReuse.process.ppid
format: string
level: extended
name: ppid
node_name: ppid
normalize: []
original_fieldset: process
short: Parent process' pid.
type: long
start:
field_details:
dashed_name: DoubleReuse-process-start
description: The time the process started.
example: '2016-05-23T08:05:34.853Z'
flat_name: DoubleReuse.process.start
level: extended
name: start
node_name: start
normalize: []
original_fieldset: process
short: The time the process started.
type: date
test_base:
field_details:
dashed_name: DoubleReuse-process-test-base
description: Object for all custom defined fields to live in.
flat_name: DoubleReuse.process.test_base
ignore_above: 1024
level: custom
name: test_base
node_name: test_base
normalize: []
original_fieldset: process
short: Object for all custom defined fields to live in.
type: keyword
thread:
field_details:
dashed_name: DoubleReuse-process-thread
flat_name: DoubleReuse.process.thread
intermediate: true
name: thread
node_name: thread
original_fieldset: process
type: object
fields:
id:
field_details:
dashed_name: DoubleReuse-process-thread-id
description: Thread ID.
example: 4242
flat_name: DoubleReuse.process.thread.id
format: string
level: extended
name: thread.id
node_name: id
normalize: []
original_fieldset: process
short: Thread ID.
type: long
name:
field_details:
dashed_name: DoubleReuse-process-thread-name
description: Thread name.
example: thread-0
flat_name: DoubleReuse.process.thread.name
ignore_above: 1024
level: extended
name: thread.name
node_name: name
normalize: []
original_fieldset: process
short: Thread name.
type: keyword
title:
field_details:
dashed_name: DoubleReuse-process-title
description: 'Process title.
The proctitle, some times the same as process name. Can also be different:
for example a browser setting its title to the web page currently opened.'
flat_name: DoubleReuse.process.title
ignore_above: 1024
level: extended
multi_fields:
- flat_name: DoubleReuse.process.title.text
name: text
norms: false
type: text
name: title
node_name: title
normalize: []
original_fieldset: process
short: Process title.
type: keyword
uptime:
field_details:
dashed_name: DoubleReuse-process-uptime
description: Seconds the process has been up.
example: 1325
flat_name: DoubleReuse.process.uptime
level: extended
name: uptime
node_name: uptime
normalize: []
original_fieldset: process
short: Seconds the process has been up.
type: long
working_directory:
field_details:
dashed_name: DoubleReuse-process-working-directory
description: The working directory of the process.
example: /home/alice
flat_name: DoubleReuse.process.working_directory
ignore_above: 1024
level: extended
multi_fields:
- flat_name: DoubleReuse.process.working_directory.text
name: text
norms: false
type: text
name: working_directory
node_name: working_directory
normalize: []
original_fieldset: process
short: The working directory of the process.
type: keyword
schema_details:
group: 2
nestings:
- DoubleReuse.process
prefix: DoubleReuse.
reused_here:
- full: DoubleReuse.process
schema_name: process
short: These fields contain information about a process.
root: false
title: DoubleReuse
Notice that the name DoubleReuse-process-parent
does not exist in the ecs.yml
file. The initial field DoubleReuse-process-test-base
does though.
The endpoint team leverages the ability to reuse process and the parent fields in custom schema for malware: https://github.com/elastic/endpoint-package/blob/master/custom_schemas/custom_process.yml#L15
This works for ecs version 1.5 because the parent
fields were defined manually.