-
Notifications
You must be signed in to change notification settings - Fork 613
[FR] Add keep metadata check to esql schema test #5441
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Enhancement - GuidelinesThese guidelines serve as a reminder set of considerations when addressing adding a new schema feature to the code. Documentation and Context
Code Standards and Practices
Testing
Additional Schema Related Checks
|
|
If #5433 merges and we are fine with Updated: Complete |
Mikaayenson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe clarify in docs/tests that this is specifically about ensuring those metadata fields are in the output row (i.e., in keep), not just in METADATA
| # Match | followed by optional whitespace/newlines and then 'keep' | ||
| keep_pattern = re.compile(r"\|\s*keep\b", re.IGNORECASE | re.DOTALL) | ||
| if not keep_pattern.search(query_lower): | ||
| keep_pattern = re.compile(r"\|\s*keep\b\s+([^\|]+)", re.IGNORECASE | re.DOTALL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this work with things like keep Esql.* / keep aws.*?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pattern is used to grab all of the fields that are kept as a single string. It is then parsed later into each individual field to check for the appropriate metadata or presence of an *. For cases like keep Esql.* and keep aws.* they are treated as part of the keep string.
For example, a rule with query like this: (also see example in our unit test)
from logs-endpoint.events.process-*, logs-windows.sysmon_operational-*, logs-system.security-*, logs-windows.*, winlogbeat-*, logs-crowdstrike.fdr*, logs-m365_defender.event-* METADATA _id, _version, _index
| where
@timestamp > now() - 8 hours and
event.category == "process" and
event.type == "start" and
process.name == "rundll32.exe" and
process.command_line like "*DavSetCookie*"
| keep Esql.*, aws.*, event.*, host.*, process.*, user.*, *
Will have a keep_pattern.search(query_lower).group(1) of 'esql.*, aws.*, event.*, host.*, process.*, user.*, *\n' which is how we are using the keep_pattern.
Co-authored-by: Mika Ayenson, PhD <Mikaayenson@users.noreply.github.com>
Co-authored-by: Jonhnathan <26856693+w0rk3r@users.noreply.github.com>
Pull Request
Issue link(s):
Resolves #5440
Related to #5439
Summary - What I changed
I added a schema test to enforce adding metadata fields to the
keepportion of non-aggregate ES|QL queries.Note unit tests will continue to fail until the rules are updated to match the new
keeprequirement.How To Test
Use view rule on an ES|QL rule that does not have the metadata fields
_id, _version, _indexin itskeepline.E.g.
Checklist
bug,enhancement,schema,maintenance,Rule: New,Rule: Deprecation,Rule: Tuning,Hunt: New, orHunt: Tuningso guidelines can be generatedmeta:rapid-mergelabel if planning to merge within 24 hoursContributor checklist