DRILL-5504: Vector validator to diagnose offset vector issues#832
DRILL-5504: Vector validator to diagnose offset vector issues#832paul-rogers wants to merge 1 commit intoapache:masterfrom
Conversation
| * each batch passed to each iterator. | ||
| */ | ||
| String ENABLE_VECTOR_VALIDATION = "debug.validate_vectors"; | ||
| BooleanValidator ENABLE_VECTOR_VALIDATOR = new BooleanValidator(ENABLE_VECTOR_VALIDATION, true); |
There was a problem hiding this comment.
false, by default, here and below.
There was a problem hiding this comment.
Good catch. Fixed.
But, that error actually accidentally caught a bug...
| if (! enableValidation) { | ||
| enableValidation = context.getOptionSet().getOption(ExecConstants.ENABLE_ITERATOR_VALIDATOR); | ||
| } | ||
| if (enableValidation) { |
There was a problem hiding this comment.
if (AssertionUtil.isAssertionsEnabled() ||
context.getOptionSet().getOption(ExecConstants.ENABLE_ITERATOR_VALIDATOR) { ... }
| private void validateWrapper(VectorWrapper<? extends ValueVector> w) { | ||
| if (w instanceof SimpleVectorWrapper) { | ||
| validateVector(w.getValueVector()); | ||
| } |
There was a problem hiding this comment.
You mentioned above that HyperVectorWrapper is not validated. Can you open a ticket for the functionality to-be-implemented in this validator?
There was a problem hiding this comment.
Done. See DRILL-5526.
| } | ||
| } | ||
|
|
||
| public void validate() { |
There was a problem hiding this comment.
Just a thought. Is there a way to enable these checks (and fail if invalid) for pre-commit tests as well?
There was a problem hiding this comment.
Great idea! Added a config option that forces vector validation. Add the following to the pom.xml file in the Surefire options:
{code}
-Ddrill.exec.debug.validate_vectors=true
{code}
Will try this out and enable the checks as a different JIRA ticket and PR.
|
+1 Please squash the commits. |
|
Commits squashed. |
Validates offset vectors in VarChar and repeated vectors. Validates the special case of repeated VarChar vectors (two layers of offsets.) Provides two new session variables to turn on validation. One enables the existing operator (iterator) validation, the other adds vector validation. This allows validation to occur in a “production” Drill (without restarting Drill with assertions, as previously required.) Unit tests validate the validator. Another test validates the integration, but requires manual steps, so is ignored by default. This version is first-cut: all work is done within a single class. Allows back-porting to an earlier version to solve a specific issues. A revision should move some of the work into generated code (or refactor vectors to allow outside access), since offset vectors appear for each subclass; not on a base class that would allow generic operations. * Added boot-time options to allow enabling vector validation in Maven unit tests. * Code cleanup per suggestions. * Additional (manual) tests for boot-time options and default options.
|
Fixed typo in log message and rebased onto latest master. |
Validates offset vectors in VarChar and repeated vectors. Validates the special case of repeated VarChar vectors (two layers of offsets.) Provides two new session variables to turn on validation. One enables the existing operator (iterator) validation, the other adds vector validation. This allows validation to occur in a “production” Drill (without restarting Drill with assertions, as previously required.) Unit tests validate the validator. Another test validates the integration, but requires manual steps, so is ignored by default. This version is first-cut: all work is done within a single class. Allows back-porting to an earlier version to solve a specific issues. A revision should move some of the work into generated code (or refactor vectors to allow outside access), since offset vectors appear for each subclass; not on a base class that would allow generic operations. * Added boot-time options to allow enabling vector validation in Maven unit tests. * Code cleanup per suggestions. * Additional (manual) tests for boot-time options and default options. closes apache#832
Validates offset vectors in VarChar and repeated vectors. Validates the
special case of repeated VarChar vectors (two layers of offsets.)
Provides two new session variables to turn on validation. One enables
the existing operator (iterator) validation, the other adds vector
validation. This allows validation to occur in a “production” Drill
(without restarting Drill with assertions, as previously required.)
Unit tests validate the validator. Another test validates the
integration, but requires manual steps, so is ignored by default.
This version is first-cut: all work is done within a single class.
Allows back-porting to an earlier version to solve a specific issues. A
revision should move some of the work into generated code (or refactor
vectors to allow outside access), since offset vectors appear for each
subclass; not on a base class that would allow generic operations.