Skip to content

Make options declaration mandatory #237

@pounard

Description

@pounard

In all multiple column anonymizers that accept other options than column names, columns cannot be named with the same names as options, such as sample_size, source, etc...

In #234 I introduced this in validation:

        $columns = $this->options->get('columns', null, true);
        if (!\is_array($columns)) {
            throw new ConfigurationException("'columns' must be an array of string or null values.");
        }
        $invalidNames = ['source', 'columns', 'file_csv_enclosure', 'file_csv_escape', 'file_csv_separator', 'file_skip_header'];
        foreach ($columns as $index => $column) {
            if (\in_array($column, $invalidNames)) {
                throw new ConfigurationException(\sprintf("'columns' values cannot be one of ('%s') for column #%d.", \implode("', '", $invalidNames), $index));
            }
            if (!\is_string($column) && null !== $column) {
                throw new ConfigurationException(\sprintf("'columns' must be an array of string or null values (invalid type for column #%d.", $index));
            }
        }

This works, but I now have to repeat this a second, then a third time for new anonymizer implementations.

It's becoming hard to maintain:

  • copy-pasted code more than two times is wrong,
  • some other options can be set at the abstract anonymizer level (sample size, for example).

What I propose is to force anonymizers to declare their options, for example:

    protected function initialize(): void
    {
        $this->addOption(
            'sample_size',
            Option::TYPE_INT,
            "Some short description",
            null /* default value) */,
            true /* Is optional (default true) */,
            "Some long description..."
        );
        // ...
    }

Then, in AbstractAnonymizer, when validation happens, check the given Options array and validate automatically at least the mandatory and type state.

  • We can also generate documentation from this.
  • We also can get rid of long class PHP doc.
  • We can generate meaningful console commands that documents it as well.
  • We can automatize some validation in check command.
  • We can use this declared attribute list for the custom validation I presented on top of this issue: instead of iterating over a raw array, we could iterate over \array_keys($this->getDeclaredOptions()) for example.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions