This repository was archived by the owner on Jan 19, 2025. It is now read-only.
This repository was archived by the owner on Jan 19, 2025. It is now read-only.
Required vs. optional parameters #415
Closed
Description
- Some parameters like
x
andy
infit
/predict
are always required. - For the rest compute the entropy of the distribution of values and compare it to a uniform distribution.
Example:
- Parameter has values
1
(4 times),2
(twice), and3
(once),4
(once). Then it has distributionP = {4/8, 2/8, 1/8, 1/8}
. ThenH(P) = 1/2*log2(2) + 1/4*log2(4) + 2 * 1/8*log2(8) = 1/2 + 1/2 + 3/4 = 1.75
- The uniform distribution over n values has entropy
H(U_n) = log2(n)
, so here this entropy islog2(4) = 2
. - We can now check how similar the distribution of values is to the uniform distribution.
- Option 1 - Kullback-Leibler divergence:
H(U_n) - H(P) = 2 - 1.75 = 0.25
. Lower values mean the distribution is more similar to a uniform distribution. - Option2 - Normed entropy:
H(P) / H(U_n) = 1.75/2 = 0.875
. Higher values mean the distribution is more similar to a uniform distribution.
- Option 1 - Kullback-Leibler divergence:
- We define thresholds to determine whether a parameter should be optional or required. For Option 1 parameters below the threshold are made required while values at or above the threshold are made optional. For Option 2 parameters above the threshold are made required while values at or below the threshold are made optional.
- If the parameter is optional, use the most commonly used value as the default (can differ from the previous default).
Note: We need to check whether we can find a threshold that always fits regardless of the number of values or whether the threshold is a function of the number of values the parameter has.
Metadata
Metadata
Assignees
Labels
No labels