You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Now in function generate_response_vector_linear we use np.random.randn to generate random error terms from the standard normal distribution. This creates a problem for its pytest function: Ideally we want to compare the actual value and the theoretical value of the response vector. However, now we have this random error term which can be any value in theory. Even though 95% of chance it will fall into [-1.96, 1.96], if the betas or the column values are small, it can still have a huge impact on the magnitude of the response vector. Raised in #19.
Question:
How do we test the function generate_response_vector_linear works given the problem stated above?
Current compromised solution:
I set a high tolerance for the sample response value.
Possible solutions:
Get around and do not test by comparing the values.
Set a random seed to fix the error terms.
Create an instance variable response_vector_w/o_error to store the response vectors without random error term.
The text was updated successfully, but these errors were encountered:
Some thoughts on tolerance when I wrote tests for generate_polynomial_vector function:
The reason why the tests fail is mainly because of the term error, which following the equation $N(\mu\ , \sigma^2)$. In the unit test code, if generate response WITHOUT error, the response would have 95% chance falling into $[response - 1.96 * epsilon, response + 1.96 * epsilon]$. And this probably is a way to approach the test, instead of $[(1-tolerance) * response, (1 + tolerance) * response]$
Description:
Now in function
generate_response_vector_linear
we usenp.random.randn
to generate random error terms from the standard normal distribution. This creates a problem for its pytest function: Ideally we want to compare the actual value and the theoretical value of the response vector. However, now we have this random error term which can be any value in theory. Even though 95% of chance it will fall into [-1.96, 1.96], if the betas or the column values are small, it can still have a huge impact on the magnitude of the response vector. Raised in #19.Question:
How do we test the function
generate_response_vector_linear
works given the problem stated above?Current compromised solution:
I set a high tolerance for the sample response value.
Possible solutions:
response_vector_w/o_error
to store the response vectors without random error term.The text was updated successfully, but these errors were encountered: