Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RN-473: Add unit tests for data-lake-api #3882

Merged
merged 3 commits into from
May 2, 2022

Conversation

rohan-bes
Copy link
Collaborator

@rohan-bes rohan-bes commented Apr 21, 2022

Issue RN-473:

Added a small suite of unit tests for @tupaia/data-lake-api. The tests are quite similar to the @tupaia/data-api tests, however they also require spinning up a test database for the data-lake. We just use the same postgres server as the regular tupaia database for this (in local as well as CI/CD).

Also ran into an issue with codeship builds failing due to OOM after adding the new test suite. Discussed here the plan is just to break the test suite into 2 batches.

Merge tasks

  • Update local data-lake-api .env vars

command: './packages/devops/scripts/ci/testInternalDependencies.sh'
- name: Test web-frontend
command: './packages/devops/scripts/ci/testFrontend.sh web-frontend'
- type: parallel
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Split the tests into 2 batches. Arguably I should've left the existing tests as batch 1 and just started a new batch, it might make it a bit more obvious where to extend the tests... but this is prettier 😝

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Yeah this is prettier!

DATA_LAKE_DB_USER=

DB_PG_USER=
DB_PG_PASSWORD=
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need the PG_USER to be able to create the data_lake_test database.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these test required only? If so can we add a comment like # only required for unit tests? Or rename these to UNIT_TEST_DB_PG_USER or something.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep test only. I'll keep the name as is as it's consistent with other packages, but I'll add a comment like you suggest 👍

@@ -15,8 +15,11 @@
"lint": "eslint . --ext .ts",
"lint:all": "yarn run lint \"src/**/*.{ts,jsx}\"",
"lint:fix": "yarn lint --fix",
"test": "jest --passWithNoTests",
"test:coverage": "yarn test --coverage"
"test": "yarn workspace @tupaia/data-lake-api check-test-data-lake-exists && DATA_LAKE_DB_NAME=data_lake_test jest --runInBand",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like we do for the tests that require the tupaia database, run a check to see if the data-lake database exists

"test:coverage": "yarn test --coverage"
"test": "yarn workspace @tupaia/data-lake-api check-test-data-lake-exists && DATA_LAKE_DB_NAME=data_lake_test jest --runInBand",
"test:coverage": "yarn test --coverage",
"update-test-data": "bash -c 'source .env && pg_dump -s -h data-lake-db.tupaia.org -U $DATA_LAKE_DB_USER -O $DATA_LAKE_DB_NAME -t analytics > src/__tests__/testData/testDataDump.sql'",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explicitly pull the schema off the data-lake-db.tupaia.org, rather than what's configured for DATA_LAKE_DB_URL. This is because we set DATA_LAKE_DB_URL to localhost for the tests, and we need a real database here to pull the schema. Should be fine to use as we're using a read-only user and it's just one table 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very handy!

Copy link
Contributor

@biaoli0 biaoli0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rohan-bes ! Always nice to have a set of unit tests to cover the code. I don't feel it will break anything for the devops changes, happy to give an approve. Just got a few comments and minor changes requested.

DATA_LAKE_DB_USER=

DB_PG_USER=
DB_PG_PASSWORD=
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these test required only? If so can we add a comment like # only required for unit tests? Or rename these to UNIT_TEST_DB_PG_USER or something.

organisationUnitCodes: ['NZ_AK', 'NZ_WG'],
dataElementCodes: ['BCD1TEST', 'BCD325TEST', 1],
},
/dataElementCodes*/,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super cool to use regex here. Just one question, does this test case mean we it will throw an error contains dataElementCodes?

  • Saw comment // data element code is not a string in fetchEvents.test.ts in a similar test case, that helps to read. Suggest to add the same comment here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep exactly, its sort of just to make the tests a bit more resilient to changes in the error message. But I'll take a look at these and improve either the messaging or the comments around the tests 👍

Comment on lines 14 to 15
echo -e "Error: $DATA_LAKE_DB_NAME database does not exist!\n\nTo create it, please run:\nyarn workspace @tupaia/data-lake-api setup-test-data-lake\n"
exit 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
echo -e "Error: $DATA_LAKE_DB_NAME database does not exist!\n\nTo create it, please run:\nyarn workspace @tupaia/data-lake-api setup-test-data-lake\n"
exit 1
echo -e "Error: $DATA_LAKE_DB_NAME database does not exist!\n\nTo create it, please get .env secret from lastpass then run:\nyarn workspace @tupaia/data-lake-api setup-test-data-lake\n"
exit 1

^ A minor suggestion, that helps my first time setup in local.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good call!

Comment on lines 12 to 14
export function getTestDatabase() {
if (!readDatabase) {
readDatabase = new DataLakeDatabase();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
export function getTestDatabase() {
if (!readDatabase) {
readDatabase = new DataLakeDatabase();
export function getTestDatabase() {
if (!readDatabase) {
readDatabase = new DataLakeDatabase(getConnectionConfig());

Can we move getConnectionConfig() here? Just wanna make these functions be more consistent.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing 👍

command: './packages/devops/scripts/ci/testInternalDependencies.sh'
- name: Test web-frontend
command: './packages/devops/scripts/ci/testFrontend.sh web-frontend'
- type: parallel
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Yeah this is prettier!

Xe0EbawQHSultl2ATorXCdSl5h/yWZie0B28nlSLBSl9NJ5VNFo0QTq146p/LjJagfL0+My5ZGTHfJ3vVf6E9+2eo6xn/z59xK+RtcD61w7uFup2Z5ocqTez5aXK6K2qZbyWmdSyY1cDyXR9gsQWtUU9rakh7SyQgxZRwez8ZfU2SdS+Iglpfc/77SNzYVGMnwJb27/h93Bb+qvXeOsgbG+J05/1FEoFD3Xn3KIZmC6nZFWHMc0SHKjVLkrrTXnxrGQvl7VocY7dWC4K80sH7hnGhVDpA2e3v5K9BqDQv7N0EgiGqvueuugId+l1rlTCKQ4+9LBK18V3cU9em6DG1/7dB9sReCJSa5LEPMStbPQQ7ubgx2G2vRIL3Pci/PWmyBkGDfNy417PrTMpdln+Wnmy96jQ0+N8yEo1cwmu/L2WUs2NTyNkLVibVlhL6q2YpwuNjTqNJwzPV/7pmXHPdSfybmW+l5gD5y53jZtnJa8VC4VsxOQS4Q9HUfry69nTVK59BkVTaRRR1/FmPVmIDUkxTp4crwG0Y6xU4+o073yYZ8TXWemeCY64jXeOUWECcbltKSZKqr8102Gbt/trwSo3Kh8bY2rZPYpp/otlk9/9HIQYAP6zjaX19owNXPtlpFGSkH1sXLCZcMrVT4q5MFZ6qv5A4fpeptWqog1ZrRy44JEGSuOsERrBDgfPJESKM6Aul5A+uwW5DuvJ60aVhNSs3OYApPcx1XdbXEMqup1IdTtfgsE0AK3WlLcHLmyyeNMcbBHGlzme4Gvbo0NQpwK3VXbfprB4AjOigFirV2ThprBAnPO+7cs6DYkgHPKBgfNs7pDrFSER5pWD
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it change automatically? Like yarn.lock changes after running yarn?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we actually have to manually update this using jet encrypt. Read here for more info

@rohan-bes rohan-bes force-pushed the rn-473-add-unit-tests-data-lake branch from 5c6c0da to 63f6e74 Compare May 2, 2022 06:07
@rohan-bes rohan-bes merged commit 56b639f into dev May 2, 2022
@rohan-bes rohan-bes deleted the rn-473-add-unit-tests-data-lake branch May 2, 2022 07:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants