Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dependencies: version check on sqlite C-language #6567

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

khsrali
Copy link
Contributor

@khsrali khsrali commented Sep 22, 2024

This PR provides a solution for an issue previously raised in discourse.

The issue:

Sofar, AiiDA doesn't enforce any specific version of sqlite (in C-language) at any stage. This has caused some confusing issues in the past.
AiiDA uses sqlite3 library in python, which is a wrapper around the sqlite3 (in C). The two have independent versions:

import sqlite3
sqlite3.version # return version of sqlite3 wrapper in python
sqlite3.sqlite_version # returns the version `sqlite3` in C --This is what we should care about

Moreover, one can have multiple versions of sqlite3(in C) on the same machine, and installing a newer version doesn't necessarily update the relevant path in python. One has to go through some pain to make it right (we may need to put that in the documentation).

Therefore, things are complicated from enforcing point of view.

Solution

Given the complicated nature of the problem, we don't enforce, but only check the version of sqlite3 (in C) wherever relevant.
These checks occur in three places:

  • during verdi status (only if the default profile is with sqlite storage backend)
  • creating a new profile with either sqlite_zip or sqlite_dos backend,
  • initializing a profile.

The check doesn't appear during AiiDA installation, as we want to keep installation as smooth as possible. Also, in my opinion, if one doesn't use sqlite has no need to deal with it's installation.

In discourse we discovered the minimum supported version is 3.35.0, the one that has RETURNING clause.

@khsrali khsrali requested a review from sphuber September 22, 2024 19:46
@khsrali khsrali changed the title Dependencies: enforce version control on sqlite Dependencies: version check on sqlite Sep 22, 2024
@khsrali khsrali changed the title Dependencies: version check on sqlite Dependencies: version check on sqlite C-language Sep 22, 2024
Copy link

codecov bot commented Sep 22, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 77.93%. Comparing base (c532b34) to head (0fa78dd).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6567      +/-   ##
==========================================
+ Coverage   77.92%   77.93%   +0.02%     
==========================================
  Files         563      563              
  Lines       41671    41687      +16     
==========================================
+ Hits        32467    32485      +18     
+ Misses       9204     9202       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@sphuber sphuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @khsrali . Instead of putting the check in multiple places, I would simply put it in the constructor of the storage backend (the ones that use sqlite that is). You can see that it already performs a validation of the storage, which checks that the schema version of the storage matches that of the code. It was done on purpose in the constructor to guarantee an instance could never be created if it was not valid. It seems to me that this check would make sense there as well.

@khsrali
Copy link
Contributor Author

khsrali commented Oct 3, 2024

Thanks @sphuber, you are right, that actually makes sense...
yeah, I don't remember why I did it this way, will apply the changes and push again.

@khsrali
Copy link
Contributor Author

khsrali commented Oct 9, 2024

Thanks @sphuber ,
just having a look now again; the two methods initialise and create_profile are class methods.
So moving the validation to constructor results in not being triggered during verdi profile setup for e.g.

@sphuber
Copy link
Contributor

sphuber commented Oct 9, 2024

Thanks @sphuber , just having a look now again; the two methods initialise and create_profile are class methods. So moving the validation to constructor results in not being triggered during verdi profile setup for e.g.

The create_profile method on the SqliteZip is a bit of an odd-one. It is not actually used to create a profile in the verdi profile setup sense as it is not actually added to the Config and stored. It just creates a Profile in memory. In one case, it just uses this to read the version of the archive, and in all others, it used to construct the SqliteZipBackend instance (if you add the check to the constructor as I suggest, the message will end being displayed).

This is why I am advocating to add the check in the constructor. With your current solution, the warning is only printed when a profile is created, but not when it is loaded (you add a check explicitly in verdi status but that is only one of many places where a storage gets loaded.

Ideally, the warning is emitted when a profile is created and when it is loaded. So I propose the following:

  • Keep the check in the Storage.initialise classmethod
  • Add the check to the storage constructor (remove the explicit code in verdi status as now it will be done implicitly)
  • Remove it from SqliteZipBackend.create_profile

Thoughts?

@khsrali
Copy link
Contributor Author

khsrali commented Oct 10, 2024

Thank a lot, @sphuber
It all makes sense now, I updated accordingly.
Cheers

Copy link
Contributor

@sphuber sphuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @khsrali . Implementation is all good now. Just some last few things on the tests.

):
"""Test the ``verdi profile setup`` command.
Same as `test_setup`, here we check the functionality to check sqlite versions, before setting up profiles.
Note that this test should be run before the `test_delete_storage` test.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was just to know if profile setup is failing, then I won't check why test_delete_storage is failing.
(Only because test_delete_storage creates profiles, also with the sqlite backend.)
I've the habit of putting them in such order, maybe it doesn't matter that much...

I took the comment away, as this is not a most just easier to debug this way..

tests/cmdline/commands/test_profile.py Outdated Show resolved Hide resolved
tests/cmdline/commands/test_status.py Outdated Show resolved Hide resolved
storage_backend = profile._attributes['storage']['backend']
# This should be True, only if `pytest -m 'presto'` is used.
# and that is essential to guarantee the funtionality of the code!
if storage_backend in ['core.sqlite_dos', 'core.sqlite_zip']:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this ever be the case though? If you really want to test this behavior in verdi status, I think you should force creating a profile with the relevant storage backend and then test it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this works as expected only in tests-presto. It's being triggered only via pytest -m 'presto'.

I've tried your suggestion to create a profile for each storage backend, but there is a strange problem there, which I don't understand.
Suppose I do:

@pytest.mark.parametrize('entry_point', ('core.sqlite_zip', 'core.sqlite_dos', 'core.psql_dos'))
def test_sqlite_version(run_cli_command, monkeypatch, entry_point, tmp_path, config_psql_dos):
    if entry_point == 'core.sqlite_zip':
        tmp_path = tmp_path / 'archive.aiida'
        create_archive([], filename=tmp_path)

    if entry_point == 'core.psql_dos':
        options = []
        for key, value in config_psql_dos().items():
            options.append(f'--{key.replace("_", "-")}')
            options.append(str(value))
    else:
        options = ['--filepath', str(tmp_path)]

    profile_name = f'temp-profile{entry_point}'
    options = [entry_point, '-n', '--profile-name', profile_name, '--email', 'email@host', '--set-as-default', *options]
    result = run_cli_command(cmd_profile.profile_setup, options, use_subprocess=True)
    assert f'Created new profile `{profile_name}`.' in result.output

    if entry_point in ['core.sqlite_dos', 'core.sqlite_zip']:
->        result = run_cli_command(cmd_status.verdi_status, use_subprocess=True, raises=True)

This last line returns wrong results for some other profile, even if I set-as-default explicitly:
AssertionError: ✔ version: AiiDA v2.6.2.post0
✔ config: /tmp/pytest-of-khosra_a/pytest-13/c048339d34c915b04f9e61ba1aa470df0/.aiida
✔ profile: f4366537365f007119dd90576b2ab04d
✔ storage: Storage for 'f4366537365f007119dd90576b2ab04d' [open] @ postgresql+psycopg://guest:***@localhost:54571/61eeacfc-85be-458a-93de-c198106f23c5 / DiskObjectStoreRepository: 43c774c5c42a44df98360169b9f08ea2 | /tmp/pytest-of-khosra_a/pytest-13/repository0/container
Warning: RabbitMQ v3.12.1 is not supported and will cause unexpected problems!
Warning: It can cause long-running workflows to crash and jobs to be submitted multiple times.
Warning: See https://github.com/aiidateam/aiida-core/wiki/RabbitMQ-version-to-use for details.
✔ broker: RabbitMQ v3.12.1 @ amqp://guest:guest@127.0.0.1:5672?heartbeat=600
⏺ daemon: The daemon is not running.

While if you do this:

(Pdb) import os
(Pdb) os.system("verdi status")

returns the correct one:

✔ version: AiiDA v2.6.2.post0
✔ config: /tmp/pytest-of-khosra_a/pytest-10/fc80b65e071f67ef50d89cc715645faa0/.aiida
✔ profile: temp-profilecore.sqlite_dos
✔ storage: SqliteDosStorage[/tmp/pytest-of-khosra_a/pytest-10/test_sqlite_version_core_sqlit0]: open,
Warning: RabbitMQ v3.12.1 is not supported and will cause unexpected problems!
Warning: It can cause long-running workflows to crash and jobs to be submitted multiple times.
Warning: See https://github.com/aiidateam/aiida-core/wiki/RabbitMQ-version-to-use for details.
✔ broker: RabbitMQ v3.12.1 @ amqp://guest:guest@127.0.0.1:5672?heartbeat=600
⏺ daemon: The daemon is not running.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite sure why it is not picking up the new profile. But anyway, this is not the way you probably want to go about this. You are now creating a profile in the test configuration. This can affect other tests. You probably want to create a new temporary isolated config, create the profile in that one, and just run the cli command in memory. See this test as an example of how to create the profile:

def test_create_profile(isolated_config, tmp_path, entry_point):

And then just run run_cli_command(use_subprocess=False). This should be a lot faster and should keep things nicely isolated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sphuber I follow your latest suggestion to make an isolated profile.
Unfortunately, the issue that I posted in my previous comment, still persist. And I have no clue why.
The profile that I make (either 'core.sqlite_dos' or 'core.sqlite_zip' ) are not picked up by run_cli_command(cmd_status.verdi_status, use_subprocess=False) when tests are run with psql database backend.

I don't know what to do with this...
Is it ok if we bound this test to run only when pytest is running with sqlite_dos storage backend?

@sphuber
Copy link
Contributor

sphuber commented Dec 16, 2024

@khsrali I made the change as I think it should be. Locally it works fine, but on CI it fails. This seems to be due to the --db-backend psql option that is used on CI. I think this may be a recent addition that I am not yet familiar with. I will try to take another look later to see if I can figure why that is causing the test to fail. Without that option, it works just fine.

@khsrali
Copy link
Contributor Author

khsrali commented Dec 17, 2024

Hi @sphuber , thanks a lot for being available for this PR.

Yes, this is the same error as I encountered when I applied your changes locally.
I think you don't see this issue locally, because now, pytest by default uses sqlite_dos as a default storage backend (similar to test-presto but with rbmq). On these tests, CI passes --see test-presto--, because run_cli_command() is able to pick up the current profile that you make.

However, somehow on pytest --db-backend psql (formally this was the default storage backend) run_cli_command() is not able to see the profile you make, and keeps reporting the "global" profile instead of the one you make.

PR: I had this issue even before PR #6625, just to clarify that, by no means this issue is not induced by that PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants