Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for Windows #29935

Closed
wants to merge 16 commits into from
Closed

Conversation

casra-developers
Copy link
Contributor

This PR adds checks to make sure that "os" methods that are not supported on Windows are not used on workers. Furthermore, we would appreciate it that methods that are only available on non-Windows systems would be scoped such that they do not raise exceptions on Windows systems. In another PR we have introduce the "airflow.utils.platform.IS_WINDOWS" constant so it should be used.

Thank you in advance

@boring-cyborg boring-cyborg bot added the area:Scheduler including HA (high availability) scheduler label Mar 6, 2023
@potiuk
Copy link
Member

potiuk commented Mar 9, 2023

@casra-developers - I think havinb half-baked solutions like that is just a starting point. What I am worried about is that it has no support for testing it and that it does not prevent (and will not prevent) anyone from using non-windows compliant behaviours in the future. But maybe this PR might be turned in something more complete.

We have #12874 and #10388 opened to make Airflow works for Windows. And Honestly, I do not think we are far from at least being able to run airflow on Windows for local development natively.

My idea and suggestion for this PR - why don't we try to make a running set of tests for Windows as part of the a Airflow CI - at least for some basic "airflow" tests. It does not have to be complete set of tests, but just running the "Core" and "Always" tests on a simple "airflow" installation without any extras. This will not use any of the Breeze testing framework - which is more complete and comprehensive, but it woul just make sure that

a) you can create airflow virtualenv on windows
b) you can install plain "airflow" there
c) you can run all the tests that are "Core" and "Airflow" and they succeed

This all should be possible with the standard way how Github Actions python regular projects are run https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python and we even could have a separate workflow for it. GitHubActions has full support for running the jobs on windows runners, so it does not seem that woudl require a lot of effort. The only real effort will be to fix (or deliberatly skip if we have good reason) all the tests which will be failing.

WDYT?

@casra-developers
Copy link
Contributor Author

@potiuk I agree with your suggestion and actually can have a look at it. My problem is, that since I have to do this in a professional environment with many other projects, my time is very limited. My experience with Github actions is also very limited. Given these limitations its probably unlikely that I can do this except if I get lucky and figure all this out first try. Is there someone willing to help with this?

@potiuk
Copy link
Member

potiuk commented Mar 10, 2023

@potiuk I agree with your suggestion and actually can have a look at it. My problem is, that since I have to do this in a professional environment with many other projects, my time is very limited. My experience with Github actions is also very limited. Given these limitations its probably unlikely that I can do this except if I get lucky and figure all this out first try. Is there someone willing to help with this?

Of course I can help if you have problems - and we can discuss it in this PR - you can just add the workflow here and add workflow to install the environment and run the tests - there. Generally installing python venv, choosing windows machine and running tests with pytest is basically following the tutorials of Github Actions - those are very standard steps.

@potiuk
Copy link
Member

potiuk commented Mar 10, 2023

There might be some complexity in skipping some tests on windows most likely, but this is just a standard pytest fixture to apply for tests that :

@pytest.mark.skipif(sys.platform == "win32", reason="Not running on Window")

@potiuk
Copy link
Member

potiuk commented Mar 10, 2023

And we do not have to hurry with that, we can iterate for as long as we need .

@casra-developers
Copy link
Contributor Author

I have added a new .yml file containing the steps I was able to guess in order for the test procedure. Things I don't now are:

  • What should be the trigger for the action (e.g. on [push])?
  • How do I install the dependencies required by airflow during testing? So far I have just installed pytest.
  • What is the command to use, to run the tests you suggested?
  • Is there a way to debug this?

I am sure those things are quite obvious once you know the process, but I tried to research those things with the provided resources, google and the existing actions and am not sure what to do exactly.

@potiuk
Copy link
Member

potiuk commented Mar 12, 2023

I have added a new .yml file containing the steps I was able to guess in order for the test procedure. Things I don't now are:

  • What should be the trigger for the action (e.g. on [push])?
  • push and pull request are both fine - one is for PR and another is when your PR gets merged to main (regression testing)
  • How do I install the dependencies required by airflow during testing? So far I have just installed pytest.

Just pip install ".[devel]" for now should do - it will install airflow and all the development dependencies (but no providers and their dependencies - those would take considerably longer.

  • What is the command to use, to run the tests you suggested?

Taking it from the current tests (just looked at the output). That would be good start I think:

pytest --verbosity=0 --strict-markers --durations=100 --maxfail=50 --color=yes --junitxml=/files/test_result-Always-sqlite.xml --timeouts-order moi --setup-timeout=60 --execution-timeout=60 --teardown-timeout=60 --output=/files/warnings-Always-sqlite.txt --disable-warnings -rfEX --with-db-init --ignore tests/providers --ignore tests/charts tests
  • Is there a way to debug this?

Yep. What this sequence does can be repeated - locally create a venv, install airflow and run tests. There is little need to run the whole workflow in such a simple case - my approach is usually to run it locally and once I think it's good to test - push it to PR (and let it do the stuff). There is also https://github.com/nektos/act to simulate the whole GA environment (sort of) but in your case, if you have a windows machine runnig pip or pytests commands as individual steps do not really require it

I am sure those things are quite obvious once you know the process, but I tried to research those things with the provided resources, google and the existing actions and am not sure what to do exactly.

Yeah. "actions" are overrated. I tend to use them for very standard stuff that you likely already have locally (like checkout the repo or install python) and all the rest shoudl be just commands you can repeat locally.

This is it for now no more is needed (just fixing the indentaion in the .yml file - I recommend intellij/vscode support for the github action yaml (they come built-in with github plugins I think) - they will help wiht making the yaml file validated and provide autocompletion.

@casra-developers
Copy link
Contributor Author

Thank you so much for these extensive instructions! This really helped me to write the action and test the process locally. One issue that remains though is that the extension I found for vscode (Github Actions v3.0.1) doesn't like my yaml file at all.
image

This is very weird since I followed the online resources provided by Github. Are there some project specific guidelines being enforced here and how can I use the "runs-on: windows-latest" option with the "runs" attribute instead of the "jobs" attribute?

@casra-developers
Copy link
Contributor Author

Sadly, I cannot run the tests in a venv since pip install ".[devel]" fails while installing the mysqlclient module.

ERROR: Command errored out with exit status 1:
     command: 'd:\repositories\airflow\venv\scripts\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\dwerner\\AppData\\Local\\Temp\\pip-install-q7q2htii\\mysqlclient\\setup.py'"'"'; __file__='"'"'C:\\Users\\dwerner\\AppData\\Local\\Temp\\pip-install-q7q2htii\\mysqlclient\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record 'C:\Users\dwerner\AppData\Local\Temp\pip-record-ei7va7xj\install-record.txt' --single-version-externally-managed --compile --install-headers 'd:\repositories\airflow\venv\include\site\python3.7\mysqlclient'
         cwd: C:\Users\dwerner\AppData\Local\Temp\pip-install-q7q2htii\mysqlclient\
    Complete output (31 lines):
    running install
    d:\repositories\airflow\venv\lib\site-packages\setuptools\command\install.py:37: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
      setuptools.SetuptoolsDeprecationWarning,
    running build
    running build_py
    creating build
    creating build\lib.win-amd64-cpython-37
    creating build\lib.win-amd64-cpython-37\MySQLdb
    copying MySQLdb\__init__.py -> build\lib.win-amd64-cpython-37\MySQLdb
    copying MySQLdb\_exceptions.py -> build\lib.win-amd64-cpython-37\MySQLdb
    copying MySQLdb\connections.py -> build\lib.win-amd64-cpython-37\MySQLdb
    copying MySQLdb\converters.py -> build\lib.win-amd64-cpython-37\MySQLdb
    copying MySQLdb\cursors.py -> build\lib.win-amd64-cpython-37\MySQLdb
    copying MySQLdb\release.py -> build\lib.win-amd64-cpython-37\MySQLdb
    copying MySQLdb\times.py -> build\lib.win-amd64-cpython-37\MySQLdb
    creating build\lib.win-amd64-cpython-37\MySQLdb\constants
    copying MySQLdb\constants\__init__.py -> build\lib.win-amd64-cpython-37\MySQLdb\constants
    copying MySQLdb\constants\CLIENT.py -> build\lib.win-amd64-cpython-37\MySQLdb\constants
    copying MySQLdb\constants\CR.py -> build\lib.win-amd64-cpython-37\MySQLdb\constants
    copying MySQLdb\constants\ER.py -> build\lib.win-amd64-cpython-37\MySQLdb\constants
    copying MySQLdb\constants\FIELD_TYPE.py -> build\lib.win-amd64-cpython-37\MySQLdb\constants
iaDB Connector C\include\mariadb" "-IC:\Program Files\MariaDB\MariaDB Connector C\include" -Id:\repositories\airflow\venv\include "-IC:\Program Files\Python37\include" "-IC:\Program Files\Python37\Include" "-IC:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.35.32215\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.35.32215\ATLMFC\include" "-IC:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\um" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\shared" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\winrt" "-IC:\Program Files (x86)\Windows Kits\10\\include\10.0.19041.0\\cppwinrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" /TcMySQLdb/_mysql.c /Fobuild\temp.win-amd64-cpython-37\Release\MySQLdb/_mysql.obj
    _mysql.c
    MySQLdb/_mysql.c(29): fatal error C1083: Cannot open include file: 'mysql.h': No such file or directory
    error: command 'C:\\Program Files\\Microsoft Visual Studio\\2022\\Enterprise\\VC\\Tools\\MSVC\\14.35.32215\\bin\\HostX86\\x64\\cl.exe' failed with exit code 2
    ----------------------------------------
ERROR: Command errored out with exit status 1: 'd:\repositories\airflow\venv\scripts\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\dwerner\\AppData\\Local\\Temp\\pip-install-q7q2htii\\mysqlclient\\setup.py'"'"'; __file__='"'"'C:\\Users\\dwerner\\AppData\\Local\\Temp\\pip-install-q7q2htii\\mysqlclient\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record 'C:\Users\dwerner\AppData\Local\Temp\pip-record-ei7va7xj\install-record.txt' 
--single-version-externally-managed --compile --install-headers 'd:\repositories\airflow\venv\include\site\python3.7\mysqlclient' Check the logs for full command output.

There seems to be a header file missing when building the module.

@uranusjr
Copy link
Member

uranusjr commented Mar 13, 2023

You need to install MySQL in order to install mysqlclient. You’ll hit similar issues for other database backends.

@casra-developers
Copy link
Contributor Author

So this means that I would have to add these installations as steps in the action I assume. Are they necessary or can we just use the sqlite backend for the tests?

@potiuk
Copy link
Member

potiuk commented Mar 13, 2023

This is very weird since I followed the online resources provided by Github.

The indentation is wrong for those lines - they should be indented more - this is simply not valid yaml if it is not.

@potiuk
Copy link
Member

potiuk commented Mar 13, 2023 via email

@casra-developers
Copy link
Contributor Author

This is very weird since I followed the online resources provided by Github.

The indentation is wrong for those lines - they should be indented more - this is simply not valid yaml if it is not.

I fixed the indentation in the .yml file but the plugin still complains about "jobs" and "on" not being allowed properties and "runs" missing.

name: 'Running Tests on Windows'
description: 'Verification that tests run on Windows.'
on: [push]
jobs:
  test:
    runs-on: windows-latest
    steps:
      - name: Check out repository code
        uses: actions/checkout@v2

      - name: Setup Python
        uses: actions/setup-python@v2
        with:
          python-version: "3.x"

      - name: Setup venv
        run: |
          python -m venv venv
          .\venv\Scripts\activate

      - name: Install dependencies
        run: |
          pip install ".[devel]"

      - name: Run tests
        run: |
          pytest --verbosity=0 --strict-markers --durations=100 --maxfail=50 --color=yes --junitxml=/files/test_result-Always-sqlite.xml --timeouts-order moi --setup-timeout=60 --execution-timeout=60 --teardown-timeout=60 --output=/files/warnings-Always-sqlite.txt --disable-warnings -rfEX --with-db-init --ignore tests/providers --ignore tests/charts tests

I used this site among others to create the file: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python
I would guess that the documentation is more accurate than the plugin so can I just ignore it?

@casra-developers
Copy link
Contributor Author

I think it might be a good idea to get a way of installing airflow devel without MySQL, but for now it requires it (because for development you usually need postgres and MySQL to test stimuff with those). For now you might just install it without 'devel' extra and then we can add individual extras as they will be needed - that might also help with the task of adding a simpler devel extra the will cover the basics) Note - all the extras are explained here https://airflow.apache.org/docs/apache-airflow/stable/extra-packages-ref.html pon., 13 mar 2023, 10:00 użytkownik casra-developers < @.> napisał:

So this means that I would have to add these installations as steps in the action I assume. Are they necessary or can we just use the sqlite backend for the tests? — Reply to this email directly, view it on GitHub <#29935 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAERMIYJWQ7QDVTMEZD5VNLW33O2HANCNFSM6AAAAAAVQ5QORU . You are receiving this because you were mentioned.Message ID: @.
>

I have read through the article but I was not able to find the extras that do not require to install anything for the setup to run properly. Could you give me some directions on what to do to setup the venv properly?

@potiuk
Copy link
Member

potiuk commented Mar 15, 2023

I have read through the article but I was not able to find the extras that do not require to install anything for the setup to run properly. Could you give me some directions on what to do to setup the venv properly?

Just start with pip install .. And yes could be your plugin is old/outdated.

@casra-developers
Copy link
Contributor Author

casra-developers commented Mar 17, 2023

The pip install . and the separate installation of the time-machine module allowed me to run the pytest command, saldy it did not get far:

(venv) PS D:\repositories\airflow> pytest --verbosity=0 --strict-markers --durations=100 --maxfail=50 --color=yes --junitxml=/files/test_result-Always-sqlite.xml --timeouts-order moi --setup-timeout=60 --execution-timeout=60 --teardown-timeout=60 --disable-warnings -rfEX --with-db-init 
--ignore tests/providers --ignore tests/charts tests
WARNING:root:OSError while attempting to symlink the latest log directory
D:\repositories\airflow\airflow\models\dagrun.py:131 RemovedIn20Warning: Deprecated API features detected! These feature(s) are not compatible with SQLAlchemy 2.0. To prevent incompatible upgrades prior to updating applications, ensure requirements files are pinned to "sqlalchemy<2.0". 
Set environment variable SQLALCHEMY_WARN_20=1 to show all deprecation warnings.  Set environment variable SQLALCHEMY_SILENCE_UBER_WARNING=1 to silence this message. (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)
d:\repositories\airflow\venv\lib\site-packages\marshmallow_sqlalchemy\convert.py:17 DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
d:\repositories\airflow\venv\lib\site-packages\marshmallow_sqlalchemy\convert.py:17 DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
d:\repositories\airflow\venv\lib\site-packages\pkg_resources\__init__.py:121 DeprecationWarning: pkg_resources is deprecated as an API
d:\repositories\airflow\venv\lib\site-packages\pkg_resources\__init__.py:2870 DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('google')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
d:\repositories\airflow\venv\lib\site-packages\pkg_resources\__init__.py:2870 DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('sphinxcontrib')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
d:\repositories\airflow\venv\lib\site-packages\pkg_resources\__init__.py:2870 DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('sphinxcontrib')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
d:\repositories\airflow\venv\lib\site-packages\pkg_resources\__init__.py:2870 DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('sphinxcontrib')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
d:\repositories\airflow\venv\lib\site-packages\pkg_resources\__init__.py:2870 DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('sphinxcontrib')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
d:\repositories\airflow\venv\lib\site-packages\pkg_resources\__init__.py:2870 DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('sphinxcontrib')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
d:\repositories\airflow\venv\lib\site-packages\pkg_resources\__init__.py:2870 DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('sphinxcontrib')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
d:\repositories\airflow\venv\lib\site-packages\pkg_resources\__init__.py:2870 DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('sphinxcontrib')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
INTERNALERROR> Traceback (most recent call last):
INTERNALERROR>   File "d:\repositories\airflow\venv\lib\site-packages\_pytest\main.py", line 266, in wrap_session
INTERNALERROR>     config._do_configure()
INTERNALERROR>   File "d:\repositories\airflow\venv\lib\site-packages\_pytest\config\__init__.py", line 1039, in _do_configure
INTERNALERROR>     self.hook.pytest_configure.call_historic(kwargs=dict(config=self))
INTERNALERROR>   File "d:\repositories\airflow\venv\lib\site-packages\pluggy\_hooks.py", line 277, in call_historic
INTERNALERROR>     res = self._hookexec(self.name, self.get_hookimpls(), kwargs, False)
INTERNALERROR>   File "d:\repositories\airflow\venv\lib\site-packages\pluggy\_manager.py", line 80, in _hookexec
INTERNALERROR>     return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
INTERNALERROR>   File "d:\repositories\airflow\venv\lib\site-packages\pluggy\_callers.py", line 60, in _multicall
INTERNALERROR>     return outcome.get_result()
INTERNALERROR>   File "d:\repositories\airflow\venv\lib\site-packages\pluggy\_result.py", line 60, in get_result
INTERNALERROR>     raise ex[1].with_traceback(ex[2])
INTERNALERROR>   File "d:\repositories\airflow\venv\lib\site-packages\pluggy\_callers.py", line 39, in _multicall
INTERNALERROR>     res = hook_impl.function(*args)
INTERNALERROR>   File "d:\repositories\airflow\venv\lib\site-packages\pytest_timeouts.py", line 57, in pytest_configure
INTERNALERROR>     assert hasattr(signal, 'SIGALRM')
INTERNALERROR> AssertionError: assert False
INTERNALERROR>  +  where False = hasattr(signal, 'SIGALRM')

It seems to me that we are running into the same issues regarding process handling, that this PR wants to address in the Airflow code base. How should we proceed here? Changing the source code in "pytest_timeouts" does not seem to be a good idea.

@potiuk
Copy link
Member

potiuk commented Mar 19, 2023

The pip install . and the separate installation of the time-machine module allowed me to run the pytest command, saldy it did not get far:

(venv) PS D:\repositories\airflow> pytest --verbosity=0 --strict-markers --durations=100 --maxfail=50 --color=yes --junitxml=/files/test_result-Always-sqlite.xml --timeouts-order moi --setup-timeout=60 --execution-timeout=60 --teardown-timeout=60 --disable-warnings -rfEX --with-db-init 
--ignore tests/providers --ignore tests/charts tests
WARNING:root:OSError while attempting to symlink the latest log directory
D:\repositories\airflow\airflow\models\dagrun.py:131 RemovedIn20Warning: Deprecated API features detected! These feature(s) are not compatible with SQLAlchemy 2.0. To prevent incompatible upgrades prior to updating applications, ensure requirements files are pinned to "sqlalchemy<2.0". 
Set environment variable SQLALCHEMY_WARN_20=1 to show all deprecation warnings.  Set environment variable SQLALCHEMY_SILENCE_UBER_WARNING=1 to silence this message. (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)
d:\repositories\airflow\venv\lib\site-packages\marshmallow_sqlalchemy\convert.py:17 DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
d:\repositories\airflow\venv\lib\site-packages\marshmallow_sqlalchemy\convert.py:17 DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
d:\repositories\airflow\venv\lib\site-packages\pkg_resources\__init__.py:121 DeprecationWarning: pkg_resources is deprecated as an API
d:\repositories\airflow\venv\lib\site-packages\pkg_resources\__init__.py:2870 DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('google')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
d:\repositories\airflow\venv\lib\site-packages\pkg_resources\__init__.py:2870 DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('sphinxcontrib')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
d:\repositories\airflow\venv\lib\site-packages\pkg_resources\__init__.py:2870 DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('sphinxcontrib')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
d:\repositories\airflow\venv\lib\site-packages\pkg_resources\__init__.py:2870 DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('sphinxcontrib')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
d:\repositories\airflow\venv\lib\site-packages\pkg_resources\__init__.py:2870 DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('sphinxcontrib')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
d:\repositories\airflow\venv\lib\site-packages\pkg_resources\__init__.py:2870 DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('sphinxcontrib')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
d:\repositories\airflow\venv\lib\site-packages\pkg_resources\__init__.py:2870 DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('sphinxcontrib')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
d:\repositories\airflow\venv\lib\site-packages\pkg_resources\__init__.py:2870 DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('sphinxcontrib')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
INTERNALERROR> Traceback (most recent call last):
INTERNALERROR>   File "d:\repositories\airflow\venv\lib\site-packages\_pytest\main.py", line 266, in wrap_session
INTERNALERROR>     config._do_configure()
INTERNALERROR>   File "d:\repositories\airflow\venv\lib\site-packages\_pytest\config\__init__.py", line 1039, in _do_configure
INTERNALERROR>     self.hook.pytest_configure.call_historic(kwargs=dict(config=self))
INTERNALERROR>   File "d:\repositories\airflow\venv\lib\site-packages\pluggy\_hooks.py", line 277, in call_historic
INTERNALERROR>     res = self._hookexec(self.name, self.get_hookimpls(), kwargs, False)
INTERNALERROR>   File "d:\repositories\airflow\venv\lib\site-packages\pluggy\_manager.py", line 80, in _hookexec
INTERNALERROR>     return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
INTERNALERROR>   File "d:\repositories\airflow\venv\lib\site-packages\pluggy\_callers.py", line 60, in _multicall
INTERNALERROR>     return outcome.get_result()
INTERNALERROR>   File "d:\repositories\airflow\venv\lib\site-packages\pluggy\_result.py", line 60, in get_result
INTERNALERROR>     raise ex[1].with_traceback(ex[2])
INTERNALERROR>   File "d:\repositories\airflow\venv\lib\site-packages\pluggy\_callers.py", line 39, in _multicall
INTERNALERROR>     res = hook_impl.function(*args)
INTERNALERROR>   File "d:\repositories\airflow\venv\lib\site-packages\pytest_timeouts.py", line 57, in pytest_configure
INTERNALERROR>     assert hasattr(signal, 'SIGALRM')
INTERNALERROR> AssertionError: assert False
INTERNALERROR>  +  where False = hasattr(signal, 'SIGALRM')

It seems to me that we are running into the same issues regarding process handling, that this PR wants to address in the Airflow code base. How should we proceed here? Changing the source code in "pytest_timeouts" does not seem to be a good idea.

This comes from some of the parameters passed in pytest (timeouts related) disabling them (i.e. not using the timeout parameters of pytest]) is the right way to go.

@casra-developers
Copy link
Contributor Author

Just running pytest tests produces the same error. Is there a setting that is project specific provoking this issue? I have used pytest before on Windows systems without issue.

@potiuk
Copy link
Member

potiuk commented Mar 20, 2023

Just running pytest tests produces the same error. Is there a setting that is project specific provoking this issue? I have used pytest before on Windows systems without issue.

Maybe just the plugin that we have being installed causes it. This is the reason why we really need to do the exercise you are doing and fix any kind of hidden dependencies we have. If we do not test it on windows, we will never find out. This is part of the job to be done to make it works fine on Windows.

In this case likely adding ;platform_system!="Windows" where the plugin is added (setup.py) is likely the right solution (we already have similar exclusions for example ;python_version<"3.9 elsewhere.

@casra-developers
Copy link
Contributor Author

I am making a bit of progress in getting the tests to run in venv. One thing I forgot to mention before is that gunicorn is kind of a bad choice for cross-platform support as it is not available on Windows. What we ended up doing for getting the logs from our workers is using basically the same code with waitress instead of gunicorn. It is compatible with all platforms and works almost the same as gunicorn. Should we create an issue for this?

https://pypi.org/project/waitress/

@casra-developers
Copy link
Contributor Author

Ok... I got it to at least run the tests, but the issue is that there are just too many instances where things are needed that don't exist on Windows. Be it gunicorn or pytest markers that require timeouts. My current command looks like this:

pytest --verbosity=0 --strict-markers --maxfail=50 --color=yes --junitxml=/files/test_result-Always-sqlite.xml --output=/files/warnings-Always-sqlite.txt --disable-warnings -rfEX --with-db-init --ignore tests/system/providers --ignore tests/providers --ignore tests/integration/providers --ignore tests/charts tests --ignore tests/utils/test_serve_logs.py --ignore tests/utils/test_dataform.py --ignore tests/kubernetes --ignore tests/integration/cli/commands/test_celery_command.py --ignore tests/integration/api/auth/backend/test_kerberos_auth.py --ignore tests/api_connexion/endpoints/test_extra_link_endpoint.py --ignore tests/cli/commands/test_internal_api_command.py --ignore tests/cli/commands/test_kerberos_command.py --ignore tests/cli/commands/test_scheduler_command.py --ignore tests/cli/commands/test_task_command.py --ignore tests/sensors/test_python.py --ignore tests/executors/test_kubernetes_executor.py --ignore tests/executors/test_dask_executor.py --ignore tests/executors/test_celery_executor.py --ignore tests/dag_processing/test_manager.py --ignore tests/core/test_impersonation_tests.py --ignore tests/operators/test_python.py --ignore tests/decorators/test_python.py --ignore tests/jobs/test_local_task_job.py --ignore tests/cli/commands/test_webserver_command.py

I had to exclude the following packages in setup.py:

  • eralchemy2
  • moto[cloudformation, glue]
  • pytest-timeouts

The following additional installs where required to still run the tests:

  • celery
  • psycopg2
  • boto3
  • distributed
  • statsd
  • sentry_sdk

For me there are two options on the table:

  1. We identify some really basic core tests that should succeed and just only run them (I don't know how exactly to include rather than exclude).
  2. We identify the stuff that is not working on Windows and find alternatives if possible (e.g. waitress instead of gunicorn as a webserver)

We have been using Airflow successfully for two years now with Windows workers, so it can work in a limited sense. It is a very good platform and helped us a great deal with organizing our internal workflows. So it would be awesome if we could find a way to make at least this limited support possible without the need to apply manual fixes to the module after an update.

What are your thoughts?

@potiuk
Copy link
Member

potiuk commented Mar 22, 2023

What are your thoughts?

My thoughts are that indeed we are a bit far from windows support and putting that on you here is too much of an ask :). But at least we know have the issue here and some attempts to show what challenges it brings. (and we can keep it as reference for #12874 and #10388 so that we can always find out the history and attempts when we decide to seriously tackle this.

Sorry for putting you through this, but I figured it might be worth trying (and maybe it would turn out it's easier than I thoguht). It is more difficult (after seeing your attempts).

If you really just want to have this small "If WINDOWS and they make your life easier, I am happy if you can open a new PR with just those changes and leave that one it it's current state for the future (set it to draft maybe).

I hope I have not pushed too much and that we will all learn something useful from that :)

@dominik-werner-casra
Copy link
Contributor

Perfectly understandable. I have the feeling that the main issue is #10388 and if this would work somehow, many other things would also work. The waitress vs. gunicorn thing also is something that could be investigated, since I assume that the replacement should harmonize well with flask.
But anyway, I will set this PR to draft and recreate the changes in a new one. How do I make sure that you @potiuk will be assigned and can approve them?

@dominik-werner-casra
Copy link
Contributor

New PR is created #30249

@dominik-werner-casra
Copy link
Contributor

I seem to be lacking the permission to convert this PR to a draft:
image

@potiuk potiuk marked this pull request as draft March 26, 2023 21:10
@potiuk
Copy link
Member

potiuk commented Mar 26, 2023

No - you should be able to do it. But I just did it for you

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label May 11, 2023
@github-actions github-actions bot closed this May 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:Scheduler including HA (high availability) scheduler stale Stale PRs per the .github/workflows/stale.yml policy file
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants