Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Airflow CLI client based on Stable API #10552

Open
mik-laj opened this issue Aug 25, 2020 · 8 comments
Open

Airflow CLI client based on Stable API #10552

mik-laj opened this issue Aug 25, 2020 · 8 comments
Labels
airflow3.0:candidate Potential candidates for Airflow 3.0 area:API Airflow's REST/HTTP API area:CLI kind:feature Feature Requests

Comments

@mik-laj
Copy link
Member

mik-laj commented Aug 25, 2020

tl;dr;: Hyperone and good citizen @ad-m will be happy to help build a new Airflliw CLI.
Here is docs for POC:
https://gist.github.com/ad-m/19cc06d91a7ee756f461c95f5d656eb6

If we have a Stable API ready, we can start working on tools that use it. A highly anticipated feature is the ability to manage Airflow from the CLI. Currently, there is similar feature in Airflow, but with many limitations:

  • A very small number of commands are available (7 pool commands and 2 dag commands only).
  • Use deprecated experimental REST API.
  • Low test coverage.
  • No access control. (Big security risks)
  • Requires full Airflow to be installed along with a large number of unnecessary dependencies.
  • compatibility issues on Windows.
  • Installation is via pip which is great. However, users expect such a tool to be available as a single package with all dependencies included.

I also suggested that this be deleted in Airflow 2.0 as it is rarely used.
https://lists.apache.org/thread.html/rfada8ac2fce87c0516d62923e35e3bebfc44ee5a379103b890f8c61c%40%3Cdev.airflow.apache.org%3E

Some users use the normal CLI, but this also has many of the disadvantages mentioned above, but for this it poses an even greater security risk as it requires a direct connection to the database. On the other hand, it has many more features, but not all that are available in Web UI, eg there is no access to remote logs but they are available in Web UI and stable API.

I spoke with good citizen @ad-m how HyperOne will develop CLI for its services. HyperOne is a cloud provider that develops services for polish good citizens in Poland. They use OpenAPI very intensively for build thier platform. They also strive for automation and self-maintenance. I think we can rely on their experience, knowledge and talent.

From a conversation with this expert, I learned about the existence of the h1-cli project. This is the CLI that makes it easy to manage HyperOne services. Have the key-features from our perspective:

  • Use NodeJS (-/+)
  • Uses OpenAPI (+)
  • Available as a single binary(+). To achieve it, they used the pkg utility.
  • Works on many multiple operating systems (+)

This is not the end of the story! They are working on a new CLI v2 that will use the API not only to verify the message format, but also the new commands will be built based on the OpenAPI specification. Yes! Adding a new service to the platform does not require any changes to the CLI.

The fantastic features don't stop there. CLI is written in NodeJS and thanks to the additional cli-device-browser module it is possible to use this CLI also from the browser. Yes! We can have an airflow console in the browser.

When a resource is created, a CLI command equivalent is also generated. The use of automatic command generation makes the maintenance of the CLI a minimum effort. The CLI developed in this way is able to automatically adapt to new API parameters, which reduces the effort required for its maintenance. The reference documentation of CLI is also updated.

CLI is integrated with the HyperOne Management Panel. When an operation is performed via the Panel, an example of how to perform an analogous operation with CLI is displayed. Besides the example developed in this way can be directly run in the CLI in the browser. An excellent way to educate users and popularize CLI. This mechanism is part of the CLI framework and we can integrate our web-UI similarly.

I spoke to @ad-m and HyperOne are eager to collaborate and share their experience and technology so we can similarly build our CLI. @ad-m even prepared a POC based on our specification. After writing ~300 lines of code, he had Airfllow CLI ready, which after some improvements, could be used more widely.

The CLI framework is under active development. New functionalities are planned that we will be able to use. I found out that autocomplete support is planned (including remote data). They also work as hard as possible to improve the documentation format, e.g. generate CLI context help based on examples from OpenAPI, provide documentation in new formats.

Here is the documentation that shows the idea of how it will be possible to use such a CLI.

https://gist.github.com/ad-m/19cc06d91a7ee756f461c95f5d656eb6

In a further development, we may write additional commands that address common use cases. The CLI framework assumes that basic operations are generated, and specific operations can additionally be added or even dynamically loaded (plugin-way). At this stage, these POC of CLIs are more advanced and more powerful than the remote mode in the current Airflow CLI.

As of now, I don't have the capacity to lead this idea, so I'm not starting a mailing list discussion. However, if I have positive opinions, I can try to book a few hours for its implementation. For now, I would like to know your expectations from the CLI client and your use cases in order to be able to better understand the expectations and develop a CLI in the future.

@mik-laj mik-laj added kind:feature Feature Requests area:API Airflow's REST/HTTP API area:CLI labels Aug 25, 2020
@houqp
Copy link
Member

houqp commented Aug 28, 2020

This certainly looks interesting. Being able to have a airflow console in the web sounds pretty cool. I see no harm in giving this a try as an experiment. I have always wanted us to keep the existing CLI as the admin CLI and build a new CLI from scratch as the user CLI that only interacts with the REST api and can be distributed as a single binary :)

The only thing we should be aware of is the CLI should not be tied to any proprietary tech built by HyperOne, otherwise we won't be able to adopt it as the official airflow user CLI.

@ad-m
Copy link
Contributor

ad-m commented Aug 28, 2020

HyperOne team here. :bowtie: The created CLI Framework is not any proprietary HyperOne technology. Everything is FOSS and we intend to keep it that way. We use open-source and give it to open-source, especially that this is not the core of our services. We sincerely hope that thanks to the cooperation everyone will only benefit.

@houqp
Copy link
Member

houqp commented Aug 31, 2020

@mik-laj @ad-m should we create a public repo to kick off the experiment? i am interested in giving the cli a try as well.

@potiuk
Copy link
Member

potiuk commented Aug 31, 2020

FYI @mik-laj -> as PMC you can create repost starting with airflow- in the apache organization via self-service platform: https://cwiki.apache.org/confluence/display/INFRA/Creating+a+new+repository

@mik-laj
Copy link
Member Author

mik-laj commented Aug 31, 2020

@potiuk @houqp I have the impression that the new repository will not help here, because we will have to rely on the CLI framework, which has not yet been released by HyperOne. We have to wait until Hyperone finishes its work and decides to separate Framework and HyperOne CLI. For now, we can contribute to the repository where the Hyperone CLI is, but it's not a very good idea. We can also fork, but while the framework is under heavy development it can be a very unpleasant developer experience.

I think we have to wait a while until their work is finished so that we can start work on our side.

@mik-laj
Copy link
Member Author

mik-laj commented Aug 31, 2020

We cannot use development releases for our work as this is not supported for repositories with multiple packages(monorepo) by npm/yarn.
In Python, we could use https://pip.pypa.io/en/stable/reference/pip_install/#vcs-support, but HyperOne CLI uses NodeJS. In Node JS, this is a bit more complicated, because the code released as an installation package is different from the code in the repository. It needs to be compiled/transpiled first.

@ad-m
Copy link
Contributor

ad-m commented Aug 31, 2020

We publish our packages on a regular basis, but I do not believe that the scope of the changes can be embarrassing for external software. I am going (together with @ mik-laj) to fine-tune the CLI, which will show us where the weak points are, to decouple the framework from any HyperOne-specific elements. We will also have to present the basic documentation of the public API library, because our vision in this area does not have to be clear outside of process.

@kaxil
Copy link
Member

kaxil commented Jul 24, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
airflow3.0:candidate Potential candidates for Airflow 3.0 area:API Airflow's REST/HTTP API area:CLI kind:feature Feature Requests
Projects
None yet
Development

No branches or pull requests

5 participants