Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#1193] improvement(docs): Add the document about how to debug trino connector locally. #2446

Merged
merged 8 commits into from
Mar 14, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added docs/assets/trino/add-config.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/trino/add-link.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
42 changes: 42 additions & 0 deletions docs/assets/trino/config.properties
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
#
# Copyright 2024 Datastrato Pvt Ltd.
# This software is licensed under the Apache License version 2.
#

#
# WARNING
# ^^^^^^^
# This configuration file is for development only and should NOT be used
# in production. For example configuration, see the Trino documentation.
# sample nodeId to provide consistency across test runs
node.id=ffffffff-ffff-ffff-ffff-ffffffffffff
node.environment=test
node.internal-address=localhost
experimental.concurrent-startup=true
http-server.http.port=8180

discovery.uri=http://localhost:8180

exchange.http-client.max-connections=1000
exchange.http-client.max-connections-per-server=1000
exchange.http-client.connect-timeout=1m
exchange.http-client.idle-timeout=1m

scheduler.http-client.max-connections=1000
scheduler.http-client.max-connections-per-server=1000
scheduler.http-client.connect-timeout=1m
scheduler.http-client.idle-timeout=1m

query.client.timeout=5m
query.min-expire-age=30m

plugin.bundles=\
../../plugin/trino-iceberg/pom.xml,\
../../plugin/trino-hive/pom.xml,\
../../plugin/trino-local-file/pom.xml, \
../../plugin/trino-mysql/pom.xml,\
../../plugin/trino-postgresql/pom.xml, \
../../plugin/trino-exchange-filesystem/pom.xml, \
../../plugin/trino-gravitino/pom.xml

node-scheduler.include-coordinator=true
Binary file added docs/assets/trino/create-gravitino-connector.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
13 changes: 13 additions & 0 deletions docs/assets/trino/gravitino.properties
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
#
# Copyright 2024 Datastrato Pvt Ltd.
# This software is licensed under the Apache License version 2.
#

# the connector name is always 'gravitino'
connector.name=gravitino

# uri of the gravitino server, you need to change it according to your environment
gravitino.uri=http://localhost:8090

# The name of the metalake to which the connector is connected, you need to change it according to your environment
gravitino.metalake=test
178 changes: 178 additions & 0 deletions docs/assets/trino/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
<?xml version="1.0" encoding="UTF-8"?>
<!--
Copyright 2024 Datastrato Pvt Ltd.
This software is licensed under the Apache License version 2.
-->

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>io.trino</groupId>
<artifactId>trino-root</artifactId>
<version>426</version>
<relativePath>../../pom.xml</relativePath>
</parent>

<artifactId>trino-gravitino</artifactId>
<packaging>trino-plugin</packaging>
<description>Trino - Graviton Connector</description>

<properties>
<air.main.basedir>${project.parent.basedir}</air.main.basedir>
</properties>

<dependencies>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
</dependency>

<dependency>
<groupId>com.google.inject</groupId>
<artifactId>guice</artifactId>
</dependency>

<dependency>
<groupId>io.airlift</groupId>
<artifactId>bootstrap</artifactId>
<exclusions>
<exclusion>
<artifactId>log4j-to-slf4j</artifactId>
<groupId>org.apache.logging.log4j</groupId>
</exclusion>
</exclusions>
</dependency>

<dependency>
<groupId>io.airlift</groupId>
<artifactId>configuration</artifactId>
</dependency>

<dependency>
<groupId>io.airlift</groupId>
<artifactId>json</artifactId>
</dependency>

<dependency>
<groupId>io.trino</groupId>
<artifactId>trino-plugin-toolkit</artifactId>
</dependency>

<dependency>
<groupId>jakarta.validation</groupId>
<artifactId>jakarta.validation-api</artifactId>
</dependency>

<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-annotations</artifactId>
<scope>provided</scope>
</dependency>

<dependency>
<groupId>io.airlift</groupId>
<artifactId>slice</artifactId>
<scope>provided</scope>
</dependency>

<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-api</artifactId>
<scope>provided</scope>
</dependency>

<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-context</artifactId>
<scope>provided</scope>
</dependency>

<dependency>
<groupId>io.trino</groupId>
<artifactId>trino-spi</artifactId>
<scope>provided</scope>
</dependency>

<dependency>
<groupId>org.openjdk.jol</groupId>
<artifactId>jol-core</artifactId>
<scope>provided</scope>
</dependency>

<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
</dependency>

<dependency>
<groupId>io.airlift</groupId>
<artifactId>node</artifactId>
<scope>runtime</scope>
</dependency>

<dependency>
<groupId>org.apache.httpcomponents.client5</groupId>
<artifactId>httpclient5</artifactId>
<version>5.2.1</version>
</dependency>

<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang3</artifactId>
</dependency>

<dependency>
<groupId>org.antlr</groupId>
<artifactId>antlr4-runtime</artifactId>
<version>4.9.2</version>
</dependency>

<dependency>
<groupId>org.testng</groupId>
<artifactId>testng</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>io.trino</groupId>
<artifactId>trino-memory</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>io.trino</groupId>
<artifactId>trino-testing</artifactId>
<scope>test</scope>
</dependency>

<dependency>
<groupId>com.datastrato.gravitino</groupId>
<artifactId>gravitino-client-java-runtime</artifactId>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no such artifact after a local publish. do you mean client-java-runtime?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest providing a script to install Graviton-related dependencies into the local Maven repository.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no such artifact after a local publish. do you mean client-java-runtime?

Exactly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest providing a script to install Graviton-related dependencies into the local Maven repository.

I will convert the artifacts to the standard release versions and the user can download them from Maven central repo.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the Maven Central Repository is a good choice. We need to consider situations where a user modifies a class in the client-java-runtime. In such cases, the user might not be able to install the new package.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, a local repo is necessary.

<version>0.4.0-SNAPSHOT</version>
</dependency>

<dependency>
<groupId>com.datastrato.gravitino</groupId>
<artifactId>bundled-catalog</artifactId>
<version>0.4.0-SNAPSHOT</version>
</dependency>

<dependency>
<groupId>com.datastrato.gravitino</groupId>
<artifactId>catalog-common</artifactId>
yuqi1129 marked this conversation as resolved.
Show resolved Hide resolved
<version>0.4.0-SNAPSHOT</version>
</dependency>
Copy link
Contributor

@xiacongling xiacongling Mar 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these packages are not in maven central repository. can you publish them? or can we build and publish them locally from source code?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we use current snapshots (0.5.0-SNAPSHOT)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you can install them with the latest version 0.5.0-SNAPSHOT locally.


<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-collections4</artifactId>
<version>4.4</version>
</dependency>


<!-- <dependency>-->
<!-- <groupId>mysql</groupId>-->
<!-- <artifactId>mysql-connector-java</artifactId>-->
<!-- <version>8.0.23</version>-->
<!-- </dependency>-->

</dependencies>
</project>
Binary file added docs/assets/trino/show-catalogs.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/trino/start-trino.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
74 changes: 74 additions & 0 deletions docs/trino-connector/development.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
---
title: "Gravitino connector development"
slug: /trino-connector/development
keyword: gravitino connector development
license: "Copyright 2024 Datastrato Pvt Ltd.
This software is licensed under the Apache License version 2."
---

This document is to guide users through the development of the Gravitino connector for Trino locally.

## Prerequisites

Before you start developing the Gravitino trino connector, you need to have the following prerequisites:

1. You need to start the Gravitino server locally, for more information, please refer to the [start Gravitino server](../how-to-install.md)
2. Create a catalog in the Gravitino server, for more information, please refer to the [Gravitino metadata management](../manage-metadata-using-gravitino.md). Assuming we have just created a MySQL catalog using the following command:

```curl
curl -X POST -H "Content-Type: application/json" -d '{"name":"test","comment":"comment","properties":{}}' http://localhost:8090/api/metalakes

curl -X POST -H "Content-Type: application/json" -d '{"name":"mysql_catalog3","type":"RELATIONAL","comment":"comment","provider":"jdbc-mysql", "properties":{
"jdbc-url": "jdbc:mysql://127.0.0.1:3306?useSSL=false&allowPublicKeyRetrieval=true",
"jdbc-user": "root",
"jdbc-password": "123456",
"jdbc-driver": "com.mysql.cj.jdbc.Driver"
}}' http://localhost:8090/api/metalakes/test/catalogs
```
Comment on lines +16 to +27
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

catalogs are innecessary for setting up dev environment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use the Trino command line to display the Gravition catalog if you do not create catalogs here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. i just start gravitino server built from source, and add a test empty metalake. on trino side, there is only gravitino.properties under the catalog dir, and Trino dev server started up normally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean if we could list the catalogs in the format like "test.mysql_catalog" by the Trino client.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like
image


:::note
Please change the above `localhost`, `port` and the names of metalake and catalogs accordingly.
:::


## Development environment

To develop the Gravitino connector locally, you need to do the following steps:

### IDEA

1. Clone the Trino repository from the [GitHub](https://github.com/trinodb/trino) repository. We advise you to use the release version 426 or 435.
2. Open the Trino project in your IDEA.
3. Create a new module for the Gravitino connector in the Trino project as the following picture (we will use the name `trino-gravitino` as the module name in the following steps). ![trino-gravitino](../assets/trino/create-gravitino-connector.jpg)
Copy link
Contributor

@diqiu50 diqiu50 Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do users know how to create a module?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be tiny things compared to developing The Trino connector. If users lack this knowledge, I don't think they can develop the connector effectively.

4. Add a soft link to the Gravitino trino connector module in the Trino project. Assuming the src java main directory of the Gravitino trino connector in project Gravitino is `gravitino/path/to/gravitino-trino-connector/src/main/java`,
and the src java main directory of trino-gravitino in the Trino project is `trino/path/to/trino-gravitino/src/main/java`, you can use the following command to create a soft link:

```shell
ln -s gravitino/path/to/trino-connector/src/main/java trino/path/to/trino-gravitino/src/main/java
```
then you can see the `gravitino-trino-connecor` source files and directories in the `trino-gravitino` module as follows:

![trino-gravitino-structure](../assets/trino/add-link.jpg)

5. Change the `pom.xml` file in the `trino-gravitino` module accordingly. This is a [example](../assets/trino/pom.xml) of the `pom.xml` file in the `trino-gravitino` module.
6. Try to compile module `trino-gravitino` to see if there are any errors.
```shell
mvn clean -pl 'plugin/trino-gravitino' package -DskipTests -Dcheckstyle.skip -Dair.check.skip-checkstyle=true -DskipTests -Dair.check.skip-all=true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to build the Trino as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have an error:

[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal io.trino:trino-maven-plugin:12:generate-service-descriptor (default-generate-service-descriptor) on project trino-graviton:
[ERROR]
[ERROR] Existing service descriptor for io.trino.spi.Plugin found in output directory.
[ERROR] -> [Help 1]

The solution is remove the file trino-connector/src/main/resources/META-INF/services/io.trino.spi.Plugin
Do you have this problem

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can install trino modules (exluding gravitino) at first, using:
mvn -pl '!plugin/trino-gravitino' clean install -DskipTests

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I understand that. My question is whether anyone else has encountered this error.
@yuqi1129 The steps on how to build Trino are missing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@diqiu50 I have added it.
image

```
yuqi1129 marked this conversation as resolved.
Show resolved Hide resolved
7. Set up the configuration for the Gravitino connector in the Trino project. You can do as the following picture shows:
![](../assets/trino/add-config.jpg)

The corresponding configuration files are here: [gravitino.properties](../assets/trino/gravitino.properties) and [config.properties](../assets/trino/config.properties).

8. Start the Trino server and connect to the Gravitino server.
![](../assets/trino/start-trino.jpg)
9. If `DevelopmentServer` has started successfully, you can connect to the Trino server using the `trino-cli` and run the following command to see if the Gravitino connector is available:
```shell
java -jar trino-cli-429-executable.jar --server localhost:8180
```
:::note
The `trino-cli-429-executable.jar` is the Trino CLI jar file, you can download it from the [Trino release page](https://trino.io/download.html).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we use version 429?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just an example. It works well for me with version 429, and another version might be just as good, I will add more details and let users choose the package version.

:::

10. If nothing goes wrong, you can start developing the Gravitino connector in the Gravitino project and debug it in the Trino project.
![](../assets/trino/show-catalogs.jpg)
1 change: 1 addition & 0 deletions docs/trino-connector/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,4 @@ Gravitino connector index:
- [MySQL](catalog-mysql.md)
- [PostgreSQL](catalog-postgresql.md)
- [Supported SQL](sql-support.md)
- [Development](development.md)
Loading