Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix minor grammar errors and English mistakes in documentation. #1302

Merged
merged 4 commits into from
Jan 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 4 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ Gravitino aims to provide several key features:

## Contributing to Gravitino

Gravitino is open source software available under the Apache 2.0 license. For information of how to contribute to Gravitino please see the [Contribution guidelines](CONTRIBUTING.md).
Gravitino is open source software available under the Apache 2.0 license. For information on how to contribute to Gravitino please see the [Contribution guidelines](CONTRIBUTING.md).

## Online documentation

Expand Down Expand Up @@ -53,19 +53,18 @@ Or:

to build a compressed distribution package.

The generated binary distribution package locates in `distribution` directory.
The directory `distribution` contains the generated binary distribution package.

For the details of building and testing Gravitino, please see [How to build Gravitino](docs/how-to-build.md).

## Quick start

### Configure and start the Gravitino server

If you already have a binary distribution package, please decompress the package (if required)
and go to the directory where the package locates.
If you already have a binary distribution package, go to the directory of the decompressed package.

Before starting the Gravitino server, please configure the Gravitino server configuration file. The
configuration file, `gravitino.conf`, located in the `conf` directory and follows the standard property file format. You can modify the configuration within this file.
configuration file, `gravitino.conf`, is in the `conf` directory and follows the standard property file format. You can modify the configuration within this file.

To start the Gravitino server, please run:

Expand Down
16 changes: 8 additions & 8 deletions docs/apache-hive-catalog.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ The Hive catalog is available for Apache Hive **2.x** only. Support for Apache H

### Catalog capabilities

The Hive catalog supports to create, update, and delete databases and tables in the HMS.
The Hive catalog supports creating, updating, and deleting databases and tables in the HMS.

### Catalog properties

Expand Down Expand Up @@ -61,12 +61,12 @@ see [Manage Metadata Using Gravitino](./manage-metadata-using-gravitino.md#schem

### Table capabilities

The Hive catalog supports to create, update, and delete tables in the HMS.
The Hive catalog supports creating, updating, and deleting of tables in the HMS.

#### Table partitions

The Hive catalog supports [partitioned tables](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-PartitionedTables). Users can create partitioned tables in the Hive catalog with specific partitioning attribute.
Although Gravitino supports several partitioning strategies, the Apache Hive inherently only supports a single partitioning strategy (partitioned by column), therefore the Hive catalog only support `Identity` partitioning.
The Hive catalog supports [partitioned tables](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-PartitionedTables). Users can create partitioned tables in the Hive catalog with the specific partitioning attribute.
Although Gravitino supports several partitioning strategies, the Apache Hive inherently only supports a single partitioning strategy (partitioned by column), therefore the Hive catalog only supports `Identity` partitioning.

:::caution
The `fieldName` specified in the partitioning attribute must be a column defined in the table.
Expand All @@ -75,7 +75,7 @@ The `fieldName` specified in the partitioning attribute must be a column defined
#### Table sort orders and distributions

The Hive catalog supports [bucketed sorted tables](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-BucketedSortedTables). Users can create bucketed sorted tables in the Hive catalog with specific `distribution` and `sortOrders` attributes.
Although Gravitino supports several distribution strategies, the Apache Hive inherently only supports a single distribution strategy (clustered by column), therefore the Hive catalog only support `Hash` distribution.
Although Gravitino supports several distribution strategies, the Apache Hive inherently only supports a single distribution strategy (clustered by column), therefore the Hive catalog only supports `Hash` distribution.

:::caution
The `fieldName` specified in the `distribution` and `sortOrders` attribute must be a column defined in the table.
Expand Down Expand Up @@ -131,7 +131,7 @@ Hive automatically adds and manages some reserved properties. Users aren't allow
| `comment` | Used to store the table comment. | 0.2.0 |
| `numFiles` | Used to store the number of files in the table. | 0.2.0 |
| `totalSize` | Used to store the total size of the table. | 0.2.0 |
| `EXTERNAL` | Indicates whether the table is an external table. | 0.2.0 |
| `EXTERNAL` | Indicates whether the table is external. | 0.2.0 |
| `transient_lastDdlTime` | Used to store the last DDL time of the table. | 0.2.0 |

### Table operations
Expand All @@ -141,7 +141,7 @@ Please refer to [Manage Metadata Using Gravitino](./manage-metadata-using-gravit
#### Alter operations

Gravitino has already defined a unified set of [metadata operation interfaces](./manage-metadata-using-gravitino.md#alter-a-table), and almost all [Hive Alter operations](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AlterTable/Partition/Column) have corresponding table update request which enable you to change the struct of an existing table.
The following table lists the mapping relationship between Hive Alter operations and Gravitino table update request.
The following table lists the mapping relationship between Hive Alter operations and the Gravitino table update requests.

##### Alter table

Expand All @@ -157,7 +157,7 @@ The following table lists the mapping relationship between Hive Alter operations
| `Alter Table Constraints` | Unsupported | - |

:::note
As Gravitino has a separate interface for updating the comment of a table, the Hive catalog sets `comment` as a reserved property for the table, preventing users from setting the comment property, Although Apache Hive change the comment of a table by modifying the comment property of the table.
As Gravitino has a separate interface for updating the comment of a table, the Hive catalog sets `comment` as a reserved property for the table, preventing users from setting the comment property, Although Apache Hive changes the comment of a table by modifying the comment property of the table.
:::

##### Alter column
Expand Down
10 changes: 5 additions & 5 deletions docs/docker-image-details.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,11 @@ This software is licensed under the Apache License version 2."

# User Docker images

There are 2 kinds of docker images for user Docker images: the Gravitino Docker image and playground Docker images.
There are 2 kinds of Docker images for users to use: the Gravitino Docker image and playground Docker images.

## Gravitino Docker image

You can deploy the service with Gravitino Docker image.
You can deploy the service with the Gravitino Docker image.

Container startup commands

Expand All @@ -36,7 +36,7 @@ You can use the [playground](https://github.com/datastrato/gravitino-playground)

The playground consists of multiple Docker images.

The Docker images of playground have suitable configurations for users to experience.
The Docker images of the playground have suitable configurations for users to experience.

### Hive image

Expand All @@ -59,12 +59,12 @@ Changelog

# Developer Docker images

You can use these kinds of the Docker images to facilitate Gravitino integration testing.
You can use these kinds of Docker images to facilitate Gravitino integration testing.
You can use it to test all catalog and connector modules within Gravitino.

## Gravitino CI Apache Hive image

You can use this kind of images to test the catalog of Apache Hive.
You can use this kind of image to test the catalog of Apache Hive.

Changelog

Expand Down
13 changes: 6 additions & 7 deletions docs/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,6 @@ or locally see [Installing Gravitino playground locally](#installing-gravitino-p

If you are using AWS and want to access the instance remotely, be sure to read [Accessing Gravitino on AWS externally](#accessing-gravitino-on-aws-externally)


## Getting started on Amazon Web Services

To begin using Gravitino on AWS, follow these steps:
Expand Down Expand Up @@ -159,7 +158,7 @@ You can install Apache Hive and Hadoop on AWS or Google Cloud Platform manually,
the steps of how to install [Apache Hive](https://cwiki.apache.org/confluence/display/Hive/) and
[Hadoop](https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html) instructions on their websites.

Installing and configuring Hive can be a little complex. If you don't already have Hive setup and running you can use the Docker container Datastrato provide to get Gravitino up and running.
Installing and configuring Hive can be a little complex. If you don't already have Hive setup and running you can use the Docker container Datastrato provides to get Gravitino up and running.

You can follow the instructions for setting up [Docker on Ubuntu](https://docs.docker.com/engine/install/ubuntu/).

Expand All @@ -175,7 +174,7 @@ sudo docker start gravitino-container

## Installing Apache Hive locally

The same steps apply for installing Hive locally as on AWS or Google Cloud Platform. You can
The same steps apply to installing Hive locally as on AWS or Google Cloud Platform. You can
follow the instructions for [Installing Apache Hive on AWS or Google Cloud Platform](#installing-apache-hive-on-aws-or-google-cloud-platform).

## Installing Gravitino playground on AWS or Google Cloud Platform
Expand All @@ -184,7 +183,7 @@ Gravitino provides a bundle of Docker images to launch a Gravitino playground, w
includes Apache Hive, Apache Hadoop, Trino, MySQL, PostgreSQL, and Gravitino. You can use
Docker compose to start them all.

Installing Docker and Docker Compose is a requirement to using the playground.
Installing Docker and Docker Compose is a requirement for using the playground.

```shell
sudo apt install docker docker-compose
Expand All @@ -198,12 +197,12 @@ how to run the playground, please see [how-to-use-the-playground](./how-to-use-t

## Installing Gravitino playground locally

The same steps apply for installing the playground locally as on AWS or Google Cloud Platform. You
The same steps apply to installing the playground locally as on AWS or Google Cloud Platform. You
can follow the instructions for [Installing Gravitino playground on AWS or Google Cloud Platform](#installing-gravitino-playground-on-aws-or-google-cloud-platform).

## Using REST to interact with Gravitino

After starting the Gravitino distribution, issue REST commands to create and modify metadata. While you are using localhost in these examples, run these commands remotely via a host name or IP address once you establish correct access.
After starting the Gravitino distribution, issue REST commands to create and modify metadata. While you are using localhost in these examples, run these commands remotely via a hostname or IP address once you establish correct access.

1. Create a Metalake

Expand Down Expand Up @@ -260,7 +259,7 @@ After starting the Gravitino distribution, issue REST commands to create and mod
http://localhost:8090/api/metalakes/metalake/catalogs
```

Note that the metastore.uris used for the Hive catalog and would need updating if you change your configuration.
Note that the metastore.uris property used for the Hive catalog and would need updating if you change your configuration.

## Accessing Gravitino on AWS externally

Expand Down
13 changes: 7 additions & 6 deletions docs/gravitino-server-config.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ This software is licensed under the Apache License version 2."
## Introduction

Gravitino supports several configurations:

1. **Gravitino server configuration**: Used to start up Gravitino server.
2. **Gravitino catalog properties configuration**: Used to make default values for different catalogs.
3. **Some other configurations**: Includes configurations such as HDFS configuration.
Expand All @@ -35,10 +36,10 @@ The `gravitino.conf` file lists the configuration items in the following table.
| `gravitino.server.webserver.requestHeaderSize` | Maximum size of HTTP requests. | `131072` | No | 0.1.0 |
| `gravitino.server.webserver.responseHeaderSize` | Maximum size of HTTP responses. | `131072` | No | 0.1.0 |
| `gravitino.server.shutdown.timeout` | Time in milliseconds to gracefully shutdown of the Gravitino webserver. | `3000` | No | 0.2.0 |
| `gravitino.server.webserver.customFilters` | Comma separated list of filter class names to apply to the APIs. | (none) | No | 0.4.0 |
| `gravitino.server.webserver.customFilters` | Comma separated list of filter class names to apply to the API. | (none) | No | 0.4.0 |

The filter in the customFilters should be a standard javax servlet Filter.
Filter parameters can also be specified in the configuration, by setting config entries of the form `gravitino.server.webserver.<class name of filter>.param.<param name>=<value>`
Filter parameters can also be specified in the configuration, by setting configuration entries of the form `gravitino.server.webserver.<class name of filter>.param.<param name>=<value>`

### Storage configuration

Expand All @@ -52,7 +53,7 @@ Filter parameters can also be specified in the configuration, by setting config
| `gravitino.entity.store.kv.deleteAfterTimeMs` | The maximum time in milliseconds that the deleted data and old version data is kept. Set to at least 10 minutes and no longer than 30 days. | `604800000`(7 days) | No | 0.3.0 |

:::caution
It's highly recommend that you change the default value of `gravitino.entity.store.kv.rocksdbPath`, as it's under the deployment directory and future version upgrades may remove it.
It's highly recommended that you change the default value of `gravitino.entity.store.kv.rocksdbPath`, as it's under the deployment directory and future version upgrades may remove it.
:::

### Catalog configuration
Expand Down Expand Up @@ -85,11 +86,11 @@ There are three types of catalog properties:
Catalog properties are either defined in catalog configuration files as default values or specified explicitly when creating a catalog.

:::info
Explicit specifications take precedence over the formal configurations.
Explicit specifications take precedence over formal configurations.
:::

:::caution
These rules only apply on the catalog properties, doesn't affect on the schema or table properties.
These rules only apply to the catalog properties and don't affect the schema or table properties.
:::

| catalog provider | catalog properties | catalog properties configuration file path |
Expand All @@ -100,7 +101,7 @@ These rules only apply on the catalog properties, doesn't affect on the schema o
| `jdbc-postgresql` | [PostgreSQL catalog properties](jdbc-postgresql-catalog.md#catalog-properties) | `catalogs/jdbc-postgresql/conf/jdbc-postgresql.conf` |

:::info
Gravitino server automatically add catalog properties configuration dir to classpath.
Gravitino server automatically adds catalog properties configuration dir to classpath.
:::

## Some other configurations
Expand Down
6 changes: 1 addition & 5 deletions docs/how-to-build.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,24 +13,20 @@ This software is licensed under the Apache License version 2."
+ Optionally Docker to run integration tests

:::info Please read the following notes first

+ Gravitino requires at least JDK8 and at most JDK17 to run Gradle, so you need to
install JDK8 to 17 version to launch the build environment.

+ Gravitino itself supports using JDK8, 11, and 17 to build, Gravitino Trino connector uses
JDK17 to build. You don't have to preinstall the specified JDK environment,
+ Gradle detects the JDK version needed and downloads it automatically.

+ Gravitino uses Gradle Java Toolchain to detect and manage JDK versions, it checks the
installed JDK by running `./gradlew javaToolchains` command. For the details of Gradle Java
Toolchain, please see [Gradle Java Toolchain](https://docs.gradle.org/current/userguide/toolchains.html#sec:java_toolchain).

+ Make sure you have installed Docker in your environment as Gravitino uses it to run integration tests; without it, some Docker-related tests may not run.

+ macOS uses "docker-connector" to make the Gravitino Trino connector work with Docker
for macOS. For the details of "docker-connector", please see [docker-connector](https://github.com/wenjunxiao/mac-docker-connector)
, `$GRAVITINO_HOME/dev/docker/tools/mac-docker-connector.sh`, and
`$GRAVITINO_HOME/dev/docker/tools/README.md` for more details.

+ Alternatively, you can use OrbStack to replace Docker for macOS, please see
[OrbStack](https://orbstack.dev/), with OrbStack you can run Gravitino integration tests
without needing to install "docker-connector".
Expand Down
Loading
Loading