Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#1135] improvement(docs): Add docs about tables advanced feature like partitioning #1203

Merged
merged 22 commits into from
Jan 2, 2024
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
3049470
Add docs about tables advanced feature like partitioning
yuqi1129 Dec 19, 2023
1ac2270
Add docs about tables advanced feature like partitioning
yuqi1129 Dec 19, 2023
31677a9
Resolve discussion
yuqi1129 Dec 19, 2023
164ddf0
Resolve discussion
yuqi1129 Dec 19, 2023
bfd2802
Resolve discussion again
yuqi1129 Dec 19, 2023
af0b348
Update doc again
yuqi1129 Dec 19, 2023
d4c086f
Polish docs
yuqi1129 Dec 21, 2023
41582dd
Resolve discussion again
yuqi1129 Dec 25, 2023
a08a184
Remove the source type and result type column
yuqi1129 Dec 25, 2023
ae6b3c3
Merge branch 'main' of github.com:datastrato/graviton into issue_1135
yuqi1129 Dec 25, 2023
31ddcd4
Add description about default null ordering value
yuqi1129 Dec 25, 2023
b70b394
Use a separate doc to describe partitioning, bucketing and sorted table
yuqi1129 Dec 25, 2023
6e37e14
Add document header for table-partitioning-bucketing-sort-order.md
yuqi1129 Dec 25, 2023
3f6c622
Add descriptions about default value of sort direction.
yuqi1129 Dec 25, 2023
993fdff
Change some improper variants naming
yuqi1129 Dec 25, 2023
b1d3db6
Fix discussion again
yuqi1129 Dec 25, 2023
108117a
Optimize code.
yuqi1129 Dec 27, 2023
c0503f8
Fix Jerry's comments and format some code
yuqi1129 Jan 2, 2024
b993c01
Polish docs again
yuqi1129 Jan 2, 2024
a266e95
1. Add the necessary messages needed by table partitioning
yuqi1129 Jan 2, 2024
cc5c454
Change to use api method
yuqi1129 Jan 2, 2024
983dbab
Update table-partitioning-bucketing-sort-order.md
jerryshao Jan 2, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
131 changes: 1 addition & 130 deletions docs/manage-metadata-using-gravitino.md
Original file line number Diff line number Diff line change
Expand Up @@ -736,136 +736,7 @@ In addition to the basic settings, Gravitino supports the following features:
| Bucketed table | Equal to `CLUSTERED BY` in Apache Hive, some engine may use different words to describe it. | [Distribution](pathname:///docs/0.3.0/api/java/com/datastrato/gravitino/rel/expressions/distributions/Distribution.html) |
| Sorted order table | Equal to `SORTED BY` in Apache Hive, some engine may use different words to describe it. | [SortOrder](pathname:///docs/0.3.0/api/java/com/datastrato/gravitino/rel/expressions/sorts/SortOrder.html) |

:::tip
**Not all catalogs may support those features.**. Please refer to the related document for more details.
:::

The following is an example of creating a partitioned, bucketed table and sorted order table:

<Tabs>
<TabItem value="bash" label="Bash">

```bash
curl -X POST -H "Accept: application/vnd.gravitino.v1+json" \
-H "Content-Type: application/json" -d '{
"name": "table",
"columns": [
{
"name": "id",
"type": "integer",
"nullable": true,
"comment": "Id of the user"
},
{
"name": "name",
"type": "varchar(2000)",
"nullable": true,
"comment": "Name of the user"
},
{
"name": "age",
"type": "short",
"nullable": true,
"comment": "Age of the user"
},
{
"name": "score",
"type": "double",
"nullable": true,
"comment": "Score of the user"
}
],
"comment": "Create a new Table",
"properties": {
"format": "ORC"
},
"partitioning": [
{
"strategy": "identity",
"fieldName": ["score"]
}
],
"distribution": {
"strategy": "hash",
"number": 4,
"funcArgs": [
{
"type": "field",
"fieldName": ["score"]
}
]
},
"sortOrders": [
{
"direction": "asc",
"nullOrder": "NULLS_LAST",
"sortTerm": {
"type": "field",
"fieldName": ["name"]
}
}
]
}' http://localhost:8090/api/metalakes/metalake/catalogs/catalog/schemas/schema/tables
```

</TabItem>
<TabItem value="java" label="Java">

```java
tableCatalog.createTable(
NameIdentifier.of("metalake", "hive_catalog", "schema", "table"),
new ColumnDTO[] {
ColumnDTO.builder()
.withComment("Id of the user")
.withName("id")
.withDataType(Types.IntegerType.get())
.withNullable(true)
.build(),
ColumnDTO.builder()
.withComment("Name of the user")
.withName("name")
.withDataType(Types.VarCharType.of(1000))
.withNullable(true)
.build(),
ColumnDTO.builder()
.withComment("Age of the user")
.withName("age")
.withDataType(Types.ShortType.get())
.withNullable(true)
.build(),

ColumnDTO.builder()
.withComment("Score of the user")
.withName("score")
.withDataType(Types.DoubleType.get())
.withNullable(true)
.build(),
},
"Create a new Table",
tablePropertiesMap,
new Transform[] {
// Partition by id
Transforms.identity("score")
},
// CLUSTERED BY id
new DistributionDTO.Builder()
.withStrategy(Strategy.HASH)
.withNumber(4)
.withArgs(FieldReferenceDTO.of("id"))
.build(),
// SORTED BY name asc
new SortOrderDTO[] {
new SortOrderDTO.Builder()
.withDirection(SortDirection.ASCENDING)
.withNullOrder(NullOrdering.NULLS_LAST)
.withSortTerm(FieldReferenceDTO.of("name"))
.build()
}
);
```

</TabItem>
</Tabs>
For More about partition, distribution and sort order, please refer to the related [doc](table-partitioning-bucketing-sort-order.md).
yuqi1129 marked this conversation as resolved.
Show resolved Hide resolved

:::note
The code above is an example of creating a Hive table. For other catalogs, the code is similar, but the supported column type, table properties may be different. For more details, please refer to the related doc.
Expand Down
Loading