-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[SPARK-28794][SQL][DOC] Documentation for Create table Command #26759
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,115 @@ | ||
--- | ||
layout: global | ||
title: CREATE DATASOURCE TABLE | ||
displayTitle: CREATE DATASOURCE TABLE | ||
license: | | ||
Licensed to the Apache Software Foundation (ASF) under one or more | ||
contributor license agreements. See the NOTICE file distributed with | ||
this work for additional information regarding copyright ownership. | ||
The ASF licenses this file to You under the Apache License, Version 2.0 | ||
(the "License"); you may not use this file except in compliance with | ||
the License. You may obtain a copy of the License at | ||
|
||
http://www.apache.org/licenses/LICENSE-2.0 | ||
|
||
Unless required by applicable law or agreed to in writing, software | ||
distributed under the License is distributed on an "AS IS" BASIS, | ||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
See the License for the specific language governing permissions and | ||
limitations under the License. | ||
--- | ||
|
||
### Description | ||
|
||
The `CREATE TABLE` statement defines a new table using a Data Source. | ||
|
||
### Syntax | ||
{% highlight sql %} | ||
CREATE TABLE [ IF NOT EXISTS ] table_identifier | ||
[ ( col_name1 col_type1 [ COMMENT col_comment1 ], ... ) ] | ||
USING data_source | ||
[ OPTIONS ( key1=val1, key2=val2, ... ) ] | ||
[ PARTITIONED BY ( col_name1, col_name2, ... ) ] | ||
[ CLUSTERED BY ( col_name3, col_name4, ... ) | ||
[ SORTED BY ( col_name [ ASC | DESC ], ... ) ] | ||
INTO num_buckets BUCKETS ] | ||
[ LOCATION path ] | ||
[ COMMENT table_comment ] | ||
[ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ] | ||
[ AS select_statement ] | ||
{% endhighlight %} | ||
|
||
### Parameters | ||
|
||
<dl> | ||
<dt><code><em>table_identifier</em></code></dt> | ||
<dd> | ||
Specifies a table name, which may be optionally qualified with a database name.<br><br> | ||
<b>Syntax:</b> | ||
<code> | ||
[ database_name. ] table_name | ||
</code> | ||
</dd> | ||
</dl> | ||
<dl> | ||
<dt><code><em>USING data_source</em></code></dt> | ||
<dd>Data Source is the input format used to create the table. Data source can be CSV, TXT, ORC, JDBC, PARQUET, etc.</dd> | ||
</dl> | ||
|
||
<dl> | ||
<dt><code><em>PARTITIONED BY</em></code></dt> | ||
<dd>Partitions are created on the table, based on the columns specified.</dd> | ||
</dl> | ||
|
||
<dl> | ||
<dt><code><em>CLUSTERED BY</em></code></dt> | ||
<dd> | ||
Partitions created on the table will be bucketed into fixed buckets based on the column specified for bucketing.<br><br> | ||
<b>NOTE:</b>Bucketing is an optimization technique that uses buckets (and bucketing columns) to determine data partitioning and avoid data shuffle.<br> | ||
<dt><code><em>SORTED BY</em></code></dt> | ||
<dd>Determines the order in which the data is stored in buckets. Default is Ascending order.</dd> | ||
</dd> | ||
</dl> | ||
|
||
<dl> | ||
<dt><code><em>LOCATION</em></code></dt> | ||
<dd>Path to the directory where table data is stored, which could be a path on distributed storage like HDFS, etc.</dd> | ||
</dl> | ||
|
||
<dl> | ||
<dt><code><em>COMMENT</em></code></dt> | ||
<dd>Table comments are added.</dd> | ||
</dl> | ||
|
||
<dl> | ||
<dt><code><em>TBLPROPERTIES</em></code></dt> | ||
<dd>Table properties that have to be set are specified, such as `created.by.user`, `owner`, etc. | ||
</dd> | ||
</dl> | ||
|
||
<dl> | ||
<dt><code><em>AS select_statement</em></code></dt> | ||
<dd>The table is populated using the data from the select statement.</dd> | ||
</dl> | ||
|
||
### Examples | ||
{% highlight sql %} | ||
|
||
--Using data source | ||
CREATE TABLE Student (Id INT,name STRING ,age INT) USING CSV; | ||
|
||
--Using data from another table | ||
CREATE TABLE StudentInfo | ||
AS SELECT * FROM Student; | ||
|
||
--Partitioned and bucketed | ||
CREATE TABLE Student (Id INT,name STRING ,age INT) | ||
USING CSV | ||
PARTITIONED BY (age) | ||
CLUSTERED BY (Id) INTO 4 buckets; | ||
|
||
{% endhighlight %} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you add a |
||
|
||
### Related Statements | ||
* [CREATE TABLE USING HIVE FORMAT](sql-ref-syntax-ddl-create-table-hiveformat.html) | ||
* [CREATE TABLE LIKE](sql-ref-syntax-ddl-create-table-like.html) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,122 @@ | ||
--- | ||
layout: global | ||
title: CREATE HIVEFORMAT TABLE | ||
displayTitle: CREATE HIVEFORMAT TABLE | ||
license: | | ||
Licensed to the Apache Software Foundation (ASF) under one or more | ||
contributor license agreements. See the NOTICE file distributed with | ||
this work for additional information regarding copyright ownership. | ||
The ASF licenses this file to You under the Apache License, Version 2.0 | ||
(the "License"); you may not use this file except in compliance with | ||
the License. You may obtain a copy of the License at | ||
|
||
http://www.apache.org/licenses/LICENSE-2.0 | ||
|
||
Unless required by applicable law or agreed to in writing, software | ||
distributed under the License is distributed on an "AS IS" BASIS, | ||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
See the License for the specific language governing permissions and | ||
limitations under the License. | ||
--- | ||
### Description | ||
|
||
The `CREATE TABLE` statement defines a new table using Hive format. | ||
|
||
### Syntax | ||
{% highlight sql %} | ||
CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier | ||
[ ( col_name1[:] col_type1 [ COMMENT col_comment1 ], ... ) ] | ||
[ COMMENT table_comment ] | ||
[ PARTITIONED BY ( col_name2[:] col_type2 [ COMMENT col_comment2 ], ... ) | ||
| ( col_name1, col_name2, ... ) ] | ||
[ ROW FORMAT row_format ] | ||
[ STORED AS file_format ] | ||
[ LOCATION path ] | ||
[ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ] | ||
[ AS select_statement ] | ||
|
||
{% endhighlight %} | ||
|
||
### Parameters | ||
|
||
<dl> | ||
<dt><code><em>table_identifier</em></code></dt> | ||
<dd> | ||
Specifies a table name, which may be optionally qualified with a database name.<br><br> | ||
<b>Syntax:</b> | ||
<code> | ||
[ database_name. ] table_name | ||
</code> | ||
</dd> | ||
</dl> | ||
|
||
<dl> | ||
<dt><code><em>EXTERNAL</em></code></dt> | ||
<dd>Table is defined using the path provided as LOCATION, does not use default location for this table.</dd> | ||
</dl> | ||
|
||
<dl> | ||
<dt><code><em>PARTITIONED BY</em></code></dt> | ||
<dd>Partitions are created on the table, based on the columns specified.</dd> | ||
</dl> | ||
|
||
<dl> | ||
<dt><code><em>ROW FORMAT</em></code></dt> | ||
<dd>SERDE is used to specify a custom SerDe or the DELIMITED clause in order to use the native SerDe.</dd> | ||
</dl> | ||
|
||
<dl> | ||
<dt><code><em>STORED AS</em></code></dt> | ||
<dd>File format for table storage, could be TEXTFILE, ORC, PARQUET,etc.</dd> | ||
</dl> | ||
|
||
<dl> | ||
<dt><code><em>LOCATION</em></code></dt> | ||
<dd>Path to the directory where table data is stored, Path to the directory where table data is stored, which could be a path on distributed storage like HDFS, etc.</dd> | ||
</dl> | ||
|
||
<dl> | ||
<dt><code><em>COMMENT</em></code></dt> | ||
<dd>Table comments are added.</dd> | ||
</dl> | ||
|
||
<dl> | ||
<dt><code><em>TBLPROPERTIES</em></code></dt> | ||
<dd> | ||
Table properties that have to be set are specified, such as `created.by.user`, `owner`, etc. | ||
</dd> | ||
</dl> | ||
|
||
<dl> | ||
<dt><code><em>AS select_statement</em></code></dt> | ||
<dd>The table is populated using the data from the select statement.</dd> | ||
</dl> | ||
|
||
|
||
### Examples | ||
{% highlight sql %} | ||
|
||
--Using Comment and loading data from another table into the created table | ||
CREATE TABLE StudentInfo | ||
COMMENT 'Table is created using existing data' | ||
AS SELECT * FROM Student; | ||
|
||
--Partitioned table | ||
CREATE TABLE Student (Id INT,name STRING) | ||
PARTITIONED BY (age INT) | ||
TBLPROPERTIES ('owner'='xxxx'); | ||
|
||
CREATE TABLE Student (Id INT,name STRING,age INT) | ||
PARTITIONED BY (name,age); | ||
|
||
--Using Row Format and file format | ||
CREATE TABLE Student (Id INT,name STRING) | ||
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' | ||
STORED AS TEXTFILE; | ||
|
||
{% endhighlight %} | ||
|
||
|
||
### Related Statements | ||
* [CREATE TABLE USING DATASOURCE](sql-ref-syntax-ddl-create-table-datasource.html) | ||
* [CREATE TABLE LIKE](sql-ref-syntax-ddl-create-table-like.html) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,97 @@ | ||
--- | ||
layout: global | ||
title: CREATE TABLE LIKE | ||
displayTitle: CREATE TABLE LIKE | ||
license: | | ||
Licensed to the Apache Software Foundation (ASF) under one or more | ||
contributor license agreements. See the NOTICE file distributed with | ||
this work for additional information regarding copyright ownership. | ||
The ASF licenses this file to You under the Apache License, Version 2.0 | ||
(the "License"); you may not use this file except in compliance with | ||
the License. You may obtain a copy of the License at | ||
|
||
http://www.apache.org/licenses/LICENSE-2.0 | ||
|
||
Unless required by applicable law or agreed to in writing, software | ||
distributed under the License is distributed on an "AS IS" BASIS, | ||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
See the License for the specific language governing permissions and | ||
limitations under the License. | ||
--- | ||
### Description | ||
|
||
The `CREATE TABLE` statement defines a new table using the definition/metadata of an existing table or view. | ||
|
||
### Syntax | ||
{% highlight sql %} | ||
CREATE TABLE [IF NOT EXISTS] table_identifier LIKE source_table_identifier | ||
USING data_source | ||
[ ROW FORMAT row_format ] | ||
[ STORED AS file_format ] | ||
[ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ] | ||
[ LOCATION path ] | ||
{% endhighlight %} | ||
|
||
### Parameters | ||
<dl> | ||
<dt><code><em>table_identifier</em></code></dt> | ||
<dd> | ||
Specifies a table name, which may be optionally qualified with a database name.<br><br> | ||
<b>Syntax:</b> [ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ] | ||
<code> | ||
[ database_name. ] table_name | ||
</code> | ||
</dd> | ||
</dl> | ||
|
||
<dl> | ||
<dt><code><em>USING data_source</em></code></dt> | ||
<dd>Data Source is the input format used to create the table. Data source can be CSV, TXT, ORC, JDBC, PARQUET, etc.</dd> | ||
</dl> | ||
|
||
<dl> | ||
<dt><code><em>ROW FORMAT</em></code></dt> | ||
<dd>SERDE is used to specify a custom SerDe or the DELIMITED clause in order to use the native SerDe.</dd> | ||
</dl> | ||
|
||
<dl> | ||
<dt><code><em>STORED AS</em></code></dt> | ||
<dd>File format for table storage, could be TEXTFILE, ORC, PARQUET,etc.</dd> | ||
</dl> | ||
|
||
<dl> | ||
<dt><code><em>TBLPROPERTIES</em></code></dt> | ||
<dd>Table properties that have to be set are specified, such as `created.by.user`, `owner`, etc. | ||
</dd> | ||
</dl> | ||
|
||
<dl> | ||
<dt><code><em>LOCATION</em></code></dt> | ||
<dd>Path to the directory where table data is stored,Path to the directory where table data is stored, which could be a path on distributed storage like HDFS, etc. Location to create an external table.</dd> | ||
</dl> | ||
|
||
|
||
### Examples | ||
{% highlight sql %} | ||
|
||
--Create table using an exsisting table | ||
CREATE TABLE Student_Dupli like Student; | ||
|
||
--Create table like using a data source | ||
CREATE TABLE Student_Dupli like Student USING CSV; | ||
|
||
--Table is created as external table at the location specified | ||
CREATE TABLE Student_Dupli like Student location '/root1/home'; | ||
|
||
--Create table like using a rowformat | ||
CREATE TABLE Student_Dupli like Student | ||
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' | ||
STORED AS TEXTFILE | ||
TBLPROPERTIES ('owner'='xxxx'); | ||
|
||
{% endhighlight %} | ||
|
||
### Related Statements | ||
* [CREATE TABLE USING DATASOURCE](sql-ref-syntax-ddl-create-table-datasource.html) | ||
* [CREATE TABLE USING HIVE FORMAT](sql-ref-syntax-ddl-create-table-hiveformat.html) | ||
|
Original file line number | Diff line number | Diff line change | ||
---|---|---|---|---|
|
@@ -19,4 +19,14 @@ license: | | |||
limitations under the License. | ||||
--- | ||||
|
||||
**This page is under construction** | ||||
### Description | ||||
`CREATE TABLE` statement is used to define a table in an exsisting database. | ||||
|
||||
The CREATE statements: | ||||
* [CREATE TABLE USING DATASOURCE](sql-ref-syntax-ddl-create-table-datasource.html) | ||||
* [CREATE TABLE USING HIVE FORMAT](sql-ref-syntax-ddl-create-table-hiveformat.html) | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We need to add spark/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 Line 129 in 18e8d1d
|
||||
* [CREATE TABLE LIKE](sql-ref-syntax-ddl-create-table-like.html) | ||||
|
||||
### Related Statements | ||||
- [ALTER TABLE](sql-ref-syntax-ddl-alter-table.html) | ||||
- [DROP TABLE](sql-ref-syntax-ddl-drop-table.html) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add
;
in the end of all the sql statements in example sections?