Skip to content

Commit 9841744

Browse files
MetadataBuilder
1 parent 5951ded commit 9841744

File tree

8 files changed

+150
-9
lines changed

8 files changed

+150
-9
lines changed

docs/SparkSession.md

+5-4
Original file line numberDiff line numberDiff line change
@@ -93,22 +93,23 @@ cloneSession(): SparkSession
9393
* `AdaptiveSparkPlanHelper` is requested to `getOrCloneSessionWithAqeOff`
9494
* `StreamExecution` (Spark Structured Streaming) is created
9595

96-
## <span id="builder"> Creating SparkSession Using Builder Pattern
96+
## Creating SparkSession Using Builder Pattern { #builder }
9797

9898
```scala
9999
builder(): Builder
100100
```
101101

102-
`builder` is an object method that creates a new [Builder](SparkSession-Builder.md) to build a `SparkSession` using a _fluent API_.
102+
`builder` is an object method that creates a new [Builder](SparkSession-Builder.md) (that is then used to build a `SparkSession` using a so-called *Fluent API*).
103103

104104
```scala
105105
import org.apache.spark.sql.SparkSession
106106
val builder = SparkSession.builder
107107
```
108108

109-
TIP: Read about https://en.wikipedia.org/wiki/Fluent_interface[Fluent interface] design pattern in Wikipedia, the free encyclopedia.
109+
??? note "Fluent interface"
110+
Read about [Fluent interface](https://en.wikipedia.org/wiki/Fluent_interface) design pattern in Wikipedia, the free encyclopedia.
110111

111-
## <span id="version"> Spark Version
112+
## Spark Version { #version }
112113

113114
```scala
114115
version: String

docs/default-columns/index.md

+8
Original file line numberDiff line numberDiff line change
@@ -86,3 +86,11 @@ Default Columns uses the following configuration properties:
8686
* [spark.sql.jsonGenerator.writeNullIfWithDefaultValue](../configuration-properties.md#spark.sql.jsonGenerator.writeNullIfWithDefaultValue)
8787

8888
Default Columns are resolved using [ResolveDefaultColumns](../logical-analysis-rules/ResolveDefaultColumns.md) logical resolution rule.
89+
90+
## Column Metadata Attributes
91+
92+
Default Columns feature uses the internal column metadata attributes to mark schema fields with default values.
93+
94+
### CURRENT_DEFAULT { #CURRENT_DEFAULT_COLUMN_METADATA_KEY }
95+
96+
### EXISTS_DEFAULT { #EXISTS_DEFAULT_COLUMN_METADATA_KEY }

docs/logical-analysis-rules/AddMetadataColumns.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ title: AddMetadataColumns
1212

1313
## Executing Rule { #apply }
1414

15-
??? note "Signature"
15+
??? note "Rule"
1616

1717
```scala
1818
apply(

docs/logical-analysis-rules/ResolveDefaultColumns.md

+21
Original file line numberDiff line numberDiff line change
@@ -57,3 +57,24 @@ For [UpdateTable](../logical-operators/UpdateTable.md)s, `apply` [resolveDefault
5757
### MergeIntoTable { #MergeIntoTable }
5858

5959
For [MergeIntoTable](../logical-operators/MergeIntoTable.md)s, `apply` [resolveDefaultColumnsForMerge](#resolveDefaultColumnsForMerge).
60+
61+
## constantFoldCurrentDefaultsToExistDefaults { #constantFoldCurrentDefaultsToExistDefaults }
62+
63+
```scala
64+
constantFoldCurrentDefaultsToExistDefaults(
65+
tableSchema: StructType,
66+
statementType: String): StructType
67+
```
68+
69+
Only with [spark.sql.defaultColumn.enabled](../configuration-properties.md#spark.sql.defaultColumn.enabled) enabled, `constantFoldCurrentDefaultsToExistDefaults`...FIXME
70+
71+
Otherwise, `constantFoldCurrentDefaultsToExistDefaults` does nothing (a _noop_) and returns the given [table schema](../types/StructType.md) intact.
72+
73+
---
74+
75+
`constantFoldCurrentDefaultsToExistDefaults` is used when:
76+
77+
* `CatalogV2Util` is requested to [addField](../connector/catalog/CatalogV2Util.md#addField)
78+
* [AlterTableAddColumnsCommand](../logical-operators/AlterTableAddColumnsCommand.md) logical command is executed (to [constantFoldCurrentDefaultsToExistDefaults](../logical-operators/AlterTableAddColumnsCommand.md#constantFoldCurrentDefaultsToExistDefaults))
79+
* [DataSourceAnalysis](../logical-analysis-rules/DataSourceAnalysis.md) logical resolution rule is executed (to resolve a [CreateTable](../logical-operators/CreateTable.md) logical operator with a [DataSource table](../connectors/DDLUtils.md#isDatasourceTable))
80+
* [DataSourceV2Strategy](../execution-planning-strategies/DataSourceV2Strategy.md) execution planning strategy is executed (to resolve [CreateTable](../logical-operators/CreateTable.md) and `ReplaceTable` logical operators)

docs/metadata-columns/MetadataColumnHelper.md

+21-3
Original file line numberDiff line numberDiff line change
@@ -4,15 +4,15 @@ title: MetadataColumnHelper
44

55
# MetadataColumnHelper Implicit Class
66

7-
`MetadataColumnHelper` is a Scala implicit class for [Attribute](#attr).
7+
`MetadataColumnHelper` is a Scala implicit class of [Attribute](#attr) class.
88

99
## Creating Instance
1010

1111
`MetadataColumnHelper` takes the following to be created:
1212

1313
* <span id="attr"> [Attribute](../expressions/Attribute.md)
1414

15-
## <span id="isMetadataCol"> isMetadataCol
15+
## isMetadataCol { #isMetadataCol }
1616

1717
```scala
1818
isMetadataCol: Boolean
@@ -35,10 +35,28 @@ markAsQualifiedAccessOnly(): Attribute
3535
Metadata Key | Value
3636
-------------|------
3737
[__metadata_col](#METADATA_COL_ATTR_KEY) | The [name](../expressions/Attribute.md#name) of this [Attribute](#attr)
38-
[__qualified_access_only](#QUALIFIED_ACCESS_ONLY) | `true`
38+
[__qualified_access_only](index.md#QUALIFIED_ACCESS_ONLY) | `true`
3939

4040
---
4141

4242
`markAsQualifiedAccessOnly` is used when:
4343

4444
* `Analyzer` is requested to [commonNaturalJoinProcessing](../Analyzer.md#commonNaturalJoinProcessing)
45+
46+
## markAsAllowAnyAccess { #markAsAllowAnyAccess }
47+
48+
```scala
49+
markAsAllowAnyAccess(): Attribute
50+
```
51+
52+
Only with [qualifiedAccessOnly](#qualifiedAccessOnly) enabled, `markAsAllowAnyAccess` removes [__qualified_access_only](index.md#QUALIFIED_ACCESS_ONLY) metadata key from this [Attribute](#attr).
53+
54+
Otherwise, `markAsAllowAnyAccess` does nothing (a _noop_) and returns this [Attribute](#attr) intact.
55+
56+
---
57+
58+
`markAsAllowAnyAccess` is used when:
59+
60+
* [AddMetadataColumns](../logical-analysis-rules/AddMetadataColumns.md) logical resolution rule is executed (to [addMetadataCol](../logical-analysis-rules/AddMetadataColumns.md#addMetadataCol))
61+
* `UnresolvedStar` expression is requested to [expand](../expressions/UnresolvedStar.md#expand)
62+
* _others_

docs/metadata-columns/index.md

+4
Original file line numberDiff line numberDiff line change
@@ -15,3 +15,7 @@ Logical operators propagate metadata columns using [metadataOutput](../logical-o
1515
## <span id="DataSourceV2Relation"> DataSourceV2Relation
1616

1717
`MetadataColumn`s are disregarded (_filtered out_) from the [metadataOutput](../logical-operators/LogicalPlan.md#metadataOutput) in [DataSourceV2Relation](../logical-operators/DataSourceV2Relation.md) leaf logical operator when in name-conflict with output columns.
18+
19+
## \_\_qualified_access_only { #QUALIFIED_ACCESS_ONLY }
20+
21+
`__qualified_access_only` special metadata attribute is used as a marker for qualified-access-only restriction.

docs/types/MetadataBuilder.md

+44-1
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,46 @@
11
# MetadataBuilder
22

3-
`MetadataBuilder` is...FIXME
3+
`MetadataBuilder` is used to describe how to [create a Metadata](#build) (using _Fluent API_).
4+
5+
??? note "Fluent API"
6+
Learn more about [Fluent interface](https://en.wikipedia.org/wiki/Fluent_interface) design pattern in Wikipedia, the free encyclopedia.
7+
8+
```scala
9+
import org.apache.spark.sql.types.MetadataBuilder
10+
11+
val newMetadata = new MetadataBuilder()
12+
.withMetadata(attr.metadata)
13+
.putNull("__is_duplicate")
14+
.build()
15+
```
16+
17+
## Creating Metadata { #build }
18+
19+
```scala
20+
build(): Metadata
21+
```
22+
23+
`build` creates a [Metadata](Metadata.md) (for the internal [map](#map)).
24+
25+
## withMetadata { #withMetadata }
26+
27+
```scala
28+
withMetadata(
29+
metadata: Metadata): this.type
30+
```
31+
32+
`withMetadata` adds the [map](Metadata.md#map) of the given [Metadata](Metadata.md) to the internal [map](#map).
33+
34+
In the end, `withMetadata` returns this `MetadataBuilder`.
35+
36+
---
37+
38+
`withMetadata` is used when:
39+
40+
* `FileSourceConstantMetadataStructField` is requested for `metadata`
41+
* `FileSourceGeneratedMetadataStructField` is requested for `metadata`
42+
* `FileSourceMetadataAttribute` is requested for `metadata` and `removeInternalMetadata`
43+
* `MetadataColumnHelper` is requested to [markAsAllowAnyAccess](../metadata-columns/MetadataColumnHelper.md#markAsAllowAnyAccess) and [markAsQualifiedAccessOnly](../metadata-columns/MetadataColumnHelper.md#markAsQualifiedAccessOnly)
44+
* [ResolveDefaultColumns](../logical-analysis-rules/ResolveDefaultColumns.md) logical resolution rule is executed (to [constantFoldCurrentDefaultsToExistDefaults](../logical-analysis-rules/ResolveDefaultColumns.md#constantFoldCurrentDefaultsToExistDefaults))
45+
* `StructField` is requested to [remove](StructField.md#clearCurrentDefaultValue) or [update the CURRENT_DEFAULT metadata attribute](StructField.md#withCurrentDefaultValue), and [update EXISTS_DEFAULT metadata attribute](StructField.md#withExistenceDefaultValue)
46+
* _others_

docs/types/StructField.md

+46
Original file line numberDiff line numberDiff line change
@@ -77,3 +77,49 @@ StructField(id,LongType,false)
7777
scala> println(f.toDDL)
7878
`id` BIGINT COMMENT 'this is a comment'
7979
```
80+
81+
## Removing CURRENT_DEFAULT Metadata Attribute { #clearCurrentDefaultValue }
82+
83+
```scala
84+
clearCurrentDefaultValue(): StructField
85+
```
86+
87+
`clearCurrentDefaultValue` removes (_clears_) [CURRENT_DEFAULT](../default-columns/index.md#CURRENT_DEFAULT_COLUMN_METADATA_KEY) column metadata attribute from this [Metadata](#metadata).
88+
89+
---
90+
91+
`clearCurrentDefaultValue` is used when:
92+
93+
* `CatalogV2Util` is requested to [applySchemaChanges](../connector/catalog/CatalogV2Util.md#applySchemaChanges)
94+
* `AlterTableChangeColumnCommand` logical command is executed
95+
96+
## Updating CURRENT_DEFAULT Metadata Attribute { #withCurrentDefaultValue }
97+
98+
```scala
99+
withCurrentDefaultValue(
100+
value: String): StructField
101+
```
102+
103+
`withCurrentDefaultValue` adds or updates [CURRENT_DEFAULT](../default-columns/index.md#CURRENT_DEFAULT_COLUMN_METADATA_KEY) column metadata attribute (of this [Metadata](#metadata)) to the given `value`.
104+
105+
---
106+
107+
`withCurrentDefaultValue` is used when:
108+
109+
* `CatalogV2Util` is requested to [applySchemaChanges](../connector/catalog/CatalogV2Util.md#applySchemaChanges) and [encodeDefaultValue](../connector/catalog/CatalogV2Util.md#encodeDefaultValue)
110+
* `AlterTableChangeColumnCommand` logical command is executed (to [addCurrentDefaultValue](#addCurrentDefaultValue))
111+
112+
## Updating EXISTS_DEFAULT Metadata Attribute { #withExistenceDefaultValue }
113+
114+
```scala
115+
withExistenceDefaultValue(
116+
value: String): StructField
117+
```
118+
119+
`withExistenceDefaultValue` adds or updates [EXISTS_DEFAULT](../default-columns/index.md#EXISTS_DEFAULT_COLUMN_METADATA_KEY) column metadata attribute (of this [Metadata](#metadata)) to the given `value`.
120+
121+
---
122+
123+
`withExistenceDefaultValue` is used when:
124+
125+
* `CatalogV2Util` is requested to [encodeDefaultValue](../connector/catalog/CatalogV2Util.md#encodeDefaultValue)

0 commit comments

Comments
 (0)