Skip to content

Documentation improvements #252

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Feb 8, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -821,6 +821,62 @@ class Modify : TestBase() {
// SampleEnd
}

private class CityInfo(val city: String?, val population: Int, val location: String)
private fun queryCityInfo(city: String?): CityInfo { return CityInfo(city, city?.length ?: 0, "35.5 32.2") }

@Test
fun addCalculatedApi() {
// SampleStart
class CityInfo(val city: String?, val population: Int, val location: String)
fun queryCityInfo(city: String?): CityInfo {
return CityInfo(city, city?.length ?: 0, "35.5 32.2")
}
// SampleEnd
}

@Test
fun addCalculated_properties() {
// SampleStart
val personWithCityInfo = df.add {
val cityInfo = city.map { queryCityInfo(it) }
"cityInfo" {
cityInfo.map { it.location } into CityInfo::location
cityInfo.map { it.population } into "population"
}
}
// SampleEnd
personWithCityInfo["cityInfo"]["population"] shouldBe df.city.map { it?.length ?: 0 }.named("population")
}

@Test
fun addCalculated_accessors() {
// SampleStart
val city by column<String?>()
val personWithCityInfo = df.add {
val cityInfo = city().map { queryCityInfo(it) }
"cityInfo" {
cityInfo.map { it.location } into CityInfo::location
cityInfo.map { it.population } into "population"
}
}
// SampleEnd
personWithCityInfo["cityInfo"]["population"] shouldBe df.city.map { it?.length ?: 0 }.named("population")
}

@Test
fun addCalculated_strings() {
// SampleStart
val personWithCityInfo = df.add {
val cityInfo = "city"<String?>().map { queryCityInfo(it) }
"cityInfo" {
cityInfo.map { it.location } into CityInfo::location
cityInfo.map { it.population } into "population"
}
}
// SampleEnd
personWithCityInfo["cityInfo"]["population"] shouldBe df.city.map { it?.length ?: 0 }.named("population")
}

@Test
fun addMany_properties() {
// SampleStart
Expand Down
4 changes: 3 additions & 1 deletion docs/StardustDocs/d.tree
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,8 @@
<toc-element id="DataColumn.md"/>
<toc-element id="DataRow.md"/>
</toc-element>
<toc-element id="operations.md">
<toc-element id="operations.md"/>
<toc-element toc-title="Operations">
<toc-element id="create.md" show-structure-depth="3">
<toc-element id="createColumn.md" toc-title="DataColumn"/>
<toc-element id="createDataFrame.md" toc-title="DataFrame"/>
Expand Down Expand Up @@ -172,4 +173,5 @@
</toc-element>
</toc-element>
<toc-element id="gradleReference.md"/>
<toc-element href="https://github.com/Kotlin/dataframe/tree/master/examples" toc-title="Examples"/>
</product-profile>
3 changes: 3 additions & 0 deletions docs/StardustDocs/project.ihp
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,7 @@
<images dir="images" version="0.9" web-path="images/" />
<vars src="v.list"/>
<product src="d.tree" version="0.9" id="dataframe"/>
<settings>
<default-property element-name="toc-element" property-name="show-structure-depth" value="2"/>
</settings>
</ihp>
78 changes: 73 additions & 5 deletions docs/StardustDocs/topics/add.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

Returns `DataFrame` which contains all columns from original `DataFrame` followed by newly added columns. Original `DataFrame` is not modified.

**Create new column and add it to `DataFrame`:**
## Create new column and add it to `DataFrame`

```text
add(columnName: String) { rowExpression }
Expand Down Expand Up @@ -55,7 +55,7 @@ df.add("fibonacci") {

<!---END-->

**Create and add several columns to `DataFrame`:**
## Create and add several columns to `DataFrame`

```kotlin
add {
Expand All @@ -64,7 +64,14 @@ add {
...
}

columnMapping = column into columnName | columnName from column | columnName from { rowExpression }
columnMapping = column into columnName
| columnName from column
| columnName from { rowExpression }
| columnGroupName {
columnMapping
columnMapping
...
}
```

<!---FUN addMany-->
Expand Down Expand Up @@ -123,7 +130,68 @@ df.add {
</tab></tabs>
<!---END-->

**Add existing column to `DataFrame`:**
### Create columns using intermediate result

Consider this API:

<!---FUN addCalculatedApi-->

```kotlin
class CityInfo(val city: String?, val population: Int, val location: String)
fun queryCityInfo(city: String?): CityInfo {
return CityInfo(city, city?.length ?: 0, "35.5 32.2")
}
```

<!---END-->

Use the following approach to add multiple columns by calling the given API only once per row:

<!---FUN addCalculated-->
<tabs>
<tab title="Properties">

```kotlin
val personWithCityInfo = df.add {
val cityInfo = city.map { queryCityInfo(it) }
"cityInfo" {
cityInfo.map { it.location } into CityInfo::location
cityInfo.map { it.population } into "population"
}
}
```

</tab>
<tab title="Accessors">

```kotlin
val city by column<String?>()
val personWithCityInfo = df.add {
val cityInfo = city().map { queryCityInfo(it) }
"cityInfo" {
cityInfo.map { it.location } into CityInfo::location
cityInfo.map { it.population } into "population"
}
}
```

</tab>
<tab title="Strings">

```kotlin
val personWithCityInfo = df.add {
val cityInfo = "city"<String?>().map { queryCityInfo(it) }
"cityInfo" {
cityInfo.map { it.location } into CityInfo::location
cityInfo.map { it.population } into "population"
}
}
```

</tab></tabs>
<!---END-->

## Add existing column to `DataFrame`

<!---FUN addExisting-->

Expand All @@ -136,7 +204,7 @@ df + score

<!---END-->

**Add all columns from another `DataFrame`:**
## Add all columns from another `DataFrame`

<!---FUN addDfs-->

Expand Down
2 changes: 1 addition & 1 deletion docs/StardustDocs/topics/operations.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
[//]: # (title: Operations)
[//]: # (title: Operations Overview)

<!---IMPORT org.jetbrains.kotlinx.dataframe.samples.api.Modify-->

Expand Down
2 changes: 1 addition & 1 deletion docs/StardustDocs/topics/types.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
[//]: # (title: Types)
[//]: # (title: Data Abstractions)

* [`DataColumn`](DataColumn.md) is a named, typed and ordered collection of elements
* [`DataFrame`](DataFrame.md) consists of one or several [`DataColumns`](DataColumn.md) with unique names and equal size
Expand Down
6 changes: 5 additions & 1 deletion examples/README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,16 @@
# Examples of Kotlin Dataframe

### Idea examples
* [movies](idea-examples/movies)
* [movies](idea-examples/movies) Using 3 different [Access APIs](https://kotlin.github.io/dataframe/apilevels.html) to perform data cleaning task
* [titanic](idea-examples/titanic)
* [youtube](idea-examples/youtube)
* [json](idea-examples/json) Using OpenAPI support in DataFrame's Gradle and KSP plugins to access data from [API guru](https://apis.guru/) in a type-safe manner

### Notebook examples

* people [Datalore](https://datalore.jetbrains.com/view/notebook/aOTioEClQQrsZZBKeUPAQj)
Small artificial dataset used in [DataFrame API examples](https://kotlin.github.io/dataframe/operations.html)

* puzzles ([Jupyter](jupyter-notebooks/puzzles/40%20puzzles.ipynb)/[Datalore](https://datalore.jetbrains.com/view/notebook/CVp3br3CDXjUGaxxqfJjFF)) &ndash;
Inspired [by 100 pandas puzzles](https://github.com/ajcr/100-pandas-puzzles). You will go from the simplest tasks to
complex problems where need to think. This notebook will show you how to solve these tasks with the Kotlin
Expand Down