Skip to content

Rewrote a gettingStarted pages #407

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 20, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions docs/StardustDocs/d.tree
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,12 @@
name="Dataframe"
status="release"
start-page="overview.md">
<toc-element topic="gettingStarted.md"/>
<toc-element topic="gettingStarted.md">
<toc-element topic="gettingStartedGradle.md"/>
<toc-element topic="gettingStartedJupyterNotebook.md"/>
<toc-element topic="gettingStartedDatalore.md"/>
<toc-element topic="gettingStartedGradleAdvanced.md"/>
</toc-element>
<toc-element topic="overview.md">
<toc-element topic="apiLevels.md">
<toc-element topic="stringApi.md"/>
Expand All @@ -25,7 +30,6 @@
<toc-element topic="schemasImportOpenApiJupyter.md"/>
</toc-element>
</toc-element>
<toc-element topic="installation.md"/>
<toc-element topic="types.md">
<toc-element topic="DataFrame.md"/>
<toc-element topic="DataColumn.md"/>
Expand Down
2 changes: 1 addition & 1 deletion docs/StardustDocs/topics/collectionsInterop.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ val df2 = df.add("c") { a + b }
<tip>

To enable extension properties generation, you should use the [DataFrame plugin](schemasGradle.md)
for Gradle or the [Kotlin Jupyter kernel](installation.md)
for Gradle or the [Kotlin Jupyter kernel](gettingStartedJupyterNotebook.md)

</tip>

Expand Down
209 changes: 69 additions & 140 deletions docs/StardustDocs/topics/gettingStarted.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,157 +2,86 @@

The Kotlin DataFrame library gives you the power to manipulate data in your Kotlin projects.

This page explains how to:
* Set up the Kotlin DataFrame library in an IntelliJ IDEA project with Gradle.
* Import and manipulate data.
* Export data.

To use the Kotlin DataFrame library with Jupyter notebooks or Datalore, follow the instructions on our [Installation page](installation.md).
To use the Kotlin DataFrame library with a [custom Gradle configuration](gettingStartedGradle.md),
[Jupyter Notebooks](gettingStartedJupyterNotebook.md), or [Datalore](gettingStartedJupyterNotebook.md),
follow the instructions on the appropriate pages.

## Install Kotlin

Kotlin is included in each IntelliJ IDEA release.
Download and install [IntelliJ IDEA](https://www.jetbrains.com/idea/download/) to start using Kotlin.

## Create Kotlin project

1. In IntelliJ IDEA, select **File** | **New** | **Project**.
2. In the panel on the left, select **New Project**.
3. Name the new project and change its location, if necessary.

> Select the **Create Git repository** checkbox to place the new project under version control. You can enable this
> later at any time.
>
{type="tip"}

4. From the **Language** list, select **Kotlin**.
5. Select the **Gradle** build system.
6. From the **JDK list**, select the [JDK](https://www.oracle.com/java/technologies/downloads/) that you want to use in
your project. The minimum supported version is JDK 8.
* If the JDK is installed on your computer, but not defined in the IDE, select **Add JDK** and specify the path to the
JDK home directory.
* If you don't have the necessary JDK on your computer, select **Download JDK**.
7. From the **Gradle DSL** list, select **Kotlin** or **Groovy**.
8. Select the **Add sample code** checkbox to create a file with a sample `"Hello World!"` application.
9. Click **Create**.

You have successfully created a project with Gradle.

### Update Gradle dependencies

In your Gradle build file (`build.gradle.kts`), add the Kotlin DataFrame library as a dependency:

## Create your powerful application with the Kotlin DataFrame library
<tabs>
<tab title="Kotlin DSL">

```kotlin
dependencies {
implementation("org.jetbrains.kotlinx:dataframe:%dataFrameVersion%")
}
```

<tab title="Gradle">
Here is how you can take the first steps in developing data-intensive Kotlin applications with Gradle in IntelliJ IDEA.

1. **Create your first application for data exploration:**
* To start from scratch,
create a [basic JVM application with the IntelliJ IDEA project wizard](gettingStartedGradle.md).
* To set up the custom Gradle configuration or Linter configuration, follow the instruction on this [page](gettingStartedGradleAdvanced.md).
* If you prefer more robust examples, try to use Kotlin DataFrame together with KotlinDL,
like in the [Titanic example](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/titanic).
2. **Use Kotlin DataFrame and third-party Kotlin Data Science libraries in your application:**
* You can find a curated list of recommended data science libraries on the JVM [here](https://kotlinlang.org/docs/data-science-overview.html#java-libraries).
3. **Learn more about Kotlin DataFrame usage:**
* [Kotlin DataFrame Preview blogpost](https://blog.jetbrains.com/kotlin/2022/06/kotlin-dataframe-library-preview/)
* [Kotlin DataFrame Overview (video)](https://www.youtube.com/watch?v=qGou8F2asNw)
* [Replacing SQL with Kotlin's 'dataframe' on the Las Vegas Strip by Andrew Goldberg (video)](https://www.youtube.com/watch?v=sDZWiu9nnuU)
* [How to use KotlinDL, Kotlin DataFrame library and Kandy together (video)](https://www.youtube.com/watch?v=4IYBVdyP_8s)
4. **Join the Kotlin Data Science community:**
* ![Slack](https://kotlinlang.org/docs/images/slack.svg)Slack:
get an [invite](https://surveys.jetbrains.com/s3/kotlin-slack-sign-up?_ga=2.148899126.1808346675.1686912551-1605045648.1686912543&_gl=1*64xzk9*_ga*MTYwNTA0NTY0OC4xNjg2OTEyNTQz*_ga_9J976DJZ68*MTY4NjkxMjU0Mi4xLjEuMTY4NjkxNTE3OC41OC4wLjA.)
and join the #datascience, #science, #mathematics channels.
* ![StackOverflow](https://kotlinlang.org/docs/images/stackoverflow.svg)StackOverflow:
subscribe to the ["kotlin"](https://stackoverflow.com/questions/tagged/kotlin)
and ["kotlin-dataframe"](https://stackoverflow.com/questions/tagged/kotlin-dataframe) tags.
5. **Follow Kotlin for Data Science** on ![Twitter](https://kotlinlang.org/docs/images/twitter.svg)[Twitter](https://twitter.com/KotlinForData)

If you encounter any difficulties or problems,
report an issue to our [issue tracker](https://github.com/Kotlin/dataframe/issues).
</tab>

<tab title="Groovy DSL">
<tab title="Jupyter Notebook">
Support for programming in Jupyter Notebook is one of Kotlin's key benefits.
It brings new visualization and presentation possibilities
while retaining the power and benefits of a strongly typed programming language.

1. **[Set up your environment for development in Jupyter Notebook.](https://jupyter.org/install)**
2. **Create your first notebook running with the Kotlin kernel:**
* To start from scratch, create a [basic Kotlin notebook](gettingStartedJupyterNotebook.md).
* If you prefer more robust examples, you can download and run [these examples](https://github.com/Kotlin/dataframe/tree/master/examples/notebooks) locally.
3. **[Learn more about Kotlin kernel for Jupyter Notebook.](https://github.com/Kotlin/kotlin-jupyter)**
4. **Join the Kotlin Data Science community:**
* ![Slack](https://kotlinlang.org/docs/images/slack.svg)Slack:
get an [invite](https://surveys.jetbrains.com/s3/kotlin-slack-sign-up?_ga=2.148899126.1808346675.1686912551-1605045648.1686912543&_gl=1*64xzk9*_ga*MTYwNTA0NTY0OC4xNjg2OTEyNTQz*_ga_9J976DJZ68*MTY4NjkxMjU0Mi4xLjEuMTY4NjkxNTE3OC41OC4wLjA.)
and join the #datascience, #science, #mathematics channels.
* ![StackOverflow](https://kotlinlang.org/docs/images/stackoverflow.svg)StackOverflow:
subscribe to the ["kotlin"](https://stackoverflow.com/questions/tagged/kotlin)
and ["kotlin-dataframe"](https://stackoverflow.com/questions/tagged/kotlin-dataframe) tags.
5. **Follow Kotlin for Data Science** on ![Twitter](https://kotlinlang.org/docs/images/twitter.svg)[Twitter](https://twitter.com/KotlinForData)
</tab>

```kotlin
dependencies {
implementation 'org.jetbrains.kotlinx:dataframe:%dataFrameVersion%'
}
```
<tab title="Datalore">
With Datalore, you can use Kotlin in the browser straight out of the box, no installation required.

You can also collaborate on Kotlin notebooks in real time,
get smart coding assistance when writing code, and share results as interactive or static reports.

1. **[Create your first notebook in Datalore](https://www.jetbrains.com/datalore/features/notebooks/)**
* To start from scratch, create a [basic Kotlin notebook](gettingStartedDatalore.md).
* If you prefer more robust examples, you can download them from GitHub, upload them to Datalore,
and run [these examples](https://github.com/Kotlin/dataframe/tree/master/examples/notebooks) here.
2. **[Learn more about Kotlin kernel for Jupyter Notebook.](https://github.com/Kotlin/kotlin-jupyter)**
3. **Join the Kotlin Data Science community:**
* ![Slack](https://kotlinlang.org/docs/images/slack.svg)Slack:
get an [invite](https://surveys.jetbrains.com/s3/kotlin-slack-sign-up?_ga=2.148899126.1808346675.1686912551-1605045648.1686912543&_gl=1*64xzk9*_ga*MTYwNTA0NTY0OC4xNjg2OTEyNTQz*_ga_9J976DJZ68*MTY4NjkxMjU0Mi4xLjEuMTY4NjkxNTE3OC41OC4wLjA.)
and join the #datascience, #science, #mathematics channels.
* ![StackOverflow](https://kotlinlang.org/docs/images/stackoverflow.svg)StackOverflow:
subscribe to the ["kotlin"](https://stackoverflow.com/questions/tagged/kotlin)
and ["kotlin-dataframe"](https://stackoverflow.com/questions/tagged/kotlin-dataframe) tags.
4. **Follow Kotlin for Data Science** on ![Twitter](https://kotlinlang.org/docs/images/twitter.svg)[Twitter](https://twitter.com/KotlinForData)

</tab>

</tabs>

### Add imports

In `src/main/kotlin/Main.kt`, add the following imports at the top of the file:

```kotlin
import org.jetbrains.kotlinx.dataframe.DataFrame
import org.jetbrains.kotlinx.dataframe.io.*
import org.jetbrains.kotlinx.dataframe.api.*
```

## Import data

Download the file `movies.csv` from [here](https://github.com/Kotlin/dataframe/blob/master/data/movies.csv) to the root directory of your project:


Delete the `println()` functions and comments from your main function in `Main.kt`.

To import the movie sample data into a data frame and print it, inside your main function in `Main.kt`, add the following code:

```kotlin
// Import your data to a data frame
var df = DataFrame.read("movies.csv")

// Print your data frame
df.print()
```

## Manipulate data

To print some information about your data frame and sort your data, add the following additional lines of code:

```kotlin
// Print some information about the data frame
println(df.columnNames()) // Print column names
println(df.count()) // Print number of rows

// Sort your data alphabetically by title
df = df.sortBy("title")

// Filter your data so that only comedy films remain, and print
df = df.filter { "genres"<String>().contains("Comedy") }
df.print()
```

## Export data

To export the current version of your data frame in CSV format, add the following additional lines of code and run `Main.kt`.

```kotlin
// Export your manipulated data to CSV format
df.writeCSV("movies-by-title.csv")
```


<code-block lang="console" collapsed-title="Example terminal output" collapsible="true">
movieId title genres
0 9b30aff7943f44579e92c261f3adc193 Women in Black (1997) Fantasy|Suspenseful|Comedy
1 2a1ba1fc5caf492a80188e032995843e Bumblebee Movie (2007) Comedy|Jazz|Family|Animation
2 f44ceb4771504342bb856d76c112d5a6 Magical School Boy and the Rock of Wi... Fantasy|Growing up|Magic
3 43d02fb064514ff3bd30d1e3a7398357 Master of the Jewlery: The Company of... Fantasy|Magic|Suspenseful
4 6aa0d26a483148998c250b9c80ddf550 Sun Conflicts: Part IV: A Novel Espai... Fantasy
5 eace16e59ce24eff90bf8924eb6a926c The Outstanding Bulk (2008) Fantasy|Superhero|Family
6 ae916bc4844a4bb7b42b70d9573d05cd In Automata (2014) Horror|Existential
7 c1f0a868aeb44c5ea8d154ec3ca295ac Interplanetary (2014) Sci-fi|Futuristic
8 9595b771f87f42a3b8dd07d91e7cb328 Woods Run (1994) Family|Drama
9 aa9fc400e068443488b259ea0802a975 Anthropod-Dude (2002) Superhero|Fantasy|Family|Growing up
10 22d20c2ba11d44cab83aceea39dc00bd The Chamber (2003) Comedy|Drama
11 8cf4d0c1bd7b41fab6af9d92c892141f That Thing About an Iceberg (1997) Drama|History|Family|Romance
12 c2f3e7588da84684a7d78d6bd8d8e1f4 Vehicles (2006) Animation|Family
13 ce06175106af4105945f245161eac3c7 Playthings Tale (1995) Animation|Family
14 ee28d7e69103485c83e10b8055ef15fb Metal Man 2 (2010) Fantasy|Superhero|Family
15 c32bdeed466f4ec09de828bb4b6fc649 Surgeon Odd in the Omniverse of Crazy... Fantasy|Superhero|Family|Horror
16 d4a325ab648a42c4a2d6f35dfabb387f Bad Dream on Pine Street (1984) Horror
17 60ebe74947234ddcab49dea1a958faed The Shimmering (1980) Horror
18 f24327f2b05147b197ca34bf13ae3524 Krubit: Societal Teachings for Do Man... Comedy
19 2bb29b3a245e434fa80542e711fd2cee This is No Movie (1950) (no genres listed)
[movieId, title, genres]
20
movieId title genres
0 2a1ba1fc5caf492a80188e032995843e Bumblebee Movie (2007) Comedy|Jazz|Family|Animation
1 f24327f2b05147b197ca34bf13ae3524 Krubit: Societal Teachings for Do Man... Comedy
2 22d20c2ba11d44cab83aceea39dc00bd The Chamber (2003) Comedy|Drama
3 9b30aff7943f44579e92c261f3adc193 Women in Black (1997) Fantasy|Suspenseful|Comedy
</code-block>

Congratulations! You have successfully used the Kotlin DataFrame library to import, manipulate and export data.

## Next steps
* Learn more about how to [import and export data](io.md)
* Learn about our different [access APIs](apiLevels.md)
* Explore the many different [operations that you can perform](operations.md)
13 changes: 13 additions & 0 deletions docs/StardustDocs/topics/gettingStartedDatalore.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
[//]: # (title: Get started with Kotlin DataFrame on Datalore)

## Datalore

To start with the Kotlin DataFrame library in Datalore, create a Kotlin notebook first:

![Installation in Datalore](datalore-1.png)

As the notebook you've created is a Jupyter Notebook, you can follow the instructions
in the [previous section](gettingStartedJupyterNotebook.md) to use the Kotlin DataFrame library.
The simplest way of doing this is shown in the screenshot:

![Datalore notebook](datalore-2.png)
Loading