Skip to content

[Documentation] Update README.md #5

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 24, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
133 changes: 101 additions & 32 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,65 +1,134 @@
# Spark .NET
![Icon](docs/img/dotnetsparklogo-6.png)

![Icon](docs/img/spark-dot-net-logo.PNG)
# .NET for Apache® Spark™

Spark .NET is the .NET API for [Apache Spark](https://spark.apache.org/).
.NET for Apache Spark provides high performance APIs for using [Apache Spark](https://spark.apache.org/) from C# and F#. With these .NET APIs, you can access the most popular Dataframe and SparkSQL aspects of Apache Spark, for working with structured data, and Spark Structured Streaming, for working with streaming data.

## Build Status
| ![Ubuntu icon](docs/img/ubuntu-icon-32.png) | ![Ubuntu icon](docs/img/ubuntu-icon-32.png) | ![Windows icon](docs/img/windows-icon-32.png) |
| :---: | :---: | :---: |
| Ubuntu 16.04 | Ubuntu 18.04 | Windows 10 |
| | | [![Build Status](https://dnceng.visualstudio.com/internal/_apis/build/status/spark.net?branchName=master)](https://dnceng.visualstudio.com/internal/_build/latest?definitionId=301?branchName=master)|
.NET for Apache Spark is compliant with .NET Standard - a formal specification of .NET APIs that are common across .NET implementations. This means you can use .NET for Apache Spark anywhere you write .NET code allowing you to reuse all the knowledge, skills, code, and libraries you already have as a .NET developer.

## Table of Contents

- [Introduction](#introduction)
- [Quick Start (TL;DR)](#quick-start)
- [Features](docs/features.md)
- [FAQ/Troubleshooting](#faq)
- [Inspiration and Special Thanks](#inspiration)
- [How to Engage, Contribute and Provide Feedback](#community)
- [Get Started](#get-started)
- [Build Status](#build-status)
- [Building from Source](#building-from-source)
- [Samples](#samples)
- [Contributing](#contributing)
- [Inspiration and Special Thanks](#inspiration-and-special-thanks)
- [How to Engage, Contribute and Provide Feedback](#how-to-engage-contribute-and-provide-feedback)
- [.NET Foundation](#net-foundation)
- [Code of Conduct](#code-of-conduct)
- [License](#license)

<a name="introduction"></a>
## Introduction
## Get Started



## Build Status

| ![Ubuntu icon](docs/img/ubuntu-icon-32.png) | ![Ubuntu icon](docs/img/ubuntu-icon-32.png) | ![Windows icon](docs/img/windows-icon-32.png) |
| :---: | :---: | :---: |
| Ubuntu 16.04 | Ubuntu 18.04 | Windows 10 |
| | | [![Build Status](https://dnceng.visualstudio.com/internal/_apis/build/status/spark.net?branchName=master)](https://dnceng.visualstudio.com/internal/_build/latest?definitionId=301?branchName=master)|

<a name="quick-start"></a>
## Quick Start (TL;DR)
## Building from Source

Spark .NET will be redistributed as a Nuget package and a formal release here on Github eventually to help you build your applications easily. Until then, please feel free to build it locally on your machine and link it appropriately. Building from source is very easy and the whole process (from cloning to being able to run your app) should take less than 15 minutes!
Building from source is very easy and the whole process (from cloning to being able to run your app) should take less than 15 minutes!

| | | Instructions |
| :---: | :--- | :--- |
| ![Windows icon](docs/img/windows-icon-32.png) | **Windows** | <ul><li>Local - [.NET Framework 4.6.1](docs/building/windows-instructions.md#using-visual-studio-for-net-framework-461)</li><li>Local - [.NET Core 2.1.x](docs/building/windows-instructions.md#using-net-core-cli-for-net-core-21x)</li><ul> |
| ![Ubuntu icon](docs/img/ubuntu-icon-32.png) | **Ubuntu** | <ul><li>Local - [.NET Core 2.1.x](docs/building/ubuntu-instructions.md)</li><li>[Azure HDInsight Spark - .NET Core 2.1.x](deployment/README.md)</li></ul> |

<a name="samples"></a>
## Samples

There are two types of samples/apps in the .NET for Apache Spark repo:

* ![Icon](docs/img/app-type-getting-started.png) Getting Started - .NET for Apache Spark code focused on simple and minimalistic scenarios.

* ![Icon](docs/img/app-type-e2e.png) End-End apps/scenarios - Real world examples of industry standard benchmarks, usecases and business applications implemented using .NET for Apache Spark.

We welcome contributions to both categories!

<table>
<tr>
<td width="25%">
<h4><b>Analytics Scenario</b></h4>
</td>
<td>
<h4 width="35%"><b>Description</b></h4>
</td>
<td>
<h4><b>Scenarios</b></h4>
</td>
</tr>
<tr>
<td width="25%">
<h5>Dataframes and SparkSQL</h5>
</td>
<td width="35%">
Simple code snippets to help you get familiarized with the programmability experience of .NET for Apache Spark.
</td>
<td>
<h5>Basic &nbsp;&nbsp;&nbsp;
<a href="examples/Microsoft.Spark.CSharp.Examples/Sql/Basic.cs">C#</a> &nbsp; &nbsp; <a href="examples/Microsoft.Spark.FSharp.Examples/Sql/Basic.fs">F#</a>&nbsp;&nbsp;&nbsp;<a href="#"><img src="docs/img/app-type-getting-started.png" alt="Getting started icon"></a></h5>
</td>
</tr>
<tr>
<td width="25%">
<h5>Structured Streaming</h5>
</td>
<td width="35%">
Code snippets to show you how to utilize Apache Spark's Structured Streaming (<a href="https://spark.apache.org/docs/2.3.1/structured-streaming-programming-guide.html">2.3.1</a>, <a href="https://spark.apache.org/docs/2.3.2/structured-streaming-programming-guide.html">2.3.2</a>, <a href="https://spark.apache.org/docs/2.4.1/structured-streaming-programming-guide.html">2.4.1</a>, <a href="https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html">Latest</a>)
</td>
<td>
<h5>Word Count &nbsp;&nbsp;&nbsp;
<a href="examples/Microsoft.Spark.CSharp.Examples/Sql/Streaming/StructuredNetworkWordCount.cs">C#</a> &nbsp;&nbsp;&nbsp;<a href="examples/Microsoft.Spark.FSharp.Examples/Sql/Streaming/StructuredNetworkWordCount.fs">F#</a> &nbsp;&nbsp;&nbsp;<a href="#"><img src="docs/img/app-type-getting-started.png" alt="Getting started icon"></a></h5>
<h5>Windowed Word Count &nbsp;&nbsp;&nbsp;<a href="examples/Microsoft.Spark.CSharp.Examples/Sql/Streaming/StructuredNetworkWordCountWindowed.cs">C#</a> &nbsp; &nbsp;<a href="examples/Microsoft.Spark.FSharp.Examples/Sql/Streaming/StructuredNetworkWordCountWindowed.fs">F#</a> &nbsp;&nbsp;&nbsp;<a href="#"><img src="docs/img/app-type-getting-started.png" alt="Getting started icon"></a></h5>
<h5>Word Count on data from <a href="https://kafka.apache.org/">Kafka</a> &nbsp;&nbsp;&nbsp;<a href="examples/Microsoft.Spark.CSharp.Examples/Sql/Streaming/StructuredKafkaWordCount.cs">C#</a> &nbsp;&nbsp;&nbsp;<a href="examples/Microsoft.Spark.FSharp.Examples/Sql/Streaming/StructuredKafkaWordCount.fs">F#</a> &nbsp; &nbsp;&nbsp;<a href="#"><img src="docs/img/app-type-getting-started.png" alt="Getting started icon"></a></h5>
</td>
</tr>
<tr>
<td width="25%">
<h4>TPC-H Queries</h4>
</td>
<td width="35%">
Code to show you how to author complex queries using .NET for Apache Spark.
</td>
<td>
<h5>TPC-H Functional &nbsp;&nbsp;&nbsp;
<a href="benchmark/csharp/Tpch/TpchFunctionalQueries.cs">C#</a> &nbsp;&nbsp;&nbsp;<a href="#"><img src="docs/img/app-type-e2e.png" alt="End-to-end app icon"></a></h5>
<h5>TPC-H SparkSQL &nbsp;&nbsp;&nbsp;
<a href="benchmark/csharp/Tpch/TpchSqlQueries.cs">C#</a> &nbsp;&nbsp;&nbsp;<a href="#"><img src="docs/img/app-type-e2e.png" alt="End-to-end app icon"></a></h5>
</td>
</tr>
</tr>
</table>

## Contributing

We welcome contributions! Please review our [contribution guide](CONTRIBUTING.md).

<a name="features"></a>
## Features
## Inspiration and Special Thanks

<a name="faq"></a>
## Frequently Asked Questions

<a name="inspiration"></a>
## Inspiration
This project would not have been possible without the outstanding work from the following communities:

## Community
- [Apache Spark](https://spark.apache.org/): Unified Analytics Engine for Big Data, the underlying backend execution engine for .NET for Apache Spark
- [Mobius](https://github.com/Microsoft/Mobius): C# and F# language binding and extensions to Apache Spark, a pre-cursor project to .NET for Apache Spark from the same Microsoft group.
- [PySpark](https://spark.apache.org/docs/latest/api/python/index.html): Python bindings for Apache Spark, one of the implementations .NET for Apache Spark derives inspiration from.
- [sparkR](https://spark.apache.org/docs/latest/sparkr.html): one of the implementations .NET for Apache Spark derives inspiration from.
- [Apache Arrow](https://arrow.apache.org/): A cross-language development platform for in-memory data. This library provides .NET for Apache Spark with efficient ways to transfer column major data between the JVM and .NET CLR.
- [Pyrolite](https://github.com/irmen/Pyrolite) - Java and .NET interface to Python's pickle and Pyro protocols. This library provides .NET for Apache Spark with efficient ways to transfer row major data between the JVM and .NET CLR.
- [Databricks](https://databricks.com/): Unified analytics platform. Many thanks to all the suggestions from them towards making .NET for Apache Spark run on Azure and AWS Databricks.

<a name="contact"></a>
## How to Engage, Contribute and Provide Feedback

The Spark .NET team encourages [contributions](docs/contributing.md), both issues and PRs. The first step is finding an [existing issue](https://github.com/dotnet/spark/issues) you want to contribute to or if you cannot find any, [open an issue](https://github.com/dotnet/spark/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+).
The .NET for Apache Spark team encourages [contributions](docs/contributing.md), both issues and PRs. The first step is finding an [existing issue](https://github.com/dotnet/spark/issues) you want to contribute to or if you cannot find any, [open an issue](https://github.com/dotnet/spark/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+).

<a name="net-foundation"></a>
## .NET Foundation

The Spark .NET project is part of the [.NET Foundation](http://www.dotnetfoundation.org).
The .NET for Apache Spark project is part of the [.NET Foundation](http://www.dotnetfoundation.org).

<a name="code-of-conduct"></a>
## Code of Conduct

This project has adopted the code of conduct defined by the Contributor Covenant
Expand Down
Binary file added docs/img/app-type-e2e.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/app-type-getting-started.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/dotnetsparklogo-6.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.