11Python Flink Examples
22=====================
33
4+ ![ Flink UI] ( https://github.com/wdm0006/flink-python-examples/tree/master/images/flink_ui.png )
5+
46A collection of examples using Apache Flink's new python API. To set up your local environment with
5- the latest Flink build, see the guide [ HERE] ( http://willmcginnis.com/2015/11/08/getting-started-with-python-and-apache-flink/ ) .
7+ the latest Flink build, see the guide:
8+
9+ * [ HERE] ( http://willmcginnis.com/2015/11/08/getting-started-with-python-and-apache-flink/ ) .
610
7- The examples here use the v1. 0 python API (they won't work with the current stable release pre-1.0), and
8- are meant to serve as demonstrations of simple use cases. Currently the python API supports a portion of the DataSet
9- API, which has a similar functionality to Spark, from the user's perspective.
11+ The examples here use the v0.10. 0 python API, and are meant to serve as demonstrations of simple use cases. Currently
12+ the python API supports a portion of the DataSet API, which has a similar functionality to Spark, from the user's
13+ perspective.
1014
1115To run the examples, I've included a runner script at the top level with methods for each example, simply
12- add in the path to your pyflink script and you should be good to go (as long as you have a flask cluster running locally).
16+ add in the path to your pyflink script and you should be good to go (as long as you have a flink cluster running locally).
1317
1418The currently included examples are:
1519
1620Examples
1721========
1822
19- A listing of the examples included here.
23+ A listing of the examples and their resultant flink plans are included here.
2024
2125Word Count
2226----------
2327
2428An extremely simple analysis program uses a source from a simple string, counts the occurrences of each word
2529and outputs to a file on disk (using the overwrite functionality).
2630
31+ ![ Word Count Plan] ( https://github.com/wdm0006/flink-python-examples/tree/master/images/word_count_plan.png )
2732
2833Trending Hashtags
2934-----------------
@@ -32,6 +37,7 @@ A very similar example to word count, but includes a filter step to only include
3237The input data in this case is read off of disk, and the output is written as a csv. The file is generated dynamically
3338at run time, so you can play with different volumes of tweets to get an idea of Flink's scalability and performance.
3439
40+ ![ Trending Hashtags Plan] ( https://github.com/wdm0006/flink-python-examples/tree/master/images/trending_hashtags_plan.png )
3541
3642Data Enrichment
3743---------------
@@ -40,17 +46,23 @@ In this example, we have row-wise json in one file, with an attribute field that
4046colors. So we load both datasets in, convert the json data into a ordered and typed tuple, and join then two together
4147to get a nice dataset of cars and their colors.
4248
49+ ![ Data Enrichment Plan] ( https://github.com/wdm0006/flink-python-examples/tree/master/images/data_enrichment_plan.png )
50+
4351Mean Values
4452-----------
4553
4654Takes in a csv with two columns and finds the mean of each column, using a custom reducer function. Afterwards, it
4755formats a string nicely with the output and dumps that onto disk.
4856
57+ ![ Mean Values Plan] ( https://github.com/wdm0006/flink-python-examples/tree/master/images/mean_values_plan.png )
58+
4959Mandelbrot Set
5060--------------
5161
5262Creates a Mandelbrot set from a set of candidates. Inspired by [ this post] ( http://1oclockbuzz.com/2015/11/24/pyspark-and-the-mandelbrot-set-overkill-indeed/ )
5363
64+ ![ Mandelbrot Plan] ( https://github.com/wdm0006/flink-python-examples/tree/master/images/mandelbrot_plan.png )
65+
5466Features
5567========
5668
0 commit comments