Apache Flink is the new star in the town. It is stealing the thunder from Apache Spark (at least in the streaming system) which has been creating buzz for some time now. This is because Spark streaming is built on top of RDDs which is essentially a collection, not a Stream. So now would be the right time to try your hands on Flink, even more so since Flink 1.0 was released last week.

In this short blog, I will explain you how to setup Flink in your system.

1. Download Apache Flink

Go to https://flink.apache.org/downloads.html. This page will show the latest stable release of Flink that is available for download(1.0 is latest currently). Under the binaries, click on Download according to your Hadoop version and Scala version. I am having Hadoop 2.6.0 and Scala 2.11

If you don’t have scala installed, you can install it by following instruction from here.
If you want to know your current scala version you can find it by running below command

$scala -version
Scala code runner version 2.11.7 -- Copyright 2002-2013, LAMP/EPFL

2. Start Flink

Start Flink jobmanager by running below command from the root folder of your Flink

bin/start-local.sh 
Starting jobmanager daemon on host Vishnus-MacBook-Pro.local.

3. Flink dashboard

Flink has a pretty good UI where you can see details of your job, how many slots are present etc. You can access the Flink UI from localhost:8081
Note: Spark UI also uses same port, make sure you don’t have Spark running

4. Flink Shell

Flink comes with a scala shell which can be started by running below command from Flink base folder

$bin/start-scala-shell.sh local

Note:start-scala-shell.sh creates a mini cluster, you don’t have to start separate jobmanager for scala shell to work on local mode

Flink setup is complete. In the coming blogs, I will be writing more about how to write various Streaming applications using Apache Flink. Thanks for reading!
Continue reading