Apache Flink Example With Java In Flink Cluster

Posted By : Rahbar Ali | 29-Sep-2018

Java

Apache Flink is an open source framework an distributed processing engine for stateful computations over limited & unlimited data

1- Limited data-> means the data who have both start and an endpoint is called Limited data or Bounded data.

2- Unlimited data-> means the data who have start point but not have any endpoint that called Unlimited data or Unbounded data.
let assume an exam suppose we stand at the road and count number of a vehicle moving on that point so there is no limit of
a vehicle that passes on that road so this type data is called Unbound data.

Apache flink process both type data with in-memory speed.

you can download apache flink

http://mirrors.fibergrid.in/apache/flink/flink-1.6.1/flink-1.6.1-src.tgz

after download extract it & go to this location

/Downloads/flink-1.6.0/bin

now start with this command
./start-cluster.sh

now open url
http://localhost:8081/#/overview

you will showing this site on your default browser

Now we are going to create a sample project that use Apache-Flink framework to process data from a file & gives the result

mvn archetype:generate \
-DarchetypeGroupId=org.apache.flink \
-DarchetypeArtifactId=flink-quickstart-java \
-DarchetypeVersion=1.6.1

After creating project
open StreamingJob class & update this code on that file.

package com;

import org.apache.flink.api.java.DataSet;
import org.apache.flink.api.java.ExecutionEnvironment;
import org.apache.flink.api.java.aggregation.Aggregations;
import org.apache.flink.api.java.tuple.Tuple2;


public class StreamingJob {

	public static void main(String[] args) throws Exception {
		final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
	    
	    
	    DataSet<String> text = env.readTextFile("/home/rahbar/Desktop/flink.txt");
	    
	    DataSet<Tuple2<String, Integer>> counts = 
	        text.flatMap(new LineSplitter())
	        .groupBy(0)
	        .aggregate(Aggregations.SUM, 1);

	    counts.print();
	     
	    
	    // execute program
	    env.execute("WordCount Example");
	}
}

now create an class with name LineSplitter in same package & update this code

package com;
import org.apache.flink.api.common.functions.FlatMapFunction;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.util.Collector;

public class LineSplitter implements FlatMapFunction<String, Tuple2<String, Integer>> {

	private static final long serialVersionUID = 1L;

@Override
  public void flatMap(String value, Collector<Tuple2<String, Integer>> out) {
    // normalize and split the line into words
    String[] tokens = value.toLowerCase().split("\\W+");
    
    // emit the pairs
    for (String token : tokens) {
      if (token.length() > 0) {
        out.collect(new Tuple2<String, Integer>(token, 1));
      }
    }
  }
}

now run this project with this command
mvn clean package

after successful run it will create an jar file in targer folder of project.
now go to this directry of flink
/Downloads/flink-1.6.0/bin$
and run project with this command

./flink run /home/rahbar/Documents/forblog/1.0/target/1.0-1.0.jar

it will gives result like

it will calculate number of word that present in that file

(accesses,1)
(allwindowfunction,1)
(amount,22)
(apache,1)
(api,1)
(attempt,1)
(bean,1)
(beancreationexception,1)
(beans,1)
(btc,8)
(cancelling,1)
(class,3)
(common,3)

Thanks

If You have any query regarding this blog then leave comment.