Apache Flink
Overview
This training on Apache Flink is open-source framework & a distributed processing engine which is used for batch data processing as Unbound and Bound. This Flink has been built to run in all cluster environments and also perform computation in Memory speed and at any scale The objective of the training below:-
- Introduction for Apache Flink
- Transformation Operations of the Dataset API
- Interaction with Real-time Data
- Gelly API and Graph Processing
3 Days
Pre-Requisites
Course Outline
- Introducing Flink
- Batch-Processing Vs Stream-Processing
- Hadoop Vs Streaming-Engines (Spark & Flink)
- Spark-Vs-Flink
- Flink Architecture/Ecosystem
- Flink’s programming model and Flow of a Flink program
- Installation of Flink
- Transformation operations of DataSet API
- What is Default Code structure of a Flink Program?
- WordCount using Map, Flatmap, Filter, groupby
- Joins – Inner join
- Joins – Left, Right & Full Outer Join
- Join Hints for Optimization (Exclusive feature)
- DataStream API Operations
- Data Sources & Sinks of Datastream API
- First program using Datastream API
- Reduce Operation
- Fold Operation
- Aggregation Operations: Flink
- Split Operation
- Iterate Operator
- Windows: Flink
- Introduction to Windowing
- Window Assigners
- Various Time-Notions ofWindows in Flink
- Tumbling-Windows Implementation
- Sliding Windows Implementation
- Session Windows Implementation
- Global Windows Implementation
- Triggers in Windows
- Evictors for Windows
- Watermarks, Late Elements & Allowed Lateness
- How to generate Watermarks
- Recommendation
- State, Checkpointing, and Fault-tolerance
- Understanding State in Flink
- Checkpointing/Barrier Snapshoting
- Incremental Checkpointing (New Feature)
- Types of States
- Value State Implementation
- List State Implementation
- Reducing State Implementation
- Managed Operator State Implementation
- Implement Checkpointing in a Flink Program
- The Broadcast State Implementation
- Queryable State (Beta Version)
- Interacting with Real-Time Data
- Getting Twitter data using its APIs
- Adding Kafka to Flink as a Data source
- Install Kafka – RealTimeTuts Link
- Solve Real-Time Case studies in Flink
- Twitter data analysis in Flink
- Bank Real-Time Fraud detection
- Stock Real-Time Data-Processing
- Table & Sql API | Relational APIs Flink
- Introducing Table & Sql API
- Register a Table in Relational APIs
- Writing Queries in Table & Sql API
- Gelly API for Graph Processing
- What is a Graph
- Calculate Friends of Freinds of a Person using GELLY API
