Apache Kafka is a distributed event store & real-time streaming platform. It is an event store and message broker.
Apache Flink is a distributed stream processing framework for real-time data processing. It processes data as it flows through, rather than waiting to collect all data first.
Other stream processing frameworks are Apache Kafka Streams (primarly works in Java), Apache Spark Streaming, etc.
Apache Flink allows for:
- Stateful Processing: Remembers information across multiple events.
- Stateless Processing: Each event is processed independently, no memory of previous events
- Windowing: Group events into time-based batches for aggregation. Supports tumbling (non-overlapping) and sliding (overlapping) windows.
Batch Processing:
- Processes large volumes of data at rest.
- High latency (minutes to hours).
- Apache Spark is a batch processing framework.
- Examples: ETL jobs, data warehousing.
Stream Processing:
- Processes data as it arrives.
- Low latency (milliseconds to seconds).
- Apache Flink is a stream processing framework.
- Examples: Real-time analytics, fraud detection.
Kafka and Flink form a powerful combination for real-time analytics:
- Kafka acts as the data ingestion and storage layer, collecting events from various sources
- Flink consumes data from Kafka topics, processes it in real-time, and can write results back to Kafka or other systems
- This creates a complete streaming pipeline: Ingestion → Processing → Output
Imagine an e-commerce store like Amazon. Every second thousands of users can:
- Browse products
- Search for products
- Add products to their cart
- Purchase products
The system needs to:
- Track user behavior in real-time (page views, items added to cart, etc.)
- Detect suspicious activity (fraud, bot traffic, unusual purchase patterns)
- Real-time analytics (top-selling products, user engagement metrics)
- AI-powered recommendations (personalized product suggestions)
Kafka:
- Handle millions of events per second
- Store events reliably
- Multiple systems can consume the same event
- Decouple producers and consumers
Flink:
- Process events with sub-second latency
- Maintains state across events
- Exact once processing. Example: If we purchase a product, we want to process the event exactly once.
1️⃣ Run the Kafka & Flink Services
docker compose up -dCreate Kafka topics
./setup_kafka.sh- Flink Web UI: http://localhost:8081
- Kafka: localhost:9092
- Kafka UI: http://localhost:8080
2️⃣ Produce Events
# Install dependencies
python3.11 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
# Produce events
python data-generator/producer.pyThis will produce events to the Kafka topic user-events.
3️⃣ Run the Flink Job Install dependencies
docker exec flink-jobmanager pip install -r /opt/requirements.txt
docker exec flink-taskmanager pip install -r /opt/requirements.txtSubmit the jobs
docker exec flink-jobmanager flink run -py /opt/flink/jobs/product-analytics.py
docker exec flink-jobmanager flink run -py /opt/flink/jobs/fraud-detection.py4️⃣ View the results
- Flink Web UI: http://localhost:8081
- Kafka UI: http://localhost:8080
- View Flink Job results -
docker compose logs -f --tail 10 taskmanager
Track how each product is performing in real-time
- How many people viewed the product?
- How many added it to cart?
- How many actually bought it?
- What's the conversion rate?
- How much revenue did it generate?
Note
Stateless processing: Each window is independent. Windowing: We are grouping events by product_id.
Example Result for a single window (1 min duration):
flink-taskmanager | 1> {"product_id": "Apple Watch", "window_start": "2025-09-21T13:36:00", "window_end": "2025-09-21T13:37:00", "view_count": 90, "cart_additions": 39, "purchases": 7, "revenue": 3199.8399999999997, "conversion_rate": 0.07777777777777778}
flink-taskmanager | 2> {"product_id": "shoes-001", "window_start": "2025-09-21T13:36:00", "window_end": "2025-09-21T13:37:00", "view_count": 226, "cart_additions": 59, "purchases": 20, "revenue": 97828.18623999583, "conversion_rate": 0.08849557522123894}
flink-taskmanager | 1> {"product_id": "Airpods Max", "window_start": "2025-09-21T13:36:00", "window_end": "2025-09-21T13:37:00", "view_count": 93, "cart_additions": 23, "purchases": 8, "revenue": 6399.84, "conversion_rate": 0.08602150537634409}
flink-taskmanager | 1> {"product_id": "tablet-001", "window_start": "2025-09-21T13:36:00", "window_end": "2025-09-21T13:37:00", "view_count": 95, "cart_additions": 20, "purchases": 6, "revenue": 7799.870000000001, "conversion_rate": 0.06315789473684211}
flink-taskmanager | 1> {"product_id": "Apple Watch Pro", "window_start": "2025-09-21T13:36:00", "window_end": "2025-09-21T13:37:00", "view_count": 112, "cart_additions": 16, "purchases": 6, "revenue": 51953.62847056468, "conversion_rate": 0.05357142857142857}
flink-taskmanager | 1> {"product_id": "laptop-002", "window_start": "2025-09-21T13:36:00", "window_end": "2025-09-21T13:37:00", "view_count": 101, "cart_additions": 22, "purchases": 8, "revenue": 45999.770000000004, "conversion_rate": 0.07920792079207921}
flink-taskmanager | 1> {"product_id": "book-001", "window_start": "2025-09-21T13:36:00", "window_end": "2025-09-21T13:37:00", "view_count": 391, "cart_additions": 69, "purchases": 29, "revenue": 9394.419177383934, "conversion_rate": 0.0741687979539642}
flink-taskmanager | 1> {"product_id": "Airpods Prod", "window_start": "2025-09-21T13:36:00", "window_end": "2025-09-21T13:37:00", "view_count": 93, "cart_additions": 31, "purchases": 11, "revenue": 4999.75, "conversion_rate": 0.11827956989247312}
flink-taskmanager | 1> {"product_id": "phone-002", "window_start": "2025-09-21T13:36:00", "window_end": "2025-09-21T13:37:00", "view_count": 91, "cart_additions": 28, "purchases": 6, "revenue": 361574.586116209, "conversion_rate": 0.06593406593406594}
flink-taskmanager | 1> {"product_id": "phone-002", "window_start": "2025-09-21T13:37:00", "window_end": "2025-09-21T13:38:00", "view_count": 1, "cart_additions": 1, "purchases": 0, "revenue": 0.0, "conversion_rate": 0.0}
flink-taskmanager | 1> {"product_id": "book-001", "window_start": "2025-09-21T13:37:00", "window_end": "2025-09-21T13:38:00", "view_count": 4, "cart_additions": 1, "purchases": 0, "revenue": 0.0, "conversion_rate": 0.0}
flink-taskmanager | 1> {"product_id": "tablet-001", "window_start": "2025-09-21T13:37:00", "window_end": "2025-09-21T13:38:00", "view_count": 1, "cart_additions": 0, "purchases": 0, "revenue": 0.0, "conversion_rate": 0.0}
flink-taskmanager | 1> {"product_id": "laptop-002", "window_start": "2025-09-21T13:37:00", "window_end": "2025-09-21T13:38:00", "view_count": 4, "cart_additions": 1, "purchases": 0, "revenue": 0.0, "conversion_rate": 0.0}
flink-taskmanager | 1> {"product_id": "Airpods Max", "window_start": "2025-09-21T13:37:00", "window_end": "2025-09-21T13:38:00", "view_count": 2, "cart_additions": 0, "purchases": 0, "revenue": 0.0, "conversion_rate": 0.0}
flink-taskmanager | 1> {"product_id": "Airpods Prod", "window_start": "2025-09-21T13:37:00", "window_end": "2025-09-21T13:38:00", "view_count": 2, "cart_additions": 0, "purchases": 0, "revenue": 0.0, "conversion_rate": 0.0}
flink-taskmanager | 1> {"product_id": "Apple Watch Pro", "window_start": "2025-09-21T13:37:00", "window_end": "2025-09-21T13:38:00", "view_count": 0, "cart_additions": 2, "purchases": 0, "revenue": 0.0, "conversion_rate": 0.0}
flink-taskmanager | 2> {"product_id": "phone-001", "window_start": "2025-09-21T13:36:00", "window_end": "2025-09-21T13:37:00", "view_count": 106, "cart_additions": 19, "purchases": 5, "revenue": 8099.91, "conversion_rate": 0.04716981132075472}
flink-taskmanager | 2> {"product_id": "tablet-002", "window_start": "2025-09-21T13:36:00", "window_end": "2025-09-21T13:37:00", "view_count": 92, "cart_additions": 27, "purchases": 8, "revenue": 17999.85, "conversion_rate": 0.08695652173913043}
flink-taskmanager | 2> {"product_id": "jeans-001", "window_start": "2025-09-21T13:36:00", "window_end": "2025-09-21T13:37:00", "view_count": 248, "cart_additions": 71, "purchases": 13, "revenue": 2239.72, "conversion_rate": 0.05241935483870968}
flink-taskmanager | 2> {"product_id": "laptop-001", "window_start": "2025-09-21T13:36:00", "window_end": "2025-09-21T13:37:00", "view_count": 102, "cart_additions": 19, "purchases": 8, "revenue": 20799.84, "conversion_rate": 0.0784313725490196}
flink-taskmanager | 2> {"product_id": "book-002", "window_start": "2025-09-21T13:36:00", "window_end": "2025-09-21T13:37:00", "view_count": 342, "cart_additions": 96, "purchases": 36, "revenue": 8438.888277539389, "conversion_rate": 0.10526315789473684}
flink-taskmanager | 2> {"product_id": "shirt-001", "window_start": "2025-09-21T13:36:00", "window_end": "2025-09-21T13:37:00", "view_count": 250, "cart_additions": 56, "purchases": 15, "revenue": 11336.492395660473, "conversion_rate": 0.06}
flink-taskmanager | 2> {"product_id": "shoes-001", "window_start": "2025-09-21T13:37:00", "window_end": "2025-09-21T13:38:00", "view_count": 2, "cart_additions": 0, "purchases": 0, "revenue": 0.0, "conversion_rate": 0.0}
flink-taskmanager | 2> {"product_id": "book-002", "window_start": "2025-09-21T13:37:00", "window_end": "2025-09-21T13:38:00", "view_count": 4, "cart_additions": 1, "purchases": 0, "revenue": 0.0, "conversion_rate": 0.0}
flink-taskmanager | 2> {"product_id": "laptop-001", "window_start": "2025-09-21T13:37:00", "window_end": "2025-09-21T13:38:00", "view_count": 1, "cart_additions": 0, "purchases": 0, "revenue": 0.0, "conversion_rate": 0.0}
flink-taskmanager | 2> {"product_id": "shirt-001", "window_start": "2025-09-21T13:37:00", "window_end": "2025-09-21T13:38:00", "view_count": 3, "cart_additions": 0, "purchases": 0, "revenue": 0.0, "conversion_rate": 0.0}
flink-taskmanager | 2> {"product_id": "tablet-002", "window_start": "2025-09-21T13:37:00", "window_end": "2025-09-21T13:38:00", "view_count": 1, "cart_additions": 0, "purchases": 0, "revenue": 0.0, "conversion_rate": 0.0}
flink-taskmanager | 2> {"product_id": "jeans-001", "window_start": "2025-09-21T13:37:00", "window_end": "2025-09-21T13:38:00", "view_count": 3, "cart_additions": 0, "purchases": 0, "revenue": 0.0, "conversion_rate": 0.0}
flink-taskmanager | 2> {"product_id": "phone-001", "window_start": "2025-09-21T13:37:00", "window_end": "2025-09-21T13:38:00", "view_count": 1, "cart_additions": 0, "purchases": 0, "revenue": 0.0, "conversion_rate": 0.0}Similarly for the second window, I got per product the following summary
flink-taskmanager | 1> {"product_id": "Airpods Max", "window_start": "2025-09-21T13:45:00", "window_end": "2025-09-21T13:46:00", "view_count": 52, "cart_additions": 14, "purchases": 2, "revenue": 1599.96, "conversion_rate": 0.038461538461538464}
flink-taskmanager | 2> {"product_id": "shoes-001", "window_start": "2025-09-21T13:45:00", "window_end": "2025-09-21T13:46:00", "view_count": 104, "cart_additions": 38, "purchases": 7, "revenue": 11263.260781608495, "conversion_rate": 0.0673076923076923}
flink-taskmanager | 1> {"product_id": "laptop-002", "window_start": "2025-09-21T13:45:00", "window_end": "2025-09-21T13:46:00", "view_count": 42, "cart_additions": 10, "purchases": 3, "revenue": 15999.92, "conversion_rate": 0.07142857142857142}
flink-taskmanager | 1> {"product_id": "Apple Watch", "window_start": "2025-09-21T13:45:00", "window_end": "2025-09-21T13:46:00", "view_count": 52, "cart_additions": 13, "purchases": 1, "revenue": 399.98, "conversion_rate": 0.019230769230769232}
flink-taskmanager | 1> {"product_id": "tablet-001", "window_start": "2025-09-21T13:45:00", "window_end": "2025-09-21T13:46:00", "view_count": 63, "cart_additions": 7, "purchases": 2, "revenue": 2999.95, "conversion_rate": 0.031746031746031744}
flink-taskmanager | 1> {"product_id": "book-001", "window_start": "2025-09-21T13:45:00", "window_end": "2025-09-21T13:46:00", "view_count": 248, "cart_additions": 60, "purchases": 19, "revenue": 19066.57753044404, "conversion_rate": 0.07661290322580645}
flink-taskmanager | 1> {"product_id": "Apple Watch Pro", "window_start": "2025-09-21T13:45:00", "window_end": "2025-09-21T13:46:00", "view_count": 63, "cart_additions": 13, "purchases": 3, "revenue": 3199.92, "conversion_rate": 0.047619047619047616}
flink-taskmanager | 1> {"product_id": "Airpods Prod", "window_start": "2025-09-21T13:45:00", "window_end": "2025-09-21T13:46:00", "view_count": 67, "cart_additions": 12, "purchases": 2, "revenue": 1199.94, "conversion_rate": 0.029850746268656716}
flink-taskmanager | 1> {"product_id": "phone-002", "window_start": "2025-09-21T13:45:00", "window_end": "2025-09-21T13:46:00", "view_count": 41, "cart_additions": 15, "purchases": 4, "revenue": 564574.1042868246, "conversion_rate": 0.0975609756097561}
flink-taskmanager | 2> {"product_id": "phone-001", "window_start": "2025-09-21T13:45:00", "window_end": "2025-09-21T13:46:00", "view_count": 50, "cart_additions": 7, "purchases": 1, "revenue": 1799.98, "conversion_rate": 0.02}
flink-taskmanager | 2> {"product_id": "book-002", "window_start": "2025-09-21T13:45:00", "window_end": "2025-09-21T13:46:00", "view_count": 275, "cart_additions": 70, "purchases": 15, "revenue": 9091.994571621532, "conversion_rate": 0.05454545454545454}
flink-taskmanager | 2> {"product_id": "jeans-001", "window_start": "2025-09-21T13:45:00", "window_end": "2025-09-21T13:46:00", "view_count": 121, "cart_additions": 35, "purchases": 8, "revenue": 959.88, "conversion_rate": 0.06611570247933884}
flink-taskmanager | 2> {"product_id": "laptop-001", "window_start": "2025-09-21T13:45:00", "window_end": "2025-09-21T13:46:00", "view_count": 47, "cart_additions": 16, "purchases": 7, "revenue": 20799.84, "conversion_rate": 0.14893617021276595}
flink-taskmanager | 2> {"product_id": "shirt-001", "window_start": "2025-09-21T13:45:00", "window_end": "2025-09-21T13:46:00", "view_count": 117, "cart_additions": 32, "purchases": 8, "revenue": 599.8000000000001, "conversion_rate": 0.06837606837606838}
flink-taskmanager | 2> {"product_id": "tablet-002", "window_start": "2025-09-21T13:45:00", "window_end": "2025-09-21T13:46:00", "view_count": 39, "cart_additions": 8, "purchases": 2, "revenue": 4799.96, "conversion_rate": 0.05128205128205128}You will see new records every minute ✅
Detect suspicious user behavior in real-time to prevent fraud
- Identify users making too many purchases quickly
- Flag unusually high spending amounts
- Detect rapid successive high-value purchases
- Alert immediately when patterns are detected
Note
Stateful processing: We need to remember user behaviour across multiple events. No windowing as fraud detection needs to be done immediately.
Exmaple Result:
flink-taskmanager | 2> {"user_id": "user_056", "alert_type": "high_spending", "reason": "$5399.94 spent in 5 minutes", "risk_score": 0.539994, "timestamp": 1758461139530}
flink-taskmanager | 2> {"user_id": "user_056", "alert_type": "successive_high_value", "reason": "Multiple high-value purchases", "risk_score": 0.9, "timestamp": 1758461139530}
flink-taskmanager | 1> {"user_id": "user_018", "alert_type": "high_spending", "reason": "$442650.94 spent in 5 minutes", "risk_score": 1.0, "timestamp": 1758461143339}
flink-taskmanager | 1> {"user_id": "user_018", "alert_type": "high_spending", "reason": "$455550.13 spent in 5 minutes", "risk_score": 1.0, "timestamp": 1758461145632}
flink-taskmanager | 1> {"user_id": "user_020", "alert_type": "high_spending", "reason": "$8599.95 spent in 5 minutes", "risk_score": 0.8599950000000001, "timestamp": 1758461155209}
flink-taskmanager | 1> {"user_id": "user_018", "alert_type": "high_spending", "reason": "$612941.33 spent in 5 minutes", "risk_score": 1.0, "timestamp": 1758461156239}
flink-taskmanager | 1> {"user_id": "user_018", "alert_type": "successive_high_value", "reason": "Multiple high-value purchases", "risk_score": 0.9, "timestamp": 1758461156239}Scalability:
- Adding more TaskManagers increases parallelism.
- Increase Kafka partitions to improve throughput.
Fault Tolerance:
- Checkpointing: Flink saves its progress every 5 seconds. If a task fails, Flink can recover from the checkpoint.
- Automatic recovery: Flink can recover from task failures.
- Exactly-once processing: Flink can process events exactly once.
Low Latency:
- Results available within 1 minute + watermark delay
- Real-time business insights
- Make File Sink Work
- Add more flink job examples. Ex - ML prediction, etc