- Not Financial Advice! This is Experimental Software. Use At Your Own Risk!
- Parser for market data processing using Apache Flink
- Vibe coded with help from Google Gemini
- This is a work in progress
- docker-compose managaged network,
- react typescript frontend, spring-boot backend, flink-jobmanager, flink-taskmanager, dynamodb
- rest api middle tier and service api consumer
This repository is optimized for "lean and mean" development. You do not need Java, Maven, or Node installed on your host machine; everything executes securely inside Docker environments.
To verify code serialization, data boundary handling (dropping file comments and headers), and aggregation math, fire up the isolated test profile:
docker compose run --rm backend-testTo completely flush image layers and rebuild the production application binaries from a completely scratch state:
docker compose build --no-cacheSpin up your distributed network—including Spring Boot, the Apache Flink JobManager, TaskManagers, Postgres, and DynamoDB Local:
docker compose up
├── docker-compose.yml
│
├── backend
│ ├── Dockerfile
│ ├── pom.xml
│ ├── src
│ ├── main
│ │ ├── java
│ │ │ └── net
│ │ │ └── daveray
│ │ │ ├── BlazeStreamApplication.java
│ │ │ └── flink
│ │ │ │ └── FlinkConfig.java
│ │ │ ├── executor
│ │ │ │ ├── FlinkExecutor.java
│ │ │ │ └── FlinkJobRunner.java
│ │ │ ├── pipeline
│ │ │ │ ├── Orchestrator.java
│ │ │ │ └── StreamingJson.java
│ │ │ ├── pojo
│ │ │ │ └── CandlestickPojo.java
│ │ │ ├── provider
│ │ │ ├── FlinkSourceProvider.java
│ │ │ ├── FlinkSourceProviderFactory.java
│ │ │ ├── file
│ │ │ │ │ ├── FileBacktestSourceProvider.java
│ │ │ │ │ └── FileLineReaderSource.java
│ │ │ │ └── sigr
│ │ │ │ └── SignalRSourceProvider.java
│ │ │ └── transformer
│ │ │ ├── JsonCandleWrapper.java
│ │ │ ├── TimeSeriesAggregator.java
│ │ │ └── Transformer.java
│ │ └── resources
│ │ └── application.yaml
│ └── test
│ ├── java
│ │ └── net
│ │ └── daveray
│ │ └── flink
│ │ ├── JsonCandleWrapperTest.java
│ │ ├── OrchestratorTest.java
│ │ └── TimeSeriesAggregatorTest.java
│ └── resources
│ └── logback-test.xml
│
├── data
│ └── NQ_futures_data.tsf
│
├── frontend
│ ├── Dockerfile
│ ├── index.html
│ ├── nginx.conf
│ ├── package.json
│ ├── src
│ │ ├── App.tsx
│ │ ├── main.tsx
│ │ ├── react
│ │ │ ├── jsx
│ │ │ └── jsx-runtime
│ │ └── StreamingData.tsx
│ ├── tsconfig.json
│ └── vite.config.ts
│
├── log4j-console.xml
├── README.md
The backend code is designed with enterprise-grade architecture:
- Pipeline Orchestrator Pattern: Implements decoupled, pluggable
Transformer<I, O>interfaces. Operations can be sequentially chained together (Orchestrator.add(step)) within an isolated execution pipeline. - Spring IoC Factory Lookup: Features a dynamic
FlinkSourceProviderFactorythat queries Spring’sApplicationContextat runtime. This allows you to toggle data sources (e.g., file-backtests vs. live web sockets) purely via configuration changes. - Cluster Serialization Safety: Refactored entirely into static inner functions to eliminate
NotSerializableExceptionfaults across remote cluster nodes.
A high-performance, distributed stream processing backend built with Spring Boot and Apache Flink (v1.19). This engine orchestrates real-time financial market data streams, normalizes mixed historical time-series cadences, and prepares standard data payloads for automated trade strategy execution.
During backtesting integration, mixed historical data sources introduced a severe time-series anomaly:
- Source A (yfinance) natively provides 2-minute candlestick bars.
- Source B (ProjectX API
retrieveBars()) natively outputs 1-minute candlestick bars.
When blended together, raw sequential processing caused downstream calculation errors due to intermediate, unfinalized 1-minute blocks leaking into consumers.
This project implements a custom, look-back state machine using Flink's Managed Keyed State (ValueState).
- Dynamic Time-Delta Tracking: It calculates the physical
Durationtime-gap between the current bar and the previous bar dynamically, avoiding fragile hardcoded timestamp rules. - Stateful Buffering: If a 1-minute bar arrives, it is safely buffered in state. When its companion bar arrives (delta < 2m), they are mathematically aggregated into a single, finalized 2-minute candlestick and immediately emitted.
- Standalone Pass-through: If a native 2-minute bar arrives (delta >= 2m), the previous buffered bar is flushed, and the native bar passes through untouched.
- SignalR Streaming Aggregator: Integrate ProjectX's live SignalR web socket stream and implement a 1-minute tumbling/sliding window function to process live ticks into uniform bars.
- Execution Layer Gateway: Establish automated API endpoints to pass order signals directly to ProjectX broker placement endpoints.
- Strategy Engine Port: Migrate the core backtesting mathematical strategy routines from Python over to native, stateful Flink operators for sub-millisecond execution matching.
- Front End: build out with react to manage streaming market data and live trade order management.
Author: daveray@daveray.net
Date: 2026-05-05
License: Apache 2.0
- Not Financial Advice! This is Experimental Software. Use At Your Own Risk!
Copyright 2026 David Ray Reizes <daveray@daveray.net>
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.