This project presents a reinforcement learning-based trading agent designed to operate on simulated historical stock markets in order to reduce psychological bias and impulsive decision-making. A 1D convolutional neural network (CNN) is trained on labeled open, high, low, close, and volume (OHLCV) data to predict support and resistance lines which are then used to define observation states for Q-learning agents in 4-hour and 5-minute intervals. A bucketing strategy discretized state spaces, and the 5-minute agent takes decisions from the 4-hour model to emulate long-term memory. Different Q-learning reward structures are explored. The model achieves consistent profits in simulation, demonstrating the viability of combining CNN-based pattern recognition with reinforcement learning for trading. Developed in Python 3.11.
jgubbens/CS5100_StockPrediction
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|