ForexFactoryScrapper is a Python-based web scraping tool designed to extract financial event data from the ForexFactory website. This project provides a simple and effective way to scrape calendar events, forecast data, actual values, and other relevant information for forex trading analysis.
- Scrape calendar events, including date, time, currency, event name, forecast, actual, and previous values.
- Export or process extracted data in structured formats suitable for analysis.
- Simple and customizable scraping logic using
BeautifulSoup. - Includes examples for extracting data and creating basic reports.
- Python 3.9 or newer
- See
requirements.txtfor dependency versions used during development and testing.
- Create and activate a virtual environment:
python -m venv .venv
source .venv/bin/activate- Install dependencies:
pip install -r requirements.txtStart the application locally:
python app.pyBy default this will start the app on 0.0.0.0:5000. Example endpoints you can call:
- GET /api/hello
- GET /api/health
- GET /api/forex/daily?day=1&month=1&year=2020
(Adjust host/port or endpoint parameters as needed in main.py.)
Below are simple example requests you can use to interact with the running application. Replace localhost:5000 with the host/port where your app is listening if different.
Curl:
curl -sS http://localhost:5000/api/helloExpected JSON response (HTTP 200):
{
"message": "Hello, World!",
"status": "success"
}Curl:
curl -sS http://localhost:5000/api/healthExpected JSON response (HTTP 200):
{
"status": "ok"
}- Missing parameters (HTTP 400):
curl -sS http://localhost:5000/api/forex/dailyResponse body:
{ "error": "Missing one or more required parameters: day, month, year" }- Invalid (non-integer) parameters (HTTP 400):
curl -sS "http://localhost:5000/api/forex/daily?day=aa&month=bb&year=cc"Response body:
{ "error": "Parameters day, month and year must be integers" }- Out-of-range parameters (HTTP 400):
curl -sS "http://localhost:5000/api/forex/daily?day=99&month=99&year=3000"Response body:
{ "error": "Parameters out of reasonable range" }Curl (example):
curl -sS "http://localhost:5000/api/forex/daily?day=1&month=1&year=2020"Expected JSON response (HTTP 200): a JSON array of records. Example record format:
[
{
"Time": "01/01/2020 00:00",
"Currency": "USD",
"Event": "NFP",
"Forecast": "100k",
"Actual": "120k",
"Previous": "90k"
}
]Python requests example:
import requests
resp = requests.get(
'http://localhost:5000/api/forex/daily',
params={'day': 1, 'month': 1, 'year': 2020},
)
print(resp.status_code)
print(resp.json())This project added optional paging support to the /api/forex/daily endpoint via two query parameters: limit and offset.
offset(optional): integer >= 0, default 0. Skip this many records from the start.limit(optional): integer >= 0, default is unlimited. Return at most this many records after applying the offset.
Behavior and validation:
- Both
limitandoffsetmust be integers. Non-integer values return HTTP 400. - Negative values return HTTP 400.
- If
offsetis greater than or equal to the number of available records, the endpoint returns an empty list and HTTP 200. limit=0returns an empty list (valid request).- If the scraper returns a non-list structure, paging is not applied and the raw response is returned.
Examples:
- First 10 records:
curl -sS "http://localhost:5000/api/forex/daily?day=1&month=1&year=2020&limit=10"- Start from the 5th record and return up to 3 records:
curl -sS "http://localhost:5000/api/forex/daily?day=1&month=1&year=2020&offset=4&limit=3"- Non-integer or negative paging params (example, HTTP 400):
curl -sS "http://localhost:5000/api/forex/daily?day=1&month=1&year=2020&limit=abc"
curl -sS "http://localhost:5000/api/forex/daily?day=1&month=1&year=2020&offset=-1"Notes and suggestions:
- There is no enforced maximum
limitin the current implementation. For production use you may want to caplimit(for example 500 or 1000) to avoid large responses or memory spikes. - Consider returning a pagination wrapper like
{ "total": N, "offset": X, "limit": Y, "results": [...] }if clients benefit from metadata. Current response remains a plain JSON array for backward compatibility.
Notes:
- The exact fields and values depend on the parser and target site's HTML structure. When running the real scraper, values reflect what is parsed from ForexFactory for the given date.
- The examples above match the app behavior implemented in
main.pyand the test fixtures intests/test_app.py.
Run the test suite with pytest:
pytest -qUnit tests are located in the tests/ folder. Network calls and external dependencies are isolated using monkeypatching to keep tests deterministic.
- The scraper depends on the target site's HTML structure. If ForexFactory changes its markup, the parsing code will need updating.
requirements.txtpins versions that were used during development; consider updating or pinning further for deployments.- Respect the target site's robots.txt and terms of service when scraping.
Contributions, bug reports, and feature requests are welcome. Please open an issue or a pull request.
This project is licensed under the MIT License — see the LICENSE file for details.