Skip to content

ForexFactoryScrapper is a Python-based web scraping tool designed to extract financial event data from the ForexFactory website. This project provides a simple and effective way to scrape calendar events, forecast data, actual values, and other relevant information for forex trading analysis.

License

Notifications You must be signed in to change notification settings

AtaCanYmc/ForexFactoryScrapper

Repository files navigation

ForexFactoryScrapper

CI Python License: MIT

ForexFactoryScrapper is a Python-based web scraping tool designed to extract financial event data from the ForexFactory website. This project provides a simple and effective way to scrape calendar events, forecast data, actual values, and other relevant information for forex trading analysis.

Features

  • Scrape calendar events, including date, time, currency, event name, forecast, actual, and previous values.
  • Export or process extracted data in structured formats suitable for analysis.
  • Simple and customizable scraping logic using BeautifulSoup.
  • Includes examples for extracting data and creating basic reports.

Requirements

  • Python 3.9 or newer
  • See requirements.txt for dependency versions used during development and testing.

Installation

  1. Create and activate a virtual environment:
python -m venv .venv
source .venv/bin/activate
  1. Install dependencies:
pip install -r requirements.txt

Running locally

Start the application locally:

python app.py

By default this will start the app on 0.0.0.0:5000. Example endpoints you can call:

  • GET /api/hello
  • GET /api/health
  • GET /api/forex/daily?day=1&month=1&year=2020

(Adjust host/port or endpoint parameters as needed in main.py.)

Example requests

Below are simple example requests you can use to interact with the running application. Replace localhost:5000 with the host/port where your app is listening if different.

1) Hello

Curl:

curl -sS http://localhost:5000/api/hello

Expected JSON response (HTTP 200):

{
  "message": "Hello, World!",
  "status": "success"
}

2) Health

Curl:

curl -sS http://localhost:5000/api/health

Expected JSON response (HTTP 200):

{
  "status": "ok"
}

3) Forex daily — missing or invalid parameters

  • Missing parameters (HTTP 400):
curl -sS http://localhost:5000/api/forex/daily

Response body:

{ "error": "Missing one or more required parameters: day, month, year" }
  • Invalid (non-integer) parameters (HTTP 400):
curl -sS "http://localhost:5000/api/forex/daily?day=aa&month=bb&year=cc"

Response body:

{ "error": "Parameters day, month and year must be integers" }
  • Out-of-range parameters (HTTP 400):
curl -sS "http://localhost:5000/api/forex/daily?day=99&month=99&year=3000"

Response body:

{ "error": "Parameters out of reasonable range" }

4) Forex daily — success

Curl (example):

curl -sS "http://localhost:5000/api/forex/daily?day=1&month=1&year=2020"

Expected JSON response (HTTP 200): a JSON array of records. Example record format:

[
  {
    "Time": "01/01/2020 00:00",
    "Currency": "USD",
    "Event": "NFP",
    "Forecast": "100k",
    "Actual": "120k",
    "Previous": "90k"
  }
]

Python requests example:

import requests

resp = requests.get(
    'http://localhost:5000/api/forex/daily',
    params={'day': 1, 'month': 1, 'year': 2020},
)
print(resp.status_code)
print(resp.json())

5) Forex daily — paging (limit & offset)

This project added optional paging support to the /api/forex/daily endpoint via two query parameters: limit and offset.

  • offset (optional): integer >= 0, default 0. Skip this many records from the start.
  • limit (optional): integer >= 0, default is unlimited. Return at most this many records after applying the offset.

Behavior and validation:

  • Both limit and offset must be integers. Non-integer values return HTTP 400.
  • Negative values return HTTP 400.
  • If offset is greater than or equal to the number of available records, the endpoint returns an empty list and HTTP 200.
  • limit=0 returns an empty list (valid request).
  • If the scraper returns a non-list structure, paging is not applied and the raw response is returned.

Examples:

  • First 10 records:
curl -sS "http://localhost:5000/api/forex/daily?day=1&month=1&year=2020&limit=10"
  • Start from the 5th record and return up to 3 records:
curl -sS "http://localhost:5000/api/forex/daily?day=1&month=1&year=2020&offset=4&limit=3"
  • Non-integer or negative paging params (example, HTTP 400):
curl -sS "http://localhost:5000/api/forex/daily?day=1&month=1&year=2020&limit=abc"
curl -sS "http://localhost:5000/api/forex/daily?day=1&month=1&year=2020&offset=-1"

Notes and suggestions:

  • There is no enforced maximum limit in the current implementation. For production use you may want to cap limit (for example 500 or 1000) to avoid large responses or memory spikes.
  • Consider returning a pagination wrapper like { "total": N, "offset": X, "limit": Y, "results": [...] } if clients benefit from metadata. Current response remains a plain JSON array for backward compatibility.

Notes:

  • The exact fields and values depend on the parser and target site's HTML structure. When running the real scraper, values reflect what is parsed from ForexFactory for the given date.
  • The examples above match the app behavior implemented in main.py and the test fixtures in tests/test_app.py.

Tests

Run the test suite with pytest:

pytest -q

Unit tests are located in the tests/ folder. Network calls and external dependencies are isolated using monkeypatching to keep tests deterministic.

Notes and caveats

  • The scraper depends on the target site's HTML structure. If ForexFactory changes its markup, the parsing code will need updating.
  • requirements.txt pins versions that were used during development; consider updating or pinning further for deployments.
  • Respect the target site's robots.txt and terms of service when scraping.

Contributing

Contributions, bug reports, and feature requests are welcome. Please open an issue or a pull request.

License

This project is licensed under the MIT License — see the LICENSE file for details.

About

ForexFactoryScrapper is a Python-based web scraping tool designed to extract financial event data from the ForexFactory website. This project provides a simple and effective way to scrape calendar events, forecast data, actual values, and other relevant information for forex trading analysis.

Topics

Resources

License

Stars

Watchers

Forks

Packages