We constructed an interactive 3D chart of a world map with countries colored according to various statistical indicators from a dataset. Users can rotate the Earth sphere and click on a specific country to view detailed information and related plots.
curl -o- https://githubusercontent.com | bashRestart your terminal
nvm install --lts# Install Node.js if not installed
brew install node
# Clone the repo
git clone <REPOSITORY-URL>-
Download: Go to the official Node.js website and download the LTS (Long-Term Support) version for Windows.
-
Run Setup: Open the downloaded
.msifile and follow the installation wizard. Keep the default settings, specifically ensuring the "Add to PATH" option is checked. -
Finish: Click "Install" and then "Finish" once the process completes
Launch two terminals
cd <PATH-TO-PROJECT>
cd client
npm install
npm startcd <PATH-TO-PROJECT>
cd server
npm install
npm start- Zamir Safin: Frontend Visualization (React, D3.js)
- Denis Beliaev: Backend Database (Node.js, PostgreSQL)
- Rustem Gilmetdinov: Data Pipelining & GenAI Agents (Python, LLMs)
- Frontend: React
- Backend: Node.js
- Database: PostgreSQL, SQLite
- DevOps: Docker
- Structured Data: Scrapped from worldometers.info
- Unstructured Data: Extracted from PDFs (Demographic Yearbooks)
- Processing: GenAI agents used for data extraction and cleaning
- PyPDF - Page pre-extraction from PDF reports
- Docling - PDF to Markdown conversion
- Qwen2.5-VL - Table extraction to CSV
- Identify table pages using PDF inspector
- Extract specific pages with PyPDF (reduces context by ~90%)
- Convert extracted pages to Markdown with Docling
- Send Markdown to Qwen2.5-VL for structured CSV extraction
- Validate output with pandas
- Linearly interpolate missing values
- Scrap the data from the website (worldometers.info) and upload to CSV
- Clear the CSV by deleting redundant symbols and columns
- Linearly interpolate missing values
- Merge the gained CSV tables by using INNER JOIN
- Upload the gained
pd.DataFrameinto SQLite database file.db
- Page pre-extraction reduces processing time and increases the accuracy
- Qwen2.5-VL chosen for the ability of reading text in various scenarios (multi-orientation), interpreting tables, charts, diagrams
# Clone the repository
git clone <REPOSITORY_URL>
cd "DWaV Project"
# Create a user-specific `.env` file
chmod +x create_env_unix.sh
./create_env_unix.sh
# Create and activate a virtual environment
python3 -m venv .venv
source .venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Launch the data pipeline
python3 data_pipeline/src/main.pyREM Clone the repository
git clone <REPOSITORY_URL>
cd "DWaV Project"
REM Create a user-specific `.env` file
create_env_windows.bat
REM Create and activate a virtual environment
python -m venv .venv
.venv\Scripts\activate
REM Install dependencies
pip install -r requirements.txt
REM Launch the data pipeline:
python data_pipeline/src/main.py[!warn] Do not forget to edit
DB_USERandDB_PASSWORDwith your own values
