Skip to content

CadanHu/data-analyse-system

Repository files navigation

DataPulse AI Data Analysis System (v3.1)

DataPulse is an open-source, full-stack AI data analysis platform designed for building automated data pipelines, professional visualization workflows, and intelligent business insights.

GitHub Topics License: MIT


📖 Documentation / 文档中心

Choose your language:


🌟 Key Features (v3.1)

  • Multi-Provider AI Models: Bring your own API key — DeepSeek, Qwen, MiniMax, OpenAI, Gemini, Claude. Organized by region.
  • Extended Thinking: First-class reasoning chain support for DeepSeek R1, Claude Opus/Sonnet, QwQ-32B, and Gemini Pro.
  • AI Data Scientist Agent: Secure Python sandbox for complex modeling, multi-table analysis, and Matplotlib/Seaborn chart capturing.
  • Map Chart & 16+ Visualizations: Geographic heatmaps, ECharts dynamic dashboards, and professional chart types.
  • Mobile Local Knowledge Base (v3.1 New): Three PDF modes fully on-device — PDF.js local parse / MinerU deep parse / LLM knowledge graph extraction. Layered RAG: vector search → FTS5 fallback.
  • Knowledge Graph Visualization (v3.1 New): Interactive ECharts force graph of entities and relations extracted from documents.
  • Bring Your Own Data (BYOD): External agents can provide private datasets via API for instant analysis.
  • Enterprise-Ready: Automated database initialization with massive simulation datasets (160k+ records).

🚀 Quick Start (Docker)

# Clone the repository
git clone https://github.com/CadanHu/data-analyse-system.git

# Copy environment config (AI API keys are configured inside the app)
cp .env.example .env

# Start the entire stack
docker-compose up --build

Configure your AI provider API keys (DeepSeek / OpenAI / Gemini / Claude) in the Model/Key settings panel inside the app after startup.

🛠️ Tech Stack

  • Frontend: React, TypeScript, Tailwind CSS, Vite, Capacitor 6.
  • Backend: FastAPI, LangChain, SQLAlchemy.
  • AI: DeepSeek, OpenAI, Google Gemini, Anthropic Claude (multi-provider).
  • Analysis: Pandas, Matplotlib, Seaborn, Scikit-learn.
  • Database: MySQL (business data) + PostgreSQL (knowledge base / vector store).

📄 License

This project is licensed under the MIT License.

About

一款专为现代企业设计的全栈 AI 数据分析中台。它不仅支持传统的 SQL 查询,通过 Python 沙盒环境实现复杂的数学建模、自动化清洗与高清图表渲染。 | Open-source AI data analysis platform with multi-provider LLM , automated pipelines, mobile offline RAG knowledge base & knowledge graph visualization.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors