DataPulse is an open-source, full-stack AI data analysis platform designed for building automated data pipelines, professional visualization workflows, and intelligent business insights.
Choose your language:
- English Documentation: Full project guide, setup, and features.
- 中文说明文档: 核心功能介绍、快速部署与 AI 数据科学家模式指南。
- 移动端知识库规格: 本地 PDF 解析、RAG 检索、知识图谱完整规格。
- Multi-Provider AI Models: Bring your own API key — DeepSeek, Qwen, MiniMax, OpenAI, Gemini, Claude. Organized by region.
- Extended Thinking: First-class reasoning chain support for DeepSeek R1, Claude Opus/Sonnet, QwQ-32B, and Gemini Pro.
- AI Data Scientist Agent: Secure Python sandbox for complex modeling, multi-table analysis, and Matplotlib/Seaborn chart capturing.
- Map Chart & 16+ Visualizations: Geographic heatmaps, ECharts dynamic dashboards, and professional chart types.
- Mobile Local Knowledge Base (v3.1 New): Three PDF modes fully on-device — PDF.js local parse / MinerU deep parse / LLM knowledge graph extraction. Layered RAG: vector search → FTS5 fallback.
- Knowledge Graph Visualization (v3.1 New): Interactive ECharts force graph of entities and relations extracted from documents.
- Bring Your Own Data (BYOD): External agents can provide private datasets via API for instant analysis.
- Enterprise-Ready: Automated database initialization with massive simulation datasets (160k+ records).
# Clone the repository
git clone https://github.com/CadanHu/data-analyse-system.git
# Copy environment config (AI API keys are configured inside the app)
cp .env.example .env
# Start the entire stack
docker-compose up --buildConfigure your AI provider API keys (DeepSeek / OpenAI / Gemini / Claude) in the Model/Key settings panel inside the app after startup.
- Frontend: React, TypeScript, Tailwind CSS, Vite, Capacitor 6.
- Backend: FastAPI, LangChain, SQLAlchemy.
- AI: DeepSeek, OpenAI, Google Gemini, Anthropic Claude (multi-provider).
- Analysis: Pandas, Matplotlib, Seaborn, Scikit-learn.
- Database: MySQL (business data) + PostgreSQL (knowledge base / vector store).
This project is licensed under the MIT License.