A comprehensive Chinese language learning platform that specializes in ancient Chinese character explanations, and AI-powered study assistance.
Note: This application requires an OpenAI-compatible API key. It supports any provider with an OpenAI-compatible API (DashScope for Qwen models, OpenRouter, OpenAI, etc.). Configure via server/.env.
- AI-Powered Quick Answers: Get instant responses to Chinese language questions
- Deep Text Analysis: Comprehensive analysis of ancient Chinese texts with detailed annotations
- Character Dictionary Integration: Real-time character explanations and definitions from ZDIC
- Textbook Integration: Verbatim related character definitions in textbook
- Frequency Analysis: Statistical analysis of character usage across different contexts
- Multiple AI Models: Three configurable models for dictionary preprocessing, quick answering, and deep thinking
- Provider Agnostic: Supports any OpenAI-compatible API (Qwen via DashScope, OpenRouter, OpenAI, etc.)
- Contextual Understanding: AI models trained specifically for Chinese language nuances
- Real-time Processing: Streaming SSE responses for better user experience
- OAuth Login: Sign in with GitHub or Gitee
- Daily Quota: Configurable query limits for authenticated users and guests
- History Sync: Per-user query history backed by SQLite
- History Tracking: Keep track of your queries and learning progress
- Export: Export history to JSON, Word (.docx), or Anki (.apkg) format
- Legacy Migration: Import history from the old localStorage-based version
- Corpus Collection: Automatically builds a searchable corpus from user queries
- Interactive Interface: Modern Vue.js frontend with responsive design
- Vue 3 (3.5+) with TypeScript
- Vite (v8) for fast development and building
- TailwindCSS (v4) with
@tailwindcss/viteplugin - Pinia (v3) for state management
- Vue Router (v5) for navigation
- VueUse for composable utilities
- Vue DevTools integration via Vite plugin
- pnpm for package management
- FastAPI for high-performance API
- Python 3.12+
- Uvicorn ASGI server
- OpenAI API compatible client (DashScope, OpenRouter, OpenAI, etc.)
- SQLite for local database and caching
- httpx for async HTTP requests
- BeautifulSoup4 for ZDIC dictionary scraping
- PyJWT for authentication tokens
- Docker & Docker Compose for containerization
- Python 3.12+
- Node.js 20+
- pnpm
- Docker (optional, for containerized deployment)
-
Clone the repository
git clone https://github.com/AsithaKanchana1/exam-char-key.git cd exam-char-key -
Set up environment variables
cp server/.env.example server/.env # Edit server/.env with your API keys -
Run with Docker Compose
docker-compose up --build
-
Access the application
- Open your browser and navigate to
http://localhost:5173
- Open your browser and navigate to
-
Clone and setup environment
git clone https://github.com/AsithaKanchana1/exam-char-key.git cd exam-char-key cp server/.env.example server/.env # Edit server/.env with your API keys
-
Install backend dependencies
pip install -r server/requirements.txt
-
Install frontend dependencies
cd client pnpm install cd ..
-
Run the development server
python run_dev.py
This starts both the Vite dev server (port 5173 for frontend) and the FastAPI backend (port 8000). The Vite dev server proxies
/apirequests to the backend automatically.
Create a server/.env file based on server/.env.example:
LLM_BASE_URL=https://openrouter.ai/api/v1
LLM_API_KEY=sk-xxxxxxxx
MODEL_DICT_PREPROCESS=xiaomi/mimo-v2.5
MODEL_QUICK_ANSWER=xiaomi/mimo-v2.5
MODEL_DEEP_THINK=xiaomi/mimo-v2.5
QUOTA_USER_DAILY=50
QUOTA_GUEST_DAILY=50
APP_BASE_URL=http://localhost:5173
JWT_SECRET=
DB_PATH=../db/data.db
GITHUB_CLIENT_ID=
GITHUB_CLIENT_SECRET=
GITEE_CLIENT_ID=
GITEE_CLIENT_SECRET=The application supports any OpenAI-compatible API service. Configure via server/.env:
| Variable | Description | Example |
|---|---|---|
LLM_BASE_URL |
API endpoint base URL | https://dashscope.aliyuncs.com/compatible-mode/v1 (Qwen), https://openrouter.ai/api/v1 |
LLM_API_KEY |
Your API key | sk-... |
MODEL_DICT_PREPROCESS |
Model for structuring raw dictionary data into JSON | qwen-turbo, gpt-3.5-turbo |
MODEL_QUICK_ANSWER |
Model for fast inline character explanations | qwen3-8b-ft-202508031744-1c46 |
MODEL_DEEP_THINK |
Model for comprehensive deep analysis | qwen3-8b-ft-202508031744-1c46, gpt-4o |
JWT_SECRET is used to sign authentication tokens. Generate a secure random value:
# Option 1: openssl (Linux/macOS/Git Bash)
openssl rand -hex 32
# Option 2: Python (cross-platform)
python -c "import secrets; print(secrets.token_hex(32))"Copy the output and set it as JWT_SECRET in your server/.env or Coolify environment variables. If left empty, the server will fall back to LLM_API_KEY as the JWT secret.
To enable GitHub/Gitee login, register OAuth applications and set the corresponding *_CLIENT_ID and *_CLIENT_SECRET environment variables. The callback URL should be {APP_BASE_URL}/api/oauth2-redirect.
- Text Input: Add Chinese text or characters you want to analyze
- AI Analysis: Get instant AI-powered explanations and interpretations
- Character Lookup: Click on characters for detailed dictionary definitions (sourced from ZDIC and AI-structured)
- Deep Analysis: Access comprehensive annotations and contextual information
- History Review: Track your learning progress through the history feature
- Click "Login" to sign in via GitHub or Gitee
- Authenticated users get per-user history, higher quota, and export capabilities
- Guest users can still query with a shared daily pool
- Export your query history to JSON, Word (.docx), or Anki (.apkg) format
- Anki export is powered by an external service
exam-char-key/
├── client/ # Vue 3 frontend
│ ├── src/
│ │ ├── App.vue # Root component
│ │ ├── main.ts # Entry point
│ │ ├── types.ts # TypeScript type definitions
│ │ ├── assets/
│ │ │ └── main.css # Global TailwindCSS styles
│ │ ├── components/
│ │ │ ├── DictDisplay.vue # Dictionary lookup display
│ │ │ ├── QueryPanel.vue # Main query interface
│ │ │ ├── SelectionTooltip.vue # Text selection tooltip
│ │ │ └── TextContent.vue # Text content viewer
│ │ ├── router/
│ │ │ └── index.ts # Vue Router configuration
│ │ ├── stores/
│ │ │ ├── auth.ts # Authentication state (Pinia)
│ │ │ └── words.ts # Query/words state (Pinia)
│ │ └── views/
│ │ ├── HomeView.vue # Main search/query view
│ │ ├── HistoryView.vue # Query history view
│ │ └── ProfileView.vue # User profile & export
│ ├── index.html
│ ├── package.json
│ ├── vite.config.ts # Vite config with /api proxy
│ └── tsconfig*.json
├── server/ # FastAPI backend
│ ├── main.py # App entry, routes, CORS, SSE pipeline
│ ├── config.py # Pydantic settings from .env
│ ├── auth.py # GitHub/Gitee OAuth + JWT auth
│ ├── db_helper.py # SQLite database helpers (dict cache, quota, history, corpus)
│ ├── spider.py # ZDIC scraping + AI structuring pipeline
│ ├── prompt.py # LLM prompt templates
│ ├── log_helper.py # Logging configuration
│ ├── import_corpus.py # Corpus import utility
│ ├── requirements.txt
│ └── .env.example # Environment variable template
├── train/ # ML training pipeline
│ ├── extractor/ # PDF textbook extraction & classification
│ │ ├── textbook_extractor.py
│ │ ├── ancient_classifier.py
│ │ ├── model_extractor.py
│ │ └── unmarker.py
│ ├── dataset_generator/ # Training dataset generation
│ │ ├── flash_dataset_generator.py
│ │ ├── thinking_prompt_generator.py
│ │ ├── thinking_batch_converter.py
│ │ ├── evaluate_prompt_generator.py
│ │ └── filter_dataset_generator.py
│ ├── evaluator/ # Model evaluation
│ │ ├── evaluators.py
│ │ ├── evaluation_concluder.py
│ │ ├── evaluation_prompts_generator.py
│ │ ├── graph_maker.py
│ │ └── xlsx_converter.py
│ ├── frequency_statistics.py # Character frequency analysis
│ ├── models.py # Data models
│ └── utils.py # Shared utilities
├── db/ # SQLite database (auto-created)
├── logs/ # Application logs
├── scripts/
│ └── install.sh # Setup script
├── docker-compose.yml # Docker Compose configuration
├── Dockerfile # Multi-stage build (frontend → backend)
├── run_dev.py # Development launcher (Vite + Uvicorn)
└── README.md
The project includes training scripts for:
- Ancient Text Classification: Automatically detect classical Chinese texts, extracting them from Chinese textbooks.
- Character Frequency Analysis: Statistical analysis of character usage
- Dataset Generation: Create training data from textbooks and literature
If you'd like to train on your own, you're supposed to execute the scripts in the following order:
python -m train.extractor.textbook_extractor
python -m train.extractor.ancient_classifier
python -m train.extractor.model_extractor
python -m train.extractor.textbook_extractor # This should be run again, at this time non-ancient works will be sorted out.
python -m train.dataset_generator.flash_dataset_generator
python -m train.dataset_generator.thinking_prompt_generator
python -m train.dataset_generator.thinking_batch_converter
### !Batch reasoning and place result in train/result/dataset-thinking-batch-completion-{1,2,3}.json
python -m train.dataset_generator.evaluate_prompt_generator
### !Batch reasoning and place result in train/result/dataset-thinking-evaluation-completion-{1,2}.json
python -m train.dataset_generator.thinking_dataset_generatorFine Tune: Using SFT algorithm, dataset train/result/dataset-flash.jsonl and train/result/dataset-thinking.jsonl to fine-tune 2 separate Qwen3-8b models.
- Follow TypeScript best practices for frontend code
- Use Python type hints in backend code
- Maintain consistent code formatting
For support, questions, or feature requests: create an issue on GitHub, or contact the development team
- Qwen AI Models by Alibaba Cloud for advanced language processing
- ZDIC for comprehensive Chinese character dictionary
- Vue.js and FastAPI communities for excellent frameworks
- Leximory for UI/UX & system design inspiration
- 中国哲学书电子化计划 (ctext.org), 识典古籍 (shidianguji.com), and 古文岛 (guwendao.net) for providing extensive classical Chinese text repositories
- Contributors to the Chinese language learning community