Let AI think and express like you. This framework provides a complete assembly line: from the noise processing of original chat records, to the seamless switching of multi-model adaptation layers, to local lightweight fine-tuning (LoRA).
This project divides the personalisation process of AI into three core stages:
- Data Alchemy (Extract):
Convert messy chat records (including emojis, links, spam) into high-quality conversation pairs.
- Adaptor Centre (Adapt):
Through a unified interface, your personal style can be easily loaded to GPT-4, Claude or local Llama 3.
- Style Evolution (Evolve):
Use fine-tuning technology to solidify your language habits in the weight of the model, not just rely on Prompt.
Input (original chat.txt):
User A: Have you eaten yet?
I: Eat, [expression] I'm really hungry, https://link.com
User A: Then let's go.
Me: Indeed, Let'S Go.
Output (JSONL after cleaning):
{
"Messages": [
{"Role": "user", "content": "Have you eaten yet?"},
{"Role": "assistant", "content": "I'm really hungry after eating"}
]
}
PersonalStyleAI-Framework/
βββ data/
β βββ raw/ # Original chat records (such as chat.txt)
β βββ processed/ # JSONL training data set after cleaning
βββ src/ # Source code
β βββ core/ # adaptor logic and factory mode implementation
β βββ utils/ # Data preprocessing and string cleaning tools
β βββ trainers/ # Model fine-tuning script (based on PEFT/LoRA)
βββ pyproject.toml # Modern Python Dependency and Project Configuration
βββ preprocess_data.py # Data processing entry script
βββ main.py # Style Dialogue Test Entrance
βββ .env.example # Environment variable template
- Core adaptor (src/core/)
Adopt factory model design. This means that if you want to switch from OpenAI to local Ollama, you only need to change the one-line configuration without rewriting the business logic.
- Cleaning toolbox (src/utils/)
Efficient regular expressions are preset and optimised for text exported by social software.
- Environmental isolation
Use .env to manage sensitive information and manage dependency hierarchy through pyproject.toml.
1.Basic version (only call API)
# Clone Project
Git clone [https://github.com/your username/PersonalStyleAI-Framework.git](https://github.com/your username/Personal StyleAI-Framework.git)
Cd PersonalStyleAI-Framework
# Create a virtual environment and install core dependencies
Python -m venv venv
Source venv/bin/activate # Windows use venv\Scripts\activate
Pip install-e.
- Configure the key
Create a .env file and fill in your API Key:
cp .env.example .env
- Build your style
Collect data: Put your chat records or articles into data/raw/chat.txt. Running cleaning:
python preprocess_data.py
The script will generate data/processed/train.jsonl, which is a "textbook" for AI to learn your style. 4. Run the dialogue
python main.py
If you have a graphics card that supports CUDA, you can install fine-tuning components for local training: Pip install -e ".[train]"
# Run the fine-tuning script (need to configure parameters according to src/trainers)
Python run_train.py
If you have any suggestions for improvement or want to add more AI adaptors (such as Anthropic or DeepSeek), welcome to submit a Pull Request or open an Issue discussion.