Complete training and inference pipeline for fine-tuning Mistral-7B-Instruct with LoRA for domain-specific study-abroad guidance using Kaggle GPUs and Weights & Biases integration.
| Component | Link |
|---|---|
| Dataset | millat/StudyAbroadGPT-Dataset |
| Model Card | millat/StudyAbroadGPT-7B-LoRa-Kaggle |
| Dataset Generation | codermillat/study-abroad-dataset |
| Evaluation Companion | LoRA Paper workspace |
| Paper | arXiv:2504.15610 |
- Base Model:
mistralai/Mistral-7B-Instruct-v0.3 - Quantization: 4-bit NF4 via Unsloth
- Training Method: LoRA (Low-Rank Adaptation)
- LoRA Rank: 16, Alpha: 32
- Target Modules: Attention (q_proj, k_proj, v_proj, o_proj) + FFN (gate_proj, up_proj, down_proj)
- Training Data: 2,274 synthetic study-abroad conversations
- Training Time: ~4-5 epochs
- Memory Footprint: ~8GB model + 4GB training buffer
- Hardware: Tesla T4 or P100 GPU (Kaggle/Colab)
StudyAbroadGPT/
βββ Study_Abroad_GPT.ipynb # Main training notebook (Colab)
βββ Study_Abroad_GPT_Kaggle-T4.ipynb # Kaggle T4 optimized training
βββ Study_Abroad_GPT_Kaggle-P100.ipynb # Kaggle P100 optimized training
βββ StudyAbroadGPT_Inference.ipynb # Inference and testing notebook
βββ architecture.md # Technical architecture documentation
βββ WANDB.md # Weights & Biases integration guide
βββ training_analysis.md # Training metrics and loss analysis
βββ paper.md / paper_v2.md # Paper drafts and methodology
βββ conclusions.md # Research findings and conclusions
βββ documentation.md # General documentation
βββ readme-dataset.md # Dataset-specific notes
βββ Report_T4/ # WandB reports from T4 runs
βββ Report_P100/ # WandB reports from P100 runs
pip install transformers peft torch bitsandbytes unsloth
pip install wandb # for training monitoring- Open
Study_Abroad_GPT_Kaggle-T4.ipynb(or P100 variant) in Kaggle - Add your Hugging Face and WandB API keys as Kaggle secrets
- Run all cells
# Key training parameters
training_args = TrainingArguments(
per_device_train_batch_size=2,
gradient_accumulation_steps=4,
warmup_ratio=0.03,
num_train_epochs=4,
learning_rate=2e-4,
logging_steps=1,
optim="adamw_8bit",
max_grad_norm=0.3,
output_dir="outputs",
report_to="wandb"
)from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import AutoPeftModelForCausalLM
# Load merged model
model = AutoModelForCausalLM.from_pretrained(
"millat/StudyAbroadGPT-7B-LoRa-Kaggle",
subfolder="merged",
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(
"millat/StudyAbroadGPT-7B-LoRa-Kaggle",
subfolder="merged"
)
# Generate response
prompt = "What documents do I need for a student visa?"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)Or use the inference notebook: StudyAbroadGPT_Inference.ipynb
| Component | Setting |
|---|---|
| Model | Mistral-7B-Instruct-v0.3 |
| Quantization | 4-bit NF4 (Unsloth) |
| Sequence Length | 2048 tokens |
| Model Memory | ~8 GB |
| LoRA Rank | 16 |
| LoRA Alpha | 32 |
| Trainable Parameters | ~4.7M (vs 7B total) |
- Model Layer: 8GB (4-bit quantized parameters + LoRA adapters)
- Training Layer: 4GB (gradient accumulation, optimizer states)
- Buffer Layer: 2GB (forward/backward computation, temp storage)
Data Pipeline β Preprocessing β Training Loop β Evaluation
β
Gradient Accumulation (4 steps)
Learning Rate Scheduling
Loss Tracking (WandB)
Training runs are logged to WandB project StudyAbroadGPT. Track:
- Training loss curves
- Learning rate schedule
- GPU memory usage
- Gradient norms
- Training speed (tokens/sec)
- Quality metrics per batch
Access reports in Report_T4/ and Report_P100/ directories.
From companion evaluation artifacts (50-sample lightweight run):
- Base model avg response length: 1151.88 chars
- LoRA avg response length: 1178.74 chars (+26.86)
- Domain-specific term coverage:
- University: base 48%, LoRA 42%
- Tuition: base 12%, LoRA 22%
- Scholarship: base 20%, LoRA 16%
Status: Manual blinded scoring and factuality audit still pending.
- architecture.md β Technical details, memory management, model pipeline
- WANDB.md β Setting up monitoring and accessing training reports
- training_analysis.md β Loss analysis, convergence metrics, resource profiling
- paper.md / paper_v2.md β Research methodology and findings
- conclusions.md β Key insights and recommendations
- documentation.md β General setup and usage guide
- readme-dataset.md β Dataset-specific notes and preprocessing
- β Parameter-Efficient β LoRA reduces trainable params to ~4.7M
- β Memory-Efficient β Runs on T4 (16GB) with 4-bit quantization
- β Production-Ready β Merged weights in HuggingFace model card
- β Reproducible β Fixed seed, deterministic generation
- β Monitored β Full WandB integration for training transparency
- β Domain-Focused β 2274 synthetic study-abroad conversations
- β Inference-Optimized β Unsloth for faster generation
| GPU | RAM | Time | Status |
|---|---|---|---|
| Tesla T4 | 16GB | ~3-4 hours | β Tested |
| Tesla P100 | 16GB+ | ~1-2 hours | β Tested |
| A100 | 40GB+ | <1 hour | Should work |
Note: Training on CPU is not recommended. Kaggle provides free GPU access.
If you use this code or model, please cite:
@article{hosen2025lora,
title={A LoRA-Based Approach to Fine-Tuning LLMs for Educational Guidance in Resource-Constrained Settings},
author={Hosen, Md Millat},
journal={arXiv preprint arXiv:2504.15610},
year={2025},
doi={10.48550/arXiv.2504.15610}
}Apache 2.0
- This is a domain-adapted model for experimental use. Validate all outputs against official university and immigration sources before operational use.
- The model is not a replacement for official advising.
- Use in production should include additional validation and factuality checking.
Open an issue on GitHub or check the companion evaluation artifacts in the LoRA Paper workspace.