Structured Markdown note generation from learning videos: ASR + multimodal chapter segmentation (TextTiling x Chinese-CLIP x Qwen2.5-VL caption x LLM) + glossary + bilingual output. 学习类视频结构化笔记生成系统(本科毕设/课程项目)
markdown thesis nextjs video-summarization speech-recognition note-taking whisper bilingual multimodal llm faster-whisper qwen chinese-clip chapter-segmentation
-
Updated
May 18, 2026 - Python