Support for a locally hosted AI service instead of claude

experiments with local "large" language models using ollama for OCR and expense extraction have been proven unfruitful. need bigger models than my 8GB vram allows to implement this myself. this probably won't work reliably at all unless users got a really big, at least 16GB vram gpu in their server.