This tool uses OpenAI's language model to answer questions about uploaded documents. It provides direct responses with sourced references, making document exploration faster and more efficient.
Like the CSV/PDF Analyzer, this tool originates from pilot projects using OCR and OpenAI to extract and summarize data from documents; however, it does one prompt at a time and uses retrieval-augmented generation (RAG) using chromadb (non persistent database). RAG allows only relevant chunks of the uploaded document to be passed to OpenAI for a given prompt, reducing processing time and cost.
A detailed documentation can be found here.
Note that this repository is strictly the backend logic for this tool. The frontend is hosted on the OCDS Educational AI Hub found here.
Requests are handled in the app/main.py file. The API exposes the following endpoints:
POST /di_extract_document/
Request that takes aPDFdocument then uses an Azure document intelligence prebuilt model to convert thePDFinto a stringifiedJSONand return it.WS /ws/chat_stream
Web socketthat will create chunked objects with documents string, get relevant chunks to the given question using an embedding model, then ask the question on the selected document chunks with a LLM, the response is returned as a stream (in chunks).POST /di_chunk_single_document/
Request that takes a singlePDFdocument and uses an Azure document intelligence prebuilt model to convert aPDFinto markdown chunks, they are combined into aJSONcontaining text chunks and metadata then returned.POST /di_chunk_ multi_document/
Request that takes multiplePDFdocuments and uses an Azure document intelligence prebuilt model to convert aPDFinto markdown chunks, they are combined into aJSONcontaining text chunks and metadata then returned.
All endpoints support CORS and are designed to be consumed by the associated frontend applications.
Before this API can be run, the enviroment variables need to be intialized.
- Create a new file in
/appcalled.envby copyingapp/.env.exampleand filling in the missing keys for the required Azure resources. Instead of adding in the keys to the.envfile you can add aKEY_VAULT_NAMEfor a key vault containing the keys either in the.env, adocker-compose.yml, or in the resource running the docker instance. Note that if you are using a key vault for API secrets, it will only work when hosted on Azure resources with access to the key vault.
You can run the API by building the docker image then running a container for it. Use the following commands:
docker build -t pdf-chatbot:latest .
docker run -d -p 8080:8000 --name pdf-chatbot pdf-chatbotThis will create an image called pdf-chatbot then run it in a container called pdf-chatbot hosted on port 8080.
This tool is currently hosted in the OCDS SSC (Shared Services Canada) Azure environment to be used by the OCDS Educational AI Hub. To upload the deployment follow the steps below:
- Run the command
docker build -t pdf-chatbot:latest .to update your local docker instance with the latest changes. - Update the docker image used hosted in the SSC Azure environment by running the following commands in a terminal (note you will need to download Azure CLI if you don't have it):
az login --tenant 8c1a4d93-d828-4d0e-9303-fd3bd611c822
az acr login --name AIPortal
docker tag pdf-chatbot aiportal.azurecr.io/ocds-ai-portal/pssi-chatbot:latest
docker push aiportal.azurecr.io/ocds-ai-portal/pssi-chatbot:latest- Lastly, you will need to SSH into the VM hosting the API and rerun its
docker-compose.ymlto use the latest version of the docker image.
Azure Open AI is used for large language models for generating AI responses and embedding models for implementing RAG. The PDF Chatbot, CSV/PDF Analyzer, Web Scraper, and Document OCR tools rely on OpenAI models.
Azure Open AI provides advanced machine learning models that can interpret and respond to user queries, and analyze text data extracted from documents. This integration allows for:
- Enhanced natural language processing for real-time conversation with documents.
- Sophisticated data extraction and analysis from various file formats, enabling deep insights into document content.
Document Intelligence is used to read and interpret the contents of documents uploaded to the AI Hub’s PDF Chatbot, CSV/PDF Analyzer, PII Redactor, Sensitivity Score Calculator, French Translation, and Document OCR tools. It helps in:
- Automatically extract text and data from structured and unstructured documents including non-machine-readable files.
- Organize extracted text into markdown format allowing for LLMs and embedding models to easily process document text.