Skip to content

updated LLM endpoint adding note information#2431

Open
idkcoder93 wants to merge 5 commits into
openo-beta:developfrom
idkcoder93:alexBranch
Open

updated LLM endpoint adding note information#2431
idkcoder93 wants to merge 5 commits into
openo-beta:developfrom
idkcoder93:alexBranch

Conversation

@idkcoder93

@idkcoder93 idkcoder93 commented May 19, 2026

Copy link
Copy Markdown

LLM endpoints currently takes demographic, allergies, measurements, and medication. Notes are now added for extraction to use for our project. Development is still ongoing to see what other information for our endpoint is required.


Summary by cubic

Adds a new LLM export endpoint that returns a structured JSON export of a patient’s chart, including encounter notes and extracted text from PDF documents. Adds an “AI Summary” button in Chart Notes that opens the patient’s summary page.

  • New Features

    • REST endpoint at /llm/export?demographicId=... returning JSON with demographics, allergies, medications, prescriptions, measurements, documents, and encounter notes.
    • PDF text extraction for documents (PDFBox) with SHA-256 content-hash validation and a two-layer cache: llm_document_text table (persistent) and an in-memory cache; supports bulk preload per patient.
    • Enforces demographic read privilege; returns 403 when access is denied.
    • Adds an “AI Summary” button in Chart Notes that opens http://localhost:3000/patient/{demographicId} in a new tab.
  • Migration

    • Run database/mysql/updates/update-2026-05-31.sql to create the llm_document_text table.

Written for commit 9ef1f9a. Summary will update on new commits.

Review in cubic

Summary by CodeRabbit

  • New Features

    • Added LLM patient export endpoint to produce structured patient JSON for external use.
    • Automatic PDF text extraction with caching to include extracted text in exports.
  • User Interface

    • Added “AI Summary” button in chart notes to open the patient AI summary view.
  • Database

    • New table to store cached extracted PDF document text for faster exports.

@qodo-code-review

Copy link
Copy Markdown

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @idkcoder93, your pull request is larger than the review limit of 150000 diff characters

@coderabbitai

coderabbitai Bot commented May 19, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Adds an AI patient export API producing LLM-ready JSON and a two-layer PDF text cache (in-memory + DB) with hash-based change detection; includes schema, JPA model/DAO, extraction service, REST endpoint, export assembly, Spring wiring, a ChartNotes UI button, a devcontainer Playwright tweak, and lockfile bumps.

Changes

AI Patient Export with LLM Integration

Layer / File(s) Summary
Database schema for extracted document text cache
database/mysql/oscarinit.sql, database/mysql/updates/update-2026-05-31.sql
New llm_document_text table stores cached extracted PDF text with document/demographic foreign keys, SHA-256 content hash for change detection, and extraction timestamp.
JPA entity and DAO
src/main/java/ca/openosp/openo/commn/model/LlmDocumentText.java, src/main/java/ca/openosp/openo/commn/dao/LlmDocumentTextDao.java, src/main/java/ca/openosp/openo/commn/dao/LlmDocumentTextDaoImpl.java, src/test-modern/resources/META-INF/persistence.xml
LlmDocumentText entity maps the cache table; DAO interface and implementation provide query/save/delete operations; test persistence unit registers the entity.
PDF extraction and dual-layer caching
src/main/java/ca/openosp/openo/managers/LlmDocumentTextService.java
In-memory L1 cache plus DB L2 validated by SHA-256 of PDF bytes; extracts plain text with PDFBox on miss/stale, persists/upserts extraction records, warms cache, and supports eviction.
REST endpoint for patient export
src/main/java/ca/openosp/openo/webserv/rest/LLMExtractorEndpoint.java
JAX-RS GET /llm/export resource that enforces access control and returns generated patient export JSON or a 403 error payload.
Patient export aggregation and JSON generation
src/main/java/ca/openosp/openo/webserv/rest/LLMExtractorService.java
Loads demographics, allergies, medications, prescriptions, documents (uses cached/extracted PDF text), measurements, and notes; builds structured payload with context and summary_hints, serializes to JSON with ISO-8601 UTC timestamps and helpers.
Spring/CXF wiring and config
src/main/resources/applicationContextREST.xml, src/main/resources/spring_ws.xml
Registers the new REST endpoint bean in CXF JAX-RS server and imports REST application context.
UI trigger for AI summary
src/main/webapp/casemgmt/ChartNotes.jsp
Adds an “AI Summary” button that opens the AI summary viewer for the current demographic.

DevContainer Playwright Configuration Update

Layer / File(s) Summary
Playwright browser installation update
.devcontainer/development/Dockerfile
Removes Chrome from Playwright installation; installs only Chromium and Firefox with --with-deps and updates the status message.

Dependency lock updates

Layer / File(s) Summary
Lockfile bumps for JNA
dependencies-lock-modern.json, dependencies-lock.json
Bumps net.java.dev.jna:jna from 5.18.1 to 5.19.0 and updates integrity checksums in both lockfiles.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

  • openo-beta/Open-O#1669: Related change to the Playwright browser install step in the devcontainer Dockerfile.

Suggested labels

Review effort 4/5

Suggested reviewers

  • D3V41
  • LiamStanziani

Poem

🐰
I hop through PDFs, nibble text with glee,
Caching bytes and hashes, keeping them tidy.
Exports hum softly, JSON bright and neat,
Buttons open portals to patient feats.
Hooray — the rabbit’s done another speedy deed!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 31.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title mentions 'LLM endpoint' and 'note information', which are present but represent only partial aspects of a much larger changeset that includes database schema, persistence layer, caching infrastructure, and UI updates.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
⚔️ Resolve merge conflicts
  • Resolve merge conflict in branch alexBranch

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new LLM export REST endpoint for AI-ready patient data, adds OMA uninsured service fee codes for Ontario billing, and implements the NRTF measurement type for diabetes flowsheet compliance. It also features a significant refactor of the prescription drug profile interface to support client-side filtering and includes security enhancements such as document validation for eReferrals. Review feedback emphasizes the need for an explicit authorization check on the LLM endpoint, improved idempotency in SQL scripts, and the removal of redundant imports and bean registrations. Additionally, it is recommended to use standardized date formatting for patient data exports to ensure consistency across locales.

Comment on lines +14 to +28
@Path("/llm")
public class LLMExtractorEndpoint {

@Autowired
private LLMExtractorService llmExtractorService;

@GET
@Path("/export")
@Produces(MediaType.TEXT_PLAIN)
public String exportPatient(
@QueryParam("demographicId") Integer demographicId,
@Context HttpServletRequest request) {
LoggedInInfo loggedInInfo = LoggedInInfo.getLoggedInInfoFromRequest(request);
return llmExtractorService.generatePatientExport(loggedInInfo, demographicId);
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The new LLM export endpoint lacks an explicit security privilege check. Since this endpoint exports sensitive patient data (demographics, allergies, medications, notes), it should verify that the logged-in user has the necessary permissions (e.g., _read) before proceeding. Additionally, it is good practice to handle cases where loggedInInfo might be null.

@Path("/llm")
public class LLMExtractorEndpoint {

    @Autowired
    private LLMExtractorService llmExtractorService;

    @Autowired
    private ca.openosp.openo.managers.SecurityInfoManager securityInfoManager;

    @GET
    @Path("/export")
    @Produces(MediaType.TEXT_PLAIN)
    public String exportPatient(
            @QueryParam("demographicId") Integer demographicId,
            @Context HttpServletRequest request) {
        LoggedInInfo loggedInInfo = LoggedInInfo.getLoggedInInfoFromRequest(request);
        if (loggedInInfo == null || !securityInfoManager.hasPrivilege(loggedInInfo, "_read", ca.openosp.openo.managers.SecurityInfoManager.READ, null)) {
            return "Access Denied";
        }
        return llmExtractorService.generatePatientExport(loggedInInfo, demographicId);
    }
}

Comment on lines +9 to +13
ON DUPLICATE KEY UPDATE
`typeDisplayName`='Neurological exam: 128Hz tuning fork D1',
`typeDescription`='Neurological exam: 128Hz tuning fork D1',
`measuringInstruction`='Normal',
`validation`='7';

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The ON DUPLICATE KEY UPDATE clause is missing the createDate column. To ensure the statement is fully idempotent and consistent with the organization's general rules, all relevant columns from the INSERT should be included in the UPDATE clause.

ON DUPLICATE KEY UPDATE
  typeDisplayName='Neurological exam: 128Hz tuning fork D1',
  typeDescription='Neurological exam: 128Hz tuning fork D1',
  measuringInstruction='Normal',
  validation='7',
  createDate='2026-03-23 00:00:00';
References
  1. When using INSERT ... ON DUPLICATE KEY UPDATE, update all relevant columns in the ON DUPLICATE KEY UPDATE clause to ensure the statement is fully idempotent.

Comment on lines +3 to +30
//import ca.openosp.openo.PMmodule.model.SecUserRole;
import ca.openosp.openo.casemgmt.model.CaseManagementNote;
import ca.openosp.openo.casemgmt.service.NoteSelectionCriteria;
import ca.openosp.openo.casemgmt.service.NoteSelectionResult;
import ca.openosp.openo.casemgmt.service.NoteService;
import ca.openosp.openo.commn.model.Allergy;
import ca.openosp.openo.commn.model.Demographic;
import ca.openosp.openo.commn.model.Document;
import ca.openosp.openo.commn.model.Drug;
import ca.openosp.openo.commn.model.Measurement;
import ca.openosp.openo.commn.model.Prescription;
import ca.openosp.openo.managers.AllergyManager;
import ca.openosp.openo.managers.DemographicManager;
import ca.openosp.openo.managers.DocumentManager;
import ca.openosp.openo.managers.MeasurementManager;
import ca.openosp.openo.managers.PrescriptionManager;
import ca.openosp.openo.utility.LoggedInInfo;
import ca.openosp.openo.utility.MiscUtils;
import org.apache.logging.log4j.Logger;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Component;
import ca.openosp.openo.casemgmt.model.CaseManagementNote;
import ca.openosp.openo.casemgmt.service.NoteSelectionCriteria;
import ca.openosp.openo.casemgmt.service.NoteSelectionResult;
import ca.openosp.openo.casemgmt.service.NoteService;
import ca.openosp.openo.PMmodule.dao.SecUserRoleDao;
import ca.openosp.openo.PMmodule.model.SecUserRole;
import java.util.stream.Collectors;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There are several duplicate and redundant imports in this file. Specifically, CaseManagementNote, NoteSelectionCriteria, NoteSelectionResult, and NoteService are imported twice (lines 4-7 and 24-27). Also, line 3 is a commented-out import that should be removed.

import ca.openosp.openo.casemgmt.model.CaseManagementNote;
import ca.openosp.openo.casemgmt.service.NoteSelectionCriteria;
import ca.openosp.openo.casemgmt.service.NoteSelectionResult;
import ca.openosp.openo.casemgmt.service.NoteService;
import ca.openosp.openo.commn.model.Allergy;
import ca.openosp.openo.commn.model.Demographic;
import ca.openosp.openo.commn.model.Document;
import ca.openosp.openo.commn.model.Drug;
import ca.openosp.openo.commn.model.Measurement;
import ca.openosp.openo.commn.model.Prescription;
import ca.openosp.openo.managers.AllergyManager;
import ca.openosp.openo.managers.DemographicManager;
import ca.openosp.openo.managers.DocumentManager;
import ca.openosp.openo.managers.MeasurementManager;
import ca.openosp.openo.managers.PrescriptionManager;
import ca.openosp.openo.utility.LoggedInInfo;
import ca.openosp.openo.utility.MiscUtils;
import org.apache.logging.log4j.Logger;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Component;
import ca.openosp.openo.PMmodule.dao.SecUserRoleDao;
import ca.openosp.openo.PMmodule.model.SecUserRole;
import java.util.stream.Collectors;

n.setSigned(nd.isSigned());
return n;
})
.collect(java.util.stream.Collectors.toList());

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The fully qualified name for Collectors.toList() is redundant since Collectors is already imported.

Suggested change
.collect(java.util.stream.Collectors.toList());
.collect(Collectors.toList());

Comment on lines +380 to +382
private String fmt(Date date) {
return date == null ? "N/A" : date.toString();
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using date.toString() for patient data export is discouraged as it produces a locale-dependent format. To maintain consistency with the repository's approach of avoiding global configurations, please use per-field configuration (such as @jsonformat or a specific formatter) to specify a standardized format like ISO 8601 (yyyy-MM-dd) for this export.

References
  1. Do not suggest applying a single, global date serialization configuration. The system intentionally uses multiple date formats, so per-field configuration like @jsonformat is the required approach.

<bean class="ca.openosp.openo.webserv.rest.MeasurementService" autowire="byName" />
<bean class="ca.openosp.openo.webserv.rest.DocumentService" autowire="byName" />
<bean class="ca.openosp.openo.webserv.rest.ConsentService" autowire="byName"/>
<bean class="ca.openosp.openo.webserv.rest.LLMExtractorEndpoint" autowire="byName"/>

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The LLMExtractorEndpoint bean is being registered in both applicationContextREST.xml and spring_ws.xml. Since spring_ws.xml imports applicationContextREST.xml, this double registration is redundant and could lead to configuration conflicts. It should be defined in only one of these files.

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 49 files

Confidence score: 1/5

  • High risk due to a concrete security gap in src/main/java/ca/openosp/openo/webserv/rest/LLMExtractorEndpoint.java: sensitive patient data can be exported without a privilege check, which is likely merge-blocking.
  • There is also a runtime stability concern in src/main/resources/spring_ws.xml because LLMExtractorEndpoint is registered twice after importing applicationContextREST.xml, which can trigger bean creation conflicts at startup.
  • src/main/java/ca/openosp/openo/commn/dao/DocumentDaoImpl.java introduces a data-consistency regression risk by filtering only CtlDocument.status and not Document.status, potentially surfacing deleted documents.
  • Pay close attention to src/main/java/ca/openosp/openo/webserv/rest/LLMExtractorEndpoint.java and src/main/resources/spring_ws.xml - access control and duplicate bean wiring are the highest-impact risks.
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/main/resources/spring_ws.xml">

<violation number="1" location="src/main/resources/spring_ws.xml:224">
P2: The `LLMExtractorEndpoint` bean is registered in both `applicationContextREST.xml` (line 193) and `spring_ws.xml` (line 223). Since `spring_ws.xml` now imports `applicationContextREST.xml` (line 305), this causes a double registration that can lead to duplicate JAX-RS resource registration or bean conflicts. Remove the bean definition from one of these files.</violation>
</file>

<file name="src/main/java/ca/openosp/openo/commn/dao/DocumentDaoImpl.java">

<violation number="1" location="src/main/java/ca/openosp/openo/commn/dao/DocumentDaoImpl.java:255">
P2: New lookup only checks CtlDocument.status and omits Document.status, risking inclusion of deleted documents if the two tables drift.</violation>
</file>

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread src/main/java/ca/openosp/openo/webserv/rest/LLMExtractorEndpoint.java Outdated
Query query = entityManager.createQuery(
"SELECT c.id.documentNo FROM CtlDocument c " +
"WHERE c.id.module = 'demographic' AND c.id.moduleId = :demographicNo " +
"AND c.status != 'D' AND c.id.documentNo IN :docNos");

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: New lookup only checks CtlDocument.status and omits Document.status, risking inclusion of deleted documents if the two tables drift.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/main/java/ca/openosp/openo/commn/dao/DocumentDaoImpl.java, line 255:

<comment>New lookup only checks CtlDocument.status and omits Document.status, risking inclusion of deleted documents if the two tables drift.</comment>

<file context>
@@ -245,6 +245,19 @@ public void subtractPages(String documentNo, Integer i) {
+        Query query = entityManager.createQuery(
+                "SELECT c.id.documentNo FROM CtlDocument c " +
+                "WHERE c.id.module = 'demographic' AND c.id.moduleId = :demographicNo " +
+                "AND c.status != 'D' AND c.id.documentNo IN :docNos");
+        query.setParameter("demographicNo", demographicNo);
+        query.setParameter("docNos", docNos);
</file context>

<bean class="ca.openosp.openo.webserv.rest.DiseaseRegistryService" autowire="byName" />
<!-- <bean class="org.oscarehr.webservbserv.rest.SystemInfoService" autowire="byName" /> -->
<!-- LLM endpoint added below -->
<bean class="ca.openosp.openo.webserv.rest.LLMExtractorEndpoint" autowire="byName" />

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: The LLMExtractorEndpoint bean is registered in both applicationContextREST.xml (line 193) and spring_ws.xml (line 223). Since spring_ws.xml now imports applicationContextREST.xml (line 305), this causes a double registration that can lead to duplicate JAX-RS resource registration or bean conflicts. Remove the bean definition from one of these files.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/main/resources/spring_ws.xml, line 224:

<comment>The `LLMExtractorEndpoint` bean is registered in both `applicationContextREST.xml` (line 193) and `spring_ws.xml` (line 223). Since `spring_ws.xml` now imports `applicationContextREST.xml` (line 305), this causes a double registration that can lead to duplicate JAX-RS resource registration or bean conflicts. Remove the bean definition from one of these files.</comment>

<file context>
@@ -220,6 +220,8 @@
 			<bean class="ca.openosp.openo.webserv.rest.DiseaseRegistryService" autowire="byName" />
 <!--			<bean class="org.oscarehr.webservbserv.rest.SystemInfoService" autowire="byName" /> -->
+										<!-- LLM endpoint added below -->
+			<bean class="ca.openosp.openo.webserv.rest.LLMExtractorEndpoint" autowire="byName" />
 			<bean class="ca.openosp.openo.webserv.rest.LabService" autowire="byName" />
 			<bean class="ca.openosp.openo.webserv.rest.MeasurementService" autowire="byName" />
</file context>

@lacarmen lacarmen changed the base branch from main to develop May 21, 2026 12:23

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 16

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.devcontainer/development/Dockerfile:
- Around line 74-75: The Dockerfile currently installs only Playwright's
chromium and firefox via the RUN npx playwright install command but the test
code creates new ChromeDriver(options) in BaseTest.java without setting
CHROME_BIN or using ChromeOptions#setBinary, so Selenium won't find a Chrome
binary; either install a Chrome/Chromium browser binary (or add chrome to the
playwright install on non-arm64 architectures) or update the test setup to point
ChromeOptions#setBinary to Playwright's installed Chromium executable; also fix
the echo message to reflect that Chrome is unconditionally omitted (or make it
conditional on architecture) rather than claiming an “arm64 Chrome on Linux”
limitation.

In `@database/mysql/oscarinit.sql`:
- Around line 1093-1096: Add a uniqueness constraint to prevent duplicate cache
rows for the same document/patient pair by adding a UNIQUE index on
(document_no, demographic_no); modify the table definition (near the existing
INDEX idx_document_no and INDEX idx_demographic_no lines) to include UNIQUE KEY
(document_no, demographic_no) or run ALTER TABLE ... ADD CONSTRAINT
unique_document_demographic UNIQUE (document_no, demographic_no) so the cache
enforces one row per (document_no, demographic_no).
- Around line 1086-1097: The llm_document_text table lacks the required audit
columns; add lastUpdateUser and lastUpdateDate to the CREATE TABLE for
llm_document_text (and to any ALTER if updating an existing table): add a
lastUpdateUser column (e.g., VARCHAR(64) NOT NULL) and a lastUpdateDate column
(DATETIME NOT NULL with appropriate default/ON UPDATE CURRENT_TIMESTAMP per
project conventions) and ensure indexes/constraints remain unchanged; update any
insert/update code or migrations that populate these fields if present.

In `@database/mysql/updates/update-2026-05-31.sql`:
- Around line 2-13: The new table llm_document_text is missing the required
audit-trail columns; add columns named lastUpdateUser (VARCHAR appropriate
length, NOT NULL) and lastUpdateDate (DATETIME NOT NULL) to the CREATE TABLE for
llm_document_text and ensure they follow existing repo conventions (types, NOT
NULL constraints, and ordering consistent with other tables) so the migration
satisfies the "All database tables MUST include lastUpdateUser and
lastUpdateDate" rule.
- Around line 2-13: The table llm_document_text currently permits duplicate
cache entries for the same logical key; add a UNIQUE constraint on the
document/patient pair to enforce one cached row per pair (e.g., unique on
(document_no, demographic_no)) by modifying the CREATE TABLE (or issuing ALTER
TABLE) to include a UNIQUE index (e.g., uniq_document_demographic) on those
columns; alternatively, if intended to allow multiple rows per document, update
the DAO that reads llm_document_text (which currently queries by document_no and
expects a single row) to return and handle multiple rows instead.

In `@src/main/java/ca/openosp/openo/commn/dao/LlmDocumentTextDao.java`:
- Around line 7-34: The interface LlmDocumentTextDao and its methods
(findByDocumentNo, findByDemographicNo, save, deleteByDocumentNo) are missing
required JavaDoc tags; update each method JavaDoc to include a `@since` tag with
the current module version or milestone, a `@param` entry documenting the
parameter (e.g., documentNo or demographicNo) and its meaning, and a `@return` tag
describing the returned value (including when Optional is empty or what the List
contains) where applicable; also add a class-level JavaDoc `@since` for
LlmDocumentTextDao itself so the public API meets the repository JavaDoc
standard.

In `@src/main/java/ca/openosp/openo/commn/dao/LlmDocumentTextDaoImpl.java`:
- Around line 15-54: Add comprehensive JavaDoc to the public class
LlmDocumentTextDaoImpl and to each public method (findByDocumentNo,
findByDemographicNo, save, deleteByDocumentNo): document the purpose of the
class and each method, include a `@since` tag, typed `@param` tags (e.g., `@param`
documentNo Integer, `@param` demographicNo Integer, `@param` entity LlmDocumentText)
and precise `@return` tags (e.g., `@return` Optional<LlmDocumentText>, `@return`
List<LlmDocumentText>, `@return` LlmDocumentText, `@return` void) plus any
thrown/runtime behavior; ensure descriptions describe inputs, outputs, and side
effects (persistence/merge/delete) to satisfy the project's JavaDoc
requirements.

In `@src/main/java/ca/openosp/openo/commn/model/LlmDocumentText.java`:
- Around line 21-87: Add comprehensive JavaDoc to the public class
LlmDocumentText and all its public methods (both constructors and all
getters/setters), describing purpose, parameters, return values and any
nullability behavior; include accurate `@since` tags (use 2026-06-01) on the class
and each public method/constructor and ensure the class-level description
expands on the existing summary (what the entity represents, meaning of fields
like contentHash and extractedAt) and each accessor documents its contract
(e.g., getExtractedText may return null), so tools and readers have full API
documentation.
- Around line 36-49: The extractedText field in LlmDocumentText is storing PHI
in plaintext; update persistence to encrypt/decrypt it using the repo's
EncryptionUtils (or existing encryption service) so extractedText is encrypted
before being saved and decrypted only when assembling an export; modify the
LlmDocumentText mapping/DAO/repository interactions (references: class
LlmDocumentText, field extractedText) to call EncryptionUtils.encrypt(...) on
set/save and EncryptionUtils.decrypt(...) when loading/assembling exports,
ensure column remains MEDIUMTEXT but stores ciphertext and add any necessary
null/exception handling for failed encryption/decryption; apply same change to
any other plaintext PHI fields mentioned (e.g., other extracted_* fields
referenced at 57-64).

In `@src/main/java/ca/openosp/openo/managers/LlmDocumentTextService.java`:
- Around line 168-177: The persist method writes extracted PDF text (PHI) in
plaintext to LlmDocumentText; change it to encrypt the text before saving by
calling EncryptionUtils.encrypt(...) and store the ciphertext via
record.setExtractedText(...), keep contentHash as-is, and ensure any callers
that read LlmDocumentText use EncryptionUtils.decrypt(...) when returning plain
text; also remove/avoid any logging of the plaintext in persist and reference
the persist method and LlmDocumentText record when making this change.
- Around line 145-150: The code reads a rendered Path directly (using
documentManager.renderDocument and Files.exists/readAllBytes) without validating
it; update LlmDocumentTextService to validate pdfPath via the project's
PathValidationUtils before accessing the file (e.g., call the appropriate
PathValidationUtils method to ensure the path is within allowed locations/not a
traversal attack), and if validation fails log a warning similar to the existing
logger.warn and return null; only after successful validation proceed to
Files.readAllBytes(pdfPath).
- Around line 81-84: The in-memory L1 cache is being trusted blindly
(memoryCache.containsKey(docNo) / memoryCache.get(docNo)) which allows stale
extracted text to be served after the backing PDF changes; update
getExtractedText() to validate cached entries by comparing a stored
checksum/etag with the current PDF bytes (e.g., compute hash on pdfBytes) before
returning and fall back to re-extraction when mismatched, and ensure
getPreloadedTexts() stores that same checksum/etag alongside the extracted text
in memoryCache so the validation can occur consistently.
- Around line 81-84: The memory cache returns the sentinel empty string written
by putInMemoryCache(), violating the LlmDocumentTextService contract; update the
L1 fast-path read and the other cache-read site (the read around lines
referenced: the memoryCache.get(docNo) in LlmDocumentTextService) to treat "" as
null by checking the retrieved value from memoryCache (e.g., result =
memoryCache.get(docNo); if (result != null && result.isEmpty()) return null;) so
callers receive null for documents with no extractable text; keep the sentinel
write in putInMemoryCache() unchanged.

In `@src/main/java/ca/openosp/openo/webserv/rest/LLMExtractorEndpoint.java`:
- Around line 16-30: Add comprehensive JavaDoc to the public class
LLMExtractorEndpoint and its public method exportPatient: document the class
purpose and include a `@since` tag; for exportPatient add `@since`, `@param`
demographicId description, `@param` request description, `@return` describing the
Response payload and status codes, and `@throws` tags for any runtime or web
exceptions the method may propagate (e.g., WebApplicationException or
IOException as applicable); reference the injected dependencies
llmExtractorService and securityInfoManager in the class-level doc as their role
in the endpoint behavior.
- Around line 32-35: Replace the call to
LoggedInInfo.getLoggedInInfoFromRequest(request) with
LoggedInInfo.getLoggedInInfoFromSession(request), then immediately check
LoggedInInfo.isExpired() on the returned object; if expired, invalidate the
session (use LoggedInInfo.invalidateSession(request) or the project’s session
invalidation helper) and return an appropriate unauthorized response (e.g.,
Response.Status.UNAUTHORIZED) before calling
securityInfoManager.hasPrivilege(...). Ensure the subsequent privilege check
still uses securityInfoManager.hasPrivilege(loggedInInfo, "_demographic", "r",
demographicId).

In `@src/main/java/ca/openosp/openo/webserv/rest/LLMExtractorService.java`:
- Around line 118-129: The system label in buildContext is using the old product
name; update the value put into ctx with key "system" (in method buildContext)
to the new approved label (e.g., "OpenO EMR patient export" or "OpenO") so
context.system no longer emits "OSCAR EMR patient export"; change only that
literal string in the ctx.put("system", ...) call to the approved name.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 46e79918-c99c-47ac-b9e2-c4968d93dbbb

📥 Commits

Reviewing files that changed from the base of the PR and between c9dbad8 and 2ce37d2.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (12)
  • .devcontainer/development/Dockerfile
  • database/mysql/oscarinit.sql
  • database/mysql/updates/update-2026-05-31.sql
  • src/main/java/ca/openosp/openo/commn/dao/LlmDocumentTextDao.java
  • src/main/java/ca/openosp/openo/commn/dao/LlmDocumentTextDaoImpl.java
  • src/main/java/ca/openosp/openo/commn/model/LlmDocumentText.java
  • src/main/java/ca/openosp/openo/managers/LlmDocumentTextService.java
  • src/main/java/ca/openosp/openo/webserv/rest/LLMExtractorEndpoint.java
  • src/main/java/ca/openosp/openo/webserv/rest/LLMExtractorService.java
  • src/main/resources/applicationContextREST.xml
  • src/main/resources/spring_ws.xml
  • src/test-modern/resources/META-INF/persistence.xml

Comment on lines +74 to +75
RUN npx playwright install --with-deps chromium firefox && \
echo "✅ Playwright browsers installed (Chromium, Firefox) -- does not support arm64 Chrome on Linux yet - removed Chrome from the list"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Search for Chrome browser references in development scripts and configs

# Search for references to 'chrome' browser in test configs and scripts
rg -i --type-add 'config:*.{json,yml,yaml,conf,config,properties}' --type config 'chrome' -g '!node_modules' -g '!.git'

# Search for Chrome-specific browser configurations in JavaScript/TypeScript test files
rg -i "browser.*chrome|launch.*chrome" --type js --type ts -g '!node_modules' -C3

Repository: openo-beta/Open-O

Length of output: 50373


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Limit search to likely dev/test automation areas to avoid vendor/static frontend noise.
rg -n -i "playwright|chromium\.launch|browserType: *'chromium'|launch\('chromium'|chromium\.launch\(|launch.*chrome|browser.*chrome|chrome\.launch|chromedriver|selenium-chrome-driver|ChromeDriver" \
  --glob '!.git/**' \
  --glob '!src/main/webapp/**' \
  --glob '!.*/node_modules/**' \
  --glob '!**/dist/**' \
  --glob '!**/build/**' \
  --glob '!.devcontainer/**/node_modules/**' \
  .

echo "----"
echo "Chrome-specific refs in package scripts (package*.json)"
rg -n -i '"[^"]*chrome[^"]*"|chromedriver|selenium-chrome-driver|playwright.*chrome' \
  --glob 'package*.json' \
  --glob '!.*/node_modules/**' \
  .

echo "----"
echo "Chrome refs in devcontainer/Docker/workflow configs"
rg -n -i "chrome|chromium" --glob '.devcontainer/**' --glob '.github/**' --glob '*.yml' --glob '*.yaml' --glob '*.sh' --glob '*.env' .

Repository: openo-beta/Open-O

Length of output: 6962


🏁 Script executed:

#!/bin/bash
set -euo pipefail
# Check whether Dockerfile already has any architecture logic and inspect surrounding lines.
FILE=".devcontainer/development/Dockerfile"
nl -ba "$FILE" | sed -n '50,110p'

Repository: openo-beta/Open-O

Length of output: 104


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE=".devcontainer/development/Dockerfile"

echo "=== Dockerfile lines 60-100 ==="
cat -n "$FILE" | sed -n '60,100p'

echo
echo "=== Dockerfile browser/chromedriver related lines ==="
rg -n -i "chrome|chromedriver|CHROME_BIN|chromium|google-chrome" "$FILE" || true

echo
echo "=== BaseTest.java (ChromeDriver setup) ==="
sed -n '1,140p' src/test/java/tests/BaseTest.java

echo
echo "=== Any other ChromeDriver usage ==="
rg -n "new ChromeDriver|ChromeDriver" src/test/java || true

Repository: openo-beta/Open-O

Length of output: 6471


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) Inspect the Dockerfile more broadly for any other Chrome/Chromium/CHROME_BIN/chromedriver setup.
FILE=".devcontainer/development/Dockerfile"
echo "=== Dockerfile browser-related lines (whole file) ==="
rg -n -i "google-chrome|chrome-linux|chromium|chromeDriver|chromedriver|CHROME_BIN|CHROME_HOME" "$FILE" || true

echo "=== Dockerfile lines 1-120 ==="
sed -n '1,120p' "$FILE"

# 2) Check Selenium configuration for binary paths / driver system properties.
echo
echo "=== Search for ChromeDriver binary/path configuration in src/test/java ==="
rg -n -i "webdriver\.chrome\.driver|CHROME_BIN|setBinary|binary.*chrome|chrome\.options|ChromeOptions" src/test/java || true

echo
echo "=== Search for chromedriver/selenium-chrome-driver references ==="
rg -n -i "chromedriver|selenium-chrome-driver|selenium.*chrome|ChromeDriver" . | head -n 200

echo
echo "=== Search UI test docs for Chrome vs Chromium ==="
rg -n -i "ChromeDriver|selenium|Chrome|Chromium|playwright.*chrome|playwright.*chromium" docs | head -n 200

Repository: openo-beta/Open-O

Length of output: 50375


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Where createHeadlessDriver() is used ==="
rg -n "createHeadlessDriver\\(" src/test/java

echo
echo "=== Where BaseTest.createHeadlessDriver() / ChromeDriver is used in tests ==="
rg -n "ChromeDriver|createHeadlessDriver\\(" src/test/java/tests

echo
echo "=== Any explicit Chrome binary path config in tests ==="
rg -n "setBinary\\(|CHROME_BIN|--binary|binary.*chrome|google-chrome|chromium" src/test/java || true

echo
echo "=== Maven: what runs tests, and any Selenium/ChromeDriver mentions ==="
rg -n -i "surefire|failsafe|testng|junit|selenium|chromedriver|ChromeDriver|playwright" pom.xml

echo
echo "=== Docs: any mention of Selenium UI tests requiring Chrome ==="
rg -n -i "selenium|ChromeDriver|chromedriver|webdriver\\.chrome\\.driver" docs src || true

echo
echo "=== Devcontainer scripts: anything running selenium tests ==="
rg -n -i "mvn .*test|mvn .*failsafe|selenium|ChromeDriver|chromedriver" .devcontainer || true

Repository: openo-beta/Open-O

Length of output: 50375


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Search for env vars / binary paths for Chrome or Chromium ==="
rg -n -i "CHROME_BIN|CHROME_HOME|PLAYWRIGHT_BROWSERS_PATH|setBinary\\(|--browser|google-chrome|chromium-browser" \
  .devcontainer src/test/java docs .mcp.json package.json pom.xml || true

echo
echo "=== Search for Playwright cache paths ==="
rg -n -i "ms-playwright|\\.cache/ms-playwright" .devcontainer src/test/java docs .mcp.json package*.json pom.xml || true

echo
echo "=== Check BaseTest.java for any binary/path overrides (setBinary, system properties) ==="
sed -n '1,120p' src/test/java/tests/BaseTest.java

echo
echo "=== Check pom.xml for any webdriver manager / browser config (driver+binary properties) ==="
rg -n -i "selenium-manager|webdriver|chromedriver|chrome\\.options|setBinary|CHROME_BIN|browserBinary|driverExecutable" pom.xml || true

Repository: openo-beta/Open-O

Length of output: 5502


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== .mcp.json ==="
cat .mcp.json

echo
echo "=== .devcontainer/scripts/verify-playwright-path.sh ==="
sed -n '1,140p' .devcontainer/scripts/verify-playwright-path.sh

echo
echo "=== .devcontainer/PLAYWRIGHT_SETUP.md (relevant sections) ==="
sed -n '1,120p' .devcontainer/PLAYWRIGHT_SETUP.md

echo
echo "=== Search for CHROME_BIN / CHROME_HOME usage in devcontainer scripts ==="
rg -n -i "CHROME_BIN|CHROME_HOME|google-chrome|chromium|chrome-linux/chrome" .devcontainer --glob '!**/node_modules/**' || true

echo
echo "=== Search for selenium manager / browser env vars used in scripts ==="
rg -n -i "selenium|CHROME_BIN|webdriver\\.chrome|--headless|chromedriver" .devcontainer --glob '!**/node_modules/**' || true

echo
echo "=== Search for symlinks or copy of playwright browser into /usr/bin ==="
rg -n -i "ln -s|/usr/bin|google-chrome|chromium-browser|chromium\\b" .devcontainer --glob '!**/node_modules/**' || true

echo
echo "=== Check selenium tests resources directory for any driver/binary config ==="
find src/test -maxdepth 3 -type f \( -name "*.properties" -o -name "*.xml" -o -name "*.json" -o -name "*.yml" -o -name "*.yaml" \) 2>/dev/null | head -n 50
rg -n -i "CHROME_BIN|CHROME_HOME|webdriver\\.chrome\\.driver|chromium|chrome-linux/chrome|google-chrome|chromedriver" src/test || true

Repository: openo-beta/Open-O

Length of output: 50377


Fix misleading arm64 message and ensure Selenium ChromeDriver still has a browser binary

  • .devcontainer/development/Dockerfile installs only Playwright chromium + firefox (no chrome), but the Selenium suite creates new ChromeDriver(options) in src/test/java/tests/BaseTest.java without setting CHROME_BIN / ChromeOptions#setBinary, so it won’t be pointed at Playwright’s installed Chromium executable.
  • The echo on Dockerfile line 75 says it’s an “arm64 Chrome on Linux” limitation, but Chrome is removed unconditionally for all builds; update the message or add architecture-specific logic.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.devcontainer/development/Dockerfile around lines 74 - 75, The Dockerfile
currently installs only Playwright's chromium and firefox via the RUN npx
playwright install command but the test code creates new ChromeDriver(options)
in BaseTest.java without setting CHROME_BIN or using ChromeOptions#setBinary, so
Selenium won't find a Chrome binary; either install a Chrome/Chromium browser
binary (or add chrome to the playwright install on non-arm64 architectures) or
update the test setup to point ChromeOptions#setBinary to Playwright's installed
Chromium executable; also fix the echo message to reflect that Chrome is
unconditionally omitted (or make it conditional on architecture) rather than
claiming an “arm64 Chrome on Linux” limitation.

Comment on lines +1086 to +1097
CREATE TABLE IF NOT EXISTS llm_document_text (
id INT AUTO_INCREMENT PRIMARY KEY,
document_no INT(6) NOT NULL,
demographic_no INT NOT NULL,
extracted_text MEDIUMTEXT,
content_hash VARCHAR(64) NOT NULL, -- SHA-256 of the PDF bytes
extracted_at DATETIME NOT NULL,
INDEX idx_document_no (document_no),
INDEX idx_demographic_no (demographic_no),
FOREIGN KEY (document_no) REFERENCES ctl_document(document_no),
FOREIGN KEY (demographic_no) REFERENCES demographic(demographic_no)
);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add required audit columns to the new table.

llm_document_text is missing lastUpdateUser and lastUpdateDate, which are required for MySQL audit trail compliance.

Suggested fix
 CREATE TABLE IF NOT EXISTS llm_document_text (
     id              INT AUTO_INCREMENT PRIMARY KEY,
     document_no     INT(6) NOT NULL,
     demographic_no  INT NOT NULL,
     extracted_text  MEDIUMTEXT,
     content_hash    VARCHAR(64) NOT NULL,  -- SHA-256 of the PDF bytes
     extracted_at    DATETIME NOT NULL,
+    lastUpdateUser  VARCHAR(6),
+    lastUpdateDate  DATETIME NOT NULL,
     INDEX idx_document_no   (document_no),
     INDEX idx_demographic_no (demographic_no),
     FOREIGN KEY (document_no)   REFERENCES ctl_document(document_no),
     FOREIGN KEY (demographic_no) REFERENCES demographic(demographic_no)
 );

As per coding guidelines: “database/mysql/**/*.sql: Always include lastUpdateUser and lastUpdateDate fields in database tables for audit trail compliance”.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
CREATE TABLE IF NOT EXISTS llm_document_text (
id INT AUTO_INCREMENT PRIMARY KEY,
document_no INT(6) NOT NULL,
demographic_no INT NOT NULL,
extracted_text MEDIUMTEXT,
content_hash VARCHAR(64) NOT NULL, -- SHA-256 of the PDF bytes
extracted_at DATETIME NOT NULL,
INDEX idx_document_no (document_no),
INDEX idx_demographic_no (demographic_no),
FOREIGN KEY (document_no) REFERENCES ctl_document(document_no),
FOREIGN KEY (demographic_no) REFERENCES demographic(demographic_no)
);
CREATE TABLE IF NOT EXISTS llm_document_text (
id INT AUTO_INCREMENT PRIMARY KEY,
document_no INT(6) NOT NULL,
demographic_no INT NOT NULL,
extracted_text MEDIUMTEXT,
content_hash VARCHAR(64) NOT NULL, -- SHA-256 of the PDF bytes
extracted_at DATETIME NOT NULL,
lastUpdateUser VARCHAR(6),
lastUpdateDate DATETIME NOT NULL,
INDEX idx_document_no (document_no),
INDEX idx_demographic_no (demographic_no),
FOREIGN KEY (document_no) REFERENCES ctl_document(document_no),
FOREIGN KEY (demographic_no) REFERENCES demographic(demographic_no)
);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@database/mysql/oscarinit.sql` around lines 1086 - 1097, The llm_document_text
table lacks the required audit columns; add lastUpdateUser and lastUpdateDate to
the CREATE TABLE for llm_document_text (and to any ALTER if updating an existing
table): add a lastUpdateUser column (e.g., VARCHAR(64) NOT NULL) and a
lastUpdateDate column (DATETIME NOT NULL with appropriate default/ON UPDATE
CURRENT_TIMESTAMP per project conventions) and ensure indexes/constraints remain
unchanged; update any insert/update code or migrations that populate these
fields if present.

Comment on lines +1093 to +1096
INDEX idx_document_no (document_no),
INDEX idx_demographic_no (demographic_no),
FOREIGN KEY (document_no) REFERENCES ctl_document(document_no),
FOREIGN KEY (demographic_no) REFERENCES demographic(demographic_no)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Enforce one cache row per document/patient pair.

Without a unique key on (document_no, demographic_no), duplicate cache rows can be inserted, which makes cache reads nondeterministic and stale-prone.

Suggested fix
     INDEX idx_document_no   (document_no),
     INDEX idx_demographic_no (demographic_no),
+    UNIQUE KEY uk_llm_document_text_doc_demo (document_no, demographic_no),
     FOREIGN KEY (document_no)   REFERENCES ctl_document(document_no),
     FOREIGN KEY (demographic_no) REFERENCES demographic(demographic_no)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
INDEX idx_document_no (document_no),
INDEX idx_demographic_no (demographic_no),
FOREIGN KEY (document_no) REFERENCES ctl_document(document_no),
FOREIGN KEY (demographic_no) REFERENCES demographic(demographic_no)
INDEX idx_document_no (document_no),
INDEX idx_demographic_no (demographic_no),
UNIQUE KEY uk_llm_document_text_doc_demo (document_no, demographic_no),
FOREIGN KEY (document_no) REFERENCES ctl_document(document_no),
FOREIGN KEY (demographic_no) REFERENCES demographic(demographic_no)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@database/mysql/oscarinit.sql` around lines 1093 - 1096, Add a uniqueness
constraint to prevent duplicate cache rows for the same document/patient pair by
adding a UNIQUE index on (document_no, demographic_no); modify the table
definition (near the existing INDEX idx_document_no and INDEX idx_demographic_no
lines) to include UNIQUE KEY (document_no, demographic_no) or run ALTER TABLE
... ADD CONSTRAINT unique_document_demographic UNIQUE (document_no,
demographic_no) so the cache enforces one row per (document_no, demographic_no).

Comment on lines +2 to +13
CREATE TABLE IF NOT EXISTS llm_document_text (
id INT AUTO_INCREMENT PRIMARY KEY,
document_no INT(6) NOT NULL,
demographic_no INT NOT NULL,
extracted_text MEDIUMTEXT,
content_hash VARCHAR(64) NOT NULL,
extracted_at DATETIME NOT NULL,
INDEX idx_document_no (document_no),
INDEX idx_demographic_no (demographic_no),
FOREIGN KEY (document_no) REFERENCES ctl_document(document_no),
FOREIGN KEY (demographic_no) REFERENCES demographic(demographic_no)
);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add the required audit-trail columns to this new table.

Every new MySQL table in this repo needs lastUpdateUser and lastUpdateDate, and this migration omits both. That breaks the audit-trail contract for persisted patient-data-related records.

As per coding guidelines, database/**/*.sql and database/mysql/**/*.sql require “All database tables MUST include lastUpdateUser and lastUpdateDate columns for audit trails.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@database/mysql/updates/update-2026-05-31.sql` around lines 2 - 13, The new
table llm_document_text is missing the required audit-trail columns; add columns
named lastUpdateUser (VARCHAR appropriate length, NOT NULL) and lastUpdateDate
(DATETIME NOT NULL) to the CREATE TABLE for llm_document_text and ensure they
follow existing repo conventions (types, NOT NULL constraints, and ordering
consistent with other tables) so the migration satisfies the "All database
tables MUST include lastUpdateUser and lastUpdateDate" rule.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Enforce a unique cache key for the document/patient pair.

This schema allows duplicate rows for the same logical cache entry, but the DAO layer reads only the first match by document_no. That makes cache reads nondeterministic once duplicates appear. Add a unique constraint on the intended key, or align the DAO/query model with a one-to-many design.

Suggested schema fix
 CREATE TABLE IF NOT EXISTS llm_document_text (
     id              INT AUTO_INCREMENT PRIMARY KEY,
     document_no     INT(6) NOT NULL,
     demographic_no  INT NOT NULL,
     extracted_text  MEDIUMTEXT,
     content_hash    VARCHAR(64) NOT NULL,
     extracted_at    DATETIME NOT NULL,
     INDEX idx_document_no   (document_no),
     INDEX idx_demographic_no (demographic_no),
+    UNIQUE KEY uq_llm_document_text_doc_demo (document_no, demographic_no),
     FOREIGN KEY (document_no)   REFERENCES ctl_document(document_no),
     FOREIGN KEY (demographic_no) REFERENCES demographic(demographic_no)
 );
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
CREATE TABLE IF NOT EXISTS llm_document_text (
id INT AUTO_INCREMENT PRIMARY KEY,
document_no INT(6) NOT NULL,
demographic_no INT NOT NULL,
extracted_text MEDIUMTEXT,
content_hash VARCHAR(64) NOT NULL,
extracted_at DATETIME NOT NULL,
INDEX idx_document_no (document_no),
INDEX idx_demographic_no (demographic_no),
FOREIGN KEY (document_no) REFERENCES ctl_document(document_no),
FOREIGN KEY (demographic_no) REFERENCES demographic(demographic_no)
);
CREATE TABLE IF NOT EXISTS llm_document_text (
id INT AUTO_INCREMENT PRIMARY KEY,
document_no INT(6) NOT NULL,
demographic_no INT NOT NULL,
extracted_text MEDIUMTEXT,
content_hash VARCHAR(64) NOT NULL,
extracted_at DATETIME NOT NULL,
INDEX idx_document_no (document_no),
INDEX idx_demographic_no (demographic_no),
UNIQUE KEY uq_llm_document_text_doc_demo (document_no, demographic_no),
FOREIGN KEY (document_no) REFERENCES ctl_document(document_no),
FOREIGN KEY (demographic_no) REFERENCES demographic(demographic_no)
);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@database/mysql/updates/update-2026-05-31.sql` around lines 2 - 13, The table
llm_document_text currently permits duplicate cache entries for the same logical
key; add a UNIQUE constraint on the document/patient pair to enforce one cached
row per pair (e.g., unique on (document_no, demographic_no)) by modifying the
CREATE TABLE (or issuing ALTER TABLE) to include a UNIQUE index (e.g.,
uniq_document_demographic) on those columns; alternatively, if intended to allow
multiple rows per document, update the DAO that reads llm_document_text (which
currently queries by document_no and expects a single row) to return and handle
multiple rows instead.

Comment on lines +7 to +34
/**
* Data access for pre-extracted PDF text used in LLM patient exports.
*/
public interface LlmDocumentTextDao {

/**
* Find cached extraction for a specific document.
* Returns empty if not yet extracted.
*/
Optional<LlmDocumentText> findByDocumentNo(Integer documentNo);

/**
* Find all cached extractions for a patient.
* Used to bulk-load all document texts for a demographic in one query
* rather than N individual lookups.
*/
List<LlmDocumentText> findByDemographicNo(Integer demographicNo);

/**
* Persist a new extraction record or update an existing one.
* Callers should use this for both insert and re-extraction on hash mismatch.
*/
LlmDocumentText save(LlmDocumentText entity);

/**
* Remove a cached record — used when the source document is deleted.
*/
void deleteByDocumentNo(Integer documentNo);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Complete the required JavaDoc contract tags.

This new public DAO API still needs the required @since, @param, and @return tags on the interface methods so it meets the repository’s public-JavaDoc standard.

As per coding guidelines, "src/main/java/**/*.java: All public classes and methods MUST have comprehensive JavaDoc documentation" and "Use @since tags in JavaDoc... Document @param tags... Document @return tags...".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/main/java/ca/openosp/openo/commn/dao/LlmDocumentTextDao.java` around
lines 7 - 34, The interface LlmDocumentTextDao and its methods
(findByDocumentNo, findByDemographicNo, save, deleteByDocumentNo) are missing
required JavaDoc tags; update each method JavaDoc to include a `@since` tag with
the current module version or milestone, a `@param` entry documenting the
parameter (e.g., documentNo or demographicNo) and its meaning, and a `@return` tag
describing the returned value (including when Optional is empty or what the List
contains) where applicable; also add a class-level JavaDoc `@since` for
LlmDocumentTextDao itself so the public API meets the repository JavaDoc
standard.

Comment on lines +145 to +150
Path pdfPath = documentManager.renderDocument(loggedInInfo, docNo.toString());
if (pdfPath == null || !Files.exists(pdfPath)) {
logger.warn("renderDocument returned no path for document {}", docNo);
return null;
}
return Files.readAllBytes(pdfPath);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate the rendered path before reading it.

This code reads the rendered document path with raw Files.exists/readAllBytes calls. For new Java code here, file-path operations are supposed to go through PathValidationUtils so the read is constrained to an approved location before the bytes are loaded.

As per coding guidelines, "src/main/java/**/*.java: Use PathValidationUtils for ALL file path operations to prevent path traversal attacks".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/main/java/ca/openosp/openo/managers/LlmDocumentTextService.java` around
lines 145 - 150, The code reads a rendered Path directly (using
documentManager.renderDocument and Files.exists/readAllBytes) without validating
it; update LlmDocumentTextService to validate pdfPath via the project's
PathValidationUtils before accessing the file (e.g., call the appropriate
PathValidationUtils method to ensure the path is within allowed locations/not a
traversal attack), and if validation fails log a warning similar to the existing
logger.warn and return null; only after successful validation proceed to
Files.readAllBytes(pdfPath).

Comment on lines +168 to +177
private void persist(Integer docNo, Integer demographicNo,
String text, String hash,
Optional<LlmDocumentText> existing) {
try {
LlmDocumentText record = existing.orElseGet(() ->
new LlmDocumentText(docNo, demographicNo, null, hash));
record.setExtractedText(text);
record.setContentHash(hash);
record.setExtractedAt(new Date());
llmDocumentTextDao.save(record);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Don’t persist extracted document text in plaintext.

text here is extracted PDF content and can contain PHI, but it is written directly to the cache table. This needs encryption-at-rest before save, with decryption only at read time, otherwise the new cache becomes a plaintext PHI store.

As per coding guidelines, "Encrypt sensitive PHI data before storage using EncryptionUtils - decrypt only when needed" and "PHI (Patient Health Information) must NEVER be logged or exposed".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/main/java/ca/openosp/openo/managers/LlmDocumentTextService.java` around
lines 168 - 177, The persist method writes extracted PDF text (PHI) in plaintext
to LlmDocumentText; change it to encrypt the text before saving by calling
EncryptionUtils.encrypt(...) and store the ciphertext via
record.setExtractedText(...), keep contentHash as-is, and ensure any callers
that read LlmDocumentText use EncryptionUtils.decrypt(...) when returning plain
text; also remove/avoid any logging of the plaintext in persist and reference
the persist method and LlmDocumentText record when making this change.

Comment on lines +16 to +30
@Path("/llm")
public class LLMExtractorEndpoint {

@Autowired
private LLMExtractorService llmExtractorService;

@Autowired
private SecurityInfoManager securityInfoManager;

@GET
@Path("/export")
@Produces(MediaType.APPLICATION_JSON)
public Response exportPatient(
@QueryParam("demographicId") Integer demographicId,
@Context HttpServletRequest request) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Add the required JavaDoc for this public REST API.

The new public resource class and endpoint method are missing the repository-required JavaDoc contract, including @since, @param, @return, and any relevant @throws tags.

As per coding guidelines, "src/main/java/**/*.java: All public classes and methods MUST have comprehensive JavaDoc documentation" and "Use @since tags in JavaDoc... Document @param tags... Document @return tags... Document all thrown exceptions with @throws tags...".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/main/java/ca/openosp/openo/webserv/rest/LLMExtractorEndpoint.java` around
lines 16 - 30, Add comprehensive JavaDoc to the public class
LLMExtractorEndpoint and its public method exportPatient: document the class
purpose and include a `@since` tag; for exportPatient add `@since`, `@param`
demographicId description, `@param` request description, `@return` describing the
Response payload and status codes, and `@throws` tags for any runtime or web
exceptions the method may propagate (e.g., WebApplicationException or
IOException as applicable); reference the injected dependencies
llmExtractorService and securityInfoManager in the class-level doc as their role
in the endpoint behavior.

Comment on lines +32 to +35
LoggedInInfo loggedInInfo = LoggedInInfo.getLoggedInInfoFromRequest(request);

if (!securityInfoManager.hasPrivilege(loggedInInfo, "_demographic", "r", demographicId)) {
return Response.status(Response.Status.FORBIDDEN)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use session-based LoggedInInfo and fail expired sessions before auth.

This endpoint pulls auth context with getLoggedInInfoFromRequest(request) and goes straight into the privilege check. The repo standard here is LoggedInInfo.getLoggedInInfoFromSession(request) plus an isExpired() check that invalidates expired sessions before any access decision.

As per coding guidelines, "Always use LoggedInInfo.getLoggedInInfoFromSession() for session management and validation" and "Validate session expiration using LoggedInInfo.isExpired() and invalidate expired sessions".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/main/java/ca/openosp/openo/webserv/rest/LLMExtractorEndpoint.java` around
lines 32 - 35, Replace the call to
LoggedInInfo.getLoggedInInfoFromRequest(request) with
LoggedInInfo.getLoggedInInfoFromSession(request), then immediately check
LoggedInInfo.isExpired() on the returned object; if expired, invalidate the
session (use LoggedInInfo.invalidateSession(request) or the project’s session
invalidation helper) and return an appropriate unauthorized response (e.g.,
Response.Status.UNAUTHORIZED) before calling
securityInfoManager.hasPrivilege(...). Ensure the subsequent privilege check
still uses securityInfoManager.hasPrivilege(loggedInInfo, "_demographic", "r",
demographicId).

Comment on lines +118 to +129
private Map<String, Object> buildContext(Integer demographicId) {
Map<String, Object> ctx = new LinkedHashMap<>();
ctx.put("system", "OSCAR EMR patient export");
ctx.put("export_version", "1.0");
ctx.put("exported_at", isoNow());
ctx.put("demographic_id", demographicId);
ctx.put("instructions",
"This payload contains structured clinical data for a single patient. " +
"Use it to answer physician questions accurately. " +
"Dates are ISO-8601 UTC. Null fields indicate data not recorded. " +
"Do not infer diagnoses. Cite specific records when relevant.");
return ctx;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Rename the exported system label to OpenO/OpenO EMR.

context.system is still emitting "OSCAR EMR patient export", which leaks the legacy product name into the new export contract.

As per coding guidelines, "**/*: Refer to this system as 'OpenO EMR' or 'OpenO' - NOT 'OSCAR EMR' or 'OSCAR McMaster'".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/main/java/ca/openosp/openo/webserv/rest/LLMExtractorService.java` around
lines 118 - 129, The system label in buildContext is using the old product name;
update the value put into ctx with key "system" (in method buildContext) to the
new approved label (e.g., "OpenO EMR patient export" or "OpenO") so
context.system no longer emits "OSCAR EMR patient export"; change only that
literal string in the ctx.put("system", ...) call to the approved name.

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

7 issues found across 9 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/main/java/ca/openosp/openo/commn/dao/LlmDocumentTextDaoImpl.java">

<violation number="1" location="src/main/java/ca/openosp/openo/commn/dao/LlmDocumentTextDaoImpl.java:26">
P2: `findByDocumentNo` selects `getResultList().get(0)` without `ORDER BY`, but no unique constraint exists on `document_no` and the service explicitly allows concurrent insert races. This yields non-deterministic results when duplicates exist and can trigger repeated re-extraction or stale cached text.</violation>
</file>

<file name="src/main/java/ca/openosp/openo/managers/LlmDocumentTextService.java">

<violation number="1" location="src/main/java/ca/openosp/openo/managers/LlmDocumentTextService.java:82">
P1: In-memory cache stores null extraction results as empty-string sentinel but does not decode them back to null on cache hits, violating the documented API contract that returns null when extraction fails or PDF has no text layer.</violation>

<violation number="2" location="src/main/java/ca/openosp/openo/managers/LlmDocumentTextService.java:125">
P1: getPreloadedTexts warms the L1 cache and returns DB-cached text without hash-revalidating against current PDF bytes, bypassing the hash-check safeguard in getExtractedText. Stale extraction results can be silently served to the LLM export if a document is modified without a matching evict() call (evict() is never invoked anywhere in the codebase).</violation>

<violation number="3" location="src/main/java/ca/openosp/openo/managers/LlmDocumentTextService.java:150">
P1: Validate the file path through `PathValidationUtils` before reading it with `Files.readAllBytes`. Without path validation, a manipulated document path could lead to path traversal, allowing reads from outside the approved document storage directory.</violation>

<violation number="4" location="src/main/java/ca/openosp/openo/managers/LlmDocumentTextService.java:186">
P2: Concurrent eviction logic can throw `NoSuchElementException` at runtime due to a race between `size()` and `nextElement()` on a concurrently mutating `ConcurrentHashMap`.</violation>
</file>

<file name="src/main/java/ca/openosp/openo/webserv/rest/LLMExtractorService.java">

<violation number="1" location="src/main/java/ca/openosp/openo/webserv/rest/LLMExtractorService.java:120">
P2: Rename `"OSCAR EMR patient export"` to `"OpenO EMR patient export"`. The legacy product name should not appear in new code or public-facing exports.</violation>

<violation number="2" location="src/main/java/ca/openosp/openo/webserv/rest/LLMExtractorService.java:274">
P2: Synchronous per-PDF text extraction inside the REST request loop can cause severe latency and timeouts for charts with many uncached PDFs</violation>
</file>

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

for (LlmDocumentText row : rows) {
result.put(row.getDocumentNo(), row.getExtractedText());
// Warm up L1 while we're here
putInMemoryCache(row.getDocumentNo(), row.getExtractedText());

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: getPreloadedTexts warms the L1 cache and returns DB-cached text without hash-revalidating against current PDF bytes, bypassing the hash-check safeguard in getExtractedText. Stale extraction results can be silently served to the LLM export if a document is modified without a matching evict() call (evict() is never invoked anywhere in the codebase).

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/main/java/ca/openosp/openo/managers/LlmDocumentTextService.java, line 125:

<comment>getPreloadedTexts warms the L1 cache and returns DB-cached text without hash-revalidating against current PDF bytes, bypassing the hash-check safeguard in getExtractedText. Stale extraction results can be silently served to the LLM export if a document is modified without a matching evict() call (evict() is never invoked anywhere in the codebase).</comment>

<file context>
@@ -0,0 +1,208 @@
+        for (LlmDocumentText row : rows) {
+            result.put(row.getDocumentNo(), row.getExtractedText());
+            // Warm up L1 while we're here
+            putInMemoryCache(row.getDocumentNo(), row.getExtractedText());
+        }
+        return result;
</file context>

Comment on lines +82 to +84
if (memoryCache.containsKey(docNo)) {
return memoryCache.get(docNo);
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: In-memory cache stores null extraction results as empty-string sentinel but does not decode them back to null on cache hits, violating the documented API contract that returns null when extraction fails or PDF has no text layer.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/main/java/ca/openosp/openo/managers/LlmDocumentTextService.java, line 82:

<comment>In-memory cache stores null extraction results as empty-string sentinel but does not decode them back to null on cache hits, violating the documented API contract that returns null when extraction fails or PDF has no text layer.</comment>

<file context>
@@ -0,0 +1,208 @@
+        Integer docNo = document.getDocumentNo();
+
+        // L1: in-memory hit
+        if (memoryCache.containsKey(docNo)) {
+            return memoryCache.get(docNo);
+        }
</file context>
Suggested change
if (memoryCache.containsKey(docNo)) {
return memoryCache.get(docNo);
}
if (memoryCache.containsKey(docNo)) {
String cached = memoryCache.get(docNo);
return cached.isEmpty() ? null : cached;
}

logger.warn("renderDocument returned no path for document {}", docNo);
return null;
}
return Files.readAllBytes(pdfPath);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Validate the file path through PathValidationUtils before reading it with Files.readAllBytes. Without path validation, a manipulated document path could lead to path traversal, allowing reads from outside the approved document storage directory.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/main/java/ca/openosp/openo/managers/LlmDocumentTextService.java, line 150:

<comment>Validate the file path through `PathValidationUtils` before reading it with `Files.readAllBytes`. Without path validation, a manipulated document path could lead to path traversal, allowing reads from outside the approved document storage directory.</comment>

<file context>
@@ -0,0 +1,208 @@
+                logger.warn("renderDocument returned no path for document {}", docNo);
+                return null;
+            }
+            return Files.readAllBytes(pdfPath);
+        } catch (PDFGenerationException | IOException e) {
+            logger.warn("Could not load PDF bytes for document {}", docNo, e);
</file context>

"SELECT l FROM LlmDocumentText l WHERE l.documentNo = :docNo",
LlmDocumentText.class);
q.setParameter("docNo", documentNo);
List<LlmDocumentText> results = q.getResultList();

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: findByDocumentNo selects getResultList().get(0) without ORDER BY, but no unique constraint exists on document_no and the service explicitly allows concurrent insert races. This yields non-deterministic results when duplicates exist and can trigger repeated re-extraction or stale cached text.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/main/java/ca/openosp/openo/commn/dao/LlmDocumentTextDaoImpl.java, line 26:

<comment>`findByDocumentNo` selects `getResultList().get(0)` without `ORDER BY`, but no unique constraint exists on `document_no` and the service explicitly allows concurrent insert races. This yields non-deterministic results when duplicates exist and can trigger repeated re-extraction or stale cached text.</comment>

<file context>
@@ -0,0 +1,55 @@
+            "SELECT l FROM LlmDocumentText l WHERE l.documentNo = :docNo",
+            LlmDocumentText.class);
+        q.setParameter("docNo", documentNo);
+        List<LlmDocumentText> results = q.getResultList();
+        return results.isEmpty() ? Optional.empty() : Optional.of(results.get(0));
+    }
</file context>

}

private void putInMemoryCache(Integer docNo, String text) {
if (memoryCache.size() >= MAX_CACHE_SIZE) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Concurrent eviction logic can throw NoSuchElementException at runtime due to a race between size() and nextElement() on a concurrently mutating ConcurrentHashMap.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/main/java/ca/openosp/openo/managers/LlmDocumentTextService.java, line 186:

<comment>Concurrent eviction logic can throw `NoSuchElementException` at runtime due to a race between `size()` and `nextElement()` on a concurrently mutating `ConcurrentHashMap`.</comment>

<file context>
@@ -0,0 +1,208 @@
+    }
+
+    private void putInMemoryCache(Integer docNo, String text) {
+        if (memoryCache.size() >= MAX_CACHE_SIZE) {
+            // Simple eviction: remove the first key we find
+            // For production, consider Caffeine/Guava with LRU + TTL
</file context>

m.put("extracted_text", preloaded.get(docNo));
} else {
// Not yet cached — extract now, service will persist it
String text = llmDocumentTextService.getExtractedText(loggedInInfo, d, demographicNo);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Synchronous per-PDF text extraction inside the REST request loop can cause severe latency and timeouts for charts with many uncached PDFs

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/main/java/ca/openosp/openo/webserv/rest/LLMExtractorService.java, line 274:

<comment>Synchronous per-PDF text extraction inside the REST request loop can cause severe latency and timeouts for charts with many uncached PDFs</comment>

<file context>
@@ -16,388 +15,491 @@
+                    m.put("extracted_text", preloaded.get(docNo));
+                } else {
+                    // Not yet cached — extract now, service will persist it
+                    String text = llmDocumentTextService.getExtractedText(loggedInInfo, d, demographicNo);
+                    m.put("extracted_text", text);
+                }
</file context>


private Map<String, Object> buildContext(Integer demographicId) {
Map<String, Object> ctx = new LinkedHashMap<>();
ctx.put("system", "OSCAR EMR patient export");

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Rename "OSCAR EMR patient export" to "OpenO EMR patient export". The legacy product name should not appear in new code or public-facing exports.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/main/java/ca/openosp/openo/webserv/rest/LLMExtractorService.java, line 120:

<comment>Rename `"OSCAR EMR patient export"` to `"OpenO EMR patient export"`. The legacy product name should not appear in new code or public-facing exports.</comment>

<file context>
@@ -16,388 +15,491 @@
+
+    private Map<String, Object> buildContext(Integer demographicId) {
+        Map<String, Object> ctx = new LinkedHashMap<>();
+        ctx.put("system",          "OSCAR EMR patient export");
+        ctx.put("export_version",  "1.0");
+        ctx.put("exported_at",     isoNow());
</file context>
Suggested change
ctx.put("system", "OSCAR EMR patient export");
ctx.put("system", "OpenO EMR patient export");

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/main/webapp/casemgmt/ChartNotes.jsp`:
- Line 712: Replace the hardcoded
"http://localhost:3000/patient/<%=demographicNo%>" in the button's onclick with
an application-relative call to the LLM extractor endpoint; update the onclick
on the button in ChartNotes.jsp so it opens the path
"/llm/export?demographicId=<%=demographicNo%>" (or builds the URL from a
configured base if your app provides one) instead of embedding localhost,
ensuring it uses the demographicId query parameter expected by
LLMExtractorEndpoint.
- Line 712: In ChartNotes.jsp replace the direct injection of demographicNo in
the inline onclick string (the button with
onclick="window.open('http://localhost:3000/patient/<%=demographicNo%>',
'_blank');") with an OWASP Encoder call: encode the demographicNo for URI
context and then for safe embedding in JavaScript (e.g., wrap demographicNo with
Encode.forUriComponent(...) and then Encode.forJavaScript(...) when rendering
into the onclick string) so the value is safely escaped before calling
window.open.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: bead45aa-078b-4ee1-abfc-1868d978ae6d

📥 Commits

Reviewing files that changed from the base of the PR and between 2ce37d2 and 9ef1f9a.

📒 Files selected for processing (3)
  • dependencies-lock-modern.json
  • dependencies-lock.json
  • src/main/webapp/casemgmt/ChartNotes.jsp

<fmt:setBundle basename="oscarResources"/><fmt:message key="oscarEncounter.Index.BrowseNotes"/></button>

<!-- AI LINK -->
<button type="button" onclick="window.open('http://localhost:3000/patient/<%=demographicNo%>', '_blank');">

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Replace hardcoded localhost URL with app-relative endpoint contract.

Line 712 hardcodes http://localhost:3000/patient/..., which will break outside local dev and does not follow the backend contract added in this PR stack (/llm/export?demographicId=...). Use a context-relative URL (or a configured base URL) and the demographicId query parameter expected by LLMExtractorEndpoint.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/main/webapp/casemgmt/ChartNotes.jsp` at line 712, Replace the hardcoded
"http://localhost:3000/patient/<%=demographicNo%>" in the button's onclick with
an application-relative call to the LLM extractor endpoint; update the onclick
on the button in ChartNotes.jsp so it opens the path
"/llm/export?demographicId=<%=demographicNo%>" (or builds the URL from a
configured base if your app provides one) instead of embedding localhost,
ensuring it uses the demographicId query parameter expected by
LLMExtractorEndpoint.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Encode demographicNo before rendering into inline JavaScript.

Line 712 injects request-derived demographicNo directly into an onclick JavaScript string. This is an XSS sink in JSP output and should be encoded with OWASP Encoder for URI/JS context before rendering.

Suggested fix
+<%@page import="org.owasp.encoder.Encode" %>
...
-                <button type="button" onclick="window.open('http://localhost:3000/patient/<%=demographicNo%>', '_blank');">
+                <button type="button"
+                        onclick="window.open('<%=request.getContextPath()%>/llm/export?demographicId=<%=Encode.forUriComponent(demographicNo)%>', '_blank');">
                     AI Summary
                 </button>

As per coding guidelines: “Use Encode.forHtml(), Encode.forJavaScript(), Encode.forHtmlAttribute(), Encode.forJavaScriptBlock(), Encode.forUriComponent(), and Encode.forCssUrl() from OWASP Encoder for ALL user inputs in web output.”

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
<button type="button" onclick="window.open('http://localhost:3000/patient/<%=demographicNo%>', '_blank');">
<button type="button"
onclick="window.open('<%=request.getContextPath()%>/llm/export?demographicId=<%=Encode.forUriComponent(demographicNo)%>', '_blank');">
AI Summary
</button>
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/main/webapp/casemgmt/ChartNotes.jsp` at line 712, In ChartNotes.jsp
replace the direct injection of demographicNo in the inline onclick string (the
button with
onclick="window.open('http://localhost:3000/patient/<%=demographicNo%>',
'_blank');") with an OWASP Encoder call: encode the demographicNo for URI
context and then for safe embedding in JavaScript (e.g., wrap demographicNo with
Encode.forUriComponent(...) and then Encode.forJavaScript(...) when rendering
into the onclick string) so the value is safely escaped before calling
window.open.

Source: Coding guidelines

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 3 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/main/webapp/casemgmt/ChartNotes.jsp">

<violation number="1" location="src/main/webapp/casemgmt/ChartNotes.jsp:712">
P1: window.open to external origin without noopener/noreferrer enables opener-based tabnabbing</violation>

<violation number="2" location="src/main/webapp/casemgmt/ChartNotes.jsp:712">
P1: Patient identifier sent to external URL over unencrypted HTTP, risking data exposure</violation>

<violation number="3" location="src/main/webapp/casemgmt/ChartNotes.jsp:712">
P1: Hardcoded localhost:3000 AI endpoint in production JSP makes the feature unusable outside local development.</violation>
</file>

Tip: Review your code locally with the cubic CLI to iterate faster.

Re-trigger cubic

<fmt:setBundle basename="oscarResources"/><fmt:message key="oscarEncounter.Index.BrowseNotes"/></button>

<!-- AI LINK -->
<button type="button" onclick="window.open('http://localhost:3000/patient/<%=demographicNo%>', '_blank');">

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: window.open to external origin without noopener/noreferrer enables opener-based tabnabbing

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/main/webapp/casemgmt/ChartNotes.jsp, line 712:

<comment>window.open to external origin without noopener/noreferrer enables opener-based tabnabbing</comment>

<file context>
@@ -707,6 +707,11 @@
                     <fmt:setBundle basename="oscarResources"/><fmt:message key="oscarEncounter.Index.BrowseNotes"/></button>
+                
+                <!-- AI LINK -->
+                <button type="button" onclick="window.open('http://localhost:3000/patient/<%=demographicNo%>', '_blank');">
+                    AI Summary
+                </button>
</file context>
Suggested change
<button type="button" onclick="window.open('http://localhost:3000/patient/<%=demographicNo%>', '_blank');">
<button type="button" onclick="window.open('http://localhost:3000/patient/<%=demographicNo%>', '_blank', 'noopener,noreferrer');">

<fmt:setBundle basename="oscarResources"/><fmt:message key="oscarEncounter.Index.BrowseNotes"/></button>

<!-- AI LINK -->
<button type="button" onclick="window.open('http://localhost:3000/patient/<%=demographicNo%>', '_blank');">

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Patient identifier sent to external URL over unencrypted HTTP, risking data exposure

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/main/webapp/casemgmt/ChartNotes.jsp, line 712:

<comment>Patient identifier sent to external URL over unencrypted HTTP, risking data exposure</comment>

<file context>
@@ -707,6 +707,11 @@
                     <fmt:setBundle basename="oscarResources"/><fmt:message key="oscarEncounter.Index.BrowseNotes"/></button>
+                
+                <!-- AI LINK -->
+                <button type="button" onclick="window.open('http://localhost:3000/patient/<%=demographicNo%>', '_blank');">
+                    AI Summary
+                </button>
</file context>

Comment on lines +712 to +714
<button type="button" onclick="window.open('http://localhost:3000/patient/<%=demographicNo%>', '_blank');">
AI Summary
</button>

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Hardcoded localhost:3000 AI endpoint in production JSP makes the feature unusable outside local development.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/main/webapp/casemgmt/ChartNotes.jsp, line 712:

<comment>Hardcoded localhost:3000 AI endpoint in production JSP makes the feature unusable outside local development.</comment>

<file context>
@@ -707,6 +707,11 @@
                     <fmt:setBundle basename="oscarResources"/><fmt:message key="oscarEncounter.Index.BrowseNotes"/></button>
+                
+                <!-- AI LINK -->
+                <button type="button" onclick="window.open('http://localhost:3000/patient/<%=demographicNo%>', '_blank');">
+                    AI Summary
+                </button>
</file context>
Suggested change
<button type="button" onclick="window.open('http://localhost:3000/patient/<%=demographicNo%>', '_blank');">
AI Summary
</button>
<% String aiServiceUrl = OscarProperties.getInstance().getProperty("ai.service.url", ""); %>
<% if (!aiServiceUrl.isEmpty()) { %>
<button type="button" onclick="window.open('<%=aiServiceUrl%>/patient/<%=demographicNo%>', '_blank');">
AI Summary
</button>
<% } %>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant