feat: inject browser cookies from env to skip OTP/2FA challenges#855
feat: inject browser cookies from env to skip OTP/2FA challenges#855shaiu wants to merge 1 commit intodaniel-hauser:mainfrom
Conversation
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 12 minutes and 24 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (6)
📝 WalkthroughWalkthroughThis PR adds support for cookie-based authentication in bank scraping by introducing an Changes
Sequence Diagram(s)sequenceDiagram
actor User as User (CLI)
participant ExportScript as export-cookies Script
participant Browser as Puppeteer Browser
participant Page as Browser Page
participant Stdout as Stdout
User->>ExportScript: Run with --company and --url
ExportScript->>ExportScript: Validate CLI args
ExportScript->>Browser: Launch non-headless browser
Browser-->>ExportScript: Browser instance
ExportScript->>Page: Create new page
ExportScript->>Page: Navigate to login URL
Page->>Page: Load bank login page
User->>User: Complete OTP/2FA in browser
User->>ExportScript: Press Enter (confirm auth)
ExportScript->>Page: getCookies()
Page-->>ExportScript: Array of cookies
ExportScript->>ExportScript: Format as JSON keyed by companyId
ExportScript->>Stdout: Print JSON
ExportScript->>Browser: Close browser
sequenceDiagram
participant Env as Environment Variable
participant BrowserModule as browser.ts
participant Parser as Cookie Parser
participant BrowserContext as BrowserContext
participant Page as Page
Env->>BrowserModule: MONEYMAN_BROWSER_COOKIES (JSON)
BrowserModule->>BrowserModule: createSecureBrowserContext(companyId)
BrowserModule->>BrowserModule: injectCookiesFromEnv(companyId)
BrowserModule->>Parser: Parse MONEYMAN_BROWSER_COOKIES
Parser->>BrowserModule: Cookies for this companyId
BrowserModule->>BrowserContext: Create new page
BrowserContext-->>Page: Page instance
BrowserModule->>Page: setCookie(cookies)
Page->>Page: Inject cookies into context
BrowserModule->>Page: Close temporary page
BrowserModule->>BrowserContext: Return ready context (cookies persisted)
BrowserContext->>BrowserContext: Navigate to scraper URL (cookies sent)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
b889fef to
df9a3a3
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
src/scraper/browser.ts (1)
79-81: Set cookies on the context directly.
Page.setCookie()is deprecated in Puppeteer, whileBrowserContext.setCookie()writes directly to the context. Switching APIs here also removes the temporary page, so a failed cookie write does not leave behind an extra blank page or trigger an unnecessarytargetcreatedcycle. (pptr.dev)♻️ Proposed refactor
- const page = await context.newPage(); - await page.setCookie(...cookies); - await page.close(); + await context.setCookie(...cookies);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/scraper/browser.ts` around lines 79 - 81, Replace the temporary page-based cookie write with a context-level cookie write: remove the context.newPage() / page.setCookie(...) / page.close() sequence and call context.setCookie(...cookies) instead, ensuring the cookies variable matches Puppeteer's BrowserContext.setCookie format; update any logic that depended on the temporary Page creation (e.g., no targetcreated side effects).package.json (1)
23-23: Pintsxlocally forexport-cookies.
export-cookiesnow depends ontsx, buttsxis not declared in the provided manifest. Please add it todevDependenciesand call the local binary here instead of resolving the runner dynamically at execution time.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@package.json` at line 23, The package.json script "export-cookies" currently invokes npx tsx which relies on a globally resolved runner; add "tsx" to devDependencies in package.json (pin to a specific version) and change the script to call the local binary (e.g., use "tsx src/scripts/export-cookies.ts" so npm/yarn will resolve ./node_modules/.bin/tsx). Update devDependencies to include "tsx": "<version>" and modify the "export-cookies" script to use the local tsx binary rather than "npx tsx".
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/scripts/export-cookies.ts`:
- Around line 53-54: Replace the deprecated call to page.cookies() with
page.browserContext().cookies() to ensure cookies from the entire browser
context (all domains) are captured during the OTP flow; locate the usage where
cookies are awaited and logged (the variable cookies and the
console.log(JSON.stringify({ [company]: cookies })) statement) and change the
retrieval to await page.browserContext().cookies() so the rest of the code
(assignment to cookies and the JSON log keyed by company) continues to work
unchanged.
- Around line 34-39: The launch call is passing browserArgs (which currently
includes "--no-sandbox") to puppeteer.launch and the script still uses the
deprecated page.cookies(); remove or filter out "--no-sandbox" from browserArgs
before calling puppeteer.launch (so sandbox stays enabled) and keep passing
executablePath/ignoreDefaultArgs as before; replace usages of page.cookies()
with await page.context().cookies([url]) (use the page's BrowserContext and pass
the page URL(s) to BrowserContext.cookies) so cookie retrieval uses the modern
API.
---
Nitpick comments:
In `@package.json`:
- Line 23: The package.json script "export-cookies" currently invokes npx tsx
which relies on a globally resolved runner; add "tsx" to devDependencies in
package.json (pin to a specific version) and change the script to call the local
binary (e.g., use "tsx src/scripts/export-cookies.ts" so npm/yarn will resolve
./node_modules/.bin/tsx). Update devDependencies to include "tsx": "<version>"
and modify the "export-cookies" script to use the local tsx binary rather than
"npx tsx".
In `@src/scraper/browser.ts`:
- Around line 79-81: Replace the temporary page-based cookie write with a
context-level cookie write: remove the context.newPage() / page.setCookie(...) /
page.close() sequence and call context.setCookie(...cookies) instead, ensuring
the cookies variable matches Puppeteer's BrowserContext.setCookie format; update
any logic that depended on the temporary Page creation (e.g., no targetcreated
side effects).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 3642ded5-f39c-44db-b92f-b63f33f7dd66
📒 Files selected for processing (7)
.github/workflows/scrape.ymldocs/cookie-injection.mdpackage.jsonsrc/config.schema.tssrc/scraper/browser.test.tssrc/scraper/browser.tssrc/scripts/export-cookies.ts
| const browser = await puppeteer.launch({ | ||
| headless: false, | ||
| args: browserArgs, | ||
| executablePath: browserExecutablePath, | ||
| ignoreDefaultArgs: ["--enable-automation"], | ||
| }); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -type f -name "*.ts" | grep -E "(export-cookies|browser)" | head -20Repository: daniel-hauser/moneyman
Length of output: 152
🏁 Script executed:
cat -n ./src/scripts/export-cookies.tsRepository: daniel-hauser/moneyman
Length of output: 2144
🏁 Script executed:
cat -n ./src/scraper/browser.tsRepository: daniel-hauser/moneyman
Length of output: 5294
🌐 Web query:
Puppeteer page.cookies() deprecated scope current page
💡 Result:
page.cookies() is deprecated/obsolete in modern Puppeteer; the docs say to use Browser.cookies() or BrowserContext.cookies() instead. These newer APIs return all cookies in the (default) browser context, not “cookies scoped to the current page URL” like page.cookies() did. [1][2][3]
To get “cookies for the current page”, do:
- read all cookies from the page’s context
- filter them by the page URL (domain + path)
import type { Cookie } from 'puppeteer';
function cookiesForUrl(all: Cookie[], urlString: string) {
const url = new URL(urlString);
const host = url.hostname;
const path = url.pathname || '/';
return all.filter(c => {
// domain match (handles ".example.com" and "example.com")
const cd = c.domain?.startsWith('.') ? c.domain.slice(1) : c.domain;
const domainOk = cd === host || host.endsWith(`.${cd}`);
// path match (cookie path is a prefix)
const cookiePath = c.path || '/';
const pathOk = path.startsWith(cookiePath);
return domainOk && pathOk;
});
}
// usage
const url = page.url();
const all = await page.browserContext().cookies(); // or: await browser.cookies()
const currentPageCookies = cookiesForUrl(all, url);Sources: Puppeteer docs for Page.cookies() deprecation and replacements [1], and that BrowserContext.cookies() returns all cookies in the context [2] (plus the cookies guide using browser.cookies() for retrieval) [3].
Remove --no-sandbox from launch args and modernize the deprecated page.cookies() call.
This script reuses browserArgs from the scraper module, which includes --no-sandbox—a container workaround that disables Chromium's sandbox. For a local manual banking session, the sandbox should remain enabled. Puppeteer explicitly discourages --no-sandbox except as a last resort.
Additionally, page.cookies() is deprecated in modern Puppeteer. Use BrowserContext.cookies() with URL-based filtering instead.
Suggested adjustments
const browser = await puppeteer.launch({
headless: false,
- args: browserArgs,
+ args: browserArgs.filter((arg) => arg !== "--no-sandbox"),
executablePath: browserExecutablePath,
ignoreDefaultArgs: ["--enable-automation"],
});Then replace line 53:
- const cookies = await page.cookies();
+ const context = page.browserContext();
+ const cookies = await context.cookies();📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const browser = await puppeteer.launch({ | |
| headless: false, | |
| args: browserArgs, | |
| executablePath: browserExecutablePath, | |
| ignoreDefaultArgs: ["--enable-automation"], | |
| }); | |
| const browser = await puppeteer.launch({ | |
| headless: false, | |
| args: browserArgs.filter((arg) => arg !== "--no-sandbox"), | |
| executablePath: browserExecutablePath, | |
| ignoreDefaultArgs: ["--enable-automation"], | |
| }); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/scripts/export-cookies.ts` around lines 34 - 39, The launch call is
passing browserArgs (which currently includes "--no-sandbox") to
puppeteer.launch and the script still uses the deprecated page.cookies(); remove
or filter out "--no-sandbox" from browserArgs before calling puppeteer.launch
(so sandbox stays enabled) and keep passing executablePath/ignoreDefaultArgs as
before; replace usages of page.cookies() with await
page.context().cookies([url]) (use the page's BrowserContext and pass the page
URL(s) to BrowserContext.cookies) so cookie retrieval uses the modern API.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
df9a3a3 to
aa7eef1
Compare
Summary
MONEYMAN_BROWSER_COOKIESenv var support to inject pre-authenticated cookies into browser contexts, allowing scrapers to skip repeated OTP/2FA challenges{"hapoalim": [...], "visaCal": [...]}) or provided as a flat arrayexport-cookieshelper script for extracting cookies from manual browser sessionsbrowserProfilePathconfig option for persistent browser profilesMotivation
Many Israeli bank providers require OTP/2FA on each login from an unrecognized browser. For automated scheduled scraping (e.g., via GitHub Actions), this creates friction since each run uses a fresh browser. By exporting cookies from a manual authenticated session and injecting them via an environment variable, the scraper can reuse the session and skip OTP challenges.
Changes
src/scraper/browser.ts— NewinjectCookiesFromEnv()function that reads cookies fromMONEYMAN_BROWSER_COOKIES, parses the JSON (keyed or flat format), and injects matching cookies via a temporary page. Also addsbrowserProfilePathsupport for persistent profiles.src/config.schema.ts— New optionalbrowserProfilePathfield inScrapingOptionsSchemasrc/scripts/export-cookies.ts— Helper script that opens a visible browser, lets you log in manually, then exports cookies as JSON.github/workflows/scrape.yml— PassesMONEYMAN_BROWSER_COOKIESsecret to the Docker containerdocs/cookie-injection.md— User-facing documentationTest plan
npm run lintpassesnpm run buildpassesnpm run testpassesSummary by CodeRabbit
Release Notes
New Features
export-cookiescommand for capturing authenticated browser cookies.Documentation