Skip to content

feat: タイムゾーン対応クエリ(保存UTC・取得時環境TZ変換)#168

Merged
endo-ly merged 4 commits into
mainfrom
feat/timezone-aware-query
May 17, 2026
Merged

feat: タイムゾーン対応クエリ(保存UTC・取得時環境TZ変換)#168
endo-ly merged 4 commits into
mainfrom
feat/timezone-aware-query

Conversation

@endo-ly
Copy link
Copy Markdown
Owner

@endo-ly endo-ly commented May 17, 2026

概要

日付範囲クエリで環境タイムゾーンを考慮するように変更しました。
保存はUTCのまま変更なし。取得時に TIMEZONE 環境変数で指定した
タイムゾーンとして日付を解釈し、UTCに変換してからParquetと比較します。

解決する問題

日本時間 5/17 01:30 の Spotify 再生データが played_at_utc = 2024-05-16T16:30:00Z となり、
WHERE played_at_utc::DATE = '2024-05-17' でヒットしない問題を修正します。

変更内容

設定

  • .env.exampleTIMEZONE=Asia/Tokyo を追加
  • BackendConfigtimezone: ZoneInfo フィールドを追加(デフォルト UTC)

コアロジック

  • validators.to_utc_range() — 入力 date を環境TZと解釈し、UTC naive datetime に変換
    • 例: date(2026, 5, 17) + Asia/Tokyodatetime(2026, 5, 16, 15, 0)datetime(2026, 5, 17, 15, 0)
  • QueryParamsutc_start/utc_end を追加
  • SQL WHERE句: _utc::DATE BETWEEN ? AND ?_utc >= ? AND _utc < ?
  • strftimeAT TIME ZONE を使用し環境TZで期間バケット生成

影響範囲

  • 4つの Repository(Spotify/GitHub/BrowserHistory/YouTube)に tz パラメータ追加
  • 全 API エンドポイント、MCP Server、Tool Factory で TZ を注入
  • 全248テスト通過

変更しないもの

  • Parquet のスキーマ・保存ロジック(UTCのまま)
  • Pipelines 側の収集ロジック(UTCのまま)
  • _utc カラム名(保存フォーマットの仕様)

Summary by CodeRabbit

  • 新機能
    • 環境変数 TIMEZONE を導入し、指定タイムゾーンで日付を解釈してUTC範囲で照合するよう全データクエリを対応(Spotify/YouTube/GitHub/ブラウザ履歴など)。
  • ドキュメント
    • タイムゾーン対応の方針・作業手順を文書化。
  • テスト
    • タイムゾーン変換を検証するユニット/統合テストを追加・更新。

日付範囲クエリで環境タイムゾーンを考慮するように変更。
保存はUTCのまま変更なし。取得時に TIMEZONE 環境変数で指定した
タイムゾーンとして日付を解釈し、UTCに変換してからParquetと
比較する。これにより JST 23:59 のデータが正しく「当日」として
扱われるようになる。

変更点:
- .env.example に TIMEZONE=Asia/Tokyo を追加
- BackendConfig に timezone フィールドを追加 (ZoneInfo)
- validators.to_utc_range() で日付→UTC naive datetime 変換
- 全 QueryParams に utc_start/utc_end を追加
- SQL WHERE句を ::DATE BETWEEN → >= / < に変更
- strftime で AT TIME ZONE を使用し環境TZで期間バケット生成
- 全 Repository に tz パラメータを追加
- 全テストを utc_start/utc_end 対応に更新
- to_utc_range の単体テストを追加
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 17, 2026

Walkthrough

タイムゾーンで解釈した日付範囲をto_utc_rangeでUTCの半開区間(utc_start, utc_end)に変換し、QueryParams/パーティション生成/SQLバインド/リポジトリ初期化/API/テストへ一貫して伝播させる変更を導入。

Changes

Timezone-aware query support

Layer / File(s) Summary
Configuration and UTC range conversion
egograph/backend/config.py, egograph/backend/validators.py, egograph/pipelines/.env.example
BackendConfigtimezone: ZoneInfo追加(デフォルトUTC)。to_utc_range(start_date,end_date,tz)を追加し、指定TZで日付を解釈してUTCのnaive datetime半開区間を返す。.env.exampleTIMEZONE記載。
QueryParams and partition helpers
egograph/backend/infrastructure/database/queries.py, .../github_queries.py, .../youtube_queries.py, .../parquet_paths.py
各QueryParamsにutc_start: datetime, utc_end: datetime, tz_name: str = "UTC"を追加。月パーティション生成・build_partition_paths/_iter_monthsをUTC datetimeレンジ基準へ切替。
Database SQL changes (Spotify / Listening)
egograph/backend/infrastructure/database/queries.py
get_top_tracks の WHERE を played_at_utc >= ? AND played_at_utc < ? に変更。get_listening_stats の期間ラベル生成を AT TIME ZONE ... 経由で tz_name を反映する形式へ変更。
Database SQL changes (GitHub)
egograph/backend/infrastructure/database/github_queries.py
PR/Commitの期間フィルタを::DATE BETWEENから>= utc_start AND < utc_endへ変更。get_activity_stats/get_repo_summary_statsAT TIME ZONE params.tz_nameを用いた期間キー生成へ変更。パーティション生成引数をdatetimeに更新。
Database SQL changes (YouTube / Watch events)
egograph/backend/infrastructure/database/youtube_queries.py
filtered_watch_events の抽出条件を watched_at_utc >= ? AND watched_at_utc < ? に変更。get_watching_stats の period 生成に AT TIME ZONE params.tz_name を導入。
Database SQL changes (Browser history)
egograph/backend/infrastructure/database/browser_history_queries.py
get_page_views / get_top_domains の WHERE を started_at_utc >= ? AND started_at_utc < ? に変更。パーティション解決に utc_start/utc_end を用いるように更新。
Repositories: tz injection & param builders
egograph/backend/infrastructure/repositories/*_repository.py
Spotify/GitHub/YouTube/BrowserHistory 等の各Repositoryで`tz: ZoneInfo
API endpoints wiring
egograph/backend/api/*.py
Spotify/GitHub/BrowserHistoryエンドポイントでバリデータの結果からto_utc_rangeを呼び出し、QueryParamsにutc_start/utc_end/tz_nameを渡すよう変更。YouTubeエンドポイントはリポジトリ生成時にtz=config.timezoneを渡す。
Dependency injection & MCP wiring
egograph/backend/usecases/tools/factory.py, egograph/backend/mcp_server.py
build_tool_registrytz引数を受け、生成する各リポジトリへtz=effective_tzを渡す。MCPサーバー作成側からconfig.timezoneを渡す。
Tests — validators / unit / integration / e2e
egograph/backend/tests/unit/test_validators.py, .../test_queries.py, .../test_youtube_queries.py, .../test_browser_history_queries.py, egograph/backend/tests/integration/test_compacted_parquet_reads.py, pipelines tests
to_utc_rangeのユニットテスト追加。QueryParams生成をファクトリ_qp/_yqp/_bqpへ統一。パーティションテスト・統合テスト・E2Eをutc_start/utc_end/tz_nameに合わせて更新、JST境界テスト追加。
Documentation
docs/99.archive/plans/timezone-aware-query.md
実装方針・対象ファイル・手順をまとめた計画ドキュメントを追加。

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

"CodeRabbit"より祝詩:
保存はUTCでぴょんと跳ねる🐇
日付はTZでそっと解釈する🌏
BETWEENを切って半開で比較🔪
テストもパスしてニコニコだね✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed PRタイトルは「タイムゾーン対応クエリ(保存UTC・取得時環境TZ変換)」で、変更セットの主要な目的をしっかり説明している。環境タイムゾーン対応とUTC保存という核心が明確に伝わる。
Docstring Coverage ✅ Passed Docstring coverage is 89.81% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/timezone-aware-query

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (5)
egograph/backend/tests/unit/repositories/test_youtube_queries.py (1)

29-47: ⚡ Quick win

_yqptz_name と UTCレンジ計算の基準TZを揃えよう

今は tz_name を上書きしても utc_start/utc_end が常に UTC 基準になる。TZ対応テストの信頼性が落ちるので、tz_name 由来のTZで to_utc_range を呼ぶ形に合わせたい。

🧪 変更イメージ
-from datetime import date, timezone
+from datetime import date
+from zoneinfo import ZoneInfo
@@
 def _yqp(**overrides):
@@
-    utc_start, utc_end = to_utc_range(sd, ed, timezone.utc)
+    tz_name = defaults.get("tz_name", "UTC")
+    utc_start, utc_end = to_utc_range(sd, ed, ZoneInfo(str(tz_name)))
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@egograph/backend/tests/unit/repositories/test_youtube_queries.py` around
lines 29 - 47, The helper _yqp currently always computes utc_start/utc_end using
timezone.utc, so overriding tz_name doesn't affect to_utc_range; change _yqp to
resolve the timezone from the provided tz_name (e.g. ZoneInfo(tz_name) or
equivalent) and pass that timezone object into to_utc_range instead of
timezone.utc, then construct YouTubeQueryParams with the resulting
utc_start/utc_end so tz_name and the UTC range calculation are consistent.
egograph/backend/infrastructure/repositories/youtube_repository.py (1)

34-34: ⚡ Quick win

デフォルト引数の ZoneInfo("UTC") を本体内初期化に移そう

Line 34 の tz パラメータがデフォルト引数で関数呼び出しを含んでるから、Ruff の B008 警告が出てる。None デフォルトで本体内初期化のほうが安定する。

-    def __init__(self, r2_config: R2Config, tz: ZoneInfo = ZoneInfo("UTC")):
+    def __init__(self, r2_config: R2Config, tz: ZoneInfo | None = None):
         """YouTubeRepository を初期化します。

         Args:
             r2_config: R2 設定
             tz: クエリ時の日付解釈に使用するタイムゾーン
         """
         self.r2_config = r2_config
-        self._tz = tz
+        self._tz = tz or ZoneInfo("UTC")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@egograph/backend/infrastructure/repositories/youtube_repository.py` at line
34, Change the __init__ signature to avoid calling ZoneInfo("UTC") at function
definition time: make the tz parameter default to None (update the type hint to
Optional[ZoneInfo]) and then inside the __init__ body set tz = ZoneInfo("UTC")
if tz is None; update any imports to include typing.Optional if needed. This
moves creation of ZoneInfo("UTC") from the function definition into runtime
initialization within the __init__ method (refer to the __init__ method and the
tz parameter).
egograph/backend/tests/integration/test_compacted_parquet_reads.py (1)

30-31: ⚡ Quick win

非UTC境界ケースの統合テストを1本追加したい

今回の主題は「環境TZ解釈」だから、tz_name="UTC" 固定だけだと回帰を拾いにくい。Asia/Tokyo みたいな日付またぎケースを1本足しておくと守りがかなり強くなる。

Also applies to: 73-83, 163-174, 231-241

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@egograph/backend/tests/integration/test_compacted_parquet_reads.py` around
lines 30 - 31, 現在のテストヘルパー _utc_range が常に timezone.utc を使って UTC 固定でレンジを作っているため、TZ
解釈の回帰を検出しにくいので、to_utc_range を使うテストに少なくとも1つ「非UTC境界(例:
tz_name='Asia/Tokyo')」の統合テストケースを追加してください;具体的には既存の _utc_range / to_utc_range
呼び出しパターンを参考にして、日付がTZ境界をまたぐケース(日本時間で00:00をまたぐ日付範囲)を使う新しいテストを作成し、テスト内で
tz_name="Asia/Tokyo"
を指定して期待するUTC範囲が得られることをアサートし、同様の修正(同じ非UTCケース追加)をファイル内の他の類似ブロック(同様の
_utc_range/to_utc_range を使っている箇所)にも適用してください。
egograph/backend/tests/unit/database/test_queries.py (1)

20-37: ⚡ Quick win

_qp がUTC固定で、TZ回帰テストを増やしにくい

Line 30 が timezone.utc 固定だから、今回の機能の本丸(非UTC解釈)をこのファイルで守りにくい。tz 引数(既定UTC)を足して、月跨ぎ1ケースだけでも Asia/Tokyo を入れておくとかなり強くなる。

差分案
-def _qp(**overrides):
-    """テスト用 QueryParams ファクトリ(UTC で日付を解釈)。"""
+def _qp(*, tz=timezone.utc, **overrides):
+    """テスト用 QueryParams ファクトリ (指定TZで日付を解釈)。"""
@@
-    defaults = dict(
+    defaults = dict(
         bucket="test-bucket",
         events_path="events/",
-        tz_name="UTC",
+        tz_name=getattr(tz, "key", "UTC"),
     )
@@
-    utc_start, utc_end = to_utc_range(sd, ed, timezone.utc)
+    utc_start, utc_end = to_utc_range(sd, ed, tz)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@egograph/backend/tests/unit/database/test_queries.py` around lines 20 - 37,
The helper _qp currently calls to_utc_range with a hard-coded timezone.utc which
prevents writing non-UTC regression tests; modify _qp to accept an optional tz
(or tz_name) parameter defaulting to timezone.utc and pass that through to
to_utc_range and into the returned QueryParams (keep existing defaults
behavior), and update at least one test in this file to call _qp with
tz=ZoneInfo("Asia/Tokyo") (or tz_name="Asia/Tokyo") to validate month-crossing
behavior under JST. Ensure references: function _qp, to_utc_range, and the
QueryParams constructor are updated accordingly.
egograph/backend/infrastructure/repositories/spotify_repository.py (1)

31-31: ⚡ Quick win

デフォルト引数で関数を呼ぶのはやめておこう

31行目の ZoneInfo("UTC") はデフォルト引数で関数呼び出ししてるから Ruff B008 に引っかかる。None で受けてモジュール定数かメソッド内で処理するほうがスッキリ。

こんな感じの修正
+DEFAULT_TIMEZONE = ZoneInfo("UTC")
+
-    def __init__(self, r2_config: R2Config, tz: ZoneInfo = ZoneInfo("UTC")):
+    def __init__(self, r2_config: R2Config, tz: ZoneInfo | None = None):
         self.r2_config = r2_config
-        self._tz = tz
+        self._tz = tz or DEFAULT_TIMEZONE
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@egograph/backend/infrastructure/repositories/spotify_repository.py` at line
31, The __init__ signature currently calls ZoneInfo("UTC") as a default argument
which triggers Ruff B008; change the parameter to accept tz: ZoneInfo | None =
None in the SpotifyRepository.__init__ (or whatever class contains that
__init__) and inside the constructor set self.tz = tz or ZoneInfo("UTC") when tz
is None, or alternatively define a module-level constant UTC = ZoneInfo("UTC")
and use that inside the body; ensure no function calls occur in the default
argument.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/99.archive/plans/timezone-aware-query.md`:
- Around line 17-27:
ドキュメント内でファイルパス表記が「egograph/...」と「backend/...」で混在しているため参照が分かりにくくなっています。計画書全体でルート表記を統一してください(例:すべて「backend/...」にするかすべて「egograph/...」にする)。具体的には記載されている関数/型名(to_utc_range)や変更対象のデータクラス名(QueryParams,
GitHubQueryParams, BrowserHistoryQueryParams,
YouTubeQueryParams)が登場するすべての行を探し、同一のルートプレフィックスに置き換えて一貫性を保ってください。なお目次や項番(複数箇所、例:
48-52 行付近)も忘れず更新してください。

In `@egograph/backend/infrastructure/database/queries.py`:
- Around line 239-243: The SQL time-zone conversion in the strftime expression
is reversed causing local buckets to shift; locate the strftime usage
referencing played_at_utc and params.tz_name in queries.py and change the
timezone handling to either apply UTC first then the target zone (i.e.,
played_at_utc AT TIME ZONE 'UTC' AT TIME ZONE '{params.tz_name}') or simplify to
a single conversion (played_at_utc AT TIME ZONE '{params.tz_name}'), ensuring
the resulting timestamp passed to strftime reflects the intended local timezone
using the date_format variable.

In `@egograph/backend/infrastructure/repositories/spotify_repository.py`:
- Around line 63-64: The partition-generation is using local start_date/end_date
which causes month-boundary shifts versus stored UTC partitions; update
_generate_partition_paths() to compute partition months from UTC-aligned
datetimes (e.g., convert start_date and end_date to UTC at midnight or use
.astimezone(timezone.utc) and then derive year/month ranges) so generated
partitions match stored year/month=YYYY/MM in UTC; ensure any loop or
inclusive/exclusive logic uses those UTC-derived year/month values and add tests
around Asia/Tokyo midnight transitions to validate.

In `@egograph/backend/tests/unit/test_validators.py`:
- Line 65: Replace the full-width Japanese parentheses in the docstring "返り値が
naive datetime(tzinfo=None)であること。" with ASCII parentheses to satisfy Ruff
(RUF002); locate the docstring in egograph/backend/tests/unit/test_validators.py
(search for that exact string) and change "(" and ")" to "(" and ")" so it
becomes "返り値が naive datetime (tzinfo=None) であること。".

---

Nitpick comments:
In `@egograph/backend/infrastructure/repositories/spotify_repository.py`:
- Line 31: The __init__ signature currently calls ZoneInfo("UTC") as a default
argument which triggers Ruff B008; change the parameter to accept tz: ZoneInfo |
None = None in the SpotifyRepository.__init__ (or whatever class contains that
__init__) and inside the constructor set self.tz = tz or ZoneInfo("UTC") when tz
is None, or alternatively define a module-level constant UTC = ZoneInfo("UTC")
and use that inside the body; ensure no function calls occur in the default
argument.

In `@egograph/backend/infrastructure/repositories/youtube_repository.py`:
- Line 34: Change the __init__ signature to avoid calling ZoneInfo("UTC") at
function definition time: make the tz parameter default to None (update the type
hint to Optional[ZoneInfo]) and then inside the __init__ body set tz =
ZoneInfo("UTC") if tz is None; update any imports to include typing.Optional if
needed. This moves creation of ZoneInfo("UTC") from the function definition into
runtime initialization within the __init__ method (refer to the __init__ method
and the tz parameter).

In `@egograph/backend/tests/integration/test_compacted_parquet_reads.py`:
- Around line 30-31: 現在のテストヘルパー _utc_range が常に timezone.utc を使って UTC
固定でレンジを作っているため、TZ 解釈の回帰を検出しにくいので、to_utc_range を使うテストに少なくとも1つ「非UTC境界(例:
tz_name='Asia/Tokyo')」の統合テストケースを追加してください;具体的には既存の _utc_range / to_utc_range
呼び出しパターンを参考にして、日付がTZ境界をまたぐケース(日本時間で00:00をまたぐ日付範囲)を使う新しいテストを作成し、テスト内で
tz_name="Asia/Tokyo"
を指定して期待するUTC範囲が得られることをアサートし、同様の修正(同じ非UTCケース追加)をファイル内の他の類似ブロック(同様の
_utc_range/to_utc_range を使っている箇所)にも適用してください。

In `@egograph/backend/tests/unit/database/test_queries.py`:
- Around line 20-37: The helper _qp currently calls to_utc_range with a
hard-coded timezone.utc which prevents writing non-UTC regression tests; modify
_qp to accept an optional tz (or tz_name) parameter defaulting to timezone.utc
and pass that through to to_utc_range and into the returned QueryParams (keep
existing defaults behavior), and update at least one test in this file to call
_qp with tz=ZoneInfo("Asia/Tokyo") (or tz_name="Asia/Tokyo") to validate
month-crossing behavior under JST. Ensure references: function _qp,
to_utc_range, and the QueryParams constructor are updated accordingly.

In `@egograph/backend/tests/unit/repositories/test_youtube_queries.py`:
- Around line 29-47: The helper _yqp currently always computes utc_start/utc_end
using timezone.utc, so overriding tz_name doesn't affect to_utc_range; change
_yqp to resolve the timezone from the provided tz_name (e.g. ZoneInfo(tz_name)
or equivalent) and pass that timezone object into to_utc_range instead of
timezone.utc, then construct YouTubeQueryParams with the resulting
utc_start/utc_end so tz_name and the UTC range calculation are consistent.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0f2ea8d5-d1f0-4b9a-b9a8-e6836a799ac7

📥 Commits

Reviewing files that changed from the base of the PR and between 6ee3a06 and 8e5f3c5.

📒 Files selected for processing (25)
  • docs/99.archive/plans/timezone-aware-query.md
  • egograph/backend/api/browser_history_data.py
  • egograph/backend/api/data.py
  • egograph/backend/api/github.py
  • egograph/backend/api/youtube.py
  • egograph/backend/config.py
  • egograph/backend/infrastructure/database/browser_history_queries.py
  • egograph/backend/infrastructure/database/github_queries.py
  • egograph/backend/infrastructure/database/queries.py
  • egograph/backend/infrastructure/database/youtube_queries.py
  • egograph/backend/infrastructure/repositories/browser_history_repository.py
  • egograph/backend/infrastructure/repositories/github_repository.py
  • egograph/backend/infrastructure/repositories/spotify_repository.py
  • egograph/backend/infrastructure/repositories/youtube_repository.py
  • egograph/backend/mcp_server.py
  • egograph/backend/tests/integration/test_compacted_parquet_reads.py
  • egograph/backend/tests/integration/test_mcp_endpoint.py
  • egograph/backend/tests/test_mcp_server.py
  • egograph/backend/tests/unit/database/test_browser_history_queries.py
  • egograph/backend/tests/unit/database/test_queries.py
  • egograph/backend/tests/unit/repositories/test_youtube_queries.py
  • egograph/backend/tests/unit/test_validators.py
  • egograph/backend/usecases/tools/factory.py
  • egograph/backend/validators.py
  • egograph/pipelines/.env.example
💤 Files with no reviewable changes (1)
  • egograph/backend/tests/integration/test_mcp_endpoint.py

Comment thread docs/99.archive/plans/timezone-aware-query.md
Comment thread egograph/backend/infrastructure/database/queries.py
Comment thread egograph/backend/infrastructure/repositories/spotify_repository.py
Comment thread egograph/backend/tests/unit/test_validators.py Outdated
endo-ly added 3 commits May 17, 2026 15:28
- Fix critical bug: strftime AT TIME ZONE was reversed. Naive timestamps
  are UTC, so must apply AT TIME ZONE 'UTC' first, then AT TIME ZONE
  '{tz}' to convert to local time for period bucketing.
- Fix B008: replace ZoneInfo('UTC') default arg with None + fallback
- Fix RUF002: replace full-width parentheses in docstring
- Add JST boundary integration test to verify timezone-aware querying
Tests were hardcoding '2026-04' for mock data dates, causing failures
when the current month changed. Updated to use datetime.now() so mock
PR/commit dates always fall within the backfill window.
… start_date/end_date

Codex review revealed a critical bug: partition path generation was still
using local start_date/end_date, so when querying JST 2024-01-01 the
WHERE clause correctly converted to UTC range [12/31 15:00, 01/01 15:00)
but only read year=2024/month=01 partition — missing data in
year=2023/month=12 (UTC pre-month partitions).

Changes:
- All _generate_partition_paths functions now accept utc_start/utc_end
  (naive UTC datetimes) instead of local date args
- _iter_months and build_partition_paths similarly updated
- JST integration test split data across year=2023/month=12 and
  year=2024/month=01 partitions to validate cross-partition reads
- Updated all test callers to use datetime args
- Added empty month-02 partition data in compacted_parquet_reads tests
  since utc_end at midnight of next month generates that partition too

No production schema change — data still stored in UTC partitions.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
egograph/backend/tests/unit/database/test_queries.py (1)

21-21: ⚡ Quick win

Ruff警告の全角カッコ、半角にそろえよう

Line 21 と Line 52 の が RUF002/RUF003 の警告対象。CIノイズ減らすために ( ) に置換しよう。

🔧 修正例
-def _qp(**overrides):
-    """テスト用 QueryParams ファクトリ(UTC で日付を解釈)。"""
+def _qp(**overrides):
+    """テスト用 QueryParams ファクトリ(UTC で日付を解釈)。"""
@@
-        # utc_end が 2/1 なので 2月も含まれる(安全側に倒す)
+        # utc_end が 2/1 なので 2月も含まれる(安全側に倒す)

Also applies to: 52-52

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@egograph/backend/tests/unit/database/test_queries.py` at line 21, The
docstrings use full-width parentheses which trigger Ruff RUF002/RUF003; update
the Japanese docstring text for the QueryParams test factory (the line
containing "テスト用 QueryParams ファクトリ(UTC で日付を解釈)。") and the similar docstring
around line 52 to use ASCII parentheses "(" and ")" instead of "(" and ")" so
the strings read "テスト用 QueryParams ファクトリ(UTC で日付を解釈)。" and the other docstring
is similarly normalized.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@egograph/backend/tests/unit/database/test_queries.py`:
- Line 21: The docstrings use full-width parentheses which trigger Ruff
RUF002/RUF003; update the Japanese docstring text for the QueryParams test
factory (the line containing "テスト用 QueryParams ファクトリ(UTC で日付を解釈)。") and the
similar docstring around line 52 to use ASCII parentheses "(" and ")" instead of
"(" and ")" so the strings read "テスト用 QueryParams ファクトリ(UTC で日付を解釈)。" and the
other docstring is similarly normalized.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f8718b4f-5ba4-4cb9-8926-56433add69e3

📥 Commits

Reviewing files that changed from the base of the PR and between 8e5f3c5 and c433483.

📒 Files selected for processing (18)
  • egograph/backend/infrastructure/database/browser_history_queries.py
  • egograph/backend/infrastructure/database/github_queries.py
  • egograph/backend/infrastructure/database/parquet_paths.py
  • egograph/backend/infrastructure/database/queries.py
  • egograph/backend/infrastructure/database/youtube_queries.py
  • egograph/backend/infrastructure/repositories/browser_history_repository.py
  • egograph/backend/infrastructure/repositories/github_repository.py
  • egograph/backend/infrastructure/repositories/spotify_repository.py
  • egograph/backend/infrastructure/repositories/youtube_repository.py
  • egograph/backend/tests/integration/test_compacted_parquet_reads.py
  • egograph/backend/tests/unit/database/test_browser_history_queries.py
  • egograph/backend/tests/unit/database/test_parquet_paths.py
  • egograph/backend/tests/unit/database/test_queries.py
  • egograph/backend/tests/unit/repositories/test_youtube_queries.py
  • egograph/backend/tests/unit/test_validators.py
  • egograph/pipelines/tests/e2e/test_browser_history_ingest.py
  • egograph/pipelines/tests/integration/github/test_enrichment.py
  • egograph/pipelines/tests/integration/github/test_ingest.py
🚧 Files skipped from review as they are similar to previous changes (8)
  • egograph/backend/infrastructure/repositories/browser_history_repository.py
  • egograph/backend/infrastructure/repositories/spotify_repository.py
  • egograph/backend/tests/unit/test_validators.py
  • egograph/backend/infrastructure/repositories/youtube_repository.py
  • egograph/backend/tests/unit/database/test_browser_history_queries.py
  • egograph/backend/infrastructure/repositories/github_repository.py
  • egograph/backend/tests/unit/repositories/test_youtube_queries.py
  • egograph/backend/infrastructure/database/github_queries.py

@endo-ly endo-ly merged commit 5ad3dbb into main May 17, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant