feat: タイムゾーン対応クエリ(保存UTC・取得時環境TZ変換)#168
Conversation
日付範囲クエリで環境タイムゾーンを考慮するように変更。 保存はUTCのまま変更なし。取得時に TIMEZONE 環境変数で指定した タイムゾーンとして日付を解釈し、UTCに変換してからParquetと 比較する。これにより JST 23:59 のデータが正しく「当日」として 扱われるようになる。 変更点: - .env.example に TIMEZONE=Asia/Tokyo を追加 - BackendConfig に timezone フィールドを追加 (ZoneInfo) - validators.to_utc_range() で日付→UTC naive datetime 変換 - 全 QueryParams に utc_start/utc_end を追加 - SQL WHERE句を ::DATE BETWEEN → >= / < に変更 - strftime で AT TIME ZONE を使用し環境TZで期間バケット生成 - 全 Repository に tz パラメータを追加 - 全テストを utc_start/utc_end 対応に更新 - to_utc_range の単体テストを追加
Walkthroughタイムゾーンで解釈した日付範囲をto_utc_rangeでUTCの半開区間(utc_start, utc_end)に変換し、QueryParams/パーティション生成/SQLバインド/リポジトリ初期化/API/テストへ一貫して伝播させる変更を導入。 ChangesTimezone-aware query support
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (5)
egograph/backend/tests/unit/repositories/test_youtube_queries.py (1)
29-47: ⚡ Quick win
_yqpのtz_nameと UTCレンジ計算の基準TZを揃えよう今は
tz_nameを上書きしてもutc_start/utc_endが常に UTC 基準になる。TZ対応テストの信頼性が落ちるので、tz_name由来のTZでto_utc_rangeを呼ぶ形に合わせたい。🧪 変更イメージ
-from datetime import date, timezone +from datetime import date +from zoneinfo import ZoneInfo @@ def _yqp(**overrides): @@ - utc_start, utc_end = to_utc_range(sd, ed, timezone.utc) + tz_name = defaults.get("tz_name", "UTC") + utc_start, utc_end = to_utc_range(sd, ed, ZoneInfo(str(tz_name)))🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@egograph/backend/tests/unit/repositories/test_youtube_queries.py` around lines 29 - 47, The helper _yqp currently always computes utc_start/utc_end using timezone.utc, so overriding tz_name doesn't affect to_utc_range; change _yqp to resolve the timezone from the provided tz_name (e.g. ZoneInfo(tz_name) or equivalent) and pass that timezone object into to_utc_range instead of timezone.utc, then construct YouTubeQueryParams with the resulting utc_start/utc_end so tz_name and the UTC range calculation are consistent.egograph/backend/infrastructure/repositories/youtube_repository.py (1)
34-34: ⚡ Quick winデフォルト引数の
ZoneInfo("UTC")を本体内初期化に移そうLine 34 の
tzパラメータがデフォルト引数で関数呼び出しを含んでるから、Ruff の B008 警告が出てる。Noneデフォルトで本体内初期化のほうが安定する。- def __init__(self, r2_config: R2Config, tz: ZoneInfo = ZoneInfo("UTC")): + def __init__(self, r2_config: R2Config, tz: ZoneInfo | None = None): """YouTubeRepository を初期化します。 Args: r2_config: R2 設定 tz: クエリ時の日付解釈に使用するタイムゾーン """ self.r2_config = r2_config - self._tz = tz + self._tz = tz or ZoneInfo("UTC")🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@egograph/backend/infrastructure/repositories/youtube_repository.py` at line 34, Change the __init__ signature to avoid calling ZoneInfo("UTC") at function definition time: make the tz parameter default to None (update the type hint to Optional[ZoneInfo]) and then inside the __init__ body set tz = ZoneInfo("UTC") if tz is None; update any imports to include typing.Optional if needed. This moves creation of ZoneInfo("UTC") from the function definition into runtime initialization within the __init__ method (refer to the __init__ method and the tz parameter).egograph/backend/tests/integration/test_compacted_parquet_reads.py (1)
30-31: ⚡ Quick win非UTC境界ケースの統合テストを1本追加したい
今回の主題は「環境TZ解釈」だから、
tz_name="UTC"固定だけだと回帰を拾いにくい。Asia/Tokyoみたいな日付またぎケースを1本足しておくと守りがかなり強くなる。Also applies to: 73-83, 163-174, 231-241
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@egograph/backend/tests/integration/test_compacted_parquet_reads.py` around lines 30 - 31, 現在のテストヘルパー _utc_range が常に timezone.utc を使って UTC 固定でレンジを作っているため、TZ 解釈の回帰を検出しにくいので、to_utc_range を使うテストに少なくとも1つ「非UTC境界(例: tz_name='Asia/Tokyo')」の統合テストケースを追加してください;具体的には既存の _utc_range / to_utc_range 呼び出しパターンを参考にして、日付がTZ境界をまたぐケース(日本時間で00:00をまたぐ日付範囲)を使う新しいテストを作成し、テスト内で tz_name="Asia/Tokyo" を指定して期待するUTC範囲が得られることをアサートし、同様の修正(同じ非UTCケース追加)をファイル内の他の類似ブロック(同様の _utc_range/to_utc_range を使っている箇所)にも適用してください。egograph/backend/tests/unit/database/test_queries.py (1)
20-37: ⚡ Quick win
_qpがUTC固定で、TZ回帰テストを増やしにくいLine 30 が
timezone.utc固定だから、今回の機能の本丸(非UTC解釈)をこのファイルで守りにくい。tz引数(既定UTC)を足して、月跨ぎ1ケースだけでもAsia/Tokyoを入れておくとかなり強くなる。差分案
-def _qp(**overrides): - """テスト用 QueryParams ファクトリ(UTC で日付を解釈)。""" +def _qp(*, tz=timezone.utc, **overrides): + """テスト用 QueryParams ファクトリ (指定TZで日付を解釈)。""" @@ - defaults = dict( + defaults = dict( bucket="test-bucket", events_path="events/", - tz_name="UTC", + tz_name=getattr(tz, "key", "UTC"), ) @@ - utc_start, utc_end = to_utc_range(sd, ed, timezone.utc) + utc_start, utc_end = to_utc_range(sd, ed, tz)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@egograph/backend/tests/unit/database/test_queries.py` around lines 20 - 37, The helper _qp currently calls to_utc_range with a hard-coded timezone.utc which prevents writing non-UTC regression tests; modify _qp to accept an optional tz (or tz_name) parameter defaulting to timezone.utc and pass that through to to_utc_range and into the returned QueryParams (keep existing defaults behavior), and update at least one test in this file to call _qp with tz=ZoneInfo("Asia/Tokyo") (or tz_name="Asia/Tokyo") to validate month-crossing behavior under JST. Ensure references: function _qp, to_utc_range, and the QueryParams constructor are updated accordingly.egograph/backend/infrastructure/repositories/spotify_repository.py (1)
31-31: ⚡ Quick winデフォルト引数で関数を呼ぶのはやめておこう
31行目の
ZoneInfo("UTC")はデフォルト引数で関数呼び出ししてるから Ruff B008 に引っかかる。Noneで受けてモジュール定数かメソッド内で処理するほうがスッキリ。こんな感じの修正
+DEFAULT_TIMEZONE = ZoneInfo("UTC") + - def __init__(self, r2_config: R2Config, tz: ZoneInfo = ZoneInfo("UTC")): + def __init__(self, r2_config: R2Config, tz: ZoneInfo | None = None): self.r2_config = r2_config - self._tz = tz + self._tz = tz or DEFAULT_TIMEZONE🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@egograph/backend/infrastructure/repositories/spotify_repository.py` at line 31, The __init__ signature currently calls ZoneInfo("UTC") as a default argument which triggers Ruff B008; change the parameter to accept tz: ZoneInfo | None = None in the SpotifyRepository.__init__ (or whatever class contains that __init__) and inside the constructor set self.tz = tz or ZoneInfo("UTC") when tz is None, or alternatively define a module-level constant UTC = ZoneInfo("UTC") and use that inside the body; ensure no function calls occur in the default argument.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/99.archive/plans/timezone-aware-query.md`:
- Around line 17-27:
ドキュメント内でファイルパス表記が「egograph/...」と「backend/...」で混在しているため参照が分かりにくくなっています。計画書全体でルート表記を統一してください(例:すべて「backend/...」にするかすべて「egograph/...」にする)。具体的には記載されている関数/型名(to_utc_range)や変更対象のデータクラス名(QueryParams,
GitHubQueryParams, BrowserHistoryQueryParams,
YouTubeQueryParams)が登場するすべての行を探し、同一のルートプレフィックスに置き換えて一貫性を保ってください。なお目次や項番(複数箇所、例:
48-52 行付近)も忘れず更新してください。
In `@egograph/backend/infrastructure/database/queries.py`:
- Around line 239-243: The SQL time-zone conversion in the strftime expression
is reversed causing local buckets to shift; locate the strftime usage
referencing played_at_utc and params.tz_name in queries.py and change the
timezone handling to either apply UTC first then the target zone (i.e.,
played_at_utc AT TIME ZONE 'UTC' AT TIME ZONE '{params.tz_name}') or simplify to
a single conversion (played_at_utc AT TIME ZONE '{params.tz_name}'), ensuring
the resulting timestamp passed to strftime reflects the intended local timezone
using the date_format variable.
In `@egograph/backend/infrastructure/repositories/spotify_repository.py`:
- Around line 63-64: The partition-generation is using local start_date/end_date
which causes month-boundary shifts versus stored UTC partitions; update
_generate_partition_paths() to compute partition months from UTC-aligned
datetimes (e.g., convert start_date and end_date to UTC at midnight or use
.astimezone(timezone.utc) and then derive year/month ranges) so generated
partitions match stored year/month=YYYY/MM in UTC; ensure any loop or
inclusive/exclusive logic uses those UTC-derived year/month values and add tests
around Asia/Tokyo midnight transitions to validate.
In `@egograph/backend/tests/unit/test_validators.py`:
- Line 65: Replace the full-width Japanese parentheses in the docstring "返り値が
naive datetime(tzinfo=None)であること。" with ASCII parentheses to satisfy Ruff
(RUF002); locate the docstring in egograph/backend/tests/unit/test_validators.py
(search for that exact string) and change "(" and ")" to "(" and ")" so it
becomes "返り値が naive datetime (tzinfo=None) であること。".
---
Nitpick comments:
In `@egograph/backend/infrastructure/repositories/spotify_repository.py`:
- Line 31: The __init__ signature currently calls ZoneInfo("UTC") as a default
argument which triggers Ruff B008; change the parameter to accept tz: ZoneInfo |
None = None in the SpotifyRepository.__init__ (or whatever class contains that
__init__) and inside the constructor set self.tz = tz or ZoneInfo("UTC") when tz
is None, or alternatively define a module-level constant UTC = ZoneInfo("UTC")
and use that inside the body; ensure no function calls occur in the default
argument.
In `@egograph/backend/infrastructure/repositories/youtube_repository.py`:
- Line 34: Change the __init__ signature to avoid calling ZoneInfo("UTC") at
function definition time: make the tz parameter default to None (update the type
hint to Optional[ZoneInfo]) and then inside the __init__ body set tz =
ZoneInfo("UTC") if tz is None; update any imports to include typing.Optional if
needed. This moves creation of ZoneInfo("UTC") from the function definition into
runtime initialization within the __init__ method (refer to the __init__ method
and the tz parameter).
In `@egograph/backend/tests/integration/test_compacted_parquet_reads.py`:
- Around line 30-31: 現在のテストヘルパー _utc_range が常に timezone.utc を使って UTC
固定でレンジを作っているため、TZ 解釈の回帰を検出しにくいので、to_utc_range を使うテストに少なくとも1つ「非UTC境界(例:
tz_name='Asia/Tokyo')」の統合テストケースを追加してください;具体的には既存の _utc_range / to_utc_range
呼び出しパターンを参考にして、日付がTZ境界をまたぐケース(日本時間で00:00をまたぐ日付範囲)を使う新しいテストを作成し、テスト内で
tz_name="Asia/Tokyo"
を指定して期待するUTC範囲が得られることをアサートし、同様の修正(同じ非UTCケース追加)をファイル内の他の類似ブロック(同様の
_utc_range/to_utc_range を使っている箇所)にも適用してください。
In `@egograph/backend/tests/unit/database/test_queries.py`:
- Around line 20-37: The helper _qp currently calls to_utc_range with a
hard-coded timezone.utc which prevents writing non-UTC regression tests; modify
_qp to accept an optional tz (or tz_name) parameter defaulting to timezone.utc
and pass that through to to_utc_range and into the returned QueryParams (keep
existing defaults behavior), and update at least one test in this file to call
_qp with tz=ZoneInfo("Asia/Tokyo") (or tz_name="Asia/Tokyo") to validate
month-crossing behavior under JST. Ensure references: function _qp,
to_utc_range, and the QueryParams constructor are updated accordingly.
In `@egograph/backend/tests/unit/repositories/test_youtube_queries.py`:
- Around line 29-47: The helper _yqp currently always computes utc_start/utc_end
using timezone.utc, so overriding tz_name doesn't affect to_utc_range; change
_yqp to resolve the timezone from the provided tz_name (e.g. ZoneInfo(tz_name)
or equivalent) and pass that timezone object into to_utc_range instead of
timezone.utc, then construct YouTubeQueryParams with the resulting
utc_start/utc_end so tz_name and the UTC range calculation are consistent.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0f2ea8d5-d1f0-4b9a-b9a8-e6836a799ac7
📒 Files selected for processing (25)
docs/99.archive/plans/timezone-aware-query.mdegograph/backend/api/browser_history_data.pyegograph/backend/api/data.pyegograph/backend/api/github.pyegograph/backend/api/youtube.pyegograph/backend/config.pyegograph/backend/infrastructure/database/browser_history_queries.pyegograph/backend/infrastructure/database/github_queries.pyegograph/backend/infrastructure/database/queries.pyegograph/backend/infrastructure/database/youtube_queries.pyegograph/backend/infrastructure/repositories/browser_history_repository.pyegograph/backend/infrastructure/repositories/github_repository.pyegograph/backend/infrastructure/repositories/spotify_repository.pyegograph/backend/infrastructure/repositories/youtube_repository.pyegograph/backend/mcp_server.pyegograph/backend/tests/integration/test_compacted_parquet_reads.pyegograph/backend/tests/integration/test_mcp_endpoint.pyegograph/backend/tests/test_mcp_server.pyegograph/backend/tests/unit/database/test_browser_history_queries.pyegograph/backend/tests/unit/database/test_queries.pyegograph/backend/tests/unit/repositories/test_youtube_queries.pyegograph/backend/tests/unit/test_validators.pyegograph/backend/usecases/tools/factory.pyegograph/backend/validators.pyegograph/pipelines/.env.example
💤 Files with no reviewable changes (1)
- egograph/backend/tests/integration/test_mcp_endpoint.py
- Fix critical bug: strftime AT TIME ZONE was reversed. Naive timestamps
are UTC, so must apply AT TIME ZONE 'UTC' first, then AT TIME ZONE
'{tz}' to convert to local time for period bucketing.
- Fix B008: replace ZoneInfo('UTC') default arg with None + fallback
- Fix RUF002: replace full-width parentheses in docstring
- Add JST boundary integration test to verify timezone-aware querying
Tests were hardcoding '2026-04' for mock data dates, causing failures when the current month changed. Updated to use datetime.now() so mock PR/commit dates always fall within the backfill window.
… start_date/end_date Codex review revealed a critical bug: partition path generation was still using local start_date/end_date, so when querying JST 2024-01-01 the WHERE clause correctly converted to UTC range [12/31 15:00, 01/01 15:00) but only read year=2024/month=01 partition — missing data in year=2023/month=12 (UTC pre-month partitions). Changes: - All _generate_partition_paths functions now accept utc_start/utc_end (naive UTC datetimes) instead of local date args - _iter_months and build_partition_paths similarly updated - JST integration test split data across year=2023/month=12 and year=2024/month=01 partitions to validate cross-partition reads - Updated all test callers to use datetime args - Added empty month-02 partition data in compacted_parquet_reads tests since utc_end at midnight of next month generates that partition too No production schema change — data still stored in UTC partitions.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
egograph/backend/tests/unit/database/test_queries.py (1)
21-21: ⚡ Quick winRuff警告の全角カッコ、半角にそろえよう
Line 21 と Line 52 の
()が RUF002/RUF003 の警告対象。CIノイズ減らすために()に置換しよう。🔧 修正例
-def _qp(**overrides): - """テスト用 QueryParams ファクトリ(UTC で日付を解釈)。""" +def _qp(**overrides): + """テスト用 QueryParams ファクトリ(UTC で日付を解釈)。""" @@ - # utc_end が 2/1 なので 2月も含まれる(安全側に倒す) + # utc_end が 2/1 なので 2月も含まれる(安全側に倒す)Also applies to: 52-52
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@egograph/backend/tests/unit/database/test_queries.py` at line 21, The docstrings use full-width parentheses which trigger Ruff RUF002/RUF003; update the Japanese docstring text for the QueryParams test factory (the line containing "テスト用 QueryParams ファクトリ(UTC で日付を解釈)。") and the similar docstring around line 52 to use ASCII parentheses "(" and ")" instead of "(" and ")" so the strings read "テスト用 QueryParams ファクトリ(UTC で日付を解釈)。" and the other docstring is similarly normalized.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@egograph/backend/tests/unit/database/test_queries.py`:
- Line 21: The docstrings use full-width parentheses which trigger Ruff
RUF002/RUF003; update the Japanese docstring text for the QueryParams test
factory (the line containing "テスト用 QueryParams ファクトリ(UTC で日付を解釈)。") and the
similar docstring around line 52 to use ASCII parentheses "(" and ")" instead of
"(" and ")" so the strings read "テスト用 QueryParams ファクトリ(UTC で日付を解釈)。" and the
other docstring is similarly normalized.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: f8718b4f-5ba4-4cb9-8926-56433add69e3
📒 Files selected for processing (18)
egograph/backend/infrastructure/database/browser_history_queries.pyegograph/backend/infrastructure/database/github_queries.pyegograph/backend/infrastructure/database/parquet_paths.pyegograph/backend/infrastructure/database/queries.pyegograph/backend/infrastructure/database/youtube_queries.pyegograph/backend/infrastructure/repositories/browser_history_repository.pyegograph/backend/infrastructure/repositories/github_repository.pyegograph/backend/infrastructure/repositories/spotify_repository.pyegograph/backend/infrastructure/repositories/youtube_repository.pyegograph/backend/tests/integration/test_compacted_parquet_reads.pyegograph/backend/tests/unit/database/test_browser_history_queries.pyegograph/backend/tests/unit/database/test_parquet_paths.pyegograph/backend/tests/unit/database/test_queries.pyegograph/backend/tests/unit/repositories/test_youtube_queries.pyegograph/backend/tests/unit/test_validators.pyegograph/pipelines/tests/e2e/test_browser_history_ingest.pyegograph/pipelines/tests/integration/github/test_enrichment.pyegograph/pipelines/tests/integration/github/test_ingest.py
🚧 Files skipped from review as they are similar to previous changes (8)
- egograph/backend/infrastructure/repositories/browser_history_repository.py
- egograph/backend/infrastructure/repositories/spotify_repository.py
- egograph/backend/tests/unit/test_validators.py
- egograph/backend/infrastructure/repositories/youtube_repository.py
- egograph/backend/tests/unit/database/test_browser_history_queries.py
- egograph/backend/infrastructure/repositories/github_repository.py
- egograph/backend/tests/unit/repositories/test_youtube_queries.py
- egograph/backend/infrastructure/database/github_queries.py
概要
日付範囲クエリで環境タイムゾーンを考慮するように変更しました。
保存はUTCのまま変更なし。取得時に
TIMEZONE環境変数で指定したタイムゾーンとして日付を解釈し、UTCに変換してからParquetと比較します。
解決する問題
日本時間 5/17 01:30 の Spotify 再生データが
played_at_utc = 2024-05-16T16:30:00Zとなり、WHERE played_at_utc::DATE = '2024-05-17'でヒットしない問題を修正します。変更内容
設定
.env.exampleにTIMEZONE=Asia/Tokyoを追加BackendConfigにtimezone: ZoneInfoフィールドを追加(デフォルト UTC)コアロジック
validators.to_utc_range()— 入力 date を環境TZと解釈し、UTC naive datetime に変換date(2026, 5, 17)+Asia/Tokyo→datetime(2026, 5, 16, 15, 0)〜datetime(2026, 5, 17, 15, 0)QueryParamsにutc_start/utc_endを追加_utc::DATE BETWEEN ? AND ?→_utc >= ? AND _utc < ?strftimeでAT TIME ZONEを使用し環境TZで期間バケット生成影響範囲
tzパラメータ追加変更しないもの
_utcカラム名(保存フォーマットの仕様)Summary by CodeRabbit