From 88991f7df1d76c8fd5b8dba9f7be3ba87a2c4c43 Mon Sep 17 00:00:00 2001 From: Yeganathan S <63534555+skwowet@users.noreply.github.com> Date: Mon, 23 Mar 2026 14:39:05 +0530 Subject: [PATCH 1/4] chore: update .gitignore and add Claude settings file Signed-off-by: Yeganathan S <63534555+skwowet@users.noreply.github.com> --- .claude/settings.json | 5 +++++ .gitignore | 4 ---- 2 files changed, 5 insertions(+), 4 deletions(-) create mode 100644 .claude/settings.json diff --git a/.claude/settings.json b/.claude/settings.json new file mode 100644 index 0000000000..0cbf4db508 --- /dev/null +++ b/.claude/settings.json @@ -0,0 +1,5 @@ +{ + "enabledPlugins": { + "typescript-lsp@claude-plugins-official": true + } +} diff --git a/.gitignore b/.gitignore index 54edc4f658..30715a7f85 100644 --- a/.gitignore +++ b/.gitignore @@ -196,10 +196,6 @@ services/libs/tinybird/.diff_tmp # custom cursor rules .cursor/rules/*.local.mdc -# claude code rules -CLAUDE.md -.claude - # git integration test repositories & output services/apps/git_integration/src/test/repos/ services/apps/git_integration/src/test/outputs/custom/ From 0945ac6e623e961df0258a7e3739d7d840e564f1 Mon Sep 17 00:00:00 2001 From: Yeganathan S <63534555+skwowet@users.noreply.github.com> Date: Mon, 23 Mar 2026 15:22:59 +0530 Subject: [PATCH 2/4] chore: add pyright-lsp for py Signed-off-by: Yeganathan S <63534555+skwowet@users.noreply.github.com> --- .claude/settings.json | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/.claude/settings.json b/.claude/settings.json index 0cbf4db508..fa7f612a80 100644 --- a/.claude/settings.json +++ b/.claude/settings.json @@ -1,5 +1,6 @@ { "enabledPlugins": { - "typescript-lsp@claude-plugins-official": true + "typescript-lsp@claude-plugins-official": true, + "pyright-lsp@claude-plugins-official": true } } From 503f112bdb6aeba17e24e47aad7791b322af7d59 Mon Sep 17 00:00:00 2001 From: Yeganathan S <63534555+skwowet@users.noreply.github.com> Date: Wed, 1 Apr 2026 21:00:08 +0530 Subject: [PATCH 3/4] feat: add .claude/settings.local.json to .gitignore and create CLAUDE.md Signed-off-by: Yeganathan S <63534555+skwowet@users.noreply.github.com> --- .gitignore | 3 +++ CLAUDE.md | 69 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 72 insertions(+) create mode 100644 CLAUDE.md diff --git a/.gitignore b/.gitignore index 30715a7f85..af439d5cb9 100644 --- a/.gitignore +++ b/.gitignore @@ -196,6 +196,9 @@ services/libs/tinybird/.diff_tmp # custom cursor rules .cursor/rules/*.local.mdc +# claude +.claude/settings.local.json + # git integration test repositories & output services/apps/git_integration/src/test/repos/ services/apps/git_integration/src/test/outputs/custom/ diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 0000000000..7f50c0b02d --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,69 @@ +# CDP — Community Data Platform + +CDP is a community data platform by the Linux Foundation. It ingests millions of +activities and events daily from platforms like GitHub, GitLab, and many others +(not just code hosting). Open-source projects get onboarded by connecting +integrations, and data flows continuously at scale. + +The ingested data is often messy. A big part of what CDP does is improve data quality: deduplicating member and organization profiles through merge and unmerge operations, enriching data via third-party providers, and resolving identities across sources. The cleaned data powers analytics and insights for LFX products. + +The codebase started as crowd.dev, an open-source startup later acquired by the Linux Foundation. Speed was prioritized over standards, but the platform is now stable. The focus has shifted to maintainable patterns, scalability, and good developer experience. Performance matters at this scale, even small inefficiencies compound across millions of data points. + +## Tech stack + +TypeScript, Node.js, Express, PostgreSQL (pg-promise), Temporal, Kafka, Redis, OpenSearch, Zod, Bunyan, AWS S3. + +Package manager is **pnpm**. Monorepo managed via pnpm workspaces. + +## Codebase structure + +``` +backend/ -> APIs (public endpoints for LFX products + internal for CDP UI) +frontend/ -> CDP Platform UI +services/apps/ -> Microservices — Temporal workers, Node.js workers, webhook APIs +services/libs/ -> Shared libraries used across services +``` + +`services/libs/common` holds shared utilities, error classes, +and helpers. If a piece of logic is reusable (not business logic), it belongs there. + +`services/libs/data-access-layer` holds all +database query functions. Check here before writing new ones — duplicates are +already a problem. + +## Patterns in transition + +Old and new patterns coexist. Always use the new pattern. + +- **Sequelize -> pg-promise**: Sequelize is legacy (backend only). Use + `queryExecutor` from `@crowd/data-access-layer` for all new database code. +- **Classes -> functions**: Class-based services and repos are legacy. Write + plain functions — composable, modular, easy to test. +- **Multi-tenancy -> single tenant**: Multi-tenancy is being phased out. The + tenant table still exists. Code uses `DEFAULT_TENANT_ID` from `@crowd/common`. + Don't add new multi-tenant logic. +- **Legacy auth -> Auth0**: Auth0 is the current auth system. Ignore old JWT + patterns. +- **Zod for validation**: Public API endpoints use Zod schemas with + `validateOrThrow`. Follow this pattern for all new endpoints. + +## Working with the database + +Millions of rows. Every query matters. + +- Look up the table schema and indexes before writing any query. Don't select + or touch columns blindly. +- Check existing functions in `data-access-layer` before writing new ones. + Weigh the blast radius of modifying a shared function — sometimes a new + function is safer. +- Write queries with performance in mind. Think about what indexes exist, what + the query plan looks like, and whether you're scanning more rows than needed. + +## Code quality + +- Functional and modular. Code should be easy to plug in, pull out, and test + independently. +- Think about performance at scale, even for small changes. +- Define types properly — extend and reuse existing types. Don't sprinkle `any`. +- Don't touch working code outside the scope of the current task. +- Prefer doing less over introducing risk. Weigh trade-offs before acting. From 9d2d8e44713f75d6dc28df2fda5d522110bdfd9b Mon Sep 17 00:00:00 2001 From: Yeganathan S <63534555+skwowet@users.noreply.github.com> Date: Thu, 2 Apr 2026 15:06:56 +0530 Subject: [PATCH 4/4] Update .gitignore Co-authored-by: Mouad BANI Signed-off-by: Yeganathan S <63534555+skwowet@users.noreply.github.com> --- .gitignore | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/.gitignore b/.gitignore index af439d5cb9..72d47967d3 100644 --- a/.gitignore +++ b/.gitignore @@ -198,6 +198,11 @@ services/libs/tinybird/.diff_tmp # claude .claude/settings.local.json +.claude/cache/ +.claude/tmp/ +.claude/logs/ +.claude/sessions/ +.claude/todos/ # git integration test repositories & output services/apps/git_integration/src/test/repos/