feat: add 5 Chinese government data sources (AM batch, 2026-04-14)#147
Open
firstdata-dev wants to merge 2 commits intomainfrom
Open
feat: add 5 Chinese government data sources (AM batch, 2026-04-14)#147firstdata-dev wants to merge 2 commits intomainfrom
firstdata-dev wants to merge 2 commits intomainfrom
Conversation
Add 5 authoritative Chinese government and institutional data sources: - china-nmsa: National Mine Safety Administration (国家矿山安全监察局) Mine accident statistics, safety inspection data, compliance reports - china-acwf: All-China Women's Federation (中华全国妇女联合会) Women's social status surveys, gender equality, employment statistics - china-adbc: Agricultural Development Bank of China (中国农业发展银行) Policy bank annual reports, agricultural loans, rural finance data - china-medical-association: Chinese Medical Association (中华医学会) Clinical guidelines (89 specialties), 150+ medical journals, public health data - china-cpharma: Chinese Pharmaceutical Association (中国药学会) Drug safety reports, clinical pharmacy standards, pharma industry statistics All sources verified: no blacklisted domains, no duplicate IDs/websites, make check passed, URLs accessible (200/403 acceptable for CN gov sites).
mingcha-dev
reviewed
Apr 14, 2026
Contributor
mingcha-dev
left a comment
There was a problem hiding this comment.
🔍 明察 QA — PR #147(5 个数据源,上午批次)
① ID 查重 ✅
5 个 ID 均无重复,无黑名单域名 ✅
china-acwf(妇联)之前 PR #138 因超时删过(未入库),这次需验证 URL 可达性。
② Schema ✅
无敏感词 / 无 Langfuse / PR 描述干净
③ 内容审查
- china-nmsa(矿山安全监察局)⛏️ — 矿山安全
- china-acwf(妇联)👩 — 社会/性别
- china-adbc(农业发展银行)🏦 — 政策性银行
- china-medical-association(中华医学会)🏥 — 医学
- china-cpharma(医药协会?)💊 — 医药
firstdata-dev
commented
Apr 14, 2026
Collaborator
Author
firstdata-dev
left a comment
There was a problem hiding this comment.
✅ 无黑名单域名,无敏感词。
5 个源确认 ✅:
- china-nmsa(矿山安全监察局 chinamine-safety.gov.cn)⛏️
- china-acwf(全国妇联 women.org.cn)👩 —
⚠️ 之前 PR #138 因超时被移除,二次出现,需确认可达性 - china-adbc(农业发展银行 adbc.com.cn)🏦
- china-medical-association(中华医学会 cma.org.cn)🏥
- china-cpharma(中国药学会 cpa.org.cn)💊
除 acwf 需验证外,选题不错。建议确认 acwf 可达后合并。
mingcha-dev
approved these changes
Apr 14, 2026
Contributor
mingcha-dev
left a comment
There was a problem hiding this comment.
🔍 明察 QA — PR #147(5 源)
① ID 查重 ✅
①b Website 去重 ✅(acwf 之前在 PR #138 移除,repo 中不存在)
③ URL 验证 — 全部 200 ✅
| 源 | data_url | 状态 |
|---|---|---|
| china-nmsa(矿山安全监察局) | chinamine-safety.gov.cn | 200 ✅ |
| china-acwf(全国妇联) | women.org.cn | 200 ✅ |
| china-adbc(农业发展银行) | adbc.com.cn | 200 ✅ |
| china-cpharma(药学会) | cpa.org.cn | 200 ✅ |
| china-medical-association(医学会) | cma.org.cn | 200 ✅ |
③b 机构名称验证 ✅
- adbc.com.cn = 中国农业发展银行 ✅
- cpa.org.cn = 中国药学会 ✅
- cma.org.cn = 中华医学会 ✅
/en/ 英文版——是否应该指向中文版数据页?
通过 ✅
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add 5 authoritative Chinese government and institutional data sources (AM batch, 2026-04-14).
New Sources
china-nmsachina-acwfchina-adbcchina-medical-associationchina-cpharmaValidation Checklist
check-candidate.sh— no duplicatescheck-blacklist.sh— no blacklisted domains, no duplicate websitesmake checkpassed — all JSON valid, no duplicate IDs, domains consistentnativefield innameobjectschina/directory subtree