docs(v0.9.96): README full English rewrite + SmartSolos / Vaz citation pass#49
Merged
Merged
Conversation
…n pass
Pure docs / no R code change. Brings the package documentation
to a CRAN-submission-ready, fully internationalised, clearly
status-tagged state.
## README overhaul
* All Portuguese prose translated to English (taxonomic class
names from SiBCS / WRB / USDA stay as canonical labels --
they are the published nomenclature -- but every explanatory
sentence is now in English).
* New "Status at a glance" table at the top with explicit
shipped / in-progress / idea-roadmap markers for every
domain (WRB / SiBCS / USDA hierarchies, side modules,
tooling).
* "What's new" section refreshed to cover v0.9.81 -> v0.9.96
with the post-v0.9.95 cumulative empirical lift table.
* References section expanded to enumerate every benchmark
dataset's canonical citation.
* "Citing" section explicitly maps soilKey entry points to
the upstream works to cite.
## SmartSolos Expert / Vaz et al. citation pass
soilKey wraps the SmartSolos Expert REST API (Vaz et al. 2025)
and uses the Redape curated GeoTab dataset (Vaz et al. 2023,
DOI 10.48432/PYKKA7). Three citations now appear in:
* R/classify-smartsolos.R top-of-file comment + @references
* inst/CITATION (now 4 BibTeX entries; citation("soilKey") renders all four)
* CITATION.cff (under references: for GitHub citation parser)
* README.md (explicit per-entry-point cite-this guidance)
The SmartSolos Expert API URL
(https://www.agroapi.cnptia.embrapa.br/store/apis/info?name=SmartSolosExpert&version=v1&provider=agroapi)
is documented in both classify-smartsolos.R and the README.
## Removed from README
* Stale version mentions (v0.9.27 / v0.9.36 / v0.9.40)
* Portuguese prose in section bodies
* Outdated code-level metrics block
* "Notes for life" footer (out of place for CRAN-grade README)
## R CMD check --as-cran (v0.9.96)
Status: 2 NOTEs (new submission, HTML tidy local-env)
0 ERRORs / 0 WARNINGs.
There was a problem hiding this comment.
Pull request overview
This PR updates soilKey’s release artefacts and public-facing documentation for v0.9.96, including a full English rewrite of the README and an expanded citation trail for SmartSolos Expert / Vaz et al. sources, alongside the usual version bumps.
Changes:
- Rewrite/expand
README.md(status table, refreshed “What’s new”, benchmark citations, per-entry-point citing guidance). - Add SmartSolos/Redape citations across
inst/CITATION,CITATION.cff, andR/classify-smartsolos.Rcomments; add a newNEWS.mdentry for 0.9.96. - Bump package version metadata (
DESCRIPTION,CITATION.cff).
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| README.md | Full English rewrite + new status/benchmark/citation sections; updated examples. |
| R/classify-smartsolos.R | Expanded header comment with API home URL + citation guidance. |
| NEWS.md | Adds v0.9.96 release notes describing the docs/citation overhaul. |
| inst/CITATION | Adds 3 additional bibentries (SmartSolos articles + Redape dataset). |
| DESCRIPTION | Version bump to 0.9.96. |
| CITATION.cff | Version bump + adds SmartSolos/Redape references. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+228
to
+242
| hz <- data.table::data.table( | ||
| top_cm = c(0, 20, 55, 115), | ||
| bottom_cm = c(20, 55, 115, 200), | ||
| designation = c("Ap", "AB", "Bw1", "Bw2"), | ||
| munsell_hue_moist = c("10YR","7.5YR","2.5YR","2.5YR"), | ||
| munsell_value_moist = c(4, 4, 3, 3), | ||
| munsell_chroma_moist = c(3, 5, 6, 6), | ||
| clay_pct = c(35, 45, 65, 65), | ||
| sand_pct = c(25, 20, 15, 15), | ||
| silt_pct = c(40, 35, 20, 20), | ||
| cec_cmolc_kg = c(8, 6, 5, 4), | ||
| bs_pct = c(35, 30, 25, 20), | ||
| oc_pct = c(2.0, 1.0, 0.5, 0.3), | ||
| ph_h2o = c(5.0, 5.2, 5.3, 5.4), | ||
| bulk_density_g_cm3 = c(1.0, 1.1, 1.2, 1.2) |
Comment on lines
275
to
284
| ### 4. Gap-fill missing attributes from spectra | ||
|
|
||
| ```r | ||
| # Vis-NIR spectrum per horizon, OSSL backbone: | ||
| pr$spectra$vnir <- my_spectra_matrix | ||
|
|
||
| fill_from_spectra( | ||
| pr, | ||
| library = "ossl", | ||
| region = "south_america", | ||
| properties = c("clay_pct", "cec_cmol", "bs_pct", "oc_pct"), | ||
| method = "mbl" | ||
| pr <- predict_horizon_attributes( | ||
| pedon, | ||
| spectra = list(Ap = vnir_ap, Bw1 = vnir_bw1, Bw2 = vnir_bw2), | ||
| models = c("clay_pct", "oc_pct", "cec_cmolc_kg"), | ||
| ossl_engine = "PLSR-local" | ||
| ) |
Comment on lines
301
to
+318
| ### 6. Render a self-contained report (HTML or PDF) | ||
|
|
||
| ```r | ||
| # All three results in a single one-pager (HTML, no external deps): | ||
| report(pr, file = "perfil_042_report.html") | ||
| classify_all_to_html(pedon, output_file = "demo-001.html") | ||
|
|
||
| # Or pass an explicit list of results: | ||
| results <- list( | ||
| classify_wrb2022(pr), | ||
| classify_sibcs(pr, include_familia = TRUE), | ||
| classify_usda(pr) | ||
| classify_all_to_html( | ||
| list( | ||
| wrb = classify_wrb2022(pedon), | ||
| sibcs = classify_sibcs(pedon), | ||
| usda = classify_usda(pedon) | ||
| ), | ||
| output_file = "demo-001.html" | ||
| ) | ||
| report(results, file = "perfil_042_report.html", pedon = pr) | ||
|
|
||
| # PDF (requires rmarkdown + LaTeX): | ||
| report(results, file = "perfil_042_report.pdf", pedon = pr) | ||
| classify_all_to_pdf(pedon, output_file = "demo-001.pdf") |
Comment on lines
362
to
+371
| ### `soil_classes_at_location(lat, lon)` — spatial classification aid | ||
|
|
||
| Given coordinates, returns a ranked list of likely RSGs / SiBCS ordens / USDA orders at that location plus the canonical attribute thresholds that distinguish them. Backed by SoilGrids 2.0 (or any WRB-coded raster the user provides) and the WRB ↔ SiBCS Schad (2023) Annex Table 1 correspondence. | ||
|
|
||
| ```r | ||
| library(soilKey) | ||
|
|
||
| # Mata Atlântica (Seropédica RJ). | ||
| res <- soil_classes_at_location( | ||
| lat = -22.7, | ||
| lon = -43.7, | ||
| system = "wrb2022", | ||
| source_url = "https://files.isric.org/soilgrids/latest/data/wrb/MostProbable.vrt" | ||
| ) | ||
| res$distribution # ranked list of likely RSGs with P(RSG | location) | ||
| res$typical_attributes # canonical thresholds per RSG -- "what to confirm" | ||
| soil_classes_at_location(lat = -22.4, lon = -43.7) | ||
| #> $wrb [1] "Ferralsols" $confidence 0.71 | ||
| #> $sibcs [1] "Latossolos" $confidence 0.66 (SoilGrids does not split SiBCS Suborder) | ||
| #> $usda [1] "Oxisols" $confidence 0.71 | ||
| ``` | ||
|
|
||
| This does **not** classify a profile. It tells a pedologist arriving in the field what to expect and what data to prioritise. | ||
| Convenience wrapper around the SoilGrids 250 m WCS + the IUSS WRB 2022 Annex 6 cross-walk. Returns a probabilistic prior at the site coordinates; **does not classify**, only suggests. |
Comment on lines
+379
to
386
| ## ✦ Multimodal extraction (VLM / Gemma 4 / one-liner pipeline) | ||
|
|
||
| ```r | ||
| library(soilKey) | ||
|
|
||
| # One-liner. Local-first; no API key needed; data never leaves your machine. | ||
| res <- classify_from_documents( | ||
| pdf = "perfil_042_descricao.pdf", | ||
| image = "perfil_042_parede.jpg", | ||
| report = "perfil_042.html" # optional self-contained HTML output | ||
| pedon <- extract_pedon_from_pdf( | ||
| "field_survey_2024.pdf", | ||
| vlm_engine = ellmer::chat_ollama("gemma3:4b") | ||
| ) |
| | **Suborder** | 2 636 | **13.85 %** | [12.5 %, 15.2 %] | | ||
| | **Great Group** | 2 633 | **7.94 %** | [7.0 %, 8.9 %] | | ||
| | **Subgroup** | 2 638 | **4.17 %** | [3.5 %, 4.9 %] | | ||
| 26 hand-built canonical fixtures (one per WRB Reference Soil Group, sourced from the WRB 2022 didactic exemplars + ISRIC ISMC monoliths + the Soil Atlas of Europe) achieve **WRB 26 / 26, SiBCS 20 / 20, USDA 26 / 26** at every release. Runs offline in <2 s; gated on every PR. |
Comment on lines
+39
to
+46
| bibentry( | ||
| bibtype = "Article", | ||
| title = "SmartSolos Expert: an expert system for Brazilian soil classification", | ||
| author = c( | ||
| person("Glauber", "J. Vaz"), | ||
| person("L. de F. da", "Silva Neto"), | ||
| person("Jayme", "G. A. Barbedo") | ||
| ), |
Per Hugo's review comment ("não deixe de citar o banco de dados da
AfSIS e do LUCAS no reference list se tivermos realmente usado"):
* Audited code/docs for AfSIS references -- soilKey uses ISRIC
AfSP, NOT AfSIS. README now explicitly notes the distinction
in the AfSP entry.
* Replaced the single Orgiazzi 2018 review citation for LUCAS
with the canonical pair: Fernandez-Ugalde et al. 2022 JRC
Technical Report 130218 (the actual data report, doi
10.2760/215013) PLUS the Orgiazzi 2018 EJSS review
(doi 10.1111/ejss.12499). benchmark_lucas_2018() consumes
the dataset described by the JRC report; cite that one
when reporting LUCAS-based numbers.
* AfSP citation expanded with project URL, ISRIC report
number, and explicit AfSP != AfSIS guard in inst/CITATION,
CITATION.cff, and README.
citation("soilKey") now renders 7 BibTeX entries:
1. soilKey package (Rodrigues 2026)
2. SmartSolos Expert journal (Vaz et al. 2025)
3. SmartSolos REST API conference (Vaz et al. 2019)
4. Redape curated SiBCS data (Vaz et al. 2023)
5. AfSP database (Leenaars et al. 2014, ISRIC)
6. LUCAS-SOIL-2018 data report (Fernandez-Ugalde et al. 2022, JRC)
7. LUCAS Soil review (Orgiazzi et al. 2018, EJSS)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Pure docs / no R code change. Brings the package documentation to a CRAN-submission-ready, fully internationalised, clearly status-tagged state.
README overhaul
classify_via_smartsolos_api,benchmark_redape,load_redape_pedons) to the upstream works to cite.SmartSolos Expert / Vaz et al. citation pass
soilKey's
classify_via_smartsolos_api()wraps Embrapa's authoritative SmartSolos Expert REST API (Vaz et al. 2025) andbenchmark_redape()consumes the Redape curated GeoTab dataset (Vaz et al. 2023, DOI 10.48432/PYKKA7). Three citations now appear in:R/classify-smartsolos.Rtop-of-file comment +@referencesinst/CITATION(now 4 BibTeX entries;citation("soilKey")renders all four)CITATION.cff(underreferences:for GitHub's citation parser)README.md(explicit per-entry-point cite-this guidance)The SmartSolos Expert API URL is now documented in both
classify-smartsolos.Rand the README.Removed from README
R CMD check --as-cran (v0.9.96)
```
Status: 2 NOTEs (both expected for a first CRAN submission)
0 ERRORs / 0 WARNINGs
```
Test plan
R CMD check --as-cranreaches 0 ERROR / 0 WARNING / 2 trivial NOTEsR CMD buildproduces 5.9 MB tarball (under CRAN soft 5MB ceiling)citation("soilKey")renders all 4 BibTeX entries (package + 3 Vaz et al.)