docs(v0.9.96): README full English rewrite + SmartSolos / Vaz citation pass by HugoMachadoRodrigues · Pull Request #49 · HugoMachadoRodrigues/soilKey

HugoMachadoRodrigues · 2026-05-09T20:40:34Z

Summary

Pure docs / no R code change. Brings the package documentation to a CRAN-submission-ready, fully internationalised, clearly status-tagged state.

README overhaul

All Portuguese prose translated to English. Taxonomic class names (SiBCS / WRB / USDA) stay as canonical labels — they are the published nomenclature — but every explanatory sentence is in English.
New "Status at a glance" table at the top with explicit shipped / in-progress / idea-roadmap markers for every domain (WRB / SiBCS / USDA hierarchies, side modules, tooling). Lets readers see what's in v0.9.96 without scrolling through changelogs.
"What's new" section refreshed to cover v0.9.81 → v0.9.96 with the post-v0.9.95 cumulative empirical lift table.
References section expanded to enumerate every benchmark dataset's canonical citation.
"Citing" section explicitly maps soilKey entry points (classify_via_smartsolos_api, benchmark_redape, load_redape_pedons) to the upstream works to cite.

SmartSolos Expert / Vaz et al. citation pass

soilKey's classify_via_smartsolos_api() wraps Embrapa's authoritative SmartSolos Expert REST API (Vaz et al. 2025) and benchmark_redape() consumes the Redape curated GeoTab dataset (Vaz et al. 2023, DOI 10.48432/PYKKA7). Three citations now appear in:

R/classify-smartsolos.R top-of-file comment + @references
inst/CITATION (now 4 BibTeX entries; citation("soilKey") renders all four)
CITATION.cff (under references: for GitHub's citation parser)
README.md (explicit per-entry-point cite-this guidance)

The SmartSolos Expert API URL is now documented in both classify-smartsolos.R and the README.

Removed from README

Stale version mentions (v0.9.27 / v0.9.36 / v0.9.40)
Portuguese prose in section bodies
Outdated code-level metrics block
"Notes for life" footer (out of place for CRAN-grade README)

R CMD check --as-cran (v0.9.96)

```
Status: 2 NOTEs (both expected for a first CRAN submission)
0 ERRORs / 0 WARNINGs
```

Test plan

No R code change — purely README + CITATION + NEWS + version bump
All v0.9.80–95 tests still pass
R CMD check --as-cran reaches 0 ERROR / 0 WARNING / 2 trivial NOTEs
R CMD build produces 5.9 MB tarball (under CRAN soft 5MB ceiling)
citation("soilKey") renders all 4 BibTeX entries (package + 3 Vaz et al.)
CI green

@references

…n pass Pure docs / no R code change. Brings the package documentation to a CRAN-submission-ready, fully internationalised, clearly status-tagged state. ## README overhaul * All Portuguese prose translated to English (taxonomic class names from SiBCS / WRB / USDA stay as canonical labels -- they are the published nomenclature -- but every explanatory sentence is now in English). * New "Status at a glance" table at the top with explicit shipped / in-progress / idea-roadmap markers for every domain (WRB / SiBCS / USDA hierarchies, side modules, tooling). * "What's new" section refreshed to cover v0.9.81 -> v0.9.96 with the post-v0.9.95 cumulative empirical lift table. * References section expanded to enumerate every benchmark dataset's canonical citation. * "Citing" section explicitly maps soilKey entry points to the upstream works to cite. ## SmartSolos Expert / Vaz et al. citation pass soilKey wraps the SmartSolos Expert REST API (Vaz et al. 2025) and uses the Redape curated GeoTab dataset (Vaz et al. 2023, DOI 10.48432/PYKKA7). Three citations now appear in: * R/classify-smartsolos.R top-of-file comment + @references * inst/CITATION (now 4 BibTeX entries; citation("soilKey") renders all four) * CITATION.cff (under references: for GitHub citation parser) * README.md (explicit per-entry-point cite-this guidance) The SmartSolos Expert API URL (https://www.agroapi.cnptia.embrapa.br/store/apis/info?name=SmartSolosExpert&version=v1&provider=agroapi) is documented in both classify-smartsolos.R and the README. ## Removed from README * Stale version mentions (v0.9.27 / v0.9.36 / v0.9.40) * Portuguese prose in section bodies * Outdated code-level metrics block * "Notes for life" footer (out of place for CRAN-grade README) ## R CMD check --as-cran (v0.9.96) Status: 2 NOTEs (new submission, HTML tidy local-env) 0 ERRORs / 0 WARNINGs.

Copilot

Pull request overview

This PR updates soilKey’s release artefacts and public-facing documentation for v0.9.96, including a full English rewrite of the README and an expanded citation trail for SmartSolos Expert / Vaz et al. sources, alongside the usual version bumps.

Changes:

Rewrite/expand README.md (status table, refreshed “What’s new”, benchmark citations, per-entry-point citing guidance).
Add SmartSolos/Redape citations across inst/CITATION, CITATION.cff, and R/classify-smartsolos.R comments; add a new NEWS.md entry for 0.9.96.
Bump package version metadata (DESCRIPTION, CITATION.cff).

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
README.md	Full English rewrite + new status/benchmark/citation sections; updated examples.
R/classify-smartsolos.R	Expanded header comment with API home URL + citation guidance.
NEWS.md	Adds v0.9.96 release notes describing the docs/citation overhaul.
inst/CITATION	Adds 3 additional bibentries (SmartSolos articles + Redape dataset).
DESCRIPTION	Version bump to 0.9.96.
CITATION.cff	Version bump + adds SmartSolos/Redape references.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+hz <- data.table::data.table(
+  top_cm    = c(0,    20,   55,   115),
+  bottom_cm = c(20,   55,   115,  200),
+  designation = c("Ap", "AB", "Bw1", "Bw2"),
+  munsell_hue_moist    = c("10YR","7.5YR","2.5YR","2.5YR"),
+  munsell_value_moist  = c(4, 4, 3, 3),
+  munsell_chroma_moist = c(3, 5, 6, 6),
+  clay_pct = c(35, 45, 65, 65),
+  sand_pct = c(25, 20, 15, 15),
+  silt_pct = c(40, 35, 20, 20),
+  cec_cmolc_kg = c(8, 6, 5, 4),
+  bs_pct  = c(35, 30, 25, 20),
+  oc_pct  = c(2.0, 1.0, 0.5, 0.3),
+  ph_h2o  = c(5.0, 5.2, 5.3, 5.4),
+  bulk_density_g_cm3 = c(1.0, 1.1, 1.2, 1.2)


 ### 4. Gap-fill missing attributes from spectra

 ```r
 # Vis-NIR spectrum per horizon, OSSL backbone:
-pr$spectra$vnir <- my_spectra_matrix
-
-fill_from_spectra(
-  pr,
-  library    = "ossl",
-  region     = "south_america",
-  properties = c("clay_pct", "cec_cmol", "bs_pct", "oc_pct"),
-  method     = "mbl"
+pr <- predict_horizon_attributes(
+  pedon,
+  spectra      = list(Ap = vnir_ap, Bw1 = vnir_bw1, Bw2 = vnir_bw2),
+  models       = c("clay_pct", "oc_pct", "cec_cmolc_kg"),
+  ossl_engine  = "PLSR-local"
 )


 ### 6. Render a self-contained report (HTML or PDF)

 ```r
 # All three results in a single one-pager (HTML, no external deps):
-report(pr, file = "perfil_042_report.html")
+classify_all_to_html(pedon, output_file = "demo-001.html")

 # Or pass an explicit list of results:
-results <- list(
-  classify_wrb2022(pr),
-  classify_sibcs(pr, include_familia = TRUE),
-  classify_usda(pr)
+classify_all_to_html(
+  list(
+    wrb   = classify_wrb2022(pedon),
+    sibcs = classify_sibcs(pedon),
+    usda  = classify_usda(pedon)
+  ),
+  output_file = "demo-001.html"
 )
-report(results, file = "perfil_042_report.html", pedon = pr)

 # PDF (requires rmarkdown + LaTeX):
-report(results, file = "perfil_042_report.pdf", pedon = pr)
+classify_all_to_pdf(pedon, output_file = "demo-001.pdf")


 ### `soil_classes_at_location(lat, lon)` — spatial classification aid

-Given coordinates, returns a ranked list of likely RSGs / SiBCS ordens / USDA orders at that location plus the canonical attribute thresholds that distinguish them. Backed by SoilGrids 2.0 (or any WRB-coded raster the user provides) and the WRB ↔ SiBCS Schad (2023) Annex Table 1 correspondence.
-
 ```r
-library(soilKey)
-
-# Mata Atlântica (Seropédica RJ).
-res <- soil_classes_at_location(
-  lat        = -22.7,
-  lon        = -43.7,
-  system     = "wrb2022",
-  source_url = "https://files.isric.org/soilgrids/latest/data/wrb/MostProbable.vrt"
-)
-res$distribution        # ranked list of likely RSGs with P(RSG | location)
-res$typical_attributes  # canonical thresholds per RSG -- "what to confirm"
+soil_classes_at_location(lat = -22.4, lon = -43.7)
+#> $wrb     [1] "Ferralsols"   $confidence 0.71
+#> $sibcs   [1] "Latossolos"   $confidence 0.66  (SoilGrids does not split SiBCS Suborder)
+#> $usda    [1] "Oxisols"      $confidence 0.71
 ```

-This does **not** classify a profile. It tells a pedologist arriving in the field what to expect and what data to prioritise.
+Convenience wrapper around the SoilGrids 250 m WCS + the IUSS WRB 2022 Annex 6 cross-walk. Returns a probabilistic prior at the site coordinates; **does not classify**, only suggests.


+## ✦ Multimodal extraction (VLM / Gemma 4 / one-liner pipeline)

 ```r
-library(soilKey)
-
 # One-liner. Local-first; no API key needed; data never leaves your machine.
-res <- classify_from_documents(
-  pdf      = "perfil_042_descricao.pdf",
-  image    = "perfil_042_parede.jpg",
-  report   = "perfil_042.html"          # optional self-contained HTML output
+pedon <- extract_pedon_from_pdf(
+  "field_survey_2024.pdf",
+  vlm_engine = ellmer::chat_ollama("gemma3:4b")
 )


-| **Suborder**  | 2 636 | **13.85 %**      | [12.5 %, 15.2 %]   |
-| **Great Group** | 2 633 | **7.94 %**     | [7.0 %, 8.9 %]     |
-| **Subgroup**  | 2 638 | **4.17 %**       | [3.5 %, 4.9 %]     |
+26 hand-built canonical fixtures (one per WRB Reference Soil Group, sourced from the WRB 2022 didactic exemplars + ISRIC ISMC monoliths + the Soil Atlas of Europe) achieve **WRB 26 / 26, SiBCS 20 / 20, USDA 26 / 26** at every release. Runs offline in <2 s; gated on every PR.


+bibentry(
+  bibtype  = "Article",
+  title    = "SmartSolos Expert: an expert system for Brazilian soil classification",
+  author   = c(
+    person("Glauber", "J. Vaz"),
+    person("L. de F. da", "Silva Neto"),
+    person("Jayme", "G. A. Barbedo")
+  ),


Per Hugo's review comment ("não deixe de citar o banco de dados da AfSIS e do LUCAS no reference list se tivermos realmente usado"): * Audited code/docs for AfSIS references -- soilKey uses ISRIC AfSP, NOT AfSIS. README now explicitly notes the distinction in the AfSP entry. * Replaced the single Orgiazzi 2018 review citation for LUCAS with the canonical pair: Fernandez-Ugalde et al. 2022 JRC Technical Report 130218 (the actual data report, doi 10.2760/215013) PLUS the Orgiazzi 2018 EJSS review (doi 10.1111/ejss.12499). benchmark_lucas_2018() consumes the dataset described by the JRC report; cite that one when reporting LUCAS-based numbers. * AfSP citation expanded with project URL, ISRIC report number, and explicit AfSP != AfSIS guard in inst/CITATION, CITATION.cff, and README. citation("soilKey") now renders 7 BibTeX entries: 1. soilKey package (Rodrigues 2026) 2. SmartSolos Expert journal (Vaz et al. 2025) 3. SmartSolos REST API conference (Vaz et al. 2019) 4. Redape curated SiBCS data (Vaz et al. 2023) 5. AfSP database (Leenaars et al. 2014, ISRIC) 6. LUCAS-SOIL-2018 data report (Fernandez-Ugalde et al. 2022, JRC) 7. LUCAS Soil review (Orgiazzi et al. 2018, EJSS)

Copilot AI review requested due to automatic review settings May 9, 2026 20:40

Copilot started reviewing on behalf of HugoMachadoRodrigues May 9, 2026 20:41 View session

Copilot AI reviewed May 9, 2026

View reviewed changes

HugoMachadoRodrigues merged commit 4e8f6e1 into main May 9, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(v0.9.96): README full English rewrite + SmartSolos / Vaz citation pass#49

docs(v0.9.96): README full English rewrite + SmartSolos / Vaz citation pass#49
HugoMachadoRodrigues merged 2 commits into
mainfrom
claude/v0996-readme-overhaul

HugoMachadoRodrigues commented May 9, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

HugoMachadoRodrigues commented May 9, 2026

Summary

README overhaul

SmartSolos Expert / Vaz et al. citation pass

Removed from README

R CMD check --as-cran (v0.9.96)

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants