Audience: ML platform engineers and integrators embedding ForgeLM in pipeline orchestrators (Airflow, Prefect, Dagster, Argo, Kubeflow) or invoking it from Jupyter notebooks. This page enumerates every public symbol re-exported from
forgelm, classifies it by stability tier, and lists the signatures that downstream consumers may pin against.Mirror: library_api_reference-tr.md
Companion guide:
../guides/library_api.md— three end-to-end worked examples.Design source:
../design/library_api.md(Phase 18).
ForgeLM ships a Python library API alongside the forgelm console script. The library surface is declared in forgelm/__init__.py via __all__, lazy-resolved through PEP 562 __getattr__, and type-hinted under TYPE_CHECKING so downstream mypy --strict consumers see real signatures. forgelm/py.typed ships in the wheel as the PEP 561 marker.
Three tiers govern the semver weight of every public symbol. A consumer that pins to a specific tier knows what to expect from a forgelm upgrade.
Semver-protected. A breaking change to any signature below requires a major version bump (__api_version__ MAJOR.MINOR.PATCH — see Versioning and deprecation policy). New optional parameters with defaults are non-breaking; renamed required parameters or removed return-shape fields are breaking.
Stable symbols are documented here, are 100% type-hinted, have at least one integration test under tests/test_library_api.py, and follow the deprecation cadence (deprecate in N, keep working in N+1, remove in N+2).
Best-effort. The shape may change in a minor release without a major bump. Operator copy at the call site flags lifecycle. Pin to a specific minor version if you depend on the current shape.
Anything not in forgelm.__all__ and not listed in the Public symbols tables. Reach-ins (from forgelm._http import ..., forgelm.cli._run_audit_cmd) work at the language level but carry zero stability guarantee. File an issue requesting promotion if your pipeline depends on an internal symbol.
Tables grouped by concern. Every cell is a real attribute on the live forgelm package after import forgelm.
| Symbol | Tier | Type | Description |
|---|---|---|---|
forgelm.__version__ |
Stable | str |
PEP 396/8 release version, derived from importlib.metadata (single source of truth = pyproject.toml). |
forgelm.__api_version__ |
Stable | str |
Three-segment semver library-API version ("MAJOR.MINOR.PATCH", e.g. "1.0.0"). Bumped per the rules in forgelm/_version.py: MAJOR for removed/signature-changed stable symbols, MINOR for new stable symbols, PATCH for implementation-only changes. Use for feature detection in downstream code. |
| Symbol | Tier | Signature | Description |
|---|---|---|---|
forgelm.load_config |
Stable | (path: str) -> ForgeConfig |
Parse a YAML file into a validated ForgeConfig. Raises ConfigError on validation failure. |
forgelm.ForgeConfig |
Stable | Pydantic BaseModel |
Root config schema. Construct directly via ForgeConfig(**dict_payload) for in-memory parametric sweeps. |
forgelm.ConfigError |
Stable | Exception subclass |
Raised by load_config and ForgeConfig(**dict) on validation failure. CLI dispatchers catch it and exit with code 1. |
| Symbol | Tier | Signature | Description |
|---|---|---|---|
forgelm.ForgeTrainer |
Stable | ForgeTrainer(config: ForgeConfig) |
Primary training entry point. Wraps TRL SFTTrainer / DPOTrainer / KTOTrainer / ORPOTrainer / GRPOTrainer selection. |
forgelm.ForgeTrainer.train |
Stable | train() -> TrainResult |
Run the configured fine-tune. Returns TrainResult.success / metrics / final_model_path. Heavy deps (torch, transformers, trl) load only when this method is called. |
forgelm.TrainResult |
Stable | dataclass |
Result of ForgeTrainer.train(). Canonical fields (per forgelm/results.py): success: bool, metrics: Dict[str, float], final_model_path: Optional[str], reverted: bool, error: Optional[str], benchmark_scores, benchmark_average, benchmark_passed, safety_passed, safety_score, safety_categories, safety_severity, safety_low_confidence, judge_score, judge_details, estimated_cost_usd, staging_path, resource_usage. |
| Symbol | Tier | Signature | Description |
|---|---|---|---|
forgelm.prepare_dataset |
Experimental | prepare_dataset(config: ForgeConfig, tokenizer: PreTrainedTokenizer) -> Dict[str, Any] |
Loads + format-detects + tokenises the configured dataset. Returns a splits dict (e.g. {"train": ..., "validation": ...}). The datasets minor surface drifts periodically, hence Experimental. |
forgelm.get_model_and_tokenizer |
Experimental | get_model_and_tokenizer(config: ForgeConfig) -> tuple[PreTrainedModel, PreTrainedTokenizerBase] |
Loads HF model + tokenizer with the configured PEFT / quantization setup. |
| Symbol | Tier | Signature | Description |
|---|---|---|---|
forgelm.audit_dataset |
Stable | audit_dataset(source: str, *, output_dir: str | None = None, near_dup_threshold: int = 3, dedup_method: str = "simhash", minhash_jaccard: float = 0.85, minhash_num_perm: int = 128, enable_quality_filter: bool = False, enable_pii_ml: bool = False, pii_ml_language: str = "en", emit_croissant: bool = False, workers: int = 1) -> AuditReport |
One-call data-audit entry point. Suitable for notebooks and CI gates. |
forgelm.AuditReport |
Stable | dataclass |
Result of audit_dataset. Canonical fields (per forgelm/data_audit/_types.py): generated_at, source_path, source_input, total_samples, splits, cross_split_overlap (a dict-shaped row-level field, accessed by key), pii_summary (Dict[str, int]), pii_severity, near_duplicate_summary (key pairs_per_split, not pairs), secrets_summary (Dict[str, int]), quality_summary, croissant, notes. |
forgelm.detect_pii |
Stable | detect_pii(text: str) -> Dict[str, int] |
Standalone PII detector. Returns a category → count dict ({"email": 3, "phone": 1, ...}). No language kwarg; the regex set is language-agnostic. |
forgelm.mask_pii |
Stable | mask_pii(text: str, replacement: str = "[REDACTED]", *, return_counts: bool = False) -> str | Tuple[str, Dict[str, int]] |
Mask detected PII spans in place. With return_counts=True, returns (masked_text, counts_dict). |
forgelm.detect_secrets |
Stable | detect_secrets(text: str) -> Dict[str, int] |
Standalone credential / API-key detector (AWS / GitHub / Slack / OpenAI / Google / JWT / private-key / Azure storage). Returns a family → count dict. |
forgelm.mask_secrets |
Stable | mask_secrets(text: str) -> str |
Mask detected secrets in place. |
forgelm.compute_simhash |
Experimental | compute_simhash(text: str) -> int |
64-bit SimHash signature. Surface may collapse into a unified compute_signature(method=...) in a future release. |
forgelm.compute_minhash |
Experimental | compute_minhash(text: str, *, num_perm: int = 128) -> Optional[Any] |
MinHash LSH signature object (returns None when the optional datasketch library is missing). Used internally by the --dedup-method=minhash audit path; the returned object is library-private — operators consume it via audit_dataset(...) rather than directly. Same Experimental tier as compute_simhash; both may collapse into a unified compute_signature(method=...) API. |
| Symbol | Tier | Signature | Description |
|---|---|---|---|
forgelm.AuditLogger |
Stable | AuditLogger(output_dir: str, run_id: str | None = None) |
Append-only Article 12 audit logger. POSIX uses fcntl.flock; Windows uses msvcrt.locking. Each forked child must construct its own instance. |
forgelm.AuditLogger.log_event |
Stable | log_event(event: str, **fields) -> None |
Append a structured event. The event vocabulary is documented in audit_event_catalog.md. |
forgelm.verify_audit_log |
Stable | verify_audit_log(path: str, *, hmac_secret: str | None = None, require_hmac: bool = False) -> VerifyResult |
Walk the SHA-256 hash chain. Returns VerifyResult(valid=False, reason=...) for chain failures (not an exception); raises OSError only for unreadable files. |
forgelm.VerifyResult |
Stable | dataclass |
Canonical fields (per forgelm/compliance.py:VerifyResult): valid: bool, entries_count: int, first_invalid_index: Optional[int], reason: Optional[str]. |
| Symbol | Tier | Signature | Description |
|---|---|---|---|
forgelm.verify_annex_iv_artifact |
Stable | verify_annex_iv_artifact(path: str) -> VerifyAnnexIVResult |
Validate an Annex IV technical-documentation bundle (manifest + model card + audit log + governance report). |
forgelm.VerifyAnnexIVResult |
Stable | dataclass-like |
Canonical attributes: valid: bool, reason: Optional[str], missing_fields: List[str], manifest_hash_actual: Optional[str], manifest_hash_expected: Optional[str]. (Note: verify_annex_iv_artifact(path) accepts a JSON manifest path, not a ZIP — the function calls json.load on path.) |
forgelm.verify_gguf |
Stable | verify_gguf(path: str) -> VerifyGgufResult |
Validate a GGUF export (header + tensor catalogue + tokenizer block). |
forgelm.VerifyGgufResult |
Stable | dataclass-like |
Canonical attributes: valid: bool, reason: Optional[str], checks: Dict[str, Any] (everything else — header / tensor catalogue / tokenizer block — lives inside checks). |
| Symbol | Tier | Signature | Description |
|---|---|---|---|
forgelm.run_benchmark |
Experimental | run_benchmark(model, tokenizer, tasks: List[str], num_fewshot: Optional[int] = None, batch_size: str = "auto", limit: Optional[int] = None, output_dir: Optional[str] = None, min_score: Optional[float] = None) -> BenchmarkResult |
Wraps lm-eval-harness. Requires the [eval] extra. |
forgelm.BenchmarkResult |
Experimental | dataclass-like |
Canonical fields: scores: Dict[str, float], average_score: float, passed: bool, failure_reason: Optional[str], raw_results: Dict[str, Any]. |
forgelm.SyntheticDataGenerator |
Experimental | SyntheticDataGenerator(config: ForgeConfig) |
Teacher-distillation generator. The teacher_backend in {"api", "local", "file"} switch will likely grow new modes. |
| Symbol | Tier | Signature | Description |
|---|---|---|---|
forgelm.WebhookNotifier |
Experimental | WebhookNotifier(config: ForgeConfig) |
Slack / Teams / generic-HTTP lifecycle notifications. Constructor schema may grow ISO/SOC 2 fields in a future release. |
forgelm.setup_authentication |
Experimental | setup_authentication(token: Optional[str] = None) -> None |
Wrapper around huggingface_hub.login. Reads HUGGINGFACE_TOKEN env var when token is None. |
forgelm.manage_checkpoints |
Experimental | manage_checkpoints(checkpoint_dir: str, action: str = "keep") -> None |
Apply checkpoint-retention behaviour against an output directory. action controls retain/prune semantics. |
Worked snippets covering the most common library-mode entry points. All imports below resolve directly via from forgelm import ....
from forgelm import audit_dataset
report = audit_dataset(
"data/customer_support.jsonl",
output_dir="audit_out",
enable_pii_ml=True,
pii_ml_language="en",
emit_croissant=True,
)
print(f"samples: {report.total_samples}")
print(f"duplicates: {report.duplicate_count}")
print(f"pii findings: {len(report.pii_findings)}")
# cross_split_overlap is dict[str, Any], access by key
print(f"split overlap pairs: {report.cross_split_overlap.get('pairs', {})}")from forgelm import verify_audit_log
result = verify_audit_log(
"outputs/run-001/audit_log.jsonl",
hmac_secret=None,
require_hmac=True,
)
if not result.valid:
raise SystemExit(f"audit chain broken: {result.reason}")
print(f"verified {result.entries_checked} entries; head={result.chain_head}")from forgelm import ForgeConfig, ForgeTrainer
config = ForgeConfig(
model={"name_or_path": "TinyLlama/TinyLlama-1.1B-Chat-v1.0"},
lora={"r": 8, "alpha": 16, "target_modules": ["q_proj", "v_proj"]},
data={"dataset_name_or_path": "data/train.jsonl"},
training={
"trainer_type": "sft",
"num_train_epochs": 1,
"per_device_train_batch_size": 1,
"output_dir": "./checkpoints/quick",
},
)
trainer = ForgeTrainer(config)
result = trainer.train()
print(f"success={result.success} output={result.output_dir}")
if not result.success and result.revert_reason:
print(f"reverted: {result.revert_reason}")The keys above are the only required ones; everything else falls back to forgelm/config.py defaults. model.name_or_path, the lora: block, data.dataset_name_or_path, and training.{trainer_type, output_dir} are required by the Pydantic schema; num_epochs / batch_size are not the canonical names and would raise ValidationError.
import os
from forgelm import AuditLogger
os.environ.setdefault("FORGELM_OPERATOR", "airflow:dag-1234:run-5678")
logger = AuditLogger(output_dir="outputs/dag-1234")
logger.log_event(
"training.started",
trainer_type="sft",
model="meta-llama/Llama-3.1-8B-Instruct",
dataset="acme/customer-support-v3",
)
# ... your pipeline runs ...
logger.log_event(
"pipeline.completed",
exit_code=0,
duration_seconds=4218.7,
success=True,
metrics_summary={"eval_loss": 0.43, "rouge_l": 0.61},
)from forgelm import detect_pii, detect_secrets, mask_pii, mask_secrets
text = "Contact alice@example.com or use AKIAIOSFODNN7EXAMPLE for the call."
pii = detect_pii(text)
secrets = detect_secrets(text)
print(f"pii: {[(f.kind, f.span) for f in pii]}")
print(f"secrets: {[(f.kind, f.span) for f in secrets]}")
masked = mask_secrets(mask_pii(text))
print(masked)Importing the package facade is cheap by contract: import forgelm does not load torch, transformers, trl, datasets, peft, or any other heavy ML dependency. Only importlib.metadata and a tiny module-level state dict are touched.
Heavy attributes resolve on first access via PEP 562 __getattr__:
import sys
import forgelm
assert "torch" not in sys.modules # contract — pinned in CI
_ = forgelm.ForgeTrainer # imports forgelm.trainer, but
assert "torch" not in sys.modules # forgelm.trainer also defers torch
trainer = forgelm.ForgeTrainer(config) # constructor still cheap
result = trainer.train() # NOW torch loadsThis invariant exists because lightweight CI runners, forgelm doctor, and python -m forgelm.cli --help must respond instantly. tests/test_library_api.py::test_lazy_import_no_torch regression-pins it.
| Symbol | Multi-threaded? | Fork-safe after construction? |
|---|---|---|
ForgeTrainer.train() |
No — TRL holds GPU state | No |
audit_dataset() |
Yes — each call is self-contained | Yes |
AuditLogger.log_event() |
Yes — flock on POSIX, msvcrt.locking on Windows |
Construct a fresh logger per child; sharing handles across forks is unsupported |
verify_audit_log() |
Yes — read-only | Yes |
WebhookNotifier.notify_*() |
Yes — each call opens its own requests session |
Yes |
CLI dispatchers map exceptions to public exit codes (0/1/2/3/4). Library callers do not see exit codes — typed exceptions propagate.
| Symbol | Errors propagated as |
|---|---|
ForgeTrainer.train() |
ConfigError (validation), RuntimeError (CUDA / training-loop), OSError (I/O) |
audit_dataset() |
ValueError (invalid args), OSError (I/O), OptionalDependencyError (missing extra) |
verify_audit_log() |
Returns VerifyResult(valid=False, reason=...) for chain failures; raises OSError only for unreadable files |
AuditLogger.log_event() |
OSError on write failure (caller decides retry vs abort) |
Library code never calls sys.exit. Every exit-code mapping lives in CLI dispatchers.
import forgelm does not call logging.basicConfig(). Configure the consumer logger explicitly:
import logging
logging.getLogger("forgelm").setLevel(logging.WARNING) # quiet by default in librariesThe CLI does its own setup in forgelm.cli._setup_logging; the library leaves it to the caller (PEP 8 / logging HOWTO library hygiene).
Two independent version strings:
| Variable | Bumps when... | Read by... |
|---|---|---|
forgelm.__version__ |
Every release (CLI fix, library fix, doc-only release) | Downstream pinning, audit manifest stamp |
forgelm.__api_version__ |
A stable-tier signature changes | Downstream feature detection |
__api_version__ is a three-segment semver string ("1.0.0" at v0.5.5 — first publication of the formal Phase 19 contract). Bump rules live in forgelm/_version.py: MAJOR on removed / signature-changed stable symbols, MINOR on new stable symbols, PATCH on implementation-only changes that don't touch the public surface.
Deprecation cadence (per docs/standards/release.md):
- Mark the old symbol with a
DeprecationWarningin releaseN. The warning must include the replacement symbol name and the planned-removal version. - Keep it working in release
N+1. - Remove in release
N+2.
A breaking change to a stable signature without following the cadence is a release-process bug.
../guides/library_api.md— three end-to-end worked examples.audit_event_catalog.md— full event vocabularyAuditLogger.log_eventaccepts.configuration.md—ForgeConfigfield reference.../design/library_api.md— Phase 18 design + 16-row Phase 19 task plan.../standards/release.md— deprecation cadence and release process.