feat: implement strict type validation using strict from huggingface_hub#393
feat: implement strict type validation using strict from huggingface_hub#393
strict from huggingface_hub#393Conversation
| from kernels.compat import has_torch | ||
|
|
||
|
|
||
| @runtime_checkable |
There was a problem hiding this comment.
Because @strict validates field types using isinstance() checks at construction time. When Arch.backend is typed as Backend (a Protocol), @strict tries to do isinstance(value, Backend) — and Python's typing.Protocol raises TypeError unless it's decorated with @runtime_checkable.
| verified: bool | None = None # None = no verify fn, True = passed, False = failed | ||
| ref_mean_ms: float | None = None # Reference implementation mean time | ||
|
|
||
| def validate_iterations(self): |
There was a problem hiding this comment.
These are "magic" methods. @strict auto-discovers any method named validate_* and calls it during __init__ (and on __setattr__). They replaced the manual checks that were previously inside from_dict().
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
| python-self.httpx | ||
| python-self.shellingham | ||
| python-self.typer-slim |
There was a problem hiding this comment.
Can add their hash as well.
There was a problem hiding this comment.
Not needed if we use them from nixpkgs as-is.
danieldk
left a comment
There was a problem hiding this comment.
Nice!!! Added a bunch of comments.
| version=data.get("version"), | ||
| license=data.get("license"), | ||
| backends=data.get("backends"), | ||
| hub=HubConfig.from_dict(hub_data) if hub_data else None, |
There was a problem hiding this comment.
Note on the discussion from Slack, this is one thing that goes wrong with hand-rolled deserialization. What if the user put in something that makes hub_data truthy, but not a dict. Then HubConfig.from_dict will fail with a hard-to-understand error, whereas it should just say what the field/section is expected to be.
You can avoid this by programming very defensively (e.g. here checking that it is a dict first), but those things you get for free from a library that does deserialization for you.
I think for now it's ok, since we are going to replace this deserialization with Rust as discussed.
| raise ValueError( | ||
| f"min_capability ({self.min_capability}) must be <= max_capability ({self.max_capability})" | ||
| ) |
| python-self.httpx | ||
| python-self.shellingham | ||
| python-self.typer-slim |
There was a problem hiding this comment.
Not needed if we use them from nixpkgs as-is.
Co-authored-by: Daniël de Kok <me@danieldk.eu>
danieldk
left a comment
There was a problem hiding this comment.
Great work! Nice to have all the extra validation!
As discussed internally.
Note that I have chosen NOT to use
pydanticfor this as per https://huggingface.co/docs/huggingface_hub/package_reference/dataclasses#why-not-use-pydantic--or-attrs--or-marshmallowdataclass-. Plus, we don't have to add a new dependency and get away with what we already have (which is always nice). But I can switch topydanticshould we mutually decide.Some inline comments.
Notes
huggingface_hubminimum version to use@strict.@strictcan't validate packaging.version.Version (a third-party type). So, I had to remove@strictfrom theTorchclass, for example, invariants.py. If we decided to go the@strictway, we could follow up with the Hub team for adding support.overlay.nixdue to the version bump in Hugging Face Hub. I think that they should be quite safe.