Conversation
534af0b to
368adad
Compare
|
Some high-level thoughts first:
Should we somehow inform the users about it and take confirmation before proceeding?
Should this be Find the kernel at: https://hf.co/drbh/some-kernel/tree/{version}? This way, users can directly use that link to inspect files, etc. |
sayakpaul
left a comment
There was a problem hiding this comment.
Very minor comments. No RUST expertise yet.
template/CARD.md
Outdated
| @@ -0,0 +1,38 @@ | |||
| --- | |||
There was a problem hiding this comment.
Do we want to maintain two copies? We have one in https://github.com/huggingface/kernels/blob/main/kernels/src/kernels/cli/card_template.md
kernel-builder/src/build.rs
Outdated
|
|
||
| /// Nix flake target to run. | ||
| #[arg(long, default_value = "build-and-copy")] | ||
| pub target: BuildTarget, |
There was a problem hiding this comment.
I think top-level
kernel-builder build
kernel-builder build-and-copy
kernel-builder build-and-upload
are nicer, easier to discover. I think we can use #[command(flatten)] to share arguments between different build commands, something like:
// ...
Build {
#[command(flatten)]
pub common_build_args: CommonBuildArgs,
// ...
}
There was a problem hiding this comment.
makes sense, updated in latest and added those three commands at the top level
kernel-builder/src/build.rs
Outdated
|
|
||
| /// Additional arguments passed through to `nix run`. | ||
| #[arg(last = true)] | ||
| pub nix_args: Vec<String>, |
There was a problem hiding this comment.
yea I guess if the user wants more advance flags they should call nix run directly. removed in latest
| let bare_name = path.file_name().and_then(OsStr::to_str).ok_or_else(|| { | ||
| eyre::eyre!("Cannot determine directory name from `{path_str}`") | ||
| })?; | ||
| let owner = hf::whoami_username()?; |
kernel-builder/src/init.rs
Outdated
| for (from, to) in replacements { | ||
| text = text.replace(from, to); | ||
| } | ||
| fs::write(&destination, text).wrap_err_with(|| { |
There was a problem hiding this comment.
I possible (not sure if it supports everything needed), I think it would be nice to use the fileset API that is also used for generating pyproject files. The nice thing is that it is atomic, if something fails while still preparing the output, nothing is written.
I think we would have to extend it with the notion of files that should not be overwritten (if a project already exists and you only want to add new files). But I think making it impossible to have an operation half-completed is a nice feature.
There was a problem hiding this comment.
oh yea thats a nice improvement, updated to use the file set api in the latest changes. this is a better ux and pattern so we avoid getting into a strange state
kernel-builder/src/init.rs
Outdated
| .wrap_err_with(|| format!("Cannot parse TOML in `{}`", build_toml_path.display()))?; | ||
|
|
||
| // Update [general].backends | ||
| if let Some(general) = document.get_mut("general").and_then(|v| v.as_table_mut()) { |
There was a problem hiding this comment.
I forget if we discussed this before, but it would be worthwhile putting jinja templates in the template, so that we can render everything editing TOML (but maybe parse + write to format).
This has the benefit that if we ever change the build.toml format, we don't have to change it both in the template and here in the code.
E.g., suppose that backends moves somewhere else, we have to change it in the template and here.
There was a problem hiding this comment.
ah yes thats a great improvement. updated the kernel template to use jinja in the latest changes
kernel-builder/src/init.rs
Outdated
| let mut impl_block = String::new(); | ||
| for (i, (condition, device, ref_prefix)) in conditions.iter().enumerate() { | ||
| let directive = if i == 0 { "#if" } else { "#elif" }; | ||
| impl_block.push_str(directive); | ||
| impl_block.push(' '); | ||
| impl_block.push_str(condition); | ||
| impl_block.push_str("\n ops.impl(\""); | ||
| impl_block.push_str(func_name); | ||
| impl_block.push_str("\", "); | ||
| impl_block.push_str(device); | ||
| impl_block.push_str(", "); | ||
| impl_block.push_str(ref_prefix); | ||
| impl_block.push_str(func_name); | ||
| impl_block.push_str(");\n"); | ||
| } |
There was a problem hiding this comment.
Could be a jinja template with if blocks.
There was a problem hiding this comment.
updated to prefer jinja in latest
kernel-builder/src/init.rs
Outdated
| let marker = "ops.def(\""; | ||
| let start = content.find(marker)? + marker.len(); | ||
| let rest = &content[start..]; | ||
| let end = rest.find('(')?; | ||
| Some(&rest[..end]) | ||
| } | ||
|
|
||
| /// Replace `#if defined(CPU_KERNEL)...#endif` block with new content | ||
| fn replace_ifdef_block(content: &str, replacement: &str) -> String { | ||
| const START_MARKER: &str = "#if defined(CPU_KERNEL)"; | ||
| const END_MARKER: &str = "#endif"; | ||
|
|
||
| let Some(start) = content.find(START_MARKER) else { | ||
| return content.to_owned(); | ||
| }; | ||
|
|
||
| let search_region = &content[start..]; | ||
| let Some(end_offset) = search_region.find(END_MARKER) else { | ||
| return content.to_owned(); | ||
| }; | ||
|
|
||
| let end = start + end_offset + END_MARKER.len(); | ||
|
|
||
| let mut result = String::with_capacity(content.len()); | ||
| result.push_str(&content[..start]); | ||
| result.push_str(replacement); | ||
| result.push_str(&content[end..]); | ||
| result | ||
| } |
There was a problem hiding this comment.
Would be easier to maintain with a jinja template and some if blocks I think.
There was a problem hiding this comment.
updated to prefer jinja in latest
kernel-builder/src/main.rs
Outdated
| /// Upload kernel build artifacts to the Hugging Face Hub. | ||
| Upload(UploadArgs), | ||
|
|
||
| #[command(hide = true)] |
There was a problem hiding this comment.
If I understand correctly, this removes the subcommand from help. Sounds bad? Leftover from testing?
danieldk
left a comment
There was a problem hiding this comment.
Reviewed the upload subcommand now as well. Really excited to get batched uploads!
kernel-builder/src/upload.rs
Outdated
| #[arg(value_name = "KERNEL_DIR", default_value = ".")] | ||
| pub kernel_dir: PathBuf, |
There was a problem hiding this comment.
Only a small nit, I'd use
| #[arg(value_name = "KERNEL_DIR", default_value = ".")] | |
| pub kernel_dir: PathBuf, | |
| #[arg(value_name = "KERNEL_DIR")] | |
| pub kernel_dir: Option<PathBuf>, |
It expresses the semantics a bit better, there is now a helper function to get the path from that. (No strong opinions though.)
There was a problem hiding this comment.
agreed this is a more idiomatic, updated in latest
kernel-builder/src/upload.rs
Outdated
| let mut operations: Vec<CommitOperation> = Vec::new(); | ||
|
|
||
| collect_benchmark_ops(&kernel_dir, &existing_files, is_new_branch, &mut operations)?; | ||
| collect_readme_ops(&kernel_dir, &mut operations); |
There was a problem hiding this comment.
README should go to the main branch. I think probably the most elegant way is to make operations a HashMap<String, Vec<CommitOperation>> with the branch name as the key. That way we can have a nested loop to handle everything.
There was a problem hiding this comment.
For later: we may want to do the card generation and upload in one go as part of upload, since we cannot do the card generation in the Nix sandbox iff we need to fetch something from the main repo to do it (depends a bit on whether we still need to enumerate all the build variants or whether that information is handled by the Hub itself through the new kernel type, if the remaining information in the card is relatively static, we can handle it during build as well).
Not for this PR though.
There was a problem hiding this comment.
Some relevant internal discussions: https://huggingface.slack.com/archives/C090JN2P8NB/p1774236408982349
There was a problem hiding this comment.
@drbh I think the most sensible thing for now would be to keep the card contents limited to what we can fetch from the build (after the build is successful). We will render capabilities, frameworks etc. on the Hub side upon each commit, anyway. So, the card could contain an example snippet, available function interfaces, etc. This is all implemented in Python CLI.
Totally okay if we want to do the segregation in a separate PR.
There was a problem hiding this comment.
And then there is this other thing of always uploading the card to main revision.
kernel-builder/src/upload.rs
Outdated
| } | ||
|
|
||
| // Collect benchmark file operations: add matching files, delete stale ones. | ||
| fn collect_benchmark_ops( |
There was a problem hiding this comment.
Maybe we should call them commit_ops? I was initially a bit confused, because we already use the term ops for ops in the kernel itself. Adding the commit_ prefix may make it a bit clearer that we are referring to different ops here.
| fn collect_benchmark_ops( | |
| fn collect_benchmark_commit_ops( |
kernel-builder/src/upload.rs
Outdated
| fn read_repo_id_from_build_toml(kernel_dir: &Path) -> Result<String> { | ||
| let build_toml_path = kernel_dir.join("build.toml"); | ||
| let content = fs::read_to_string(&build_toml_path) | ||
| .wrap_err_with(|| format!("Cannot read `{}`", build_toml_path.display()))?; | ||
| let parsed: toml::Value = toml::from_str(&content) | ||
| .wrap_err_with(|| format!("Cannot parse `{}`", build_toml_path.display()))?; | ||
|
|
||
| parsed | ||
| .get("general") | ||
| .and_then(|g| g.get("hub")) | ||
| .and_then(|h| h.get("repo-id")) | ||
| .and_then(|v| v.as_str()) | ||
| .filter(|s| !s.is_empty()) | ||
| .map(|s| s.to_owned()) | ||
| .ok_or_else(|| { | ||
| eyre::eyre!( | ||
| "No `general.hub.repo-id` in `{}`. Use --repo-id to specify it.", | ||
| build_toml_path.display() | ||
| ) | ||
| }) | ||
| } |
There was a problem hiding this comment.
Use util::parse_build. We could also implement a fn repo_id(&self) -> Option<&str>) on Build if it is helpful and then we probably do not need this function at all.
There was a problem hiding this comment.
yea thats much cleaner, updated in latest! thanks
| for variant in variants { | ||
| let metadata_path = variant.join("metadata.json"); | ||
| let data = fs::read_to_string(&metadata_path) | ||
| .wrap_err_with(|| format!("Cannot read `{}`", metadata_path.display()))?; | ||
| let parsed: serde_json::Value = serde_json::from_str(&data) | ||
| .wrap_err_with(|| format!("Cannot parse `{}`", metadata_path.display()))?; | ||
| let version = parsed.get("version").and_then(|v| v.as_u64()); | ||
| versions.insert(version); | ||
| } | ||
|
|
||
| if versions.len() > 1 { | ||
| let version_strs: Vec<String> = versions | ||
| .iter() | ||
| .map(|v| match v { | ||
| Some(n) => n.to_string(), | ||
| None => "none".to_owned(), | ||
| }) | ||
| .collect(); | ||
| bail!( | ||
| "Found multiple versions in build variants: {}", | ||
| version_strs.join(", ") | ||
| ); | ||
| } |
There was a problem hiding this comment.
Use pyproject::Metadata to deserialize metadata.json. If helpful, we could add a function similar to parse_build.
There was a problem hiding this comment.
nice point, added a parse_metadata function in latest commits.
kernel-builder/src/upload.rs
Outdated
| fn walk_recursive(dir: &Path, files: &mut Vec<PathBuf>) -> Result<()> { | ||
| for entry in | ||
| fs::read_dir(dir).wrap_err_with(|| format!("Cannot read directory `{}`", dir.display()))? | ||
| { | ||
| let entry = entry?; | ||
| let path = entry.path(); | ||
| if path.is_dir() { | ||
| walk_recursive(&path, files)?; | ||
| } else { | ||
| files.push(path); | ||
| } | ||
| } | ||
| Ok(()) | ||
| } |
There was a problem hiding this comment.
is_dir follows symbolic links, so if a build directory contains a link like ln -s . foo, this will get stuck in an infinite loop (until it runs out of stack or heap). So we either need to skip symbolic links or we need loop detection. Maybe it's worth considering BurntSushi's walkdir, since it covers a lot of edge cases and doesn't follow symlinks by default. It also returns an iterator, which is more efficient than pushing everything into a Vec. Not that we care too much about performance here, but it is a nice property.
There was a problem hiding this comment.
very good catch, updated to use walkdir in latest. thanks for pointing out
57523a2 to
f88776a
Compare
a50eff0 to
f6b9889
Compare
danieldk
left a comment
There was a problem hiding this comment.
Thanks! Added some more smaller comments.
kernel-builder/src/pyproject/card.rs
Outdated
| let content = fs::read_to_string(&init_path).ok()?; | ||
|
|
||
| let re = Regex::new(r#"__all__\s*=\s*\[\s*([^\]]*)\s*\]"#).ok()?; | ||
| let list_content = re.captures(&content)?.get(1)?.as_str(); |
There was a problem hiding this comment.
Regex parsing is really fragile. Not for this PR, but we should use a Python AST parser. Maybe tree-sitter-python or whatever ruff/ty uses (if it's available from crates.io). Maybe it's worthwhile creating a ticket?
There was a problem hiding this comment.
good point, I opt'ed to use https://crates.io/crates/rustpython-parser which is what ruff forked, only because ruff's parser has git dependencies and our use case is relatively simple
kernel-builder/src/build.rs
Outdated
| /// Maximum number of Nix build jobs. | ||
| #[arg(long, default_value = "4")] | ||
| pub max_jobs: u32, | ||
|
|
||
| /// Number of CPU cores per build job. | ||
| #[arg(long, default_value = "4")] | ||
| pub cores: u32, |
There was a problem hiding this comment.
Should use Option, we want to use the default from nix.conf when the user has not specified any arguments. E.g. I have these turned on my builder for the number of cores/RAM available.
There was a problem hiding this comment.
good point, agree that option is a better pattern (as noted earlier) thanks for catching. updated to prefer the NixArgs struct where these values are optional
kernel-builder/src/main.rs
Outdated
| use develop::{devshell, testshell}; | ||
|
|
||
| mod build; | ||
| use build::{run_build, CommonBuildArgs}; |
There was a problem hiding this comment.
| use build::{run_build, CommonBuildArgs}; | |
| use build::run_build; |
There was a problem hiding this comment.
removed in latest changes
kernel-builder/src/main.rs
Outdated
| /// Build the kernel locally (alias for build-and-copy). | ||
| Build { | ||
| #[command(flatten)] | ||
| args: CommonBuildArgs, |
There was a problem hiding this comment.
Use
kernels/kernel-builder/src/main.rs
Line 26 in 826d536
There was a problem hiding this comment.
nice! updated to use this struct
kernel-builder/src/main.rs
Outdated
| repo_type, | ||
| } => { | ||
| let kernel_dir = build_args.kernel_dir.clone(); | ||
| run_build(build_args, "build-and-copy")?; |
There was a problem hiding this comment.
| run_build(build_args, "build-and-copy")?; |
Remove, the idea behind build-and-upload is to upload directly from the Nix store without using duplicate storage for both the store version and the build version.
There was a problem hiding this comment.
good point, updated to only call build then upload from the symlinked result dir
kernel-builder/src/main.rs
Outdated
| Commands::Init(args) => run_init(args), | ||
| Commands::Upload(args) => run_upload(args), | ||
| Commands::Build { args } | Commands::BuildAndCopy { args } => { | ||
| run_build(args, "build-and-copy") |
There was a problem hiding this comment.
I think I would only do a nix build for this subcommand. This unnecessarily creates another copy, which can be very large (e.g. FA) and the kernels tools work with the result directory/symlink as well (at least kernels upload).
There was a problem hiding this comment.
makes sense, updated in latest
| .filter_map(|e| e.ok()) | ||
| .filter(|e| e.file_type().is_file()) | ||
| .map(|e| e.into_path()) | ||
| } |
There was a problem hiding this comment.
We should probably copy the upload test from the Python tests?
There was a problem hiding this comment.
agreed, copied in latest changes
3ed8a0f to
63dbeb1
Compare
sayakpaul
left a comment
There was a problem hiding this comment.
I left some comments regarding upload and the card components. Additionally:
- Can the users optionally specify
revisionandrepo_idinupload? I guess we always upload to arevisionduring our uploads (performed after the build) to respect our versioning system. - I see we have
init_e2e.rstests. But I think we should also have tests foruploadand the card generation bits, preferably mimicking how we do it in the current Python CLI? - Do we want to update the docs in a separate PR?
- I think we should require users to have
hf-xetinstalled to benefit from fast uploads and downloads if we're not doing in here already?
This PR is a WIP to add the init command to the kernel-builder rust cli. This branch mainly ports the init command from the python cli, with some improvements to the interface and updates to the templateThis PR adds an the
initanduploadcommand to the rust kernel-builder cli similar to implementation in the current python cli. It also adds abuildcommand that is a facade for thenix runso devs can interact with a single tool for the full creation lifecycle of a kernel.Mainly it aims to have better defaults to make the commands easier to use
initcommand only requires the name of the kernels (folder) and the owner is pulled via awhoamirequest if the user is already logged into hfinitcommand creates a new dir if a name is specified, if called within a dir, the cli will use the dir name as the kernel nameuploadcommand does not require a repo id if the id can be read from the builduploadcommand does not require a path to the build folder and will default to looking for thebuilddir if no explicit path is specifiedexample usage
init from inside of a existing dir
init from outside of a dir
build
upload