Skip to content

Using Mlflow and Jaxtrain with --data-generation-experiment-id #73

@hannahhafner

Description

@hannahhafner

When I run jaxtrain without providing a training data folder and using the experiment ID parameter, it fails with this error:

ValueError: Failed to get training data from MLflow and no 
--training-data-folder provided: Could not determine training_data_folder from 
MLflow experiment 1. Please provide --training-data-folder explicitly.

Upon further inspection, I think the error comes from this line in jax_train:
264: mlflow_data_folder = first_run_info.get("data_output_folder")

In the first_run_info dictionary, I have no key for 'data_output_folder' even though I do have keys for 'run_id',`` 'run_name', 'num_files', 'total_size_mb' and 'files' . It looks like mlflow_lineage_info was created, as the logger message MLflow reports 2 runs with 90 total files was printed correctly, but the data_output_folder is not stored within first_run_info

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions