Add Tools for Managing Agent Traces in Hugging Face Datasets by RohitP2005 · Pull Request #217 · ServiceNow/AgentLab

RohitP2005 · 2025-02-10T08:40:05Z

[Sample PR] Add Tools for Managing Agent Traces in Hugging Face Datasets

reference: #53

Summary

This PR introduces a foundational implementation for managing and uploading agent traces to Hugging Face datasets. It provides tools to simplify adding traces, maintaining an index dataset for easy retrieval, and enforcing whitelist-based constraints for legality.

Key Features

1. Hugging Face Dataset Structure

Index Dataset: Stores metadata for each trace, allowing easy querying based on attributes.
Trace Dataset: Contains actual zipped trace files, which can be retrieved via pointers from the index.

2. Upload System

Functionality to upload traces one by one.
Automated grouping of traces by study.
Metadata generation, including:
- study_name, llm, benchmark, and license.
- A reference (trace_pointer) to the actual trace file.

Notes

This is a sample template and can be expanded upon.
Future work may include better versioning, enhanced querying capabilities, and automated dataset updates.

Checklist

Upload functionality
Query functionality
Legal compliance checks
Documentation

TLSDC · 2025-02-18T17:59:12Z

src/agentlab/llm/traces/uploads.py

+    }
+
+    # Load the existing index dataset and add new entry
+    dataset = load_dataset(INDEX_DATASET, split="train")


I'm having issues on this line when trying to test things on my side, as the dataset version that's online is empty. Would there be a way to initialize it first ?

I guess we shall have a test dataset online to test the functionality

That's a valid concern, RohitP2005. Having an online test dataset would indeed help us verify the functionality in real-world settings. Would it be possible to initialize and upload a minimal test dataset that we could use for these purposes? We can involve the team in generating sample data if needed.

TLSDC · 2025-02-18T18:00:31Z

src/agentlab/llm/traces/uploads.py

+def upload_index_data(index_df: pd.DataFrame):
+    dataset = Dataset.from_pandas(index_df)
+    dataset.push_to_hub(INDEX_DATASET, split="train")
+
+def upload_trace(trace_file: str, exp_id: str):
+    api.upload_file(
+        path_or_fileobj=trace_file,
+        path_in_repo=f"{exp_id}.zip",
+        repo_id=TRACE_DATASET,
+        repo_type="dataset",
+    )


Ideally, we would approve new content on our datasets. Would there be a way to make new uploads into a PR ?
I'm guessing that might be on the HuggingFace side, in the dataset settings.

TLSDC · 2025-02-18T18:01:16Z

src/agentlab/llm/traces/uploads.py

+# Hugging Face dataset names
+INDEX_DATASET = "/agent_traces_index"
+TRACE_DATASET = "/agent_traces_data"


This is a dev version so all good but eventually we'd switch this to env variables.

Yeah i understand that, Once this PR is completed , i will remove them and u can add it in your .env

I agree, moving the Hugging Face dataset names, INDEX_DATASET and TRACE_DATASET to environment variables would enhance the code maintainability and security. This would also help us in managing different environment-specific settings. Thank you for addressing this.

TLSDC · 2025-02-18T18:11:47Z

Hello @RohitP2005, this looks very interesting, thank you !
Aside from the previous comments, there is a design aspect that needs to be changed.
Atm you have 2 tables, one that has experiment metadata, which points to the corresponding zipped experiment content.

Ideally, we would have a third table on top of this, with one entry per study (as in the reproducibility_journal.csv file), with a key. The entries in the experiment metadata table would point to that key.
This way we could query per llm/benchmark like you did, but also very importantly per study.

TLSDC · 2025-02-18T18:13:48Z

src/agentlab/llm/traces/uploads.py

+def upload_trace(trace_file: str, exp_id: str):
+    api.upload_file(
+        path_or_fileobj=trace_file,
+        path_in_repo=f"{exp_id}.zip",
+        repo_id=TRACE_DATASET,
+        repo_type="dataset",
+    )


It could be interesting to compress the file if needed, instead of requiring it to be zipped already.

RohitP2005 · 2025-02-27T17:06:20Z

Hello @RohitP2005, this looks very interesting, thank you ! Aside from the previous comments, there is a design aspect that needs to be changed. Atm you have 2 tables, one that has experiment metadata, which points to the corresponding zipped experiment content.

Ideally, we would have a third table on top of this, with one entry per study (as in the reproducibility_journal.csv file), with a key. The entries in the experiment metadata table would point to that key. This way we could query per llm/benchmark like you did, but also very importantly per study.

Yeah understood , I will look into it as soon as possible

…sue#53

RohitP2005 · 2025-02-28T21:11:17Z

Hey @TLSDC @recursix

I thought of restructuring the travel uploads and creating classes for Study and Experiments with methods within them for their functionality. The functions are implemented in the utils files. Also, query functionality has been added.

Kindly refer to Discord for a detailed description.

Trace upload update

TLSDC · 2025-03-25T19:07:19Z

src/agentlab/traces/trace_utils.py

+    try:
+        dataset = load_dataset(trace_dataset, use_auth_token=hf_token, split="train")
+        existing_data = {"exp_id": dataset["exp_id"], "zip_file": dataset["zip_file"]}
+    except Exception as e:
+        print(f"Could not load existing dataset: {e}. Creating a new dataset.")
+        existing_data = None


Loading the traces dataset is going to be a problem, as the traces are really heavy (200GB for our TMLR paper).
Ideally we'd have smth more similar to your original version:

def upload_trace(trace_file: str, exp_id: str): api.upload_file( path_or_fileobj=trace_file, path_in_repo=f"{exp_id}.zip", repo_id=TRACE_DATASET, repo_type="dataset", )

We would trust the index dataset to avoid duplicates, and use the trace dataset as a container in which we'd dump the zipfiles.

TLSDC · 2025-03-25T19:07:53Z

src/agentlab/traces/uploads.py

I think we can safely remove this file now. It would be nice to have an equivalent of the upload method, to merge all 3 levels of upload (study, index, traces)

TLSDC · 2025-03-25T19:10:34Z

src/agentlab/traces/query.py

I'll update this to recent changes in the upload methods

Inital template

a75eaff

TLSDC reviewed Feb 18, 2025

View reviewed changes

RohitP2005 added 7 commits February 28, 2025 21:12

Merge branch 'main' of https://github.com/ServiceNow/AgentLab into is…

778da0f

…sue#53

Compress traces before uploding

a6034f4

Restructuring traces fucntionality

1e5c4cc

Defining classes for Study and Experiments

7923fdd

Addind trace utility fucntions

8c1b880

Addind quering fucntions

e58dd98

removing old implementations

bcf9988

TLSDC and others added 3 commits March 11, 2025 14:04

Merge branch 'main' of github.com:ServiceNow/AgentLab into trace_upload

3d0de88

updating trace upload functions to AgentLab/BGym objects

ec5b1b1

Merge pull request #1 from RohitP2005/tlsdc/trace_upload

f198efd

Trace upload update

TLSDC reviewed Mar 25, 2025

View reviewed changes

src/agentlab/traces/query.py

Copy link

Collaborator

TLSDC Mar 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll update this to recent changes in the upload methods

Conversation

RohitP2005 commented Feb 10, 2025

[Sample PR] Add Tools for Managing Agent Traces in Hugging Face Datasets

Summary

Key Features

1. Hugging Face Dataset Structure

2. Upload System

Notes

Checklist

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

korbit-ai bot Feb 27, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

korbit-ai bot Feb 27, 2025

Choose a reason for hiding this comment

Uh oh!

TLSDC commented Feb 18, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

RohitP2005 commented Feb 27, 2025

Uh oh!

RohitP2005 commented Feb 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TLSDC Mar 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

RohitP2005 commented Feb 28, 2025 •

edited

Loading

TLSDC Mar 25, 2025 •

edited

Loading