Skip to content

Prof-it/td_prediction_llm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

td_prediction_llm

This repository contains the source code and experiment outputs associated with our MOC2025 workshop paper at the DECLARE conference.

The paper presents a novel LLM/AI-enabled workflow for automatic labeling, combined with XAI human-in-the-loop quality control. We curated this workflow from a case study on detecting technical debt, aiming to support software project management in making informed decisions.

This work extends a thesis study that used a primary LLM as the labeling judge alongside classical ML methods to predict technical debt but suffered from feature leakage, shortcut learning, and challenges in handling imbalanced data. Key contributions include a curated workflow design, improved prompt engineering, and practical lessons learned to avoid shortcut learning or feature leakage when using LLM-generated labels. We also evaluate performance on imbalanced datasets.

workflowdiagram drawio

About

Code and experiment outputs for our DECLARE MOC2025 paper on an LLM-based labeling workflow with XAI (SHAP) and human-in-the-loop QA. Built from a technical-debt detection case study using LLMs and scikit-learn, focusing on avoiding shortcut learning, leakage, and handling imbalance.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors