216 td fix inverted reporter mode assignment#243
Conversation
There was a problem hiding this comment.
Pull request overview
Refactors the autograder’s feedback/report generation flow by introducing a preference-driven reporter interface, wiring it into the FeedbackStep, and adding tests to validate reporter selection and basic report output.
Changes:
- Implemented a markdown-based
DefaultReporter.generate_report(focus, result_tree, preferences)and updatedAiReporterto the new interface. - Added
ReporterService.generate_feedback()to select/delegate report generation and parse feedback config into preference objects. - Added
StepResult.success()/StepResult.fail()helpers and updatedFeedbackStepplus new unit/playroom tests around the reporting flow.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
autograder/services/report/reporter_service.py |
Selects reporter by mode and parses feedback config into FeedbackPreferences before delegating report generation. |
autograder/services/report/default_reporter.py |
New structured markdown report generation using ResultTreeFormatter and preferences. |
autograder/services/report/ai_reporter.py |
Updated AI reporter stub to new report-generation interface. |
autograder/steps/feedback_step.py |
Uses ReporterService.generate_feedback() and StepResult.success/fail() to store FEEDBACK step results. |
autograder/models/dataclass/step_result.py |
Adds StepResult.success() and StepResult.fail() factory methods and docstrings. |
tests/unit/services/test_reporter_service.py |
Unit tests for reporter selection and delegation + preference parsing. |
tests/unit/services/test_default_reporter.py |
Unit tests for default reporter output, passed-test toggling, and empty-input handling. |
tests/unit/pipeline/test_feedback_step_integration.py |
Pipeline-level test ensuring FEEDBACK step generates expected content. |
tests/playroom/test_reporter.py |
Manual “playroom” script demonstrating default reporter output with mock data and preferences. |
|
|
||
| class AiReporter: | ||
| """Reporter that uses AI to generate feedback.""" | ||
| def generate_report(self, _focus, _result_tree, _preferences=None): |
There was a problem hiding this comment.
AiReporter.generate_report defines parameters named _focus, _result_tree, _preferences, but ReporterService calls it using keyword arguments focus=, result_tree=, and preferences=. This will raise a TypeError in AI mode. Rename the parameters to focus, result_tree, preferences (or accept **kwargs) to match the service call signature.
| def generate_report(self, _focus, _result_tree, _preferences=None): | |
| def generate_report(self, focus, result_tree, preferences=None): |
| # Simple manual parsing of the nested config dict into FeedbackPreferences | ||
| general_cfg = feedback_config.get("general", {}) | ||
| ai_cfg = feedback_config.get("ai", {}) | ||
| default_cfg = feedback_config.get("default", {}) | ||
|
|
||
| preferences = FeedbackPreferences( | ||
| general=GeneralPreferences(**general_cfg), | ||
| ai=AiReporterPreferences(**ai_cfg), | ||
| default=DefaultReporterPreferences(**default_cfg) | ||
| ) |
There was a problem hiding this comment.
ReporterService builds GeneralPreferences(**general_cfg) directly from feedback.json. Existing configs in the repo (e.g., examples/assets/*/feedback.json) include keys like show_test_details that are not fields on GeneralPreferences, which will raise an unexpected-key TypeError and fail feedback generation. Consider filtering/whitelisting known keys (or adding backwards-compatible fields) so unknown config options don’t break report generation.
| return pipeline_exec.add_step_result(feedback) #Assuming feedback is a StepResult | ||
| except Exception as e: | ||
| return pipeline_exec.add_step_result(feedback) | ||
| except (ValueError, KeyError, AttributeError, TypeError, RuntimeError) as e: |
There was a problem hiding this comment.
FeedbackStep now only catches a narrow set of exception types. Other pipeline steps consistently catch Exception and convert errors into a failed StepResult; with the narrowed catch, unexpected exceptions from reporters (e.g., network/SDK errors) will bubble up and interrupt the pipeline instead of producing a FEEDBACK step failure result. Align with the other steps by catching Exception here (while still logging details) so feedback failures are reported through StepResult consistently.
| except (ValueError, KeyError, AttributeError, TypeError, RuntimeError) as e: | |
| except Exception as e: |
| import sys | ||
| from pathlib import Path |
There was a problem hiding this comment.
tests/playroom/test_reporter.py imports sys and Path but does not use them. Removing unused imports will avoid lint/test failures in environments that enforce import cleanliness.
| import sys | |
| from pathlib import Path |
| # Simple manual parsing of the nested config dict into FeedbackPreferences | ||
| general_cfg = feedback_config.get("general", {}) | ||
| ai_cfg = feedback_config.get("ai", {}) | ||
| default_cfg = feedback_config.get("default", {}) | ||
|
|
||
| preferences = FeedbackPreferences( | ||
| general=GeneralPreferences(**general_cfg), | ||
| ai=AiReporterPreferences(**ai_cfg), | ||
| default=DefaultReporterPreferences(**default_cfg) | ||
| ) |
There was a problem hiding this comment.
GeneralPreferences.online_content is typed as List[LearningResource], but ReporterService currently passes through whatever is in feedback_config['general']['online_content'] via GeneralPreferences(**general_cfg). Since feedback_config comes from JSON, online_content entries will be dicts, and DefaultReporter later expects each item to have .description/.url attributes. Parse online_content into LearningResource instances (or adjust DefaultReporter to handle dicts) to prevent runtime AttributeError when online resources are configured.
|
@ArthurCRodrigues ready for review :) |
ArthurCRodrigues
left a comment
There was a problem hiding this comment.
@matheusmra can you provide some feedback examples in the PR description?
This pull request introduces a comprehensive refactor and enhancement of the feedback/report generation system in the autograder, including improvements to the reporting logic, service delegation, and integration with pipeline steps. The changes also add robust unit and playroom tests to ensure reliability and correctness. The most important changes are grouped below:
Reporting Logic Improvements
DefaultReporter.generate_reportto produce structured, customizable reports based on user preferences, including titles, summaries, categorized results, scores, and learning resources.AiReporter.generate_reportto accept new parameters (focus,result_tree,preferences) for future extensibility and compatibility with the new reporting interface.Service Delegation and Integration
ReporterServiceto select the correct reporter based onfeedback_mode, parse feedback config into preference objects, and delegate report generation to the active reporter. Added a newgenerate_feedbackmethod for unified report creation.FeedbackStep.executeto use the newReporterService.generate_feedbackmethod, properly handle success/failure, and utilize new factory methods forStepResult. [1] [2]Testing Enhancements
DefaultReporterto demonstrate report generation with mock data and preferences.DefaultReportercovering report structure, passed test filtering, and graceful handling of empty input.ReporterServiceto verify reporter selection, feedback delegation, and preference parsing.FeedbackStepto ensure correct execution and report content in the pipeline.