Skip to content

Conversation

@Krishna200608
Copy link

Issue: #81

Summary

This Pull Request restructures the existing class-separated astronomy image dataset into
standard train, test, and eval splits to enable reproducible training and evaluation.

Work Done

  • Implemented a reusable, class-based dataset splitting utility
  • Organized images into train, test, and eval directories with a 70% / 20% / 10% split
  • Preserved class-wise separation across all splits
  • Verified split correctness by printing per-class image counts

Implementation Notes

  • The dataset restructuring is performed locally within the Kaggle environment
  • Symbolic links are used instead of copying files to avoid storage limitations
  • No dataset files are uploaded to the repository; only the splitting logic is shared
  • The implementation is added as a new section in the existing notebook under:
    participants/IIT2023139/data_exploration (version 2).ipynb

Reproducibility

  • A fixed random seed is used to ensure deterministic splits
  • The splitting logic is reusable and can be applied to similar class-structured datasets

Environment

  • Platform: Kaggle Notebook
  • OS: Windows 11 (local development)
  • Editor: VS Code

This PR addresses only Issue #81 and does not overlap with any other issues.

@OpenGitBot
Copy link

Hey @Krishna200608

Thanks for opening this PR 🚀. Mentor will review your pull request soon and till then, keep contributing and stay calm.

Thanks for contributing in OpenCode'25 ✨✨!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants