Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add COVID-RED Dataset, Detection/Prediction Tasks, and Example
Summary
This PR adds support for the COVID-RED (Remote Early Detection of SARS-CoV-2 infections) dataset to PyHealth, including:
COVIDREDDataset)covidred_detection_fn,covidred_prediction_fn)covidred_example.py)This provides a clinically relevant wearable device dataset for PyHealth users and supports reproducible research in early infectious disease detection using consumer wearables.
Feature
1. COVIDREDDataset
split="train" | "test" | "all"window_daysparameter2. Task Functions
covidred_detection_fnMaps dataset samples into PyHealth task format for COVID-19 detection:
{ "patient_id": str, "visit_id": str, "signal": Tensor(n_features × window_days), "label": int(0 or 1), "metadata": dict }covidred_prediction_fnMaps dataset samples for early COVID-19 prediction (pre-symptomatic detection):
covidred_multiclass_fn(optional extension)Extends to multiclass severity classification:
3. Example Script
Dataset Details
Dataset: COVID-RED - Remote Early Detection of SARS-CoV-2 infections
Source: Utrecht University, Netherlands
DOI: 10.34894/FW9PO7
URL: https://dataverse.nl/dataset.xhtml?persistentId=doi:10.34894/FW9PO7
Data characteristics:
Clinical significance:
Tests
Basic verification performed:
Note on Dataset Download
The COVID-RED dataset must be manually downloaded from DataverseNL.
Users must:
heart_rate.csv- Daily resting heart rate measurementssteps.csv- Daily step countssleep.csv- Daily sleep duration and efficiencylabels.csv- COVID-19 test results and symptom dates/data/covidred/)Usage Example
Files Changed
This PR adds three new files to PyHealth:
pyhealth/datasets/covidred.py- Dataset loader classpyhealth/tasks/covidred.py- Task functions for COVID-19 detection/predictionexamples/covidred_example.py- Complete usage example with LSTM classifierCitation
If you use this dataset implementation, please cite the original COVID-RED study: