Modular Motion is a framework designed to provide a modular, extensible pipeline for human motion analysis. It builds on a plugin-style architecture that allows interchangeable pose estimation backends (e.g., MediaPipe, OpenPose) and downstream analyzers (e.g., LSTM-based classifiers).
The project is structured with professional, consistent, self-documenting, and intuitive (PCSI) principles in mind:
- Clear module boundaries
- Well-documented APIs
- Config-driven behavior (via
config.json) - Centralized logging system
This ensures the codebase remains approachable while also being production-ready.
Get up and running in just a few steps:
# Clone the repository
git clone <your-repo-url>
cd modular_motion
# Install dependencies
pip install -r requirements.txt
# Run a demo on a test video
python pose_demo.py --video videos/test_videos/sample_swing.mp4
# Or run with your webcam
python pose_demo.py --video 0The pipeline is divided into the following stages:
-
Config Manager (
config_manager.py)- Loads
config/config.json. - Validates required fields based on the active pose estimator.
- Provides easy access to relevant configuration subsets.
- Loads
-
Logger (
logger.py)- Replaces
print()with structured, leveled logs. - Supports console + file output.
- Standard prefixes (
[INFO],[WARNING],[ERROR]) included.
- Replaces
-
Pose Estimator Base (
pose_estimator_base.py)- Abstract base class for all pose estimation backends.
-
MediaPipe Estimator (
mediapipe_estimator.py)- Wraps the MediaPipe Pose model.
- Converts raw landmarks into standardized
PoseFrameobjects. - Canonical joint names are used across all backends.
-
PoseFrame (
pose_frame.py)- Stores frame ID and keypoints in a standardized format.
- Keypoints:
dict[str, tuple[float, float, float]] - Provides a consistent container for downstream analyzers and visualizers.
-
Pose Visualizer (
pose_visualizer.py)- Draws canonical keypoints from a
PoseFrameonto the video frame. - Designed to be backend-agnostic.
- Draws canonical keypoints from a
-
Pose Demo (
pose_demo.py)- Driver program that ties everything together.
- Reads input video or webcam stream.
- Runs pose estimation, overlays keypoints, and displays annotated video.
-
Manual Annotation Tools (to be migrated from BaseballAI):
- Used to annotate frames for supervised training.
- Defines swing impact ranges (or equivalent motion segments).
-
LSTM Analyzer (Planned):
- Sequential model that ingests series of
PoseFramekeypoints. - Detects temporal motion patterns (e.g., a baseball swing).
- This is the next milestone for Modular Motion.
- Sequential model that ingests series of
Once the LSTM analyzer is integrated, the Modular Motion workflow will look like this:
- Pose Estimation → Extracts frame-level keypoints with MediaPipe or other backends.
- Detection/Annotation → Segments meaningful events (swings, strikes, guard drops, etc.).
- Sequence Modeling → LSTM analyzes temporal keypoint sequences.
- Results & Feedback → Outputs classifications, confidence scores, and performance summaries.
The long-term goal is a general-purpose sports action detection suite. Starting with baseball swings, the framework can be extended to MMA tendencies, basketball shooting form, or general athletic motion analysis.
- Migrate annotation pipeline from BaseballAI.
- Build first iteration of the LSTM analyzer.
- Integrate analyzer into
pose_demofor real-time classification and logging. - Expand test coverage with multiple sports datasets.
# Run demo on a test video
python pose_demo.py --video videos/test_videos/sample_swing.mp4
# Run demo with webcam
python pose_demo.py --video 0
# Run without overlay (logging only)
python pose_demo.py --video videos/test_videos/sample_swing.mp4 --no-overlaymodular_motion/
│
├── config/
│ └── config.json
│
├── core/
│ ├── pose_estimator_base.py
│ └── pose_frame.py
│
├── pose_estimators/
│ └── mediapipe_estimator.py
│
├── utils/
│ ├── logger.py
│ └── pose_visualizer.py
│
├── pose_demo.py
└── README.md