Skip to content

How to install smdistributed? #2258

@rohan-varma

Description

@rohan-varma

What did you find confusing? Please describe.
I installed sagemaker with pip install sagemaker --update, and am attempting to use distributed model parallel with pytorch. However, I'm unable to import smdistributed.

The docs https://sagemaker.readthedocs.io/en/stable/api/training/smp_versions/v1.2.0/smd_model_parallel_pytorch.html don't have installation instructions for smdistributed. I was wondering how do I get smdistributed installed? Thank you!

I am also looking at https://docs.amazonaws.cn/en_us/sagemaker/latest/dg/model-parallel-customize-training-script-pt.html which directs me to https://sagemaker.readthedocs.io/en/stable/api/training/smp_versions/v1.2.0/smd_model_parallel_common_api.html#smp.init to initialize the sagemaker distributed environment. But again I'm not sure how to get the smdistributed library.

https://github.com/aws/amazon-sagemaker-examples has some smdistributed examples but doesn't provide any clear installation instructions. environment.yml in that repo seems to indicate all that's needed is sagemaker which I have installed.

Describe how documentation can be improved
Could not find clear installation instructions for smdistributed, would it be possible to add these?

Additional context
Add any other context or screenshots about the documentation request here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions