Skip to content

Square LR-Schedule #70

@ClashLuke

Description

@ClashLuke

Our learning rate scheduler currently uses a linear increase and exponential dropoff, so our learning rate curve looks like the following:
grafik
where the duration of the initial ramp-up and the decay are tuneable hyperparameters.

However, others pointed out that square ramp-up and square decay can perform significantly better, so we might also want to use them. The modified curve (orange) would look like the following:
grafik

Metadata

Metadata

Assignees

No one assigned

    Labels

    coreImproves core model while keeping core idea intactengineeringSoftware-engineering problems that don't require ML-Expertise

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions