Square LR-Schedule

Our learning rate scheduler currently uses a linear increase and exponential dropoff, so our learning rate curve looks like the following:
![grafik](https://user-images.githubusercontent.com/39779310/184471868-e5ae0497-e5ff-4013-baf7-8218cedffb46.png)
where the duration of the initial ramp-up and the decay are tuneable hyperparameters.

However, [others](https://arxiv.org/abs/2102.06356) pointed out that square ramp-up and square decay can perform significantly better, so we might also want to use them. The modified curve (orange) would look like the following:
![grafik](https://user-images.githubusercontent.com/39779310/184472098-08f727dd-a072-444a-8b78-79e2e42dffe8.png)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Square LR-Schedule #70

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Square LR-Schedule #70

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions