Skip to content

AIMV2 as the encoder, unfreezing it and setting the learning rate to 2e-6 results in the LLaVA-NEXT model achieving a loss of 0 #21

@1359347500cwc

Description

@1359347500cwc

When using AIMV2 as the encoder, unfreezing it and setting the learning rate to 2e-6 leads to the LLaVA-NEXT model reaching a loss of 0 after 3000-4000 steps. The original paper kept the encoder frozen. Why is it not recommended to unfreeze it for training? If I decide to unfreeze it, what learning rate should I set?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions