Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto-Curriculum: Adaptively adjusting the complexity of tasks #17

Open
andreaskoepf opened this issue Jan 29, 2025 · 4 comments · Fixed by #72
Open

Auto-Curriculum: Adaptively adjusting the complexity of tasks #17

andreaskoepf opened this issue Jan 29, 2025 · 4 comments · Fixed by #72
Assignees
Labels
planning Ideation phase / overview, needs breakdown

Comments

@andreaskoepf
Copy link
Contributor

The idea of an auto-curriculum is to optimize the learning signal by adjusting the difficulty of tasks dependent on the model capabilities.

Training tasks should not be too hard nor too easy, e.g. see concepts from psychology like zone of proximal development:

Image

In a naive setup with a dataset that contains problems of all levels of difficulty RL will in the beginning exposed to many tasks that it cannot solve while at a later stage it might ace many of the simpler once which then also doesn't provide a new information anymore.

These parts are needed:

  1. measuring current model capabilities (see Tracking accuracy per dataset parameter combination #16 )
  2. standard way to adjust the task difficulty
  3. curriculum decorator which adjusts difficulty (e.g. at the end of an epoch)

For task difficulty adjustment we could add another (abstract) method to the ProceduralDataset base class which could be implemented in a task/dataset dependent form in the derived classes.
An extended form of the curriculum decorator should be able to adjust the difficulty of a dataset-collection, potentially also adjusting the frequency at dataset level for batch sampling.

@andreaskoepf andreaskoepf added the planning Ideation phase / overview, needs breakdown label Jan 29, 2025
@andthattoo
Copy link

Adding an abstract method to the ProceduralDataset class could resolve this issue. Task difficulty can be defined as a function of task-specific parameters that vary depending on the implementation.

WordSortingDataset: Complexity increases with longer sequences or a greater number of words.
LetterJumbleDataset: Difficulty scales with the length of the scrambled word.
FamilyRelationshipsDataset: Challenge depends on the min_family_size parameter.

Only question is, if this is applicable to every possible procedural dataset? (possibly)

@andreaskoepf
Copy link
Contributor Author

We could for example add a method to ProceduralDataset which returns an adjusted Config class. Those datasets which don't support difficulty-scaling call the base class variant, which generates an exception. We could for example pass a float value to indicate how much the difficulty should relatively change 1.0 = no change, 2.0 = ~twice as hard, 0.5 = ~half as hard. Or we define an absolute difficulty scale starting with 0 = trivial, 1 = default values, with an open-ended scale upwards. Probably the only difference is the base-config (either default values or the config of the current instance), so absolute or relative could be specified with a bool parameter.

A specific implementation would need to be created for each dataset. But it would be a clean way to get from a scalar value to a dataset config object.

I suggest we start with the impl for something simple as ChainSum.

@andthattoo let me know if you have time to give it a shot. :-)

@EduardDurech
Copy link
Collaborator

EduardDurech commented Jan 30, 2025

@andreaskoepf
Copy link
Contributor Author

ok, we continue the planning under discussions: Curriculum Crafting #27

@EduardDurech EduardDurech self-assigned this Feb 1, 2025
@andreaskoepf andreaskoepf linked a pull request Feb 6, 2025 that will close this issue
@andreaskoepf andreaskoepf reopened this Feb 7, 2025
@andreaskoepf andreaskoepf self-assigned this Feb 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
planning Ideation phase / overview, needs breakdown
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants