Auto-Curriculum: Adaptively adjusting the complexity of tasks #17

andreaskoepf · 2025-01-29T10:12:59Z

The idea of an auto-curriculum is to optimize the learning signal by adjusting the difficulty of tasks dependent on the model capabilities.

Training tasks should not be too hard nor too easy, e.g. see concepts from psychology like zone of proximal development:

In a naive setup with a dataset that contains problems of all levels of difficulty RL will in the beginning exposed to many tasks that it cannot solve while at a later stage it might ace many of the simpler once which then also doesn't provide a new information anymore.

These parts are needed:

measuring current model capabilities (see Tracking accuracy per dataset parameter combination #16 )
standard way to adjust the task difficulty
curriculum decorator which adjusts difficulty (e.g. at the end of an epoch)

For task difficulty adjustment we could add another (abstract) method to the ProceduralDataset base class which could be implemented in a task/dataset dependent form in the derived classes.
An extended form of the curriculum decorator should be able to adjust the difficulty of a dataset-collection, potentially also adjusting the frequency at dataset level for batch sampling.

The text was updated successfully, but these errors were encountered:

andthattoo · 2025-01-30T02:54:32Z

Adding an abstract method to the ProceduralDataset class could resolve this issue. Task difficulty can be defined as a function of task-specific parameters that vary depending on the implementation.

WordSortingDataset: Complexity increases with longer sequences or a greater number of words.
LetterJumbleDataset: Difficulty scales with the length of the scrambled word.
FamilyRelationshipsDataset: Challenge depends on the min_family_size parameter.

Only question is, if this is applicable to every possible procedural dataset? (possibly)

andreaskoepf · 2025-01-30T10:52:26Z

We could for example add a method to ProceduralDataset which returns an adjusted Config class. Those datasets which don't support difficulty-scaling call the base class variant, which generates an exception. We could for example pass a float value to indicate how much the difficulty should relatively change 1.0 = no change, 2.0 = ~twice as hard, 0.5 = ~half as hard. Or we define an absolute difficulty scale starting with 0 = trivial, 1 = default values, with an open-ended scale upwards. Probably the only difference is the base-config (either default values or the config of the current instance), so absolute or relative could be specified with a bool parameter.

A specific implementation would need to be created for each dataset. But it would be a clean way to get from a scalar value to a dataset config object.

I suggest we start with the impl for something simple as ChainSum.

@andthattoo let me know if you have time to give it a shot. :-)

EduardDurech · 2025-01-30T12:47:50Z

Post-implementation https://github.com/open-thought/reasoning-gym/discussions/27

andreaskoepf · 2025-01-31T07:17:38Z

ok, we continue the planning under discussions: Curriculum Crafting #27

andreaskoepf added the planning Ideation phase / overview, needs breakdown label Jan 29, 2025

EduardDurech self-assigned this Feb 1, 2025

andreaskoepf linked a pull request Feb 6, 2025 that will close this issue

Add Coaching & ScoreBoard class (result tracking) #72

Merged

andreaskoepf closed this as completed in #72 Feb 6, 2025

andreaskoepf reopened this Feb 7, 2025

andreaskoepf self-assigned this Feb 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Auto-Curriculum: Adaptively adjusting the complexity of tasks #17

Auto-Curriculum: Adaptively adjusting the complexity of tasks #17

andreaskoepf commented Jan 29, 2025

andthattoo commented Jan 30, 2025

andreaskoepf commented Jan 30, 2025

EduardDurech commented Jan 30, 2025 •

edited

Loading

andreaskoepf commented Jan 31, 2025

Auto-Curriculum: Adaptively adjusting the complexity of tasks #17

Auto-Curriculum: Adaptively adjusting the complexity of tasks #17

Comments

andreaskoepf commented Jan 29, 2025

andthattoo commented Jan 30, 2025

andreaskoepf commented Jan 30, 2025

EduardDurech commented Jan 30, 2025 • edited Loading

andreaskoepf commented Jan 31, 2025

EduardDurech commented Jan 30, 2025 •

edited

Loading