Skip to content

Reinforcement Learning agent uses Semi-temporal difference method for policy evaluation.

Notifications You must be signed in to change notification settings

rahimi-mohammad/Semi-TD-Agent-in-Random-walk-environment

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Semi-TD-Agent-in-Random-walk-environment-

Initially, we applied semi-gradient TD using State Aggregation to tackle a policy evaluation task. Subsequently, we employed semi-gradient TD with a basic Neural Network for the same evaluation.

Our agent was devised to assess a static policy within the 500-State Random Walk environment, comprising precisely 500 states. Each episode starts with the agent positioned at the center and concludes if the agent reaches state 1 (far left) or state 500 (far right). At each step, the agent randomly opts to move left or right with an equal chance. The environment determines the distance of the agent's movement in the chosen direction.

This project is based on the assignments of this Coursera Reinforcement Specialization: https://www.coursera.org/specializations/reinforcement-learning

About

Reinforcement Learning agent uses Semi-temporal difference method for policy evaluation.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published