Read our cookies policy and privacy statement for more information.
×Burlington, Vermont•
Students will program agents that learn to optimize a reward function using Reinforcement Learning; Markov Decision Processes with discrete states, Value Iteration, Policy Iteration, Q-learning and SARSA, methods for value function approximation in complex domains using linear and non-linear methods.
Units: 3.0