본문 바로가기

AI 공부/UCL Course on RL (David Silver)

[Introduction to Reinforcement Learning with David Silver] #3. Planning by Dynamic Programming

by CheeseBro 2022. 10. 29.

Introduction

다이나믹 프로그래밍이란 (What is Dynamic Programming?)

다이나믹 프로그래밍 조건

정책평가(Policy Evaluation)

반복 정책 평가 (Iterative Policy Evaluation)

Example : Small Gridworld

정책 반복(Policy Iteration)

가치 반복(Value Iteration)

최적화의 원리(Principle of Optimality)

가치 반복(Value Iteration)

요약(Summary)

※ 해당 내용은 David Silver 교수님의 Introduction to Reinforcement Learning 강의를 기반으로 강화학습에 대하여 정리한 자료입니다.

강의 영상(Lecture) : https://www.youtube.com/watch?v=2pWv7GOvuf0&list=PLqYmG7hTraZDM-OYHWgPebj2MfCFzFObQ
강의 슬라이드(Slide) : https://www.davidsilver.uk/teaching/

저작자표시 비영리 변경금지

'AI 공부 > UCL Course on RL (David Silver)' 카테고리의 다른 글

[Introduction to Reinforcement Learning with David Silver] #4. Model-Free Prediction (0)	2022.10.31
[Introduction to Reinforcement Learning with David Silver] #2. MDP (0)	2022.10.27
[Introduction to Reinforcement Learning with David Silver] #1. Introduction to RL (0)	2022.10.26

댓글

티스토리툴바