RAPID UNDERSTANDING AND BOND NEUROPLASCITY FORMING * repeadtly look at it, re look at it (multiple things at once) * When I look at different things I feel more motivated. why is this? Follow the guide of openai, dont fall down too many rabbit hole Why does model only seek to optimize it's ability to win and not defend?

Introduction

ESSAY: Spinning Up as a Deep RL Researcher: https://spinningup.openai.com/en/latest/spinningup/spinningup.html

The Right Background

Installation

Algorithms

On-Policy Algorithms

Off-Policy Algorithms

Code Format

Part 1: Key Concepts in RL (notations)

Link between and Action-Value functions: Taking an action, why do we need this?

Opportunities

Policy Optimization