What is the purpose of the REINFORCE algorithm in reinforcement learning?

Question

Machine Learning — Hard

What is the purpose of the REINFORCE algorithm in reinforcement learning?

Accepted Answer

The purpose of the REINFORCE algorithm in reinforcement learning is to estimate the gradient of expected return by sampling trajectories and computing the gradient of log-probability of actions weighted by the actual return. This algorithm directly optimizes the policy without relying on a value function.