What is the difference between actor-critic and pure policy gradient methods in RL?

Question

Machine Learning — Hard

What is the difference between actor-critic and pure policy gradient methods in RL?

Accepted Answer

Actor-critic methods combine a policy network (actor) with a value function (critic) to reduce variance in the policy gradient. Pure policy gradient methods like REINFORCE solely rely on returns from sampled trajectories, leading to higher variance.