It appears that the homework was due two days prior to this answer's writing, but in case it is still relevant in some way, the relevant class notes (which would have been useful if provided in the question along with the homework) are here.
The first instance of expectation placed on the student is, "Please show equation 12 by using the law of iterated expectations, breaking by decoupling the state-action marginal from the rest of the trajectory." Equation 12 is this.
The class notes identifies as the state-action marginal. It is not a proof sought, but a sequence of algebraic steps to perform the decoupling and show the degree to which independence of the state-action marginal can be achieved.
This exercise is a preparation for the next step in the homework and draws only on the review of CS189, Burkeley's Introduction to Machine Learning course, which does not contain the Law of Total Expectation in its syllabus or class notes.
All the relevant information is in the above link for class notes and requires only intermediate algebra.