Deep learning model early layer and later layer

is it true that the early layer in the deep learning model identify the simpler shapes/pattern and the later layer identify the more complex pattern?
Some people say that is what we think the model is doing vs what the model is actually doing. Which one is it?
How do we verify one?