refactor(psci): unify coherency exit between AArch64 and AArch32

The procedure is fairly simple: if we have hardware assisted coherency,
call into the cpu driver and let it do its thing. If we don't, then we
must turn data caches off, handle the confusion that causes with the
stack, and call into the cpu driver which will flush the caches that
need flushing.

On AArch32 the above happens in common code. On AArch64, however, the
turning off of the caches happens in the cpu driver. Since we're dealing
with the stack, we must exercise control over it and implement this in
assembly. But as the two implementations are nominally different (in the
ordering of operations), the part that is in assembly is quite large as
jumping back to C to handle the difference might involve the stack.

Presumably, the AArch difference was introduced in order to cater for a
possible implementation where turning off the caches requires an IMP DEF
sequence. Well, Arm no longer makes cores without hardware assisted
coherency, so this eventually is not possible.

So take this part out of the cpu driver and put it into common code,
just like in AArch32. With this, there is no longer a need call
prepare_cpu_pwr_dwn() in a different order either - we can delay it a
bit to happen after the stack management. So the two AArch-s flows
become identical. We can convert prepare_cpu_pwr_dwn() to C and leave
psci_do_pwrdown_cache_maintenance() only to exercise control over stack.

Change-Id: Ie4759ebe20bb74b60533c6a47dbc2b101875900f
Signed-off-by: Boyan Karatotev <boyan.karatotev@arm.com>
14 files changed