When the input to a Layered Blend Per Bone node is two instances of the same cached pose, we aren't sampling root motion correctly. In the attached example, the weight on the root bone is 0 and Blend Root Motion Based On Root Bone is enabled on the layered blend node so we should take 100% of the root motion from Base Pose.
However, because of how the Cached Pose node works, this breaks. We pass two contexts to
FAnimNode_SaveCachedPose::Update_AnyThread for the two sources of the layered blend. The context for Base Pose has a blend weight of 1 and a root motion weight of 1 (which is correct) and the context for Blend Poses 0 has a blend weight of 1 and a root motion weight of 0 (this is also correct as we are taking root motion from Base Pose not Blend Poses 0).
In FAnimNode_SaveCachedPose::PostGraphUpdate, when multiple contexts have the same weight, we just choose whichever is added first. In the case of the layered blend node, the context for Blend Poses 0 is added first (with the root motion weight of 0). This means when we update the branch under the cached pose node, we sample set the sequence evaluators not to sample root motion. When the output of the cached pose node is then used for each source of the layered blend node, we now have no root motion.
Some potential solutions:
There's no existing public thread on this issue, so head over to Questions & Answers just mention UE-193706 in the post.