Google has been quietly doing that for more than 10 years, only we didn't start really calling this stuff AI until 2022. Google had offline speech to text (and an always on local hotword detection for "hey Google") since the Moto X 2013, and added hardware support for image processing in the camera app, as images were captured.
The tasks they offloaded onto the Tensor chip starting in 2021 started opening up more image editing features (various algorithms for tuning and editing images), keyboard corrections and spelling/grammar recommendations that got better (and then worse), audio processing (better noise cancellation on calls, an always-on Shazam-like song recognition function that worked entirely offline), etc.
Apple went harder at trying to use those AI features into language processing locally and making it obvious, but personally I think that the tech industry as a whole has grossly overcorrected for trying to do flashy AI, pushed beyond the limits of what the tech can competently do, instead of the quiet background stuff that just worked, while using the specialized hardware functions that efficiently process tensor math.
The actual key management and encryption protocols are published. Each new device generates a new key and reports their public key to an Apple-maintained directory. When a client wants to send a message, it checks the directory to know which unique devices it should send the message to, and the public key for each device.
Any newly added device doesn't have the ability to retrieve old messages. But history can be transferred from old devices if they're still working and online.
Basically, if you've configured things for maximum security, you will lose your message history if you lose or break your only logged-in device.
There's no real way to audit whether Apple's implementation follows the protocols they've published, but we've seen no indicators that they aren't doing what they say.