It’s good at writing it, ideally 50-250 lines at a time
I find Claude Sonnet 4.5 to be good up to 800 lines at a chunk. If you structure your project into 800ish line chunks with well defined interfaces you can get 8 to 10 chunks working cooperatively pretty easily. Beyond about 2000 lines in a chunk, if it's not well defined, yeah - the hallucinations start to become seriously problematic.
The new Opus 4.5 may have a higher complexity limit, I haven't really worked with it enough to characterize... I do find Opus 4.5 to get much slower than Sonnet 4.5 was for similar problems.
I think complicated software has been an art more than a science, for the past 30 years we have been developing formal processes to make it more of a procedural pursuit but the art is still very much in there.
I think if AI authored software is going to reach any level of valuable complexity, it's going to get there with the best of our current formal processes plus some more that are being (rapidly) developed specifically for LLM based tools.
And how do we surpass those limits? Generally: research. And for the past 20+ years where do we do most of that research? On the internet. And where were the LLMs trained, and what are they relatively good at doing quickly? Internet research.
So is semiconductor design, application of transistors to implement logic gates, etc. We still have people who can do that, not very many, but enough. Not many people work in assembly language anymore, either...