According to https://arxiv.org/abs/2405.21015
The absolute most monstrous, energy guzzling model tested needed 10 MW of power to train.
Most models need less than that, and non-frontier models can even be trained on gaming hardware with comparatively little energy consumption.
That paper by the way says there is a 2.4x increase YoY for model training compute, BUT that paper doesn't mention DeepSeek, which rocked the western AI world with comparatively little training cost (2.7 M GPU Hours in total)
Some companies offset their model training environmental damage with renewable and whatever bullshit, so the actual daily usage cost is more important than the huge cost at the start (Drop by drop is an ocean formed - Persian proverb)
I can't really provide any further insight without finding the damn paper again (academia is cooked) but Inference is famously low-cost, this is basically "average user damage to the environment" comparison, so for example if a user chats with ChatGPT they gobble less water comparatively than downloading 4K porn (at least according to this particular paper)
As with any science, statistics are varied and to actually analyze this with rigor we'd need to sit down and really go down deep and hard on the data. Which is more than I intended when I made a passing comment lol