• markovs_gun@lemmy.world
    link
    fedilink
    English
    arrow-up
    23
    arrow-down
    3
    ·
    19 hours ago

    If I understand correctly this is essentially how condensed models like Deepseek work and how they’re able to attain similar performance on much cheaper hardware. If all still goes through the LLM but LLM is a lot lighter because it has this sort of thing built in. That’s all a vast oversimplification.