There is no longer any CUDA dependency anywhere in its stack, which is probably the biggest deal of all. For those who don’t know, CUDA is Nvidia’s software layer which is the foundation nearly every frontier AI model in the world is built on. Except, as of today, DeepSeek V4, which can run entirely on Huawei Ascend chips via Huawei’s CANN framework. China now has its own domestic AI stack, top to bottom.

  • Che's Motorcycle@lemmygrad.ml
    link
    fedilink
    arrow-up
    15
    ·
    6 days ago

    This is it. This is the clearest sign yet that China can and will continue to pull ahead of the west, even in its most coveted and falsely glorified spaces.

    This was the second to last thing the west was hanging on to. All that’s left is chip speed, and even there only China is doing the fundamental research to advance things beyond the current paradigm. This is socialism outproducing capitalism, in our lifetimes!

    • ☆ Yσɠƚԋσʂ ☆@lemmygrad.mlOP
      link
      fedilink
      arrow-up
      14
      ·
      6 days ago

      Exactly, AI and chips were the last bits of technology that the West could credibly claim to be ahead in. And this lead is collapsing in real time. This was the last refuge for the western technological supremacy narrative.

      • Comprehensive49@lemmygrad.ml
        link
        fedilink
        arrow-up
        3
        ·
        6 days ago

        Unfortunately, DeepSeek V4 is not a full frontier model able to beat OpenAI’s or Anthropic’s latest yet, so there is a bit yet to improve.

          • Comprehensive49@lemmygrad.ml
            link
            fedilink
            arrow-up
            4
            ·
            5 days ago

            Fair, I’ve heard similar annoyances about GPT 5.5. I think I hope DeepSeek reinforcement-trains V4 a bit harder and, in two to three months, comes out with an earth-shattering V4.1.

            • ☆ Yσɠƚԋσʂ ☆@lemmygrad.mlOP
              link
              fedilink
              arrow-up
              7
              ·
              5 days ago

              The quality of the training is really what it comes down to. I saw one approach that was actually kind of obvious in retrospect where a model was trained on the actual git history instead of repository snapshots which taught it how code actually evolves over time. I think these kinds of tricks will add a lot of polish to make really competent coding models.