D•Scribe
  • Communities
  • Create Post
  • Create Community
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
☆ Yσɠƚԋσʂ ☆@lemmygrad.ml to Technology@lemmygrad.mlEnglish · 24 days ago

Granite 4.1: IBM's 8B Model Is Competing With Models Four Times Its Size - Firethering

firethering.com

external-link
message-square
10
link
fedilink
  • cross-posted to:
  • technology@hexbear.net
  • hackernews@lemmy.bestiver.se
13
external-link

Granite 4.1: IBM's 8B Model Is Competing With Models Four Times Its Size - Firethering

firethering.com

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml to Technology@lemmygrad.mlEnglish · 24 days ago
message-square
10
link
fedilink
  • cross-posted to:
  • technology@hexbear.net
  • hackernews@lemmy.bestiver.se
IBM just released Granite 4.1, a family of open source language models built specifically for enterprise use. Three sizes, Apache 2.0 licensed and trained on 15 trillion tokens with a level of pipeline obsession that's worth understanding. But there's one result in the benchmarks I keep coming back to. The 8B model. Dense architecture, no MoE tricks, no extended reasoning chains. It matches or beats Granite 4.0-H-Small across basically every benchmark they ran. That older model has 32B parameters with 9B active. This one has 8 billion. Full stop. That result is either very impressive or it means the old model was underbuilt. Probably both. Here's how they built it, what the numbers actually say, and whether any of it matters for your use case.
alert-triangle
You must log in or # to comment.
  • Che's Motorcycle@lemmygrad.ml
    link
    fedilink
    arrow-up
    5
    ·
    24 days ago

    I might try this out next week. Tired of burning my monthly token allowance in Cursor in a couple weeks. :D

    • ☆ Yσɠƚԋσʂ ☆@lemmygrad.mlOP
      link
      fedilink
      arrow-up
      5
      ·
      24 days ago

      If you have the memory, I can highly recommend Qwen3.6-35B-A3B-Q8. It’s hands down the best local model I’ve tried. It only loads 3b params in memory too, so should run with 16gb, or you can drop to a lower quant too.

      • Che's Motorcycle@lemmygrad.ml
        link
        fedilink
        arrow-up
        3
        ·
        23 days ago

        I think I tried qwen3.6 but the 8B version, and that tanked my 16GB. But I’ll give the smaller one a shot!

    • CriticalResist8@lemmygrad.ml
      link
      fedilink
      arrow-up
      4
      ·
      24 days ago

      Deepseek v4 pro! Top up your credit as you go and they’re having a sale until May 31st, but even without the sale 1M output tokens is “only” 3.48. Flash is only 0.28 per 1M output.

      • Che's Motorcycle@lemmygrad.ml
        link
        fedilink
        arrow-up
        2
        ·
        23 days ago

        Not sure if I could swing Deepseek at my job tho. Surprisingly, Cursor still comes with Kimi2 as model option, so there’s that.

    • Che's Motorcycle@lemmygrad.ml
      link
      fedilink
      arrow-up
      2
      ·
      23 days ago

      Yep, it works on my machine. 😎

      I’ll compare it with the 3B qwen3.6 next week

  • PeeOnYou [he/him]@lemmygrad.ml
    link
    fedilink
    arrow-up
    3
    ·
    24 days ago

    i saw a comparison of the 8b model vs the dense 30b (iirc) dense model and it was almost the same… the 30b was slightly better on most tests but only barely

    • ☆ Yσɠƚԋσʂ ☆@lemmygrad.mlOP
      link
      fedilink
      arrow-up
      4
      ·
      24 days ago

      It’s honestly incredible to see because 8b is getting to the point where it will run well on a lot of consumer hardware. If we can get current frontier performance at that size, then you really would be able to solve most tasks locally.

      • CriticalResist8@lemmygrad.ml
        link
        fedilink
        arrow-up
        5
        ·
        24 days ago

        The 4-bit quantized GGUF for granite 4.1 is sub 5GB, so it’s probably going to run on any modern machine even if it’s not particularly built for Vram… 6 gigs is what I had on my old 1080 gpu.

        https://huggingface.co/unsloth/granite-4.1-8b-GGUF/tree/main

        • ☆ Yσɠƚԋσʂ ☆@lemmygrad.mlOP
          link
          fedilink
          arrow-up
          4
          ·
          24 days ago

          🎉

Technology@lemmygrad.ml

technology@lemmygrad.ml

Subscribe from Remote Instance

Create a post
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !technology@lemmygrad.ml

A tech news sub for communists

Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 59 users / day
  • 173 users / week
  • 313 users / month
  • 777 users / 6 months
  • 2 local subscribers
  • 1.43K subscribers
  • 1.12K Posts
  • 2.02K Comments
  • Modlog
  • mods:
  • Muad'Dibber@lemmygrad.ml
  • burlemarx@lemmygrad.ml
  • egs81t@lemmygrad.ml
  • UI: unknown version
  • BE: 0.19.17
  • Modlog
  • Legal
  • Instances
  • Docs
  • Code
  • join-lemmy.org