Site has extremely detailed stats by day/week for every model. Programming is by far the largest consumer of tokens, and in fact entire token growth in 2025 was only from programming. Other categories very flat. It is also a category where you would pay for better performance.

IMO, its relevant to this sub in that one of the top models, minimax, fits in under 256gb, but also that the trends are for cost effectiveness rather than “the absolute best”. There is a tangent insight as to whether US datacenter frenzy is needed.

kimi k2.5 being free on openclaw is a big reason for its total dominance. In week of Feb 2, minimax was only other top model to increase token usage. Opus 4.6 release seems to be extremely flat in reception.

Agentic trend tends to make LLM models disposable, since better ones are released every week, and the agents/platforms that can switch on the fly while keeping context, is something you can invest in improving while not being obsolete next month.

  • SuspciousCarrot78@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    2 days ago

    Woof - the axes on that chart LOL. Suffice it to say, they’re all pretty dang close. Interesting. Maybe the easter bunny can bring me something with >8GB VRAM so I can actually run em locally. I’m guessing Kimi-2 eats about what…500GB+ for 128K context?

    • pkjqpg1h@lemmy.zip
      link
      fedilink
      English
      arrow-up
      2
      ·
      2 days ago

      The real reason is LLMs are still using the same architecture and there is no breakthrough at the end of the day their intelligence will become so close to each other, when this happens they will have to decrease the prices to compete with open-weight models and even with these prices they don’t generate revenue so instead of just scaling they will have to focus on optimization and innovation