Meta Open-Sourced LLaMA 4. It Kept Muse Spark.

Meta Open-Sourced LLaMA 4. It Kept Muse Spark.

/ Maxim Starkweather

On April 8, Meta released two models. Llama 4 Scout and Llama 4 Maverick landed on llama.com and Hugging Face for anyone to download, with Meta’s blog celebrating that “openness drives innovation and is good for developers, good for Meta, and good for the world.” Same day, Meta’s new Superintelligence Labs division published a separate announcement: Muse Spark, the division’s first model, described as “the first step” toward personal superintelligence. Muse Spark scored 58 percent on Humanity’s Last Exam using a multi-agent contemplating mode, and achieves those results using, by Meta’s own account, “over an order of magnitude less compute” than Maverick. It is available at meta.ai and via private API preview to select users. There are no weights to download.

The two releases landed the same day in separate blog posts, and the press covered them the same way — separately. But the simultaneous release of one open model and one closed model from the same company, announced within hours of each other, is not a contradiction. It is a map. Meta has drawn the line between what it gives away and what it monetizes, and for the first time in its AI history, that line runs through the frontier.

The Ecosystem Meta Built With Open Weights

Meta has been the most consequential open-source actor in frontier AI since the original LLaMA model shipped in February 2023. That model wasn’t the strongest at the time, but the fact that it could be downloaded, modified, and run locally without an API key created a developer ecosystem that proprietary labs couldn’t replicate through any other means. Researchers fine-tuned it. Practitioners deployed it on hospital servers that couldn’t send patient data to external APIs. Startup founders built on it without committing to OpenAI’s or Anthropic’s billing structures. The Llama family became the substrate for more open-source AI applications than any other model lineage — a fact that matters more to Meta’s competitive position in the developer community than any benchmark could.

Llama 4 continues that line. Scout offers a ten-million-token context window and outperforms Gemma 3 and Gemini 2.0 Flash-Lite in multimodal benchmarks. Maverick matches GPT-4o and Gemini 2.0 Flash at an LMArena ELO score of 1,417 while running on seventeen billion active parameters. Behemoth, at 288 billion active parameters, is still in training. All of it is open. Downloadable. No API key required. Meta’s blog post on the Llama 4 release reads like a mission statement for open AI, because it is one.

The efficiency threshold: same infrastructure, different access terms

The LLaMA line exists because openness has a specific business logic for Meta that closed labs don’t share. Open-source AI keeps developers building within Meta’s ecosystem. It generates goodwill that compensates for Meta’s complicated consumer privacy record. It gives regulators in Brussels and Washington a reason to treat Meta differently from the labs that release nothing. And it provides a talent signal: researchers who want to publish and share work gravitate toward organizations that publish and share work. Meta’s open-source strategy is not altruism. It is a sophisticated competitive position, and it has been effective enough that the Llama model family is now the baseline against which practitioners measure every other open-weight release.

Why Muse Spark Stays Closed

Muse Spark is the product of Meta Superintelligence Labs — a new division, separate branding, separate announcement. The naming matters. MSL is not positioned as a research organization releasing tools for the community. It is positioned as a consumer-product organization with a declared destination: personal superintelligence. “The first step toward personal superintelligence” is not researcher language. It is roadmap language. It means: this is a product, and the product does not include weights.

The architecture tells part of the story. Muse Spark achieves its 58-percent score on Humanity’s Last Exam while using an order of magnitude less compute than Llama 4 Maverick to reach equivalent capability. Meta is not running two models at the same capability tier — it is running two models at two different architectural approaches. LLaMA 4 scales with parameters. Muse Spark does something more efficient. The efficient architecture is the one that stays closed. That is not a coincidence: the efficient frontier is where revenue lives, because efficiency determines what you can charge per API call and still make money.

The economic logic here is not difficult to follow. Open weights make sense when a model’s primary value is as infrastructure — when the developer community building on top of it generates more value for Meta than the model could capture directly. That calculation holds for a model that is capable but not frontier, or that has been superseded by a newer release. It inverts the moment a model is capable enough to anchor a consumer product that competes directly with ChatGPT and Google Gemini. At that point, releasing the weights is not generosity. It is writing a check to your competitors.

This is exactly the calculation OpenAI made with GPT-4 and has never departed from. Anthropic’s Claude weights have never been available for download — not Sonnet, not Opus, not Haiku. The rationale is identical regardless of whether it is articulated: frontier models are the revenue base. You do not give away the revenue base. Now Meta, which articulated a different philosophy for three years, has arrived at the same place. The open-weights champion kept the open weights. It just stopped issuing them at the frontier.

Martin Alderson, writing on May 6, put the economic stakes clearly: open-weights models function like generic pharmaceuticals, constraining the price that frontier labs can charge for equivalent capability. When open alternatives exist that can do approximately what the closed model does, the closed model’s pricing is tethered to that floor. As the open alternatives fall behind the frontier, the floor drops away. Anthropic’s Claude Max plan starts at $100 per month. The gap between what a flat subscriber can accomplish and what the same work costs at API list rates is maintained only as long as open alternatives remain close enough to the frontier to serve as a viable substitute. Muse Spark has broken that parity, and no open model from a US lab is positioned to restore it.

DeepSeek Is the Exception That Explains the Rule

There is a substantive counter to the argument that frontier AI is closing. It is called DeepSeek.

DeepSeek-V3, released in July 2025, was genuinely frontier-competitive at release and released openly under an MIT license for code and a permissive model license for weights. A 671-billion-parameter mixture-of-experts model trained on 14.8 trillion tokens, it scored 39.2 on AIME 2024 — the same benchmark where GPT-4o scored 9.3 and Claude 3.5 Sonnet scored 16.0. On Arena-Hard it outscored both. On LiveCodeBench it outscored both. The model is downloadable, commercial use is supported, and this was frontier capability released publicly while it was still frontier. Not six months after the next release made it safe to share. While it was still the best available.

Two approaches to frontier AI, operating in parallel across a geographic divide

The pattern has continued. Sean Goedecke, writing in mid-May 2026, described DeepSeek-V4-Flash as “what many engineers have been waiting for: a local model good enough to compete with at least the low end of frontier model agentic coding.” It’s runnable locally — antirez, the creator of Redis, built a stripped-down llama.cpp variant specifically to run it. Engineers who have been waiting for an open model capable enough to experiment with steering vectors, a technique that requires local access to model internals, now have one.

The question is why DeepSeek does this when Meta, Anthropic, and OpenAI do not. The answer is not idealism. DeepSeek is a Hangzhou-based lab operating without the US public market pressures that have defined the strategic calculus of every American lab at the frontier. There is no IPO timeline, no quarterly revenue expectations from institutional shareholders, no story to tell Wall Street about monetization in twelve months. The strategic value of demonstrating technical parity with American labs in public — at the frontier, in a format that other engineers can download and verify — is worth more, within that capital structure, than the revenue that closing the weights might protect. DeepSeek’s openness is a strategic statement aimed at a different audience than Meta’s, and it works precisely because its incentives point a different direction.

The open frontier of AI is not dying. It is becoming a Chinese phenomenon rather than an American one. The Llama family will continue to ship open weights, and those weights will remain capable by any reasonable standard. But “capable” and “frontier” are no longer the same thing in the open-source ecosystem, and the lab that closed that gap most recently does so from Hangzhou, not Menlo Park. Whether that framing concerns you depends on what you think the open-source AI ecosystem is ultimately for.

The two Meta announcements on April 8 tell a story that neither tells alone. LLaMA 4 is for developers. Muse Spark is for users. The weights go with the developers. The revenue model goes with the users. This bifurcation will not stay static — Behemoth is still in training and will presumably ship on the open track, because LLaMA 4 is the open track. But MSL’s next model will not ship on the open track, because MSL is the product track. Both organizations are building at the same company. One is giving something away; the other is building something to sell. The line between them is not philosophical. It is the capability threshold at which a model stops being infrastructure and starts being the product. Meta just crossed it, and drew the line in the same place every other lab drew it: at the frontier.

AI-generated editorial image

AI-generated editorial illustration · TemperatureZero · May 18, 2026

Keep reading the signal

Get the Daily Signal — a concise briefing on what actually matters in AI and the systems around it.

Subscribe Free

Continue the archive

Latest BriefingsArticlesAbout Temperature Zero