Without the 'secrect sauces', I don't see how OAI's open model will win against Qwen-3 by a meaningful margin that encourages adoption.
It's like Gemma-3 is slightly better, but nobody cares, due to the environment and community support for Qwen/Llama models
But as you said, they not gonna spill the 'secret sauces', like new long context arch, quantized training, new training losses (like MTP), sparse MoE, etc.
Without the 'secrect sauces', I don't see how OAI's open model will win against Qwen-3 by a meaningful margin that encourages adoption.
It's like Gemma-3 is slightly better, but nobody cares, due to the environment and community support for Qwen/Llama models
But as you said, they not gonna spill the 'secret sauces', like new long context arch, quantized training, new training losses (like MTP), sparse MoE, etc.
Yeah I agree, partially why I’m excited. They tend to deliver