NostrHTTP - not from LLM based m

导航栏

Home

@ S!ayer
2025-05-11 21:43:01

not from LLM based models, no. RLVR models are the new method, reinforcement learning with verifiable rewards - but then also zero data/zero knowledge based learning. In other words, they have AI teach AI, become self aware. Reinforced self-play reasoning with zero data. So basically it starts as an SI, iterates, teaches itself based on it's own inputs/outputs, iterates again all without any human inputs (data or prompts instruction) This new method allows for verified rewards to be the tool that defines the ai reasoning model

yakihonne.com iris.to jumble.social