Close Menu
orrao.com
  • Home
  • Business
  • U.S.
  • World
  • Politics
  • Sports
  • Science
  • More
    • Health
    • Entertainment
    • Education
    • Israel at War
    • Life & Trends
    • Russia-Ukraine War
What's Hot

What Uterine Fibroids Reveal About Heart Health

January 21, 2026

Research Links This Common Spice to Better Mood and Intimacy

January 21, 2026

Slow Cooker Boston Butt Recipe (Instant Pot Option)

January 20, 2026
Facebook X (Twitter) Instagram
orrao.comorrao.com
  • Home
  • Business
  • U.S.
  • World
  • Politics
  • Sports
  • Science
  • More
    • Health
    • Entertainment
    • Education
    • Israel at War
    • Life & Trends
    • Russia-Ukraine War
Subscribe
orrao.com
Home»Science»ARC-AGI-2: Leading AI models fail new test of artificial general intelligence
Science

ARC-AGI-2: Leading AI models fail new test of artificial general intelligence

March 25, 2025No Comments3 Mins Read
Share
Facebook Twitter LinkedIn Pinterest Email


ARC-AGI-2 Reference is designed for a difficult test for AI models

Just_super / Getty Images

The most sophisticated AIs models currently included in a new reference designed to measure their progress (agi), and the raw force computer power is not enough to improve the cost of launching the model.

Agi will compete in the definition, but generally refers to an AI that humans can do cognitive task. To measure this, the Arch Award Foundation He previously tested the ARC-AGI-1 reasoning skills. Last December, Openai announced the model O3 got a very score in the testSome of the business agi to ask if it was near.

But now a new Arc-AG-2 test has lifted the bar. It is hard enough to achieve more than one score for the current AI system in the market that does not take 100 tests, as all two men have been fixed in less than two attempts.

In A blog message By announcing Arc-agi-2, Greg Kamradt said the President ARC said the new reference was necessary to test different top iteration skills. “Beat, you need to show high level of fit and high efficiency,” he wrote.

The ARC-agi-2 differs from other reference tests, as the AI ​​model capacity focuses to form simple tasks. Current models are good “in depth learning,” ARC-agi-1 measured, but they are not seemingly simpler in tasks, which require harder thought and interaction, in ARC-AG-2. Openi’s O3-Low model, for example, scoring 75.7% in ARC-AG-1, but 4% in ARC-AG-2.

The references also adds a new dimension to measure air skills, measured according to the cost required to complete a problem, as necessary to complete a task. For example, the arches paid $ 17 per test by its human testers, calculates that O3-low Openai calculates that 200 $ 200 job fees.

“I think it’s now a great step towards a more realistic evaluation of the AI ​​models for more actualistic evaluation,” Joseph Imperial Bath University, United Kingdom. “This is a sign that we are moving from one-dimensional evaluation testing based on performance, but also considering smaller computation power.”

Any model capable of passing to ARC-AGI-2 should not be very competent, but even smaller and lightweight, says imperial – with the efficiency of the model being a key component of the new reference. This can help AI models that are becoming more energy intensive – Sometimes to useless point – to get tough results.

However, not everyone is convinced that the new measure is beneficial. “The whole frame of mind testing is not the right frame,” he says Catherine Flick Staffordshire University, United Kingdom. Instead, these benchmarks only meet the ability of the air to complete the only task or set of tasks, and then what general complex tasks mean.

Onderly in these benchmarks, it is not towards the document.

And exactly what happens when Arc-Agi-2 passes when another question – should we need another reference? “If I had to develop an ARC-AGI-3, the minimum number of human beings (expert or not) to solve tasks, in addition to performance and efficiency,” said the imperial. In other words, the discussion of Agi will not be settled soon.

Themes:



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleAs Measles Cases Surge, Scientists Explain Why Vaccination Is Critical
Next Article 23andMe Bankruptcy Leaves Troves of Genetic Data at Risk
Admin
  • Website

Related Posts

Science

Electrical synapses genetically engineered in mammals for first time

April 14, 2025
Science

Does Your Language’s Grammar Change How You Think?

April 14, 2025
Science

This Butterfly’s Epic Migration Is Written into Its Chemistry

April 13, 2025
Add A Comment
Leave A Reply Cancel Reply

Latest News
Israel at War

Biden commutes nearly all federal death sentences, but not for Tree of Life shooter

December 23, 2024
Politics

How Republicans’ Endowment Tax Will Hurt Higher Education

July 7, 2025
U.S.

JonBenet Ramsey case: Progress being made, sources say

December 2, 2024
Science

Electric cars now last as long as petrol and diesel counterparts

January 24, 2025
Sports

John Cooney: Northern Irish boxer in intensive care after suffering brain injury in defeat to Nathan Howells | Boxing News

February 3, 2025
Sports

Team Cup: Schedule, format and players involved in Ryder Cup-style match play event on the DP World Tour | Golf News

January 7, 2025
Categories
  • Home
  • Business
  • U.S.
  • World
  • Politics
  • Sports
  • Science
  • More
    • Health
    • Entertainment
    • Education
    • Israel at War
    • Life & Trends
    • Russia-Ukraine War
Most Popular

Why DeepSeek’s AI Model Just Became the Top-Rated App in the U.S.

January 28, 202553 Views

New Music Friday February 14: SZA, Selena Gomez, benny blanco, Sabrina Carpenter, Drake, Jack Harlow and More

February 14, 202515 Views

Why Time ‘Slows’ When You’re in Danger

January 8, 202515 Views

Top Scholar Says Evidence for Special Education Inclusion is ‘Fundamentally Flawed’

January 13, 202512 Views

Oh hi there 👋
It’s nice to meet you.

Sign up to receive awesome content in your inbox, every month.

Check your inbox or spam folder to confirm your subscription.

  • Home
  • About us
  • Get In Touch
  • Privacy Policy
  • Terms & Conditions
© 2026 All Rights Reserved - Orrao.com

Type above and press Enter to search. Press Esc to cancel.