Close Menu
Xarkas BlogXarkas Blog
    What's Hot

    How to Prepare Your Island For Animal Crossing New Horizons 3.0

    January 7, 2026

    Motorola Razr Fold Launched at CES 2026 to Take on Galaxy Z Fold 7: Razr FIFA World Cup Edition Tags Along

    January 7, 2026

    Threads is developing in-message games

    January 7, 2026
    Facebook X (Twitter) Instagram
    Xarkas BlogXarkas Blog
    • Tech News

      Apple Vision Pro vs Meta Quest 3: The Ultimate VR Headset Showdown

      December 3, 2025

      ChatGPT told them they were special — their families say it led to tragedy

      November 24, 2025

      Beehiiv’s CEO isn’t worried about newsletter saturation

      November 24, 2025

      TechCrunch Mobility: Searching for the robotaxi tipping point

      November 24, 2025

      X’s new About This Account feature is going great

      November 24, 2025
    • Mobiles

      Motorola Razr Fold Launched at CES 2026 to Take on Galaxy Z Fold 7: Razr FIFA World Cup Edition Tags Along

      January 7, 2026

      Motorola’s Signature Phone Key Specs Leaked Ahead of Launch

      January 7, 2026

      Xiaomi launches Redmi Note 15 5G and Redmi Pad 2 Pro in India

      January 6, 2026

      Realme Launches 16 Pro Series With 7,000mAh Batteries and 144Hz Displays

      January 6, 2026

      Motorola Razr Fold Appears in Leaked Renders With Stylus Support

      January 6, 2026
    • Gaming

      How to Prepare Your Island For Animal Crossing New Horizons 3.0

      January 7, 2026

      Threads is developing in-message games

      January 7, 2026

      Intel is building a handheld gaming platform including a dedicated chip

      January 7, 2026

      All Milestone Rewards in Monopoly GO’s Hufflepuff Drop

      January 6, 2026

      Games Where You’re Not Just The Hero: You’re A Resident

      January 6, 2026
    • SEO Tips
    • PC/ Laptops

      CES 2026: Samsung Unveils New Galaxy Book6 Laptops

      January 6, 2026

      CES 2026: HP Shows a Keyboard-Based PC and New EliteBooks

      January 6, 2026

      CES 2026: Intel Unveils Core Ultra Series 3, Its First Platform Built on 18A

      January 6, 2026

      Best Laptops for Students Under ₹20,000 in 2026

      January 4, 2026

      LG To Unveil 2026 Gram Lineup at CES 2026

      January 2, 2026
    • EV

      Here’s How Much It Costs

      November 15, 2025

      Sodium-Ion Batteries Have Landed In America. The Hard Part Starts Now

      November 15, 2025

      Mazda Begins Testing Its Long-Overdue U.S. EV

      November 14, 2025

      Volkswagen Adds Smartwatch Support For U.S. Vehicles

      November 14, 2025

      TATA.ev expands charging footprint with 14 new manned MegaChargers across AP, Telangana

      November 14, 2025
    • Gadget
    • AI
    Facebook
    Xarkas BlogXarkas Blog
    Home - Editor's Choice - Platform allows AI to learn from constant, nuanced human feedback rather than large datasets
    Editor's Choice

    Platform allows AI to learn from constant, nuanced human feedback rather than large datasets

    KavishBy KavishDecember 15, 2024No Comments6 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Platform allows AI to learn from constant, nuanced human feedback rather than large datasets
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email


    Training AI through Human Interactions Instead of Datasets
    GUIDE: The training consists of two stages: During the Human guidance stage, the human trainer observes the state and action taken by the agent and provides real-time continuous feedback. The feedback values are grounded into per-step dense rewards and combined with the environment reward. Concurrently, we train a human feedback simulator that takes in state-action pairs and regresses the feedback values. During the Automated guidance stage, the trained simulator stands in for the human and provides feedback to continue to improve the policy, effectively reducing human efforts and cognitive loads. Credit: arXiv (2024). DOI: 10.48550/arxiv.2410.15181

    During your first driving class, the instructor probably sat next to you, offering immediate advice on every turn, stop and minor adjustment. If it was a parent, they might have even grabbed the wheel a few times and shouted “Brake!” Over time, those corrections and insights developed experience and intuition, turning you into an independent, capable driver.

    Although advancements in artificial intelligence (AI) have made self-driving cars a reality, the teaching methods used to train them remain a far cry from even the most nervous side-seat driver. Rather than nuance and real-time instruction, AI learns primarily through massive datasets and extensive simulations, regardless of the application.

    Now, researchers from Duke University and the Army Research Laboratory have developed a platform to help AI learn to perform complex tasks more like humans. Nicknamed GUIDE for short, the AI framework will be showcased at the upcoming Conference on Neural Information Processing Systems (NeurIPS 2024), taking place Dec. 9–5 in Vancouver, Canada. The work is also available on the arXiv preprint server.

    “It remains a challenge for AI to handle tasks that require fast decision making based on limited learning information,” explained Boyuan Chen, professor of mechanical engineering and materials science, electrical and computer engineering, and computer science at Duke, where he also directs the Duke General Robotics Lab.

    “Existing training methods are often constrained by their reliance on extensive pre-existing datasets while also struggling with the limited adaptability of traditional feedback approaches,” Chen said. “We aimed to bridge this gap by incorporating real-time continuous human feedback.”






    Credit: Duke University

    GUIDE functions by allowing humans to observe AI’s actions in real-time and provide ongoing, nuanced feedback. It’s like how a skilled driving coach wouldn’t just shout “left” or “right,” but instead offer detailed guidance that fosters incremental improvements and deeper understanding.

    In its debut study, GUIDE helps AI learn how best to play hide-and-seek. The game involves two beetle-shaped players, one red and one green. While both are controlled by computers, only the red player is working to advance its AI controller.

    The game takes place on a square playing field with a C-shaped barrier in the center. Most of the playing field remains black and unknown until the red seeker enters new areas to reveal what they contain.

    As the red AI player chases the other, a human trainer provides feedback on its searching strategy. While previous attempts at this sort of training strategy have only allowed for three human inputs—good, bad or neutral—GUIDE has humans hover a mouse cursor over a gradient scale to provide real-time feedback.

    The experiment involved 50 adult participants with no prior training or specialized knowledge, which is by far the largest-scale study of its kind. The researchers found that just 10 minutes of human feedback led to a significant improvement in the AI’s performance. GUIDE achieved up to a 30% increase in success rates compared to current state-of-the-art human-guided reinforcement learning methods.

    “This strong quantitative and qualitative evidence highlights the effectiveness of our approach,” said Lingyu Zhang, the lead author and a first-year Ph.D. student in Chen’s lab. “It shows how GUIDE can boost adaptability, helping AI to independently navigate and respond to complex, dynamic environments.”

    The researchers also demonstrated that human trainers are only really needed for a short period of time. As participants provided feedback, the team created a simulated human trainer AI based on their insights within particular scenarios at particular points in time. This allows the seeker AI to continually train long after a human has grown weary of helping it learn. Training an AI “coach” that isn’t as good as the AI it’s coaching may sound counterintuitive, but as Chen explains, it’s actually a very human thing to do.

    “While it’s very difficult for someone to master a certain task, it’s not that hard for someone to judge whether or not they’re getting better at it,” Chen said. “Lots of coaches can guide players to championships without having been a champion themselves.”

    Another fascinating direction for GUIDE lies in exploring the individual differences among human trainers. Cognitive tests given to all 50 participants revealed that certain abilities, such as spatial reasoning and rapid decision-making, significantly influenced how effectively a person could guide an AI. These results highlight intriguing possibilities such as enhancing these abilities through targeted training and discovering other factors that might contribute to successful AI guidance.

    These questions point to an exciting potential for developing more adaptive training frameworks that not only focus on teaching AI but also on augmenting human capabilities to form future human-AI teams. By addressing these questions, researchers hope to create a future where AI learns not only more effectively but also more intuitively, bridging the gap between human intuition and machine learning, and enabling AI to operate more autonomously in environments with limited information.

    “As AI technologies become more prevalent, it’s crucial to design systems that are intuitive and accessible for everyday users,” said Chen. “GUIDE paves the way for smarter, more responsive AI capable of functioning autonomously in dynamic and unpredictable environments.”

    The team envisions future research that incorporates diverse communication signals using language, facial expressions, hand gestures and more to create a more comprehensive and intuitive framework for AI to learn from human interactions. Their work is part of the lab’s mission toward building the next-level intelligent systems that team up with humans to tackle tasks that neither AI nor humans alone could solve.

    More information:
    Lingyu Zhang et al, GUIDE: Real-Time Human-Shaped Agents, arXiv (2024). DOI: 10.48550/arxiv.2410.15181

    Journal information:
    arXiv


    Provided by
    Duke University


    Citation:
    Platform allows AI to learn from constant, nuanced human feedback rather than large datasets (2024, December 3)
    retrieved 15 December 2024
    from https://techxplore.com/news/2024-12-platform-ai-constant-nuanced-human.html

    This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
    part may be reproduced without the written permission. The content is provided for information purposes only.





    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Kavish
    • Website

    Related Posts

    How to Prepare Your Island For Animal Crossing New Horizons 3.0

    January 7, 2026

    Motorola Razr Fold Launched at CES 2026 to Take on Galaxy Z Fold 7: Razr FIFA World Cup Edition Tags Along

    January 7, 2026

    Threads is developing in-message games

    January 7, 2026

    Motorola’s Signature Phone Key Specs Leaked Ahead of Launch

    January 7, 2026

    Intel is building a handheld gaming platform including a dedicated chip

    January 7, 2026

    Xiaomi launches Redmi Note 15 5G and Redmi Pad 2 Pro in India

    January 6, 2026

    Comments are closed.

    Top Reviews
    Editors Picks

    How to Prepare Your Island For Animal Crossing New Horizons 3.0

    January 7, 2026

    Motorola Razr Fold Launched at CES 2026 to Take on Galaxy Z Fold 7: Razr FIFA World Cup Edition Tags Along

    January 7, 2026

    Threads is developing in-message games

    January 7, 2026

    Motorola’s Signature Phone Key Specs Leaked Ahead of Launch

    January 7, 2026
    About Us
    About Us

    Email Us: info@xarkas.com

    Facebook Pinterest
    © 2026 . Designed by Xarkas Technologies.
    • Home
    • Mobiles
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.

    Ad Blocker Enabled!
    Ad Blocker Enabled!
    Our website is made possible by displaying online advertisements to our visitors. Please support us by disabling your Ad Blocker.