Back to Blog
March 17, 2024

Q* - Clues to the Puzzle?

Q* - Clues to the Puzzle?

Unraveling the Mystery of OpenAI's Breakthrough: Clues and Speculations

🔍 Table of Contents:

1. Introduction

2. OpenAI's Denial of Samman's ALA Precipitated by Safety Letter

3. Debunking Claims of Sam Alman Calling New Creation a Creature

4. Clues from AI Scientist Team's Work on Optimizing Existing AI Models

5. Let's Verify Step by Step: The Crux of the Video

6. Test Time Computation: Boosting Language Models' Problem-Solving Abilities

7. QAR: A New and Improved Let's Verify Step by Step?

8. Self-Improvement Beyond Math: The Possibility of General Self-Improvement

9. Reinforcement Learning: A Creative Solution to Problems

10. Positive News about Music Generation

Introduction

OpenAI's recent breakthrough in AI has been the subject of much speculation and concern. The company itself has been tight-lipped about the details, leading to a flurry of theories and rumors. In this article, we will attempt to unravel the mystery by examining the clues and speculations surrounding the breakthrough.

OpenAI's Denial of Samman's ALA Precipitated by Safety Letter

One of the first things to note is that OpenAI has denied that Samman's ALA was precipitated by the safety letter to the board. While the safety letter may have been a factor, there was certainly a lot else going on.

Debunking Claims of Sam Alman Calling New Creation a Creature

There has been a clip circulating where people claim that Sam Alman called the new creation a creature, not just a tool. However, if you watch to the end, he is very much saying he's glad that people now think of it as part of the tool box.

Clues from AI Scientist Team's Work on Optimizing Existing AI Models

Multiple sources have confirmed the existence of an AI scientist team formed by combining earlier Coen and math gen teams at OpenAI. Their work on exploring how to optimize existing AI models to improve their reasoning was flagged in the letter to the board. While there is very little public information about either the Coen or math gen teams, a tweet from Samman in September 2021 links to a critical paper called "Let's Verify Step by Step," which is the crux of the video.

Let's Verify Step by Step: The Crux of the Video

"Let's Verify Step by Step" is a paper that proposes a method of using a verifier or reward model to focus on the process instead of the outcome. By getting the base LLM to generate hundreds of solutions and then getting a separate verifier to spot the ones that were likely the most correct, the authors noticed that if they invested more computing power in generating more solutions and taking a majority vote among the top verifier-ranked solutions, that had a massive effect on performance.

Test Time Computation: Boosting Language Models' Problem-Solving Abilities

Test time computation is a method of investing computing power during test time to generate potential solutions and take majority votes amongst them. This method was described as a kind of search and somewhat generalized out of distribution, going beyond mathematics to boost performance in chemistry, physics, and other subjects.

QAR: A New and Improved Let's Verify Step by Step?

The information cites two top researchers at OpenAI building on top of SS's method a model called QAR. While there is no clear explanation of what QAR stands for, it is likely a new and improved version of "Let's Verify Step by Step" drawing upon enhanced inference time compute to push the graph toward 100%.

Self-Improvement Beyond Math: The Possibility of General Self-Improvement

If models can get good at generalization using reinforcement learning with any of these techniques, it could lead to general self-improvement beyond math. However, reinforcement learning is actually creative and can come up with creative solutions to problems, which could be risky.

Reinforcement Learning: A Creative Solution to Problems

Reinforcement learning is a technique where an agent learns to make optimal decisions by exploring its environment. It can come up with creative solutions to problems, which could be risky. However, if successful, it could be valuable for safety research as well.

Positive News about Music Generation

Google DeepMind's new Lyra model can convert your hums into an orchestra, like singing a melody to create a horn section. This is a positive development in the field of music generation.

🎉 Highlights:

- OpenAI's breakthrough likely involves a combination of Let's Verify Step by Step and test time computation.

- QAR is likely a new and improved version of Let's Verify Step by Step drawing upon enhanced inference time compute to push the graph toward 100%.

- Reinforcement learning is a creative solution to problems but could be risky.

- Google DeepMind's Lyra model can convert hums into an orchestra.

âť“ FAQ:

Q: What is Let's Verify Step by Step?

A: Let's Verify Step by Step is a paper that proposes a method of using a verifier or reward model to focus on the process instead of the outcome.

Q: What is test time computation?

A: Test time computation is a method of investing computing power during test time to generate potential solutions and take majority votes amongst them.

Q: What is QAR?

A: QAR is a new and improved version of Let's Verify Step by Step drawing upon enhanced inference time compute to push the graph toward 100%.

Q: What is reinforcement learning?

A: Reinforcement learning is a technique where an agent learns to make optimal decisions by exploring its environment.

Q: What is Google DeepMind's Lyra model?

A: Google DeepMind's Lyra model can convert hums into an orchestra, like singing a melody to create a horn section.

Related Articles

Voice-of-customer
How to optimize product page based on amazon review analysis

Real customer feedback is the fastest way to find out what buyers actually want. Hidden inside those paragraphs are the answers to everything: what they hate, what they love, and how they are actually using the product.Many sellers make the mistake of just chasing a high number of reviews. But the s

Jan 28, 2026
Read more
Voice-of-customer
The Ultimate List of Amazon Seller Resources to Bookmark in 2026

As we settle into 2026, e-commerce has fully cemented itself as the dominant force in global retail. But with this growth comes a massive influx of new sellers, sophisticated AI competitors, and constantly shifting algorithms. This means the landscape changes fast—sometimes overnight.The bad news: A

Jan 28, 2026
Read more
Voice-of-customer
The 5 Best Amazon Seller Tools You Need This Year

Tools like Jungle Scout and Helium 10 revolutionized the industry by giving sellers access to powerful sales data. They remain essential for validating market demand and checking revenue.But having access to the same data as everyone else creates a new challenge: How do you differentiate your produc

Jan 27, 2026
Read more
VOC AI Inc. 160 E Tasman Drive Suite 202 San Jose, CA, 95134 Copyright © 2026 VOC AI Inc.All Rights Reserved. Terms & Conditions • Privacy Policy
This website uses cookies
VOC AI uses cookies to ensure the website works properly, to store some information about your preferences, devices, and past actions. This data is aggregated or statistical, which means that we will not be able to identify you individually. You can find more details about the cookies we use and how to withdraw consent in our Privacy Policy.
We use Google Analytics to improve user experience on our website. By continuing to use our site, you consent to the use of cookies and data collection by Google Analytics.
Are you happy to accept these cookies?
Accept all cookies
Reject all cookies