OpenAI, the creator of ChatGPT, is currently engaged in the development of a groundbreaking strategy for its artificial intelligence models, known as Project Strawberry. This information was obtained from a reliable source and corroborated by internal documentation reviewed by Reuters.

The project, whose specifics have not been previously disclosed, emerges as the Microsoft-supported startup endeavors to demonstrate the advanced reasoning capabilities of its offered models.

Teams within OpenAI are currently engaged in the development of the Strawberry project, as evidenced by a copy of an internal OpenAI document that was reviewed by Reuters in May. The specific date of the document detailing OpenAI’s plan for utilizing Strawberry in research endeavors remains uncertain. The source characterized the plan as a work in progress, and the precise timeline for Strawberry’s public availability could not be determined.

The Inner workings of Strawberry are a closely guarded secret within OpenAI, as confirmed by a reliable source.

The document describes a project that uses Strawberry models with the aim of enabling the company’s AI to not just generate answers to queries but to plan ahead enough to navigate the internet autonomously and reliably to perform what OpenAI terms “deep research,” according to the source.

This is something that has eluded AI models to date, according to interviews with more than a dozen AI researchers.

Asked about Strawberry and the details reported in this story, an OpenAI company spokesperson said in a statement: “We want our AI models to see and understand the world more like we do. Continuous research into new AI capabilities is a common practice in the industry, with a shared belief that these systems will improve in reasoning over time.”

The spokesperson did not directly address questions about Strawberry.

The Strawberry project was formerly known as Q*, which Reuters reported last year was already seen inside the company as a breakthrough.

Two sources described viewing earlier this year what OpenAI staffers told them were Q* demos, capable of answering tricky science and math questions out of reach of today’s commercially-available models.

On Tuesday at an internal all-hands meeting, OpenAI showed a demo of a research project that it claimed had new human-like reasoning skills, according to Bloomberg. An OpenAI spokesperson confirmed the meeting but declined to give details of the contents. Reuters could not determine if the project demonstrated was Strawberry.

OpenAI hopes the innovation will improve its AI models’ reasoning capabilities dramatically, the person familiar with it said, adding that Strawberry involves a specialized way of processing an AI model after it has been pre-trained on very large datasets.

Researchers Reuters interviewed say that reasoning is key to AI achieving human or super-human-level intelligence.

While large language models can already summarize dense texts and compose elegant prose far more quickly than any human, the technology often falls short on common sense problems whose solutions seem intuitive to people, such as recognizing logical fallacies and playing tic-tac-toe. When the model encounters these kinds of problems, it often provides incorrect information.

AI researchers interviewed by Reuters generally agree that reasoning, in the context of AI, involves the formation of a model that enables AI to plan ahead, reflect on how the physical world functions, and work through challenging multi-step problems reliably.

Improving reasoning in AI models is seen as the key to unlocking the ability for the models to do everything from making major scientific discoveries to planning and building new software applications.

OpenAI CEO Sam Altman said earlier this year that in AI “the most important areas of progress will be around reasoning ability.”

Other companies like Google, Meta and Microsoft are likewise experimenting with different techniques to improve reasoning in AI models, as are most academic labs that perform AI research. Researchers differ, however, on whether large language models (LLMs) are capable of incorporating ideas and long-term planning into how they do prediction. For instance, one of the pioneers of modern AI, Yann LeCun, who works at Meta, has frequently said that LLMs are not capable of humanlike reasoning.

AI CHALLENGES

Strawberry, a crucial element in OpenAI's strategy, aims to address these challenges, as confirmed by a reliable source. While the document reviewed by Reuters outlines the objectives of Strawberry, it does not provide specific details on its implementation.

In recent months, OpenAI has discreetly communicated with developers and external parties, indicating its imminent release of technology with substantially enhanced reasoning capabilities. This information was shared by four individuals who attended the company's presentations but requested anonymity due to confidentiality restrictions.

Strawberry incorporates a specialized method known as "post-training" for OpenAI's generative AI models. This involves adapting the base models to enhance their performance in specific areas after they have undergone extensive training on generalized data.

The post-training phase entails various techniques, including "fine-tuning." This widely used process involves human feedback on model responses and exposure to examples of appropriate and inappropriate answers.

Strawberry has similarities to a method developed at Stanford in 2022 called “Self-Taught Reasoner” or “STaR”, one of the sources with knowledge of the matter said. STaR enables AI models to “bootstrap” themselves into higher intelligence levels via iteratively creating their own training data, and in theory could be used to get language models to transcend human-level intelligence, one of its creators, Stanford professor Noah Goodman, told Reuters.

“I think that is both exciting and terrifying…if things keep going in that direction we have some serious things to think about as humans,” Goodman said. Goodman is not affiliated with OpenAI and is not familiar with Strawberry.

Among the capabilities OpenAI is aiming Strawberry at is performing long-horizon tasks (LHT), the document states. LHT refers to complex tasks that necessitate a model's ability to plan ahead and execute a sequence of actions over an extended timeframe, as explained by the initial source.

To achieve this, OpenAI is developing, training, and assessing the models using a "deep-research" dataset, as per the internal OpenAI documentation. However, the specific contents of this dataset and the definition of an extended period remain undisclosed.

OpenAI intends to utilize its models for research purposes, enabling them to autonomously browse the web with the aid of a "CUA" (computer-using agent). This CUA can undertake actions based on its discoveries. Furthermore, OpenAI plans to evaluate the capabilities of its models in executing tasks typically performed by software and machine learning engineers.