GPT의 오리진 스토리

2025-10-15

GPT의 오리진 스토리. 패권: 누가 AI 전쟁의 승자가 될 것인가 10장 “Size Matters”에서 발췌/요약.

Chapter 10. Size Matters

DeepMind와의 차별화를 원했던 OpenAI:

OpenAI was about to differentiate itself from DeepMind in another way. Ilya Sutskever, OpenAI’s star scientist, couldn’t stop thinking about what transformer could do with language. Google was using it to better understand text. What if OpenAI used it to generate text? … Although OpenAI is best known today for ChatGPT, back in 2017 it was still throwing spaghetti on the wall to see what would stick ….

Decoder-only transformer:

The transformer model that powered Google Translate used something called an encoder and a decoder to process words. The encoder would process the sentence coming in, perhaps in English, and the decoder would generate the output, like a sentence in French. …

Alec Radford and Sutskever figured out that they could get rid of [encoder part], and instead just have one, the decoder, listen to you and talk back by itself. Early testing showed that the idea worked in practice, which meant they could build a more streamlined language model that was quicker and easier to troubleshoot and grow. And making it “decoder only” would also be a game-changer. By combining a model’s ability to “understand” and speak into one fluid process, it could ultimately generate more humanlike text.

Scaling:

The next step was to vastly increase the amount of data, computing power, and capacity of their language model. Sutskever had long believed that “success was guaranteed” when you scaled everything up in AI, especially with language models.

GPT:

He and his colleagues started working on a new language model they called a “generatively pre-trained transformer” or GPT for short.

진짜로 이해를 하는걸까?

Down the line, (GPT) became more sophisticated, people at OpenAI and beyond would question whether these new large language models were actually understanding language and not just inferring it. …

This issue would come to divide people, even in the AI community. Did the increasing sophistication of these models mean they were becoming sentient? The answer was most likely no, but even experienced engineers and researchers would soon believe otherwise, with some falling under an emotional spell from AI-generated text that seemed loaded with empathy and personality.

구글에 대한 OpenAI의 두려움:

Sutskever meanwhile was keeping an eye on what was happening over at Google, where engineers were finally putting the transformer to use. Besides improving the company’s glitchy translation service, Google had used it to build a new program called BERT that would help improve its search engine. …

The search giant had every building block it needed to build AGI, too, if it wanted, from the transformer to the TPU, a powerful proprietary chip for training AI models.

“I would wake up, nervous that Google was just gonna go release something much better than us,” remembers one former OpenAI manager. By exploiting Google inventions like the transformer, it felt like OpenAI was playing with the search giant’s toys and somehow getting away with it. “We were like, there’s no way we’re gonna win.”