In February 2020, I read this article describing the initial vision of OpenAI and how it had evolved. I find the whole concept around OpenAI quite exciting, and especially the vision which is focused on the ethical AI, and how to make sure it does not go all wrong at the end once AGI (Artificial General Intelligence) arrives. AGI arriving means machines have the abilities to fully simulate a human mind, which has brought many questions and worries to date. Hence, having an organisation that is on the mission to make sure ‘that the technology is developed safely and its benefits distributed evenly to the world” sounds good to me.
But the article was quite bold about innovation — or the lack of it — within OpenAI:
“Most of [OpenAI] breakthroughs have been the product of sinking dramatically greater computational resources into technical innovations developed in other labs.”
In other words, this article was explaining that there has been a lot of investment into OpenAI, and all they’ve done was paid for servers on which to run someone else’s ‘innovation’ so they can get some results that they’ll then want to call a breakthrough.
Yet, a few months later, OpenAI released what is considered one of the biggest breakthroughs in AI in decades called GPT-3.
How is this possible?
What is wrong with ‘re-cycling algorithms’ in science?
Judgement around OpenAI’s innovation came from the fact that they were reusing what has already been built in other labs, despite the fact that they were by definition working on pushing the AI game forward. The question is: What exactly is wrong with reusing breakthroughs and building up on them?
Over the years, there are some libraries that I have been so grateful to be able to use, because they’ve made me 90%+ more productive. Sometimes you have to do things from scratch, but there is a reward from re-using as well as learning from someone else’s work and experience. Someone who has possibly have done it from scratch and had to figure out all those little things that always come to be on your way, and you have to keep going, keep thinking, keep overcoming them to move to the creative side of whatever you are trying to accomplish.
Can ‘recycling’ science then be considered innovation?
And would it have been wiser if OpenAI ignored everything that has happened in the world and started from scratch?
Any type of recycling is always a positive, and even more so if this enables you to move things forward. If you are enabling others to make progress. If people can learn from your ‘recycling’ to enable further innovation. As you are not necessarily recycling. You are building up on top of someone else’s work.
Sometimes, innovation requires deleting everything you know and starting from scratch. Other times, innovation means building up on what is already out there. Which is, I’d assume, why OpenAI was (and I’m hoping still is) “sinking dramatically greater computational resources into technical innovations developed in other labs”.
In conversation with Dr Xingyi Song, a Speech and NLP scientist, he mentioned that “recycling an algorithm” in AI is similar to a physicist doing experiment to prove an unproven theory, and that models such as GPT-3 enabled us to know what the deep learning, or transformer models are capable of.
Prior to GPT-3, BERT model released by Google was a game-changer for anyone working in Natural Language Processing. It is estimated that the cost for Google to train one iteration of BERT model is up to $1.6m (see this paper for more details). Practically speaking you’d probably need a few iterations, which means it might cost you around $10m to use BERT in your application if you were to train it from scratch. Who else has that much money to spare? And who else has access to all the data that Google has at hand?
What innovation isn’t?
Reinventing the wheel. Building things from scratch if you don’t have to.
To teach me how to properly serve in tennis, my coach said he first had to make me forget what I knew. This left me having two ‘left’ hands for a few weeks, but then at least my serve started to look like I’m not playing badminton. Starting from a tabula rasa is often useful, especially when you’ve realised that you won’t go very far unless you change something, no matter how much resources you put into it. But, if my serve was decent to start with, perhaps investing more time and hours and practice would get me a breakthrough?
When you have to start from scratch
Starting from scratch is what Musk had to do with Tesla. He made a car that isn’t really a car, in a traditional sense. It’s a super-computer. It feels alive. Other car manufacturers tried to make electric cars by building up on the knowledge they already had from building millions of their petrol and diesel cars. But their knowledge wasn’t helping them, because they’ve spent too much time in the car industry and it was difficult to start from scratch for a project that needed a full reset. An electric car, and a self-driving car, was craving for a new innovative way to completely disrupt the car industry. It needed to be done from scratch. Musk made it so cool that even someone who doesn’t know anything about cars (me) can get really excited about Tesla, as it is obviously the future. As a technologist, I can see that there is no turning back. Musk has already made a dent, now it’s only a matter of how quickly can it be made affordable to everyone.
Even with Musk and Tesla, I would argue that, while it can be seen as a completely “from scratch” disruptive project, all those hours and days and sleepless nights are the ones that Musk had to invest in order to build the foundations for his learning (see this article to appreciate why this worked). To start from scratch, you really need to have your foundations steady.
Algorithms are only part of the puzzle
We didn’t really see a massive change in the algorithms since the beginning of AI in 1950s, but what we have seen is the amount of data we suddenly have available, as well as the computational resources that we can use to process the data through. Most AI breakthroughs were a result of building up on the knowledge we already knew, but figuring out how to do it on a large scale (take for example IBM’s breakthrough with Watson’s winning a Jeopardy competition outperforming humans in 2011).
If some innovations have already been generating positive results, there is nothing wrong with trying to push them even further, when you are trying to make a breakthrough.
Plus, no technical innovations developed in other labs (assuming this refers to open source libraries such as Tensorflow) are capable of solving the problem by themselves, without putting significant resources into engineering and use of the actual algorithm and also, cleaning and preparing your dataset, plus deciding on what your dataset is, which part of it to use, and how. And not to forget, defining what you want your algorithm to do. All of these are massive decisions that need to be made in order to do anything with “technical innovations developed in other labs”.
Speaking of Tensorflow, I remember Google’s announcement of it going open-source and people asking them “Why did you do this?”. They said: “Because we can’t do it alone”. While it is arguable that there are many other reasons that could apply, it is 100% correct that even Google, with unlimited resources comparatively to anyone else, can’t do it alone.
Given the success of GPT-3 released recently by OpenAI we could argue that all we need is to “[sink] dramatically greater computation resources into technical innovations developed in other labs”?
If only that was so easy.