Coming from a Q&A recap:

GPT-4 is coming, but currently the focus is on coding (i.e. Codex) and that’s also where the available compute is going. GPT-4 will be a text model (as opposed to multi-modal). It will not be much bigger than GPT-3, but it will use way more compute. People will be surprised how much better you can make models without making them bigger.

The progress will come from OpenAI working on all aspects of GPT (data, algos, fine-tuning, etc.). GPT-4 will likely be able to work with longer context and (possibly) be trained with a different loss function – OpenAI has “line of sight” for this. (Uncertain about “loss” function, I think he said something like “different value function”, so this might be a misinterpretation.)

GPT-5 might be able to pass the Turing test. But probably not worth the effort.

100 trillion parameter model won’t be GPT-4 and is far off. They are getting much more performance out of smaller models. Maybe they will never need such a big model.

It is not yet obvious how to train a model to do stuff on the internet and to think long on very difficult problems. A lot of current work is how to make it accurate and tell the truth.”



Source link