3 Comments
Jan 21, 2023·edited Jan 21, 2023Liked by shako

This type of thinking is a step in the right direction, but both the ML research and the AI alignment communities already have some quite detailed predictions about the actual mechanisms of language modeling -> intelligence:

https://www.alignmentforum.org/posts/vJFdjigzmcXMhNTsx/simulators

https://twitter.com/jacobandreas/status/1600118539263741952 (paper submitted in May 2022)

Essentially, in this Twitter language, the LM learns Guy Typology to next-token-predict text on the internet, including Guy Who Does Maths, Guy Who Knows Facts About The Roman Empire, and a zillion others. If we had an infinite amount of text produced from any single Guy, the low-loss limit of training would model what that Guy says perfectly.

And each Guy is only a view on the world model; by learning views from many different viewpoints, the LM develops a world model consistent with those views. Sort of like neural radiance fields intuition.

There are of course architectural constraints: functionalities require more parameters as they get more complex, and a priori the optimization landscape might be difficult.. But adaptive gradient methods on transformers seem to work well so far.

Expand full comment

Great article thank you.

Expand full comment