Thoughts

Thoughts after NeurIPS, 18 Dec 2023

This is my first NeurIPS - last year I had three papers and no visa.

The general atmosphere is contagious: the echo chamber of AI is indeed comfortable and exciting. When so many people get together to have low-latency conversations that plastically shape their views, an electrifying air ensues. I have been enlightened countless times during a mere week. However, it’s worthwhile to remember that it is still an echo chamber and we should ground ourselves firmly in reality >95% of the time to not be intoxicated and completely detached.

Two main themes dominate the conference for me: AI for maths and language models. Here I want to discuss them, first individually and then how they relate to each other.

AI for maths:

More people than ever are getting interested in AI for maths. The tutorial on machine learning for theorem proving captivated a room of ~300 people; For the first time I heard “theorem proving” being mentioned in the halls of a machine learning conference venue by random people.

Behind the popularity are two seemingly contradictory views: one is that AI is finally good enough that we have a glimpse of what they can do for maths; the other is that AI is still extremely stupid in reasoning that mathematical abilities are something that we desperately need to improve on. They are not really contradictory. Moravec’s paradox rules over mathematics too: while being very helpful in certain tasks such as embedding the entire formal library and enabling searching over them with hybrid natural and formal languages, even the most advanced LLMs by themselves can’t reliably verify abstract reasoning steps (even the very simple ones).

So, what are the promising directions?

There are many and my opinions here are likely to be quite limited and should not deter you from carrying out research in completely different directions.

I leave with more questions than when I came.

Language models

More thoughts on language models are a bit more fragmented:

Both together

If it wasn’t clear 2 years ago, it should be very clear now that language models are an essential component of building general AI mathematicians/assistants and mathematics is a great area to test language models in. It’s time we push hard in both directions.