ProofWiki problem 21   ProofWiki problem 28   ProofWiki problem 24   ProofWiki problem 25   ProofWiki problem 39

ProofWiki problem 24 Let \(\xi\) be an irrational number. Then show there are infinitely many relatively prime integers \(p, q \in \mathbb{N}_{>0}\) such that: $$\left| {\xi - \frac{p}{q}}\right| < \frac {1}{\sqrt{5} q^2}$$

Finally, Problem 24 is another difficult problem. Its solution on the ProofWiki website requires a number of lemmas and some subtle reasoning. Solving a problem of this kind would require some planning capability, or at the very least the ability to backtrack and experiment with various ideas. This is something that GPT-4 doesn't appear to possess beyond what can be `computed' within the model itself.

GPT-4 does make the completely reasonable first step of approaching this problem using a continued fraction expansion of the irrational number \(\xi\). Many approximation problems of this kind do indeed proceed this way. Continued fractions yield a sequence of convergents \(p_n/q_n\) that converge to the irrational number \(\xi\).

After picking a reasonable theorem from the theory of continued fractions and applying it, GPT-4 has the following expression $$q_n q_{n+1} > \sqrt{5} q_n^2$$ At this point it is clear that GPT-4 does not know how to proceed, but knows what it should end up with, so makes the unsubstantiated claim that this inequality is satisfied when \(q_{n+1} > \sqrt{5} q_n\).

There is no reason to infer that this should be the case at this point in the problem and if the particular chosen approach is to work out, this would have to be proved. Instead of doing so, GPT-4 just asserts that it is true without attempting to prove it.

When asked directly how to prove this statement GPT-4 clearly has no idea how to do so and makes a completely bogus claim that a sequence with linear growth will eventually outgrow a sequence with exponential growth. It seems to be common for GPT-4 to hallucinate details when things aren't working out or if it doesn't know a reasonable answer.