Pell Equations with Algorithms

While doing my math/CS undergraduate degree at the University of Alberta, I presented at the Seminar in Number Theory for Alberta Students on solving Pell equations using continued fractions. I also presented an O(n) method for finding the fundamental solution to the Pell equation \(x^2-dy^2=1\) during my talk. In this blog post, I’m going to present the essential parts of my talk on how this O(n) method works, show the results of it for up to \(d=99\), and also analyze the negative Pell equation \(x^2-dy^2=-1\).

As a note, the first several definitions are all rewritten from David Burton’s Elementary Number Theory, which greatly inspired this post. Burton’s text is one of the most well-written and comprehensive number theory books I know, and I strongly recommend it.

Definition: A finite continued fraction is of the form $$a_0+\frac{1}{a_1+\frac{1}{a_2+\frac{1}{a_3+\frac{1}{\ddots+\frac{1}{a_{n-1}+\frac{1}{a_n}}}}}},$$ where the \(a_i\)’s are positive real numbers, except possibly \(a_0\) which may be \(0\). The fraction is called simple if all the \(a_i\) are integers. We often denote such a fraction as \([a_0; a_1, \cdots, a_n]\)

For instance, $$[1; 2, 3, 4]=1+\frac{1}{2+\frac{1}{3+\frac{1}{4}}}=1+\frac{1}{2+\frac{1}{\frac{13}{4}}}=1+\frac{1}{2+\frac{4}{13}}=\cdots=\frac{43}{30}.$$

Definition: Given a continued fraction \([a_0; a_1, \cdots, a_n]\), we define the convergent \(C_k\) that is obtained by cutting off the fraction after \(a_k\) to be \(C_k=[a_0; a_1, \cdots, a_k], 1\le k\le n.\) By definition, \(C_0=a_0\).


Example: We can see that the continued fraction for \(\phi=\frac{1+\sqrt{5}}{2}\) is given by \(\phi=[1; 1, 1, \cdots]\). To see this, observe that \(\phi\) satisfies the equation \(\phi=1+\frac{1}{\phi}=1+\frac{1}{1+\frac{1}{\phi}}=\cdots\).


If we compute the \( k^{\mbox{th}}\) convergent of the continued fraction for \( \phi\), we get the Fibonacci numbers! Observe that \( C_0=\frac{1}{1}, C_1=\frac{2}{1}, C_2=1+\frac{1}{1+\frac{1}{1}}=\frac{3}{2}, C_3=1+\frac{2}{3}=\frac53\), and so forth. In general, we claim that \(C_k=\frac{F_{k+2}}{F_{k+1}}\), where \(F_1=F_2=1\) and \(F_n=F_{n-1}+F_{n-2}\) for \(n\ge 2\). To prove this, the base cases are given above. If it holds for \(k\ge 1\), then $$C_{k+1}=1+\frac{1}{C_k}=1+\frac{F_{k+1}}{F_{k+2}}=\frac{F_{k+2}+F_{k+1}}{F_{k+2}}=\frac{F_{k+3}}{F_{k+2}},$$ hence the pattern continues by induction.

This is related to the result \(\phi=\lim_{n\to \infty}\frac{F_{n+1}}{F_n}\).

Theorem: Given a continued fraction \([a_0; a_1, \cdots, a_n]\), we define the recurrence relationships given by \(p_0=a_0, q_0=1, p_1=a_1a_0+1, q_1=a_1\), and in general for \(2\le k\le n, p_k=a_kp_{k-1}+p_{k-2}, q_k=a_kq_{k-1}+q_{k-2}.\) Then the \(latex k^{\mbox{th}}\) convergent is given by \(C_k=\frac{p_k}{q_k}\).


Example: Consider \(\phi=[1; 1, 1, \cdots]\). Since \(a_i=1\) for every \(i\), we have \(p_0=1, q_0=1, p_1=2, q_1=1\), and in general \(p_k=p_{k-1}+p_{k-2}\) and \(q_k=q_{k-1}+q_{k-2}\). Thus \(p_k=F_{k+2}\) and \(q_k=F_{k+1}\) and we see \(C_k=\frac{p_k}{q_k}=\frac{F_{k+2}}{F_{k+1}}\) as before.


Proof: We see that \(C_0=a_0=\frac{p_0}{q_0}, C_1=a_0+\frac{1}{a_1}=\frac{a_1a_0+1}{a_1}=\frac{p_1}{q_1}\), and \(C_2=a_0+\frac{1}{a_1+\frac{1}{a_2}}=a_0+\frac{a_2}{a_1a_2+1}=\frac{a_0a_1a_2+a_0+a_2}{a_1a_2+1}=\frac{a_2(a_0a_1+1)+a_0}{a_1a_2+1}=\frac{p_2}{q_2}.\)
In general, assume the statement holds for every value \(2\le k\le m\), and we show it holds for \(k=m+1\). Observe that since \( p_{m-1}, q_{m-1}, p_{m-2}, q_{m-2}\) are independent of \(a_m\), we can write

Therefore \(p_{m+1}=a_{m+1}p_m+p_{m-1}\) and \(q_{m+1}=a_{m+1}q_m+q_{m-1}\) as desired.

Definition: In an infinite convergent sequence, if we have a block that repeats over and over, we then write \([a_0; a_1, \cdots, a_m, b_1, \cdots, b_n, b_1, \cdots, b_n, \cdots]=[a_0; a_1, \cdots, a_m, \overline{b_1, b_2, \cdots, b_n}]\).

Example: Convert \(\sqrt{3}\) to a continued fraction.

Solution: Let \sqrt{3}=1+\dfrac{1}{a_1+\frac{1}{a_2+\frac{1}{a_3+\cdots}}}. Subtracting \(1\) from both sides gives us \sqrt{3}-1=\dfrac{1}{a_1+\frac{1}{a_2+\frac{1}{a_3+\cdots}}}.

Taking the reciprocal of the equation gives us \dfrac{1}{\sqrt{3}-1}=\dfrac{\sqrt{3}+1}{2}=a_1+\frac{1}{a_2+\frac{1}{a_3+\cdots}}. We then see that a_1=1 since \sqrt{3}\approx 1.73. Now, substituting this into the equation and subtracting gives:

\dfrac{\sqrt{3}-1}{2}=\frac{1}{a_2+\frac{1}{a_3+\cdots}}\implies \dfrac{2}{\sqrt{3}-1}=\sqrt{3}+1=a_2+\dfrac{1}{a_3+\cdots}. Thus a_2=2 and we obtain:

\sqrt{3}-1=\dfrac{1}{a_3+\cdots}. But notice from this point on that it repeats from the original equation! Therefore, a_3=1, a_4=2, and in general \sqrt{3}=[1;\overline{1,2}].

Observe that the first convergent of \sqrt{3} is C_1 = \frac{p_1}{q_1}=\frac{2}{1}. Furthermore, (p_1, q_1)=(2,1) satisfy the equation x^2-3y^2=1. However, notice that the period length of \sqrt{3} is even and the negative Pell equation x^2-3y^2=-1 has no solution since -1 is not a quadratic residue mod 3. These results generalize as follows:

Theorem: If the period length, n, of the simple continued fraction for \sqrt{d} is even, then the fundamental (that is, the smallest) solution to the Pell equation x^2-dy^2=1 is given by the convergent (x,y)=(p_{n-1}, q_{n-1}). Furthermore, then the negative Pell equation x^2-dy^2=-1 has no solution.

Exercise: Compute the continued fraction for \sqrt{18}. Use this to compute the fundamental solution to the Pell equation x^2-18y^2=1 and to show that the negative Pell equation x^2-18y^2=-1 has no solutions.

Now, we’ll examine the case where the period length of the continued fraction is odd. In particular, continue the continued fraction for \sqrt{5}. We see that:

\sqrt{5}=2+\frac{1}{a_1+\frac{1}{a_2+\cdots}}\implies \sqrt{5}-2=\frac{1}{a_1+\frac{1}{a_2+\cdots}}\implies \frac{1}{\sqrt{5}-2}=a_1+\frac{1}{a_2+\cdots}.

Simplifying the left-hand side by conjugating the denominator, we get \sqrt{5}+2. Therefore, a_1=4 and we’re left with \sqrt{5}-2=\frac{1}{a_2+\cdots}. Notice this repeats, hence a_2=a_3=\cdots=4 and we have \sqrt{5}=[2;\overline{4}].

The zeroth convergent of this continued fraction is C_0=\frac{p_1}{q_1}=2\implies p_0=2, q_0=1. The first convergent of this continued fraction is C_1=2+\frac{1}{4}=\frac{9}{4}=\frac{p_1}{q_1}\implies p_1=9, q_1=4. Observe that (p_0, q_0)=(2,1) satisfy the negative Pell equation x^2-5y^2=-1 and that (p_1, q_1)=(9,4) satisfy the Pell equation x^2-5y^2=1.

This result also generalizes as follows:

Theorem: If the period length, n, of the simple continued fraction for \sqrt{d} is odd, then the fundamental solution to the Pell equation x^2-dy^2=1 is given by the convergent (x,y)=(p_{2n-1}, q_{2n-1}). Furthermore, then the negative Pell equation x^2-dy^2=-1 has a fundamental solution given by (x,y)=(p_{n-1}, q_{n-1}).

We can now write a computer program that solves Pell equations by combining these two theorems. Assume that we have a dictionary of continued fractions for each integer \sqrt{d} held in the dictionary continued_fracs. Then the code is:

for d in continued_fracs.keys():
  a = continued_fracs[d]
  n = len(a)-1
  p = [a[0], a[0]*a[1]+1]
  q = [1, a[1]]
  if(n%2 == 0):
    for k in range(2, n):
      p.append(a[k]*p[k-1]+p[k-2])
      q.append(a[k]*q[k-1]+q[k-2])
    print("Fundamental solution to x^2-dy^2=1 for d=%d is x=%d, %d" %(d, p[n-1], q[n-1]))
    print("No solution to negative Pell equation x^2-dy^2=-1 for d=%d" %(d))
else:
  end_a = a[1:]
  a += end_a
  for k in range(2, 2*n):
    p.append(a[k]*p[k-1]+p[k-2])
    q.append(a[k]*q[k-1]+q[k-2])
  print("Fundamental solution to x^2-dy^2=1 for d=%d is x=%d, y=%d:"%(d, p[2*n-1], q[2*n-1]))
  print("Fundamental solution to x^2-dy^2=-1 for d=%d is x=%d, y=%d"%(d, p[n-1], q[n-1]))

In lines 2 through 5, we compute the length of the continued fraction’s period length and the initial values of the recursive sequences p_0, p_1, q_0, q_1. In lines 6 through 11, if the period length is even, we compute each of the values of p_k, q_k according to the recursive definition and then print out the fundamental solution to the Pell equation.

In lines 13 through 19, if the period length is odd, we have to compute up to p_{2n-1}. Hence, we replicate the repeating block of the continued fraction in lines 13 and 14, before computing the values recursively as before in lines 16 and 17. We then print the fundamental solution to the Pell equation in line 18 and the negative Pell equation in line 19.

The full Python file can be found on GitHub, and the output of the code can also be found here. The output for up to d=10 is also printed below in the table:

Value of d Solution to Pell EquationSolution to Negative Pell Equation
2(3,2)(1,1)
3 (2,1)None
5 (9,4) (2,1)
6(5,2) None
7 (8,3) None
8 (3,1)None
10 (19,6)(3,1)

Large Language Model Exploration:
If you’ve made it this far, you may wonder how large language models would do in solving the negative Pell equation x^2-3y^2=-1. I experimented with this question a bit on https://lmarena.ai/ and obtained the following two answers:

The one on the left is a complete fabrication, while Grok-2 impressively solves it correctly. If you experiment with other large language models, I’d be curious to see which ones can and can’t solve questions on negative Pell equations correctly.

References:

  1. David Burton, “Elementary Number Theory”, 7th Edition, 2012.
  2. W. Sierpinski, “Elementary Theory of Numbers”, 1988

Leave a Reply