Pell Equations with Algorithms

While doing my math/CS undergraduate degree at the University of Alberta, I presented at the Seminar in Number Theory for Alberta Students on solving Pell equations using continued fractions. I also presented an O(n) method for finding the fundamental solution to the Pell equation \(x^2-dy^2=1\) during my talk. In this blog post, I’m going to present the essential parts of my talk on how this O(n) method works, show the results of it for up to \(d=99\), and also analyze the negative Pell equation \(x^2-dy^2=-1\).

As a note, the first several definitions are all rewritten from David Burton’s Elementary Number Theory, which greatly inspired this post. Burton’s text is one of the most well-written and comprehensive number theory books I know, and I strongly recommend it.

Definition: A finite continued fraction is of the form $$a_0+\frac{1}{a_1+\frac{1}{a_2+\frac{1}{a_3+\frac{1}{\ddots+\frac{1}{a_{n-1}+\frac{1}{a_n}}}}}},$$ where the \(a_i\)’s are positive real numbers, except possibly \(a_0\) which may be \(0\). The fraction is called simple if all the \(a_i\) are integers. We often denote such a fraction as \([a_0; a_1, \cdots, a_n]\)

For instance, $$[1; 2, 3, 4]=1+\frac{1}{2+\frac{1}{3+\frac{1}{4}}}=1+\frac{1}{2+\frac{1}{\frac{13}{4}}}=1+\frac{1}{2+\frac{4}{13}}=\cdots=\frac{43}{30}.$$

Definition: Given a continued fraction \([a_0; a_1, \cdots, a_n]\), we define the convergent \(C_k\) that is obtained by cutting off the fraction after \(a_k\) to be \(C_k=[a_0; a_1, \cdots, a_k], 1\le k\le n.\) By definition, \(C_0=a_0\).


Example: We can see that the continued fraction for \(\phi=\frac{1+\sqrt{5}}{2}\) is given by \(\phi=[1; 1, 1, \cdots]\). To see this, observe that \(\phi\) satisfies the equation \(\phi=1+\frac{1}{\phi}=1+\frac{1}{1+\frac{1}{\phi}}=\cdots\).


If we compute the \( k^{\mbox{th}}\) convergent of the continued fraction for \( \phi\), we get the Fibonacci numbers! Observe that \( C_0=\frac{1}{1}, C_1=\frac{2}{1}, C_2=1+\frac{1}{1+\frac{1}{1}}=\frac{3}{2}, C_3=1+\frac{2}{3}=\frac53\), and so forth. In general, we claim that \(C_k=\frac{F_{k+2}}{F_{k+1}}\), where \(F_1=F_2=1\) and \(F_n=F_{n-1}+F_{n-2}\) for \(n\ge 2\). To prove this, the base cases are given above. If it holds for \(k\ge 1\), then $$C_{k+1}=1+\frac{1}{C_k}=1+\frac{F_{k+1}}{F_{k+2}}=\frac{F_{k+2}+F_{k+1}}{F_{k+2}}=\frac{F_{k+3}}{F_{k+2}},$$ hence the pattern continues by induction.

This is related to the result \(\phi=\lim_{n\to \infty}\frac{F_{n+1}}{F_n}\).

Theorem: Given a continued fraction \([a_0; a_1, \cdots, a_n]\), we define the recurrence relationships given by \(p_0=a_0, q_0=1, p_1=a_1a_0+1, q_1=a_1\), and in general for \(2\le k\le n, p_k=a_kp_{k-1}+p_{k-2}, q_k=a_kq_{k-1}+q_{k-2}.\) Then the \(latex k^{\mbox{th}}\) convergent is given by \(C_k=\frac{p_k}{q_k}\).


Example: Consider \(\phi=[1; 1, 1, \cdots]\). Since \(a_i=1\) for every \(i\), we have \(p_0=1, q_0=1, p_1=2, q_1=1\), and in general \(p_k=p_{k-1}+p_{k-2}\) and \(q_k=q_{k-1}+q_{k-2}\). Thus \(p_k=F_{k+2}\) and \(q_k=F_{k+1}\) and we see \(C_k=\frac{p_k}{q_k}=\frac{F_{k+2}}{F_{k+1}}\) as before.


Proof: We see that \(C_0=a_0=\frac{p_0}{q_0}, C_1=a_0+\frac{1}{a_1}=\frac{a_1a_0+1}{a_1}=\frac{p_1}{q_1}\), and \(C_2=a_0+\frac{1}{a_1+\frac{1}{a_2}}=a_0+\frac{a_2}{a_1a_2+1}=\frac{a_0a_1a_2+a_0+a_2}{a_1a_2+1}=\frac{a_2(a_0a_1+1)+a_0}{a_1a_2+1}=\frac{p_2}{q_2}.\)
In general, assume the statement holds for every value \(2\le k\le m\), and we show it holds for \(k=m+1\). Observe that since \( p_{m-1}, q_{m-1}, p_{m-2}, q_{m-2}\) are independent of \(a_m\), we can write

Therefore \(p_{m+1}=a_{m+1}p_m+p_{m-1}\) and \(q_{m+1}=a_{m+1}q_m+q_{m-1}\) as desired.

Definition: In an infinite convergent sequence, if we have a block that repeats over and over, we then write \([a_0; a_1, \cdots, a_m, b_1, \cdots, b_n, b_1, \cdots, b_n, \cdots]=[a_0; a_1, \cdots, a_m, \overline{b_1, b_2, \cdots, b_n}]\).

Example: Convert \(\sqrt{3}\) to a continued fraction.

Solution: Let \sqrt{3}=1+\dfrac{1}{a_1+\frac{1}{a_2+\frac{1}{a_3+\cdots}}}. Subtracting \(1\) from both sides gives us \sqrt{3}-1=\dfrac{1}{a_1+\frac{1}{a_2+\frac{1}{a_3+\cdots}}}.

Taking the reciprocal of the equation gives us \dfrac{1}{\sqrt{3}-1}=\dfrac{\sqrt{3}+1}{2}=a_1+\frac{1}{a_2+\frac{1}{a_3+\cdots}}. We then see that a_1=1 since \sqrt{3}\approx 1.73. Now, substituting this into the equation and subtracting gives:

\dfrac{\sqrt{3}-1}{2}=\frac{1}{a_2+\frac{1}{a_3+\cdots}}\implies \dfrac{2}{\sqrt{3}-1}=\sqrt{3}+1=a_2+\dfrac{1}{a_3+\cdots}. Thus a_2=2 and we obtain:

\sqrt{3}-1=\dfrac{1}{a_3+\cdots}. But notice from this point on that it repeats from the original equation! Therefore, a_3=1, a_4=2, and in general \sqrt{3}=[1;\overline{1,2}].

Observe that the first convergent of \sqrt{3} is C_1 = \frac{p_1}{q_1}=\frac{2}{1}. Furthermore, (p_1, q_1)=(2,1) satisfy the equation x^2-3y^2=1. However, notice that the period length of \sqrt{3} is even and the negative Pell equation x^2-3y^2=-1 has no solution since -1 is not a quadratic residue mod 3. These results generalize as follows:

Theorem: If the period length, n, of the simple continued fraction for \sqrt{d} is even, then the fundamental (that is, the smallest) solution to the Pell equation x^2-dy^2=1 is given by the convergent (x,y)=(p_{n-1}, q_{n-1}). Furthermore, then the negative Pell equation x^2-dy^2=-1 has no solution.

Exercise: Compute the continued fraction for \sqrt{18}. Use this to compute the fundamental solution to the Pell equation x^2-18y^2=1 and to show that the negative Pell equation x^2-18y^2=-1 has no solutions.

Now, we’ll examine the case where the period length of the continued fraction is odd. In particular, continue the continued fraction for \sqrt{5}. We see that:

\sqrt{5}=2+\frac{1}{a_1+\frac{1}{a_2+\cdots}}\implies \sqrt{5}-2=\frac{1}{a_1+\frac{1}{a_2+\cdots}}\implies \frac{1}{\sqrt{5}-2}=a_1+\frac{1}{a_2+\cdots}.

Simplifying the left-hand side by conjugating the denominator, we get \sqrt{5}+2. Therefore, a_1=4 and we’re left with \sqrt{5}-2=\frac{1}{a_2+\cdots}. Notice this repeats, hence a_2=a_3=\cdots=4 and we have \sqrt{5}=[2;\overline{4}].

The zeroth convergent of this continued fraction is C_0=\frac{p_1}{q_1}=2\implies p_0=2, q_0=1. The first convergent of this continued fraction is C_1=2+\frac{1}{4}=\frac{9}{4}=\frac{p_1}{q_1}\implies p_1=9, q_1=4. Observe that (p_0, q_0)=(2,1) satisfy the negative Pell equation x^2-5y^2=-1 and that (p_1, q_1)=(9,4) satisfy the Pell equation x^2-5y^2=1.

This result also generalizes as follows:

Theorem: If the period length, n, of the simple continued fraction for \sqrt{d} is odd, then the fundamental solution to the Pell equation x^2-dy^2=1 is given by the convergent (x,y)=(p_{2n-1}, q_{2n-1}). Furthermore, then the negative Pell equation x^2-dy^2=-1 has a fundamental solution given by (x,y)=(p_{n-1}, q_{n-1}).

We can now write a computer program that solves Pell equations by combining these two theorems. Assume that we have a dictionary of continued fractions for each integer \sqrt{d} held in the dictionary continued_fracs. Then the code is:

for d in continued_fracs.keys():
  a = continued_fracs[d]
  n = len(a)-1
  p = [a[0], a[0]*a[1]+1]
  q = [1, a[1]]
  if(n%2 == 0):
    for k in range(2, n):
      p.append(a[k]*p[k-1]+p[k-2])
      q.append(a[k]*q[k-1]+q[k-2])
    print("Fundamental solution to x^2-dy^2=1 for d=%d is x=%d, %d" %(d, p[n-1], q[n-1]))
    print("No solution to negative Pell equation x^2-dy^2=-1 for d=%d" %(d))
else:
  end_a = a[1:]
  a += end_a
  for k in range(2, 2*n):
    p.append(a[k]*p[k-1]+p[k-2])
    q.append(a[k]*q[k-1]+q[k-2])
  print("Fundamental solution to x^2-dy^2=1 for d=%d is x=%d, y=%d:"%(d, p[2*n-1], q[2*n-1]))
  print("Fundamental solution to x^2-dy^2=-1 for d=%d is x=%d, y=%d"%(d, p[n-1], q[n-1]))

In lines 2 through 5, we compute the length of the continued fraction’s period length and the initial values of the recursive sequences p_0, p_1, q_0, q_1. In lines 6 through 11, if the period length is even, we compute each of the values of p_k, q_k according to the recursive definition and then print out the fundamental solution to the Pell equation.

In lines 13 through 19, if the period length is odd, we have to compute up to p_{2n-1}. Hence, we replicate the repeating block of the continued fraction in lines 13 and 14, before computing the values recursively as before in lines 16 and 17. We then print the fundamental solution to the Pell equation in line 18 and the negative Pell equation in line 19.

The full Python file can be found on GitHub, and the output of the code can also be found here. The output for up to d=10 is also printed below in the table:

Value of d Solution to Pell EquationSolution to Negative Pell Equation
2(3,2)(1,1)
3 (2,1)None
5 (9,4) (2,1)
6(5,2) None
7 (8,3) None
8 (3,1)None
10 (19,6)(3,1)

Large Language Model Exploration:
If you’ve made it this far, you may wonder how large language models would do in solving the negative Pell equation x^2-3y^2=-1. I experimented with this question a bit on https://lmarena.ai/ and obtained the following two answers:

The one on the left is a complete fabrication, while Grok-2 impressively solves it correctly. If you experiment with other large language models, I’d be curious to see which ones can and can’t solve questions on negative Pell equations correctly.

References:

  1. David Burton, “Elementary Number Theory”, 7th Edition, 2012.
  2. W. Sierpinski, “Elementary Theory of Numbers”, 1988

A Pythagorean Triple for 2025

Happy 2025 to everyone! As one exciting update, I plan to work more on AI for math education this year, including more work in number theory. I hope to publish some updates to my number theory book with more completed chapters if all goes well.

The following puzzle was posed to me in an email by Professor Paul Buckingham. Paul has been a big inspiration to me for a while for his mentoring of me in Algebraic Number Theory, so I knew it’d be a good problem. It led to some twists and turns in number theory and computer science, so I’m sharing it below with his permission.

Problem: Show that there are pairwise coprime positive integers x,y, and z such that: x^2+y^2=z^2 and \dfrac{x}{x+y+z}=\dfrac{1}{2}-\dfrac{1}{2025}.

Solution: We must consider two cases based on whether x is odd or even. In the first case, we know that the primitive Pythagorean triple can be generated by (x,y,z) = (m^2-n^2, 2mn, m^2+n^2) where m and n are pairwise relatively prime and have opposite parity. If you’re unfamiliar with this formula, please see my notes here.

Substituting these values into the equation gives us:

 \frac{x}{x+y+z} = \frac{m^2-n^2}{(m^2-n^2)+2mn+(m^2+n^2)}=\frac{(m-n)(m+n)}{2m^2+2mn}=\frac{(m-n)(m+n)}{2m(m+n)}=\frac{m-n}{2m}=\frac{1}{2}-\frac{n}{2m}

Therefore, we have that \dfrac{n}{2m}=\dfrac{1}{2025}. The only coprime positive integers that satisfy this are n=2 and m=2025. These generate the solution triplet (x,y,z)=(4100621, 8100, 4100629).

In the second case when x is even, we arrive at the equation:

\dfrac{x}{x+y+z} = \dfrac{2mn}{2m^2+2mn} = \dfrac{n}{m+n} = \dfrac{1}{2}-\dfrac{1}{2025}=\dfrac{2023}{4050}.

Therefore, we have that n=2023 and m=2027. However, this gives a non-primitive solution since n and m are the same parity (it actually gives double the solution found above).

Hence, the only primitive Pythagorean triple that satisfies this equation is (x,y,z)=\boxed{(4100621, 8100, 4100629)}.

I then shared this puzzle with the professor who got me into computer science, Dr. Richard Kelley. He was recently learning the Lean Proof Assistant and decided it’d be a fun exercise to code up the solution in Lean. Below is a screenshot of the code he wrote, showing how the proof assistant checks the math to ensure the solution is valid.

-- Code written by Richard Kelley
import Mathlib.NumberTheory.PythagoreanTriples
def x : ℤ := 4100621
def y : ℤ := 8100
def z : ℤ := 4100629

def FractionalEq (x y z : ℚ) : Prop :=
( x / (x + y + z) ) = ( (1/2) - (1/2025) )

-- The coercion from ℤ to ℚ is necessary to be able to use the ring_nf tactic.
example : FractionalEq (x : ℚ) (y : ℚ) (z : ℚ) := by
rw[x,y,z]
rw[FractionalEq]
ring_nf

example : (PythagoreanTriple x y z) ∧ (FractionalEq (x:ℚ) (y:ℚ) (z:ℚ)) := by
rw[x,y,z]
constructor
case left =>
rw[PythagoreanTriple]
simp
case right =>
rw[FractionalEq]
ring_nf

Extension of the problem:

Can a computer search to find the primitive Pythagorean triple we found to this? My suspicion is no, unless we’re able to put tight bounds on x,y,z. Since my master’s thesis used a lot of logic programming, I attempted to program it in Clingo (a language for answer set programming). Unfortunately, my code failed to return a solution in a reasonable amount of time. If anyone has suggestions for how to get a computer search to find this solution, I’d love to hear them in the comments below!


pythagoreanTriple(X,Y,Z) :- X=1..n, Y=1..X, Z=X..n, (X**2)+(Y**2)=(Z**2).
divisionCondition(X,Y,Z) :- X=1..n, Y=1..X, Z=X..n, X/(X+Y+Z) = 1/2 - 1/2025.
satisfiesTriple(X,Y,Z) :- X=1..n, Y=1..X, Z=X..n, pythagoreanTriple(X,Y,Z), divisionCondition(X,Y,Z).
#show satisfiesTriple/3.

Happy year 2025 = 45^2 = (20+25)^2 = \displaystyle \sum_{n=1}^{9}n^3 to you all.