Math-Linux.com

Knowledge base dedicated to Linux and applied mathematics.

Home > Mathematics > Numerical solution of nonlinear equations > Newton’s method

Newton’s method

All the versions of this article: <English> <français>

Newton’s method or Newton-Raphson method is an iterative numerical method used to solve f(x)=0 type equations. It relies on the fixed-point method and on a particular function, g, related to the derivative of f.

Definition

Newton’s method is a fixed-point method using the application $g$:

$$g(x)=x-\displaystyle\frac{f(x)}{f’(x)}$$


It can be easily inferred that looking for a fixed point for $g$ comes down to looking for a solution to the following equation

$$f(x)=0\quad (E).$$


Remember that, in order to look for the fixed point, we resort to an iterative algorithm defined by the sequence

$$x_{n+1}=g(x_n)$$


The numerical scheme for Newton’s method is

$$x_{n+1}=x_n-\displaystyle\frac{f(x_n)}{f’(x_n)}$$

Geometrical interpretation

The tangent to the curve f at the point $(x_n,f(x_n))$ has the following equation

$$y=f(x_n)+f’(x_n)(x-x_n)$$


$x_{n+1}$ is nothing less than the abscissa of the point of intersection of this tangent with the $x$-axis. Indeed

$$ \begin{array}{lcl} y&=&0\\ x&=&x_n-\displaystyle\frac{f(x_n)}{f’(x_n)} \end{array} $$


We then set $x_{n+1}$:

$$x_{n+1}=x_n-\displaystyle\frac{f(x_n)}{f’(x_n)}$$

Convergence of Newton’s method

Theorem.

Let $f\in\mathcal{C}^2([a,b])$ and $x_*\in]a,b[$ such that

$$f(x_*)=0$$


and

$$f’(x_*)\neq 0$$


Then, there exists $\delta>0$ such that Newton’s method converges for all $x_0\in I_\delta=[x_*-\delta,x_*-\delta]$

Proof.

By assumption, $f’$ is continuous and $f’(x_*)\neq 0$. Therefore, there exists $\eta>0$ such that:

$$f’(x)\neq 0 \quad \forall x \in[x_*-\eta,x_*-\eta]\subset [a,b]$$

The derivative of $g$ is defined by:

$$g’(x)=(x-\displaystyle\frac{f(x)}{f’(x)})’=\displaystyle\frac{f(x)f’’(x)}{f’(x)^2}$$

By assumption, $f(x_*)=0$ and $f’(x_*)\neq 0$. Consequently:

$$g’(x_*)=0$$


Moreover, $g’$ is continuous in $[x_*-\eta,x_*-\eta]$ since

$$f’(x)\neq 0 \quad \forall x \in[x_*-\eta,x_*-\eta]\subset [a,b]$$


Explicitly writing the continuity of $g’$ at $x_*$ in the interval $[x_*-\eta,x_*-\eta]$ yields to:

$$\forall \varepsilon >0 \quad |x-x*|\leq \eta \Longrightarrow |g’(x)-g’(x_*)|\leq\varepsilon$$


that is

$$\forall \varepsilon >0 \quad |x-x*|\leq \eta \Longrightarrow |g’(x)|\leq\varepsilon$$


Therefore, there exists $\delta<\eta$ such that

$$|g’(x)|\leq\tau,\quad\forall x\in I_\delta=[x_*-\delta,x_*-\delta],\quad\tau\in]0,1[.$$


Up to now, we have proved one of the assumptions for the fixed-point theorem. We now need to prove that the interval $l_\delta$ is $g$-invariant. That is:

$$g(I_\delta)\subset I_\delta$$


By means of the mean value theorem, we show that there exists an element
$\xi\in]x,x_*[$ such that

$$ \begin{array}{ccl} |g(x)-g(x_*)|&=&|g’(\xi)||x-x_*|\leq\tau|x-x_*|<|x-x_*|\\ |g(x)-x_*)|&<&|x-x_*| \end{array} $$


hence

$$g(I_\delta)\subset I_\delta$$


To sum up, we have proven that:
- $g\in\mathcal{C}^1(I_\delta)$ et $g(I_\delta)\subset I_\delta$
- there exists a constant $\tau$ in ] 0,1[ such that $|g’(x)|\leq \tau$.
Via the fixed-point theorem, we can conclude that:
the sequence defined by $x_{n+1} = g(x_n )$ converges toward $x_*$, the fixed point of $g$ $\forall x_0\in I_\delta$.

Order of convergence of Newton’s method

Theorem.

If
- $g’(x_*)=0$
- $g’’$ is continuous on an open set $\mathcal{O}$ containing $x_*$
- $g’’(x_*)\neq 0$
then, there exists $\delta>0$ such that the sequence defined by $x_{n+1} = g(x_n )$ converges toward $x_*$, the fixed point of $g$, $\forall x_0\in I_\delta=[x_*-\delta,x_*-\delta]$. Newton’s method has a $2^{nd}$-order convergence. The convergence is quadratic.

Proof.

Similar to what we have seen previously, we show that there exists $\delta>0$and $\tau\in]0,1[$ such that

$$|g’(x)|\leq\tau,\quad\forall x \in I_\delta=[x_*-\delta,x_*-\delta] \subset\mathcal{O}$$


Since $g’(x_*)=0$, we introduce Taylor series for the function $g$ of the second order in the neighborhood of $x_*$. We obviously use the fact that $e_n=x_n-x_*$ approaches 0.

$$ \begin{array}{ccl} x_{n+1}&=&g(x_n)\\ &=&g(e_n+x_*)\\ &=&g(x_*)+e_ng’(x_*)+\displaystyle\frac{e_n^2}{2!}g’’(\xi_n)\\ &=&x_*+0+\displaystyle\frac{e_n^2}{2!}g’’(\xi_n)\\ &=&x_*+\displaystyle\frac{e_n^2}{2!}g’’(\xi_n)\\ \end{array} $$


where $\xi_n\in]x_*,e_n+x_*[=]x_*,x_n[$.
Hence:

$$e_{n+1}=x_{n+1}-x_*=\displaystyle\frac{e_n^2}{2!}g’’(\xi_n)$$


and

$$\lim_{n\to\infty}\frac{|e_{n+1}|}{|e_{n}|^2}=|\frac{g’’(x_*)}{2}|$$


The asymptotic error constant is $C=\displaystyle|\frac{g’’(x_*)}{2}|$. The convergence is quadratic; that is, of the $2^{nd}$-order since by assumption $g’’(x_*)\neq0$.

Multiplicity of a root

Let $p\in\mathbf{N}$. f is said to have root of multiplicity (or order) $p$ if

$$ \begin{array}{lcl} f(x_*)&=&f’(x_*)=f’’(x_*)=f^{(3)}(x_*)=\ldots=f^{(p-1)}(x_*)=0\\ f^{(p)}(x_*)&\neq& 0 \end{array} $$


and $f$ can be written as follows:

$$f(x)=(x-x_*)^ph(x)$$


where $h(x_*)\neq0$.

Convergence of Newton’s method: simple root

Theorem.

If $f\in\mathcal{C}^2([a,b])$ and $f$ has a simple root, then Newton’s method converges quadratically ($2^{nd}$ order), at least.

Proof.

If $f$ has a simple root, then:

$$ \begin{array}{lcl} f(x_*)&=&0\\ f^{’}(x_*)&\neq& 0 \end{array} $$


Newton’s method converges.
The derivative of $g$ is defined by:

$$g’(x)=(x-\displaystyle\frac{f(x)}{f’(x)})’=\displaystyle\frac{f(x)f’’(x)}{f’(x)^2}$$


The second derivative of $g$ is defined by:

$$g’’(x)=\displaystyle\frac{f’(x)^3f’’(x) + f’(x)^2f(x)f^{(3)}(x) - 2 f’(x) f’’^2(x)f(x) }{f’(x)^4}$$


and

$$g’’(x_*)=\displaystyle\frac{f’’(x_*)}{f’(x_*)}$$


If $f’’(x_*)\neq 0$
then
- $g’(x_*)=0$
- $g’’$ is continuous on an open set $\mathcal{O}$ containing $x_*$
- $g’’(x_*)\neq0$

If $f’’(x_*)=0$, then $g’’(x_*)=0$. It is required to deepen the investigation by introducing Taylor series of the $3^{rd}$ in the neighborhood of $x_*$ of the function $g$. Obviously, $e_n$ approaches 0. This gives

$$ \begin{array}{ccl} x_{n+1}&=&g(x_n)\\ &=&g(e_n+x_*)\\ &=&g(x_*)+e_ng’(x_*)+\displaystyle\frac{e_n^2}{2!}g’’(x_*) +\displaystyle\frac{e_n^3}{3!}g^{(3)}(\xi_n)\\ &=&x_*+0+0 +\displaystyle\frac{e_n^3}{3!}g^{(3)}(\xi_n)\\ &=&x_*+\displaystyle\frac{e_n^3}{3!}g^{(3)}(\xi_n)\\ \end{array} $$


where $\xi_n\in]x_*,e_n+x_*[=]x_*,x_n[$. Hence:

$$e_{n+1}=x_{n+1}-x_*=\displaystyle\frac{e_n^3}{3!}g^{(3)}(\xi_n)$$


and

$$\lim_{n\to\infty}\frac{|e_{n+1}|}{|e_{n}|^3}=|\displaystyle\frac{g^{(3)}(x_*)}{3!}|$$


The asymptotic error constant is $C=\displaystyle|\frac{g^{(3)}(x_*)}{3!}|$ and the convergence is cubic if $g^{(3)}(x_*)\neq 0$; that is, of the $3^{rd}$ order. If $g^{(3)}(x_*)=0$, a Taylor series of the $4^{th}$-order of $g$ in the neighborhood of $x_*$ is required.

We have therefore proved that the convergence of Newton’s method is at least quadratic.

Convergence of Newton’s method: multiple root

Theorem.

Assume that $f$ has a root of multiplicity $p$. Then, Newton’s method converges linearly ($1^{st}$-order) and the asymptotic error constant is:

$$C=1-\frac{1}{p}$$

Proof.

$f$ has a root of multiplicity $p\geq 2$. Consequently, there exists an application $h$ such that:

$$f(x)=(x-x_*)^ph(x)$$


where $h(x_*)\neq0$.

The derivative $g’$ of $g$ is defined by:

$$ \begin{array}{ccl} g’(x)=1&-&\displaystyle\frac{(ph(x)+(x-x_*)h’(x))(h(x)+ (x-x_*)h’(x))}{(ph(x)+(x-x_*)h’(x))^2}\\ &+& \displaystyle\frac{(x-x_*)h(x)(ph’(x)+h’(x)+(x-x_*)h’’(x))}{(ph(x)+(x-x_*)h’(x))^2} \end{array} $$


Thereafter:

$$g’(x_*)=1-\frac{1}{p}$$


therefore: $g’(x_*)\neq 0$ since $p\geq 2$. A $1^{st}$-order Taylor series of the $g$ in the neighborhood of $x_*$gives:

$$ \begin{array}{ccl} x_{n+1}&=&g(x_n)\\ &=&g(e_n+x_*)\\ &=&g(x_*)+e_ng’(\xi_n)\\ &=&x_*+e_ng’(\xi_n) \end{array} $$


where $\xi_n\in]x_*,e_n+x_*[=]x_*,x_n[$. Hence:

$$e_{n+1}=x_{n+1}-x_*=e_ng’(\xi_n)$$


The convergence is of the $1^{st}$-order; ie. linear, since:

$$\lim_{n\to\infty}\frac{|e_{n+1}|}{|e_{n}|}=|g’(x_*)|=|1-\frac{1}{p}|=1-\frac{1}{p}$$


with an asymptotic constant

$$C=1-\frac{1}{p}.$$

Modified Newton’s method

We have previously seen that when the function $f$ has multiple roots, the order of convergence is deteriorated since only a linear order of convergence is attained. How can we recover a quadratic order — order 2 — of convergence when the roots have a degree of multiplicity $p\geq 2$ ? This is the aim of the modified Newton’s method.

Definition.
The modified Newton’s method of order $p$ is the fixed-point method of the function

$$g(x)=x-p\displaystyle\frac{f(x)}{f’(x)}$$

Theorem.

Assume that $f$ has a root whose multiplicity is $p$. Then, the modified Newton’s method converges at least quadratically (order 2)

Proof.
The derivative $g’$ of $g$ is defined as follows:

$$ \displaystyle \begin{array}{ccl} g’(x)&=&1 - p((ph(x)+(x-x_*)h’(x))(h(x)+(x-x_*)h’(x))\\ &+& (x-x_*)h(x)(ph’(x)+h’(x)+(x-x_*)h’’(x)))/((ph(x)+(x-x_*)h’(x))^2) \end{array} $$

Thereafter:

$$g’(x_*)=0$$

We thus conclude, just like before, that the modified Newton’s is at least quadratic.

Any message or comments?

comments powered by Disqus