http://dandavison.github.io/2016-10-17T00:00:00-07:00Finding the nth Fibonacci number via an eigenvector change of basis2016-10-17T00:00:00-07:00Dan Davisontag:dandavison.github.io,2016-10-17:fibonacci-eigenbasis.html<style type="text/css">
body {color: black;}
</style>
<div class="math">$$
\newcommand{\i}{\mathbf{i}}
\newcommand{\j}{\mathbf{j}}
\newcommand{\cvec}[2]{\begin{pmatrix}#1\\#2\end{pmatrix}}
\newcommand{\mat}[4]{\begin{bmatrix}#1 & #2\\#3 & #4\\ \end{bmatrix}}
\newcommand{\scvec}[2]{\tiny{\cvec{#1}{#2}}}
\newcommand{\smat}[4]{\tiny{\mat{#1}{#2}{#3}{#4}}}
\newcommand{\nth}{n^{\text{th}}}
$$</div>
<p>This is the problem given at the end of the eigenvectors video in the
<a href="https://www.youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab">Essence of Linear Algebra</a>
series by <a href="http://www.3blue1brown.com/">3blue1brown</a>.</p>
<hr />
<h4 id="introduction"><strong>Introduction</strong></h4>
<p>Consider the matrix</p>
<div class="math">$$
A = \mat{0}{1}
{1}{1}
$$</div>
<p>The first few powers are</p>
<div class="math">\begin{align*}
&A^{1} &= \mat{0}{1}
{1}{1}
\\
&A^{2} = \mat{0}{1}
{1}{1} \mat{0}{1}
{1}{1} &= \mat{1}{1}
{1}{2}
\\
&A^{3} = \mat{0}{1}
{1}{1} \mat{1}{1}
{1}{2} &= \mat{1}{2}
{2}{3}
\\
&A^{4} = \mat{0}{1}
{1}{1} \mat{1}{2}
{2}{3} &= \mat{2}{3}
{3}{5}
\end{align*}</div>
<p>The Fibonacci sequence is the sequence you get by starting with <span class="math">\(0,
1\)</span> and after that always forming the next number by adding the two previous ones:
<span class="math">\(F_0, F_1, F_2, F_3, F_4, F_5, F_6, F_7, ...\)</span> = <span class="math">\(0, 1, 1, 2, 3, 5, 8, 13, ...\)</span>.</p>
<p>The matrix powers are generating the Fibonacci sequence:</p>
<div class="math">$$
A^{n} = \mat{F_{n-1} }{F_n }
{F_n }{F_{n+1} }
$$</div>
<p>So if there were a way to compute the <span class="math">\(\nth\)</span> power of that matrix "directly",
that would also be a way to compute the <span class="math">\(\nth\)</span> Fibonacci number "directly",
i.e. without computing all the preceding Fibonacci numbers <em>en route</em>.</p>
<p>How can we do this? To state the problem in a different way, we need to
construct a new matrix that performs exactly the same transformation as <span class="math">\(A^n\)</span>,
but which somehow does the exponentiation step "in one go" rather than by
multiplying <span class="math">\(A\)</span> with itself <span class="math">\(n\)</span> times.</p>
<h4 id="solution-outline"><strong>Solution outline</strong></h4>
<p>Matrices represent transformations, so we can talk about them as taking in some
vector and producing some other vector. The approach we're going to take is to
re-express the <span class="math">\(A^n\)</span> transformation as follows:</p>
<ol>
<li>Convert the input vector to its representation in an alternative basis which
uses the eigenvectors as the basis vectors (it's called an "eigenbasis").</li>
<li>In this alternative basis, compute the new position of the vector after
carrying out the <span class="math">\(A^n\)</span> transformation.</li>
<li>Convert the resulting vector back to its representation in our original
basis.</li>
</ol>
<p>I.e., we're going to compute the overall transformation as this product of
matrices (remember that one reads these things right-to-left):</p>
<div class="math">$$
\begin{bmatrix}\text{matrix converting their}\\\text{representation to ours} \\ \end{bmatrix}
\begin{bmatrix}\text{matrix that does the A transformation}\\\text{in the alternative basis} \\ \end{bmatrix}^n
\begin{bmatrix}\text{matrix converting our}\\\text{representation to theirs} \\ \end{bmatrix}
$$</div>
<p>The crux of all this is that the exponentiation is efficient in the
eigenbasis. That's because, in the eigenbasis, the transformation is just
stretching space in the directions of the two basis vectors. So to do the
transformation <span class="math">\(n\)</span> times in the eigenbasis, you just stretch by the
stretch-factor raised to the <span class="math">\(\nth\)</span> power, rather than doing <span class="math">\(n\)</span> matrix
multiplications.</p>
<h4 id="solution-details"><strong>Solution details</strong></h4>
<p>Let's suppose we've already found the eigenvectors, and that there are two of
them, and that we've arranged them as the two columns of a matrix <span class="math">\(V\)</span>. <span class="math">\(V\)</span> holds
the basis vectors of the alternative basis, and therefore we know from the
<a href="./linear-algebra.html#change-of-basis">change of basis</a> notes that <span class="math">\(V\)</span> is the
matrix that takes as input a vector expressed in the alternative basis and
outputs its representation in our basis.</p>
<p>So, step (3) is done by <span class="math">\(V\)</span>, and step (1) is done by <span class="math">\(V^{-1}\)</span>, and the matrix
performing all three steps is going to look like</p>
<div class="math">$$
V
\begin{bmatrix}\text{matrix that does the A transformation}\\\text{in the alternative basis} \\ \end{bmatrix}^n
V^{-1}
$$</div>
<p>OK, so what is the matrix in the middle? The
<a href="./linear-algebra.html#change-of-basis">change of basis</a> notes tell us that we
can compute it as</p>
<div class="math">$$
\begin{bmatrix}\text{matrix converting our}\\\text{representation to theirs} \\ \end{bmatrix}
A
\begin{bmatrix}\text{matrix converting their}\\\text{representation to ours} \\ \end{bmatrix}
$$</div>
<p>In other words the matrix in the middle is</p>
<div class="math">$$
V^{-1}AV
$$</div>
<p>and the entire transformation is</p>
<div class="math">$$
V
\Big(V^{-1}AV\Big)^n
V^{-1}
$$</div>
<p>Put back into words, that's</p>
<div class="math">$$
\begin{bmatrix}\text{matrix converting their}\\\text{representation to ours} \\ \end{bmatrix}
\Bigg(
\begin{bmatrix}\text{matrix converting our}\\\text{representation to theirs} \\ \end{bmatrix}
A
\begin{bmatrix}\text{matrix converting their}\\\text{representation to ours} \\ \end{bmatrix}
\Bigg)^n
\begin{bmatrix}\text{matrix converting our}\\\text{representation to theirs} \\ \end{bmatrix}
$$</div>
<p>Recall that above we observed that the <span class="math">\(\nth\)</span> power of <span class="math">\(A\)</span> is a matrix with the
nth Fibonacci number in its bottom left and top right entries. So the following
tasks remain:</p>
<ol>
<li>Find the eigenvectors and put them in a matrix <span class="math">\(V\)</span>.</li>
<li>Find the inverse of <span class="math">\(V\)</span>.</li>
<li>Compute the matrix product <span class="math">\(V^{-1}AV\)</span>.</li>
<li>Compute the result of raising that to the <span class="math">\(\nth\)</span> power.</li>
<li>Plug the result of that into the overall expression.</li>
<li>Take the entry in the bottom left or top right (they should be the same!).</li>
</ol>
<p>The result should be an expression giving the <span class="math">\(\nth\)</span> Fibonacci number as a
function of <span class="math">\(n\)</span>. It should be possible to give as input to that function the
number one million, and have it output the one millionth Fibonacci number
directly, without it having to go through the preceding 999,999 Fibonacci
numbers.</p>
<h4 id="the-answer-without-showing-the-calculations"><strong>The answer without showing the calculations</strong></h4>
<div class="math">\begin{align*}
&\text{The eigenvectors are}
\\\\
&V &= \mat{2 }{2 }
{1 + \sqrt 5}{1 - \sqrt 5}
\\\\
&\text{which has inverse}
\\\\
&V^{-1} &= \frac{-1}{4\sqrt 5} \mat{1 - \sqrt 5 }{-2}
{-1 - \sqrt 5}{2}
\\\\
&\text{Therefore}
\\\\
&V^{-1}AV &= \frac{1}{2} \mat{1 + \sqrt 5}{0 }
{0 }{1 - \sqrt 5}
\\\\
&\text{and}
\\\\
&(V^{-1}AV)^n &= \frac{1}{2^n} \mat{(1 + \sqrt 5)^n}{0 }
{0 }{(1 - \sqrt 5)^n}
\\\\
&\text{and}
\\\\
&V \Big(V^{-1}AV\Big)^n V^{-1} &=
\mat{\frac{\big((1 + \sqrt 5)^{n-1} - (1 - \sqrt 5)^{n-1}\big)}{2^{n-1}\sqrt 5}}{\frac{\big((1 + \sqrt 5)^n - (1 - \sqrt 5)^n \big)}{2^n \sqrt 5}}
{\frac{\big((1 + \sqrt 5)^n - (1 - \sqrt 5)^n \big)}{2^n \sqrt 5}}{\frac{\big((1 + \sqrt 5)^{n+1} - (1 - \sqrt 5)^{n+1}\big)}{2^{n+1}\sqrt 5}}
\\\\
&\text{Therefore the nth Fibonacci number is}
\\\\
&F_n &= \frac{(1 + \sqrt 5)^n - (1 - \sqrt 5)^n}
{2^n \sqrt 5}
\end{align*}</div>
<h4 id="does-this-actually-work"><strong>Does this actually work?</strong></h4>
<p>Yes.</p>
<div class="codehilite"><pre><span class="kn">from</span> <span class="nn">math</span> <span class="kn">import</span> <span class="n">sqrt</span>
<span class="k">def</span> <span class="nf">fib</span><span class="p">(</span><span class="n">n</span><span class="p">):</span>
<span class="k">return</span> <span class="p">(</span>
<span class="p">(</span> <span class="p">(</span><span class="mi">1</span> <span class="o">+</span> <span class="n">sqrt</span><span class="p">(</span><span class="mi">5</span><span class="p">))</span><span class="o">**</span><span class="n">n</span> <span class="o">-</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">sqrt</span><span class="p">(</span><span class="mi">5</span><span class="p">))</span><span class="o">**</span><span class="n">n</span> <span class="p">)</span>
<span class="o">/</span>
<span class="nb">float</span><span class="p">(</span><span class="mi">2</span><span class="o">**</span><span class="n">n</span> <span class="o">*</span> <span class="n">sqrt</span><span class="p">(</span><span class="mi">5</span><span class="p">)))</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">10</span><span class="p">):</span>
<span class="k">print</span> <span class="n">i</span><span class="p">,</span> <span class="n">fib</span><span class="p">(</span><span class="n">i</span><span class="p">)</span>
<span class="mi">0</span> <span class="mf">0.0</span>
<span class="mi">1</span> <span class="mf">1.0</span>
<span class="mi">2</span> <span class="mf">1.0</span>
<span class="mi">3</span> <span class="mf">2.0</span>
<span class="mi">4</span> <span class="mf">3.0</span>
<span class="mi">5</span> <span class="mf">5.0</span>
<span class="mi">6</span> <span class="mf">8.0</span>
<span class="mi">7</span> <span class="mf">13.0</span>
<span class="mi">8</span> <span class="mf">21.0</span>
<span class="mi">9</span> <span class="mf">34.0</span>
</pre></div>
<h4 id="history"><strong>History</strong></h4>
<p>The formula is known as
<a href="https://en.wikipedia.org/wiki/Fibonacci_number#Closed-form_expression">Binet's formula</a>
(1843) but was apparently known to Euler, Daniel Bernoulli and de Moivre more
than a century earlier. It can be derived without using linear algebra
techniques; I don't know when the style of proof attempted here would first
have been done. The result can be written as</p>
<div class="math">$$
F_n = \frac{\phi^n - (1-\phi)^n}{\sqrt{5}}
$$</div>
<p>where <span class="math">\(\phi = \frac{1+\sqrt{5}}{2}\)</span> is the
<a href="https://en.wikipedia.org/wiki/Golden_ratio">golden ratio</a>.</p>
<h4 id="calculations"><strong>Calculations</strong></h4>
<h5 id="1-find-the-eigenvectors"><strong>1. Find the eigenvectors</strong></h5>
<p>We follow the textbook approach: We have
</p>
<div class="math">$$
A = \mat{0}{1}
{1}{1}
$$</div>
<p>An eigenvector <span class="math">\(v\)</span> satisfies <span class="math">\(Av = \lambda v\)</span> for some scalar <span class="math">\(\lambda\)</span>. That
equation can be rearranged as follows</p>
<div class="math">\begin{align*}
A\vec v &= \lambda I\vec v
\\
A\vec v - \lambda I\vec v &= \vec 0
\\
(A - \lambda I)\vec v &= \vec 0
\end{align*}</div>
<p>which means that the matrix <span class="math">\(A - \lambda I\)</span> is a transformation that takes some
non-zero vector <span class="math">\(\vec v\)</span> to the zero vector (i.e. it has a non-empty "null
space"). This means that the transformation cannot be reversed, i.e. the matrix
has no inverse, i.e. its determinant is zero. So, use that last fact to find
the eigenvectors <span class="math">\(\lambda\)</span>:</p>
<div class="math">\begin{align*}
\det (A - \lambda I) &= 0
\\
\\
\det \mat{-\lambda}{1}
{1 }{1 - \lambda} &= 0
% \\
% \\
% (-\lambda)(1 - \lambda) - 1 &= 0
\\
\\
\lambda^2 - \lambda - 1 = 0
\end{align*}</div>
<p>Using the quadratic formula we have <span class="math">\(a=1, b=-1, c=-1\)</span> and</p>
<div class="math">\begin{align*}
\lambda
= \frac{-b ± \sqrt{b^2 - 4ac}}{2a}
= \frac{1 ± \sqrt{5}}{2}
\end{align*}</div>
<p>which are the two eigenvalues.</p>
<p>To find eigenvectors associated with the eigenvalues, go back to the equations</p>
<div class="math">\begin{align*}
(A - \lambda I)\vec v &= \vec 0
\\
\\
\mat{-\lambda}{1}
{1 }{1 - \lambda} \vec v &= \vec 0
\end{align*}</div>
<p>Let an eigenvector <span class="math">\(v\)</span> be <span class="math">\(\scvec{v_1}{v_2}\)</span>. The matrix equation corresponds
to this system of equations:</p>
<div class="math">$$
\begin{cases}
-\lambda v_1 &+ v_2 &= 0\\
v_1 &+ (1 - \lambda) v_2 &= 0
\end{cases}
$$</div>
<p>From the first equation we have <span class="math">\(v_2 = \lambda v_1\)</span>. There are infinitely many
eigenvectors (a line of them) associated with any given eigenvalue, so we can
pick an arbitrary value for <span class="math">\(v_1\)</span>. If we choose <span class="math">\(v_1=2\)</span> then we have
eigenvectors <span class="math">\(\scvec{2}{1+\sqrt 5}\)</span> and <span class="math">\(\scvec{2}{1-\sqrt 5}\)</span>. The matrix
containing the eigenvectors is</p>
<div class="math">$$
V = \mat{2 }{2 }
{1 + \sqrt 5}{1 - \sqrt 5}
$$</div>
<h5 id="2-find-inverse-of-v"><strong>2. Find inverse of <span class="math">\(V\)</span></strong></h5>
<p>The inverse of a 2x2 matrix is given by</p>
<div class="math">$$
\mat{a}{c}
{b}{d} ^ {-1}
=
\frac{1}{\text{det}} \mat{d}{-c}
{-b}{a}
$$</div>
<p>where <span class="math">\(\text{det} = ad - cb\)</span>. Therefore</p>
<div class="math">\begin{align*}
V^{-1}
&= \frac{1}{2(1 - \sqrt 5) - 2(1 + \sqrt 5)} \mat{1 - \sqrt 5 }{-2}
{-(1 + \sqrt 5)}{2}
\\\\
&= \frac{-1}{4\sqrt 5} \mat{1 - \sqrt 5 }{-2}
{-(1 + \sqrt 5)}{2}
\end{align*}</div>
<h5 id="3-find-the-matrix-product-v-1av"><strong>3. Find the matrix product <span class="math">\(V^{-1}AV\)</span></strong></h5>
<p>Before we get lost in the calculation, let's remember what this is. It's a
matrix that does the <span class="math">\(A\)</span> transformation, but <em>in the coordinate system defined
by <span class="math">\(A\)</span>'s eigenvectors</em>. So, the resulting matrix <em>must</em> do nothing other than
stretch space in the direction of one or both basis vectors in that coordinate
system. That's because (1) we represent a transformation with a matrix saying
where each of the basis vectors are taken to, (2) the definition of an
eigenvector of a transformation is that it is a vector which is simply
stretched by the transformation with no change in direction, therefore (3) if
the eigenvectors are the basis vectors, then the matrix representing the
transformation must just stretch space in the two directions. A matrix which
stretches space in the direction of the basis vectors looks like
<span class="math">\(\smat{a}{0}{0}{b}\)</span>, i.e. it is diagonal. Therefore, <span class="math">\(V^{-1}AV\)</span> <em>must</em> be
diagonal.</p>
<div class="math">\begin{align*}
V^{-1}AV &=
\frac{-1}{4\sqrt 5}
\mat{1 - \sqrt 5 }{-2}
{-(1 + \sqrt 5)}{2}
\mat{0}{1}
{1}{1}
\mat{2 }{2 }
{1 + \sqrt 5}{1 - \sqrt 5}
\\\\
&=
\frac{-1}{4\sqrt 5}
\mat{1 - \sqrt 5 }{-2}
{-(1 + \sqrt 5)}{2}
\mat{1 + \sqrt 5}{1 - \sqrt 5}
{3 + \sqrt 5}{3 - \sqrt 5}
\\\\
&=
\frac{-1}{4\sqrt 5}
\mat{-4 - 2(3 + \sqrt 5) }{6 - 2\sqrt 5 - 2(3 - \sqrt 5)}
{-(6 + 2\sqrt 5) + 2(3 + \sqrt 5)}{4 + 2(3 - \sqrt 5)}
\\\\
&=
\frac{-1}{2\sqrt 5}
\mat{-2 - 3 - \sqrt 5}{3 - \sqrt 5 - 3 + \sqrt 5}
{-3 - \sqrt 5 + 3 + \sqrt 5}{2 + 3 - \sqrt 5}
\\\\
&=
\frac{-1}{2\sqrt 5}
\mat{-5 - \sqrt 5}{0 }
{0 }{5 - \sqrt 5}
\\\\
&=
\frac{1}{2}
\mat{1 + \sqrt 5}{0 }
{0 }{1 - \sqrt 5}
\end{align*}</div>
<h5 id="4-compute-v-1avn"><strong>4. Compute <span class="math">\((V^{-1}AV)^n\)</span></strong></h5>
<p>The matrix is diagonal so this is straightforward. Note that this is the whole
point of converting to the eigenbasis: the exponentiation at this step just
involves the usual operations of raising scalar numbers to a power; no need to
multiply matrices together. A computer will be able to compute the <span class="math">\(\nth\)</span> power
of a diagonal matrix much faster than that of a non-diagonal matrix.</p>
<div class="math">$$
(V^{-1}AV)^n = \frac{1}{2^n} \mat{(1 + \sqrt 5)^n}{0 }
{0 }{(1 - \sqrt 5)^n}
$$</div>
<h5 id="5-plug-the-nth-power-into-the-overall-expression"><strong>5. Plug the <span class="math">\(\nth\)</span> power into the overall expression</strong></h5>
<div class="math">\begin{align*}
V \Big(V^{-1}AV\Big)^n V^{-1}
&=
\frac{-1}{4\sqrt 5}
\frac{1}{2^n}
\mat{2 }{2 }
{1 + \sqrt 5}{1 - \sqrt 5}
\mat{(1 + \sqrt 5)^n}{0 }
{0 }{(1 - \sqrt 5)^n}
\mat{1 - \sqrt 5 }{-2}
{-(1 + \sqrt 5)}{2}
\\\\
&=
\frac{-1}{4\sqrt 5}
\frac{1}{2^n}
\mat{2 }{2 }
{1 + \sqrt 5}{1 - \sqrt 5}
\mat{(1 - \sqrt 5)(1 + \sqrt 5)^n}{-2(1 + \sqrt 5)^n}
{-(1 + \sqrt 5)(1 - \sqrt 5)^n}{2(1 - \sqrt 5)^n}
\\\\
&=
\frac{-1}{4\sqrt 5}
\frac{1}{2^n}
\mat{2(-4)\big((1 + \sqrt 5)^{n-1} - (1 - \sqrt 5)^{n-1}\big)}{-4\big((1 + \sqrt 5)^n - (1 - \sqrt 5)^n \big)}
{ -4\big((1 + \sqrt 5)^n - (1 - \sqrt 5)^n \big)}{-2\big((1 + \sqrt 5)^{n+1} - (1 - \sqrt 5)^{n+1}\big)}
\\\\
&=
\frac{1}{4\sqrt 5}
\mat{4\frac{\big((1 + \sqrt 5)^{n-1} - (1 - \sqrt 5)^{n-1}\big)}{2^{n-1}}}{4\frac{\big((1 + \sqrt 5)^n - (1 - \sqrt 5)^n \big)}{2^n }}
{4\frac{\big((1 + \sqrt 5)^n - (1 - \sqrt 5)^n \big)}{2^n }}{ \frac{\big((1 + \sqrt 5)^{n+1} - (1 - \sqrt 5)^{n+1}\big)}{2^{n-1}}}
\\\\
&=
\mat{\frac{\big((1 + \sqrt 5)^{n-1} - (1 - \sqrt 5)^{n-1}\big)}{2^{n-1}\sqrt 5}}{\frac{\big((1 + \sqrt 5)^n - (1 - \sqrt 5)^n \big)}{2^n \sqrt 5}}
{\frac{\big((1 + \sqrt 5)^n - (1 - \sqrt 5)^n \big)}{2^n \sqrt 5}}{\frac{\big((1 + \sqrt 5)^{n+1} - (1 - \sqrt 5)^{n+1}\big)}{2^{n+1}\sqrt 5}}
\end{align*}</div>
<script type="text/javascript">if (!document.getElementById('mathjaxscript_pelican_#%@#$@#')) {
var align = "center",
indent = "0em",
linebreak = "false";
if (false) {
align = (screen.width < 768) ? "left" : align;
indent = (screen.width < 768) ? "0em" : indent;
linebreak = (screen.width < 768) ? 'true' : linebreak;
}
var mathjaxscript = document.createElement('script');
var location_protocol = (false) ? 'https' : document.location.protocol;
if (location_protocol !== 'http' && location_protocol !== 'https') location_protocol = 'https:';
mathjaxscript.id = 'mathjaxscript_pelican_#%@#$@#';
mathjaxscript.type = 'text/javascript';
mathjaxscript.src = location_protocol + '//cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML';
mathjaxscript[(window.opera ? "innerHTML" : "text")] =
"MathJax.Hub.Config({" +
" config: ['MMLorHTML.js']," +
" TeX: { extensions: ['AMSmath.js','AMSsymbols.js','noErrors.js','noUndefined.js'], equationNumbers: { autoNumber: 'AMS' }, Macros: {} }," +
" jax: ['input/TeX','input/MathML','output/HTML-CSS']," +
" extensions: ['tex2jax.js','mml2jax.js','MathMenu.js','MathZoom.js']," +
" displayAlign: '"+ align +"'," +
" displayIndent: '"+ indent +"'," +
" showMathMenu: true," +
" messageStyle: 'normal'," +
" tex2jax: { " +
" inlineMath: [ ['\\\\(','\\\\)'] ], " +
" displayMath: [ ['$$','$$'] ]," +
" processEscapes: true," +
" preview: 'TeX'," +
" }, " +
" 'HTML-CSS': { " +
" styles: { '.MathJax_Display, .MathJax .mo, .MathJax .mi, .MathJax .mn': {color: 'inherit ! important'} }," +
" linebreaks: { automatic: "+ linebreak +", width: '90% container' }," +
" }, " +
"}); " +
"if ('default' !== 'default') {" +
"MathJax.Hub.Register.StartupHook('HTML-CSS Jax Ready',function () {" +
"var VARIANT = MathJax.OutputJax['HTML-CSS'].FONTDATA.VARIANT;" +
"VARIANT['normal'].fonts.unshift('MathJax_default');" +
"VARIANT['bold'].fonts.unshift('MathJax_default-bold');" +
"VARIANT['italic'].fonts.unshift('MathJax_default-italic');" +
"VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" +
"});" +
"MathJax.Hub.Register.StartupHook('SVG Jax Ready',function () {" +
"var VARIANT = MathJax.OutputJax.SVG.FONTDATA.VARIANT;" +
"VARIANT['normal'].fonts.unshift('MathJax_default');" +
"VARIANT['bold'].fonts.unshift('MathJax_default-bold');" +
"VARIANT['italic'].fonts.unshift('MathJax_default-italic');" +
"VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" +
"});" +
"}";
(document.body || document.getElementsByTagName('head')[0]).appendChild(mathjaxscript);
}
</script>Linear Algebra2016-08-14T00:00:00-07:00Dan Davisontag:dandavison.github.io,2016-08-14:linear-algebra.html<style type="text/css">
body {color: black;}
</style>
<div class="math">$$
\newcommand{\i}{\mathbf{i}}
\newcommand{\j}{\mathbf{j}}
\newcommand{\cvec}[2]{\begin{pmatrix}#1\\#2\end{pmatrix}}
\newcommand{\mat}[4]{\begin{bmatrix}#1 & #2\\#3 & #4\\ \end{bmatrix}}
\newcommand{\scvec}[2]{\tiny{\cvec{#1}{#2}}}
\newcommand{\smat}[4]{\tiny{\mat{#1}{#2}{#3}{#4}}}
\newcommand{\nth}{n^{\text{th}}}
$$</div>
<p>Notes from the
<a href="https://www.youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab">Essence of Linear Algebra</a>
video series by <a href="http://www.3blue1brown.com/">3blue1brown</a>.</p>
<hr />
<h3 id="linear-transformations-and-matrices">Linear transformations and matrices</h3>
<p>A linear transformation is completely specified by</p>
<ol>
<li>Some basis vectors <span class="math">\(\i\)</span> and <span class="math">\(\j\)</span></li>
<li>Where those basis vectors are taken to by the transformation.</li>
</ol>
<p>How the transformation affects any other point follows from those two pieces of
information.</p>
<p>So <span class="math">\(\i\)</span> might be taken to <span class="math">\(a\i + b\j\)</span>, and <span class="math">\(\j\)</span> might be taken to <span class="math">\(c\i + d\j\)</span>.
In this case we would use the following matrix to describe the
transformation:</p>
<div class="math">$$
\mat{a}{c}
{b}{d}
$$</div>
<p>Some examples are</p>
<div class="math">$$
\begin{array}{ll}
\text{stretch by a in the i-direction} & \mat{a}{0}
{0}{1}
\\\\
\text{stretch by a in the i-direction and shear right} & \mat{a}{b}
{0}{1}
\\\\
\text{rotate anticlockwise 90°} & \mat{0}{-1}
{1}{ 0}
\end{array}
$$</div>
<p>Note that we haven't said what <span class="math">\(\i\)</span> and <span class="math">\(\j\)</span> are yet; they <em>define</em> the
2-dimensional space that we're considering. But, we can think of them for now
as the usual orthogonal unit vectors in 2D space.</p>
<p>So the matrix tells us where the basis vectors have been taken to. Any other
vector <span class="math">\(f\i + g\j\)</span> is taken to wherever that is using the transformed basis
vectors:</p>
<div class="math">$$
f\i + g\j \longrightarrow f\cvec{a}{b} + g\cvec{c}{d} = \cvec{fa + gc}{fb + gd}
$$</div>
<p>And that's how matrix multiplication is defined:</p>
<div class="math">$$
\mat{a}{c}
{b}{d} \cvec{f}{g} = \cvec{fa + gc}{fb + gd}
$$</div>
<p>A matrix represents a linear transformation by showing where the basis vector
are taken to.</p>
<hr />
<h3 id="change-of-basis">Change of basis</h3>
<p>Suppose person B uses some other basis vectors to describe locations in
space. Specifically, in our coordinates, their basis vectors are
<span class="math">\(\scvec{2}{1}\)</span> and <span class="math">\(\scvec{-1}{1}\)</span>.</p>
<p><strong>When they state a vector, what is it in our coordinates?</strong></p>
<p>If they say <span class="math">\(\scvec{-1}{2}\)</span>, what is that in our coordinates?</p>
<p>Well, if they say <span class="math">\(\scvec{1}{0}\)</span>, that's <span class="math">\(\scvec{2}{1}\)</span> in our coordinates. And
if they say <span class="math">\(\scvec{0}{1}\)</span>, that's <span class="math">\(\scvec{-1}{1}\)</span> in our coordinates. So the
matrix containing <em>their basis vectors expressed using our coordinate system</em>
transforms a point expressed in their coordinate system into one expressed in
ours. That last sentence is critical, so hopefully it makes sense! So, the answer is</p>
<div class="math">$$
\mat{2}{-1}
{1}{ 1} \cvec{-1}{2} = \cvec{-4}{1}.
$$</div>
<p><strong>When we state a vector, what is it in their coordinates?</strong></p>
<p>We give the vector <span class="math">\(\scvec{3}{2}\)</span>. What is that in their coordinate system? By
definition, the answer is the weights that scales their basis vectors to hit
<span class="math">\(\scvec{3}{2}\)</span>. So, the solution to</p>
<div class="math">$$
\mat{2}{-1}
{1}{1} \cvec{a}{b} = \cvec{3}{2}.
$$</div>
<p>Computationally, we can see that we can get the solution by multiplying both
sides by the inverse:</p>
<div class="math">$$
\cvec{a}{b} = \mat{2}{-1}
{1}{1}^{-1} \cvec{3}{2}.
$$</div>
<p>Conceptually, we have</p>
<div class="math">$$
\mat{2}{-1}
{1}{1} =
\begin{bmatrix}\text{matrix converting their}\\\text{representation to ours} \\ \end{bmatrix}
$$</div>
<p>where "their representation" means the vector expressed using their coordinate
system. So the role played by the inverse is</p>
<div class="math">$$
\cvec{a}{b} =
\begin{bmatrix}\text{matrix converting our}\\\text{representation to theirs} \\ \end{bmatrix}
\cvec{3}{2}.
$$</div>
<p><strong>When we state a transformation, what is it in their coordinates?</strong></p>
<p>We state a 90° anticlockwise rotation of 2D space:</p>
<div class="math">$$
\mat{0}{-1}
{1}{0}
$$</div>
<p>what is that transformation in their coordinates? The answer is</p>
<div class="math">$$
\begin{bmatrix}\text{matrix converting our}\\\text{representation to theirs} \\ \end{bmatrix}
\mat{0}{-1}
{1}{0}
\begin{bmatrix}\text{matrix converting their}\\\text{representation to ours} \\ \end{bmatrix}
$$</div>
<p>since the composition of those three transformations defines a single
transformation that takes in a vector expressed in their coordinate system,
converts it to our coordinate system, transforms it as requested, and then
converts back to theirs.</p>
<script type="text/javascript">if (!document.getElementById('mathjaxscript_pelican_#%@#$@#')) {
var align = "center",
indent = "0em",
linebreak = "false";
if (false) {
align = (screen.width < 768) ? "left" : align;
indent = (screen.width < 768) ? "0em" : indent;
linebreak = (screen.width < 768) ? 'true' : linebreak;
}
var mathjaxscript = document.createElement('script');
var location_protocol = (false) ? 'https' : document.location.protocol;
if (location_protocol !== 'http' && location_protocol !== 'https') location_protocol = 'https:';
mathjaxscript.id = 'mathjaxscript_pelican_#%@#$@#';
mathjaxscript.type = 'text/javascript';
mathjaxscript.src = location_protocol + '//cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML';
mathjaxscript[(window.opera ? "innerHTML" : "text")] =
"MathJax.Hub.Config({" +
" config: ['MMLorHTML.js']," +
" TeX: { extensions: ['AMSmath.js','AMSsymbols.js','noErrors.js','noUndefined.js'], equationNumbers: { autoNumber: 'AMS' }, Macros: {} }," +
" jax: ['input/TeX','input/MathML','output/HTML-CSS']," +
" extensions: ['tex2jax.js','mml2jax.js','MathMenu.js','MathZoom.js']," +
" displayAlign: '"+ align +"'," +
" displayIndent: '"+ indent +"'," +
" showMathMenu: true," +
" messageStyle: 'normal'," +
" tex2jax: { " +
" inlineMath: [ ['\\\\(','\\\\)'] ], " +
" displayMath: [ ['$$','$$'] ]," +
" processEscapes: true," +
" preview: 'TeX'," +
" }, " +
" 'HTML-CSS': { " +
" styles: { '.MathJax_Display, .MathJax .mo, .MathJax .mi, .MathJax .mn': {color: 'inherit ! important'} }," +
" linebreaks: { automatic: "+ linebreak +", width: '90% container' }," +
" }, " +
"}); " +
"if ('default' !== 'default') {" +
"MathJax.Hub.Register.StartupHook('HTML-CSS Jax Ready',function () {" +
"var VARIANT = MathJax.OutputJax['HTML-CSS'].FONTDATA.VARIANT;" +
"VARIANT['normal'].fonts.unshift('MathJax_default');" +
"VARIANT['bold'].fonts.unshift('MathJax_default-bold');" +
"VARIANT['italic'].fonts.unshift('MathJax_default-italic');" +
"VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" +
"});" +
"MathJax.Hub.Register.StartupHook('SVG Jax Ready',function () {" +
"var VARIANT = MathJax.OutputJax.SVG.FONTDATA.VARIANT;" +
"VARIANT['normal'].fonts.unshift('MathJax_default');" +
"VARIANT['bold'].fonts.unshift('MathJax_default-bold');" +
"VARIANT['italic'].fonts.unshift('MathJax_default-italic');" +
"VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" +
"});" +
"}";
(document.body || document.getElementsByTagName('head')[0]).appendChild(mathjaxscript);
}
</script>Classical Mechanics by John R. Taylor2016-05-29T00:00:00-07:00Dan Davisontag:dandavison.github.io,2016-05-29:taylor-classical-mechanics.html<p>Notes on <a href="http://www.amazon.com/Classical-Mechanics-John-R-Taylor/dp/189138922X">Classical Mechanics</a> by John R. Taylor.</p>
<style type="text/css">
body {color: black;}
</style>
<div class="math">$$
\newcommand{\xhat}{\vec{e_x}}
\newcommand{\yhat}{\vec{e_y}}
\newcommand{\rhat}{\vec{e_r}}
\newcommand{\phihat}{\vec{e_\phi}}
\newcommand{\r}{\vec{r}}
\newcommand{\v}{\vec{v}}
\newcommand{\p}{\vec{p}}
\newcommand{\a}{\vec{a}}
\newcommand{\F}{\vec{F}}
\newcommand{\vector}[1]{\begin{bmatrix}#1\end{bmatrix}}
$$</div>
<h1 id="chapter-1-newtons-laws-of-motion">Chapter 1 - Newton's Laws of Motion</h1>
<h4 id="basics">Basics</h4>
<p>The basic object of interest is a moving particle. Its position at time <span class="math">\(t\)</span> is
<span class="math">\(\r\)</span>. It has that arrow over it because it is a vector. A vector is something
that specifies a direction and a magnitude. Think of <span class="math">\(\r\)</span> as an arrow from the
origin pointing to the current position. Don't think of <span class="math">\(\r\)</span> yet as a column
vector containing numbers, because we haven't said what coordinate system we're
using. Regardless of what coordinate system we use, <span class="math">\(\r\)</span> is always a vector
pointing from the origin to the current position.</p>
<p>The particle is moving, i.e. the position changes over time. So instead of just
writing <span class="math">\(\r\)</span>, we write <span class="math">\(\r(t)\)</span> which says that it's a function of time. Think
of that as giving the answer to a question: "At a given time <span class="math">\(t\)</span>, what is the
position?". The answer (position) is a vector, so we can say that this is a
"vector-valued function" (i.e. whatever output it gives, it's always a vector).</p>
<p>Its velocity is a function <span class="math">\(\v(t)\)</span> whose value is also a vector (at time <span class="math">\(t\)</span>
it's going at some speed in some direction). The velocity function <span class="math">\(\v(t)\)</span> is
the derivative with respect to time of the position function <span class="math">\(\r(t)\)</span>. That
sounds very familiar, but what exactly is the derivative of a vector-valued
function?</p>
<p>In normal, non-vector, calculus we imagine some curve like <span class="math">\(y = x^2\)</span>. So <span class="math">\(y\)</span> is
a function of <span class="math">\(x\)</span>. The value of that function is not a vector; it's just a
number (a scalar). The derivative of that function with respect to <span class="math">\(x\)</span> is
saying: at a particular point along the x-axis, if I start advancing <span class="math">\(x\)</span> a tiny
bit, how fast is <span class="math">\(y\)</span> changing? So, it's the slope of the curve at that point
(also just a number, not a vector).</p>
<p>In vector calculus, the derivative of <span class="math">\(\r(t)\)</span> with respect to <span class="math">\(t\)</span> is saying: at
some particular time <span class="math">\(t\)</span>, if I start advancing time a tiny bit, where is the
position going and how fast is it going there? So the derivative of a
vector-valued function is a vector -- an arrow with direction and magnitude
(speed).</p>
<h4 id="coordinate-systems">Coordinate systems</h4>
<p>Thinking of <span class="math">\(\r(t)\)</span> as an arrow with direction and magnitude is correct but a
bit abstract. How specifically do we use numbers to represent position? The
chapter covers two main coordinate systems. Let's say the particle is moving in
2D space for now.</p>
<ul>
<li>
<p><strong>Cartesian coordinates</strong>: we write down how far the particle currently is in
the x-direction, <span class="math">\(x(t)\)</span>, and how far it currently is in the y-direction,
<span class="math">\(y(t)\)</span>.</p>
</li>
<li>
<p><strong>Polar coordinates</strong>: we write down how far the particle currently is,
<span class="math">\(r(t)\)</span>, in the current direction to the particle.</p>
</li>
</ul>
<p>Note that <span class="math">\(x(t)\)</span>, <span class="math">\(y(t)\)</span>, and <span class="math">\(r(t)\)</span> were not written with arrows. They are
just numbers, saying how far the particle is <em>in some direction</em>. The "in some
direction" part corresponds to the concept of a <em>unit vector</em>. A "unit vector"
is basically a vector where the direction is of interest, but the magnitude is
just set to 1 for convenience.</p>
<p>Cartesian coordinates use two directions to specify the position. We'll write
these directions as the unit vectors <span class="math">\(\xhat\)</span> and <span class="math">\(\yhat\)</span>. So in Cartesian
coordinates, the position is</p>
<table style="width:100%">
<tbody><tr>
<td> $$\r(t) = x(t)\xhat + y(t)\yhat$$ </td>
<td> $$\mathrm{Go~} x(t) \mathrm{~units~in~the~} \xhat \mathrm{~direction} \mathrm{~and~} y(t) \mathrm{~units~in~the~} \yhat \mathrm{~direction}$$ </td>
</tr>
</tbody></table>
<p>In contrast, polar coordinates just use one direction to specify the position:
the direction of a direct line to the particle's current position. This
direction is the unit vector <span class="math">\(\rhat(t)\)</span>. So in polar
coordinates, the position is</p>
<table style="width:100%">
<tbody><tr>
<td> $$\r(t) = r(t)\rhat(t)$$ </td>
<td> $$\mathrm{Go~} r(t) \mathrm{~units~in~the~} \rhat(t) \mathrm{~direction}$$ </td>
</tr>
</tbody></table>
<p>Notice (and this is pretty important; it's basically the reason the chapter is
covering polar coordinates) that in polar coordinates the unit vector
<span class="math">\(\rhat(t)\)</span> is a function of time (its direction changes as the particle moves);
in contrast, in Cartesian coordinates, <span class="math">\(\xhat\)</span> and <span class="math">\(\yhat\)</span> are constant; they
always point in the same direction. The polar unit vector is a function of time
because it is the direction to wherever-the-particle-currently-is. The
Cartesian unit vectors are not functions of time because they are just the
x-axis direction and the y-axis direction and these do not change.</p>
<h4 id="velocity">Velocity</h4>
<p>We can now differentiate these position functions to get the velocity. Recall
that the answer is going to be a vector because it is the derivative of a
vector-valued function.</p>
<p><strong>Cartesian coordinates</strong></p>
<p>Because <span class="math">\(\xhat\)</span> and <span class="math">\(\yhat\)</span> are not functions of time, differentiating is
straightforward:</p>
<div class="math">$$\v(t) = \frac{d}{dt}\bigg(x(t)\xhat + y(t)\yhat\bigg) = \frac{d x(t)}{dt}\xhat + \frac{d y(t)}{dt} \yhat$$</div>
<p>Physicists use a dot to represent derivative-with-respect-to-time. So they
might write this as</p>
<div class="math">$$\v(t) = \dot x(t) \xhat + \dot y(t) \yhat$$</div>
<p>Either way, what this is saying is that in Cartesian coordinates, the velocity
function is a vector comprised of current x-speed in the x-direction and
current y-speed in the y-direction. In other words, it's what you expect.</p>
<p><strong>Polar coordinates</strong></p>
<div class="math">$$\v(t) = \frac{d}{dt}\bigg(r(t)\rhat(t)\bigg)$$</div>
<p>That's a product of two things that are both a function of time, so we use the
"product rule"<sup id="sf-taylor-classical-mechanics-1-back"><a href="#sf-taylor-classical-mechanics-1" class="simple-footnote" title=" The product rule is the thing when you studied differentiation that says: when you're differentiating the product of two functions you differentiate one and keep the other as-is, then you differentiate the other while keeping the first as-is, and you add the two things together: \(\frac{d(f(t)g(t))}{dt} = \dot f(t) g(t) + f(t) \dot g(t)\) ">1</a></sup> to differentiate it:</p>
<div class="math">$$\frac{d}{dt}\bigg(r(t)\rhat(t)\bigg) = \dot r(t) \rhat(t) + r(t)\frac{d \rhat(t)}{dt}$$</div>
<p>There's quite a few <span class="math">\(r\)</span>s there and it's important at this stage not to get lost
in the symbols. We know that the answer (velocity) is a vector. That means we
can write it as a bunch of things added together, where each thing is a number
times some unit vector. And we're using polar coordinates, so the unit vectors
are going to be the polar unit vectors. So the thing on the left <span class="math">\(\dot r(t)
\rhat(t)\)</span> is fine: that's saying that the velocity has one component which is
the current radial speed (a number <span class="math">\(\dot r(t)\)</span>) in the current radial direction
(the unit vector <span class="math">\(\rhat(t)\)</span>).</p>
<p>What about the thing on the right? It's the current radial distance times the
current derivative of the unit vector function. We've said that in polar
coordinates the unit vector <span class="math">\(\rhat(t)\)</span> changes over time, so it does make sense
that we could ask what its derivative with respect to time is. So what is it?
The answer is that it's a vector-valued function whose current value always
points at right-angles to the current radial direction, but that requires
explaining:</p>
<p>Going back to the informal definition of derivatives above, we're at some point
<span class="math">\(t\)</span> in time, and we imagine starting to advance time a tiny bit, and we look at
the change in where the unit vector points, after this infinitesimally small
amount of time passes. A unit vector always has length 1, so it can't grow in
length. There's only one thing it can do: it can point in a slightly different
direction. What direction has it gone in? It's basically like the hand of a
clock. It's not too hard to see that if the hand of a clock changes just a tiny
bit, then the tip moves in a direction that's almost a tangent to the
circle. Change "tiny" to "infinitesimally small" and the "almost" goes away: so
the time derivative of the radial unit vector is a vector pointing at right
angles to the radial vector. This unit vector in that direction is called
<span class="math">\(\phihat\)</span>, because it points in the direction that you go in when you increase
the angle <span class="math">\(\phi\)</span>, as opposed to <span class="math">\(\rhat\)</span> which points in the direction you go in
if you increase the radius <span class="math">\(r\)</span>. How fast does the radial unit vector move in
the <span class="math">\(\phihat\)</span> direction? The answer is that it moves at the speed that the
angle is increasing, so <span class="math">\(\dot \phi\)</span><sup id="sf-taylor-classical-mechanics-2-back"><a href="#sf-taylor-classical-mechanics-2" class="simple-footnote" title="You can prove this by writing the unit vector in Cartesian coordinates, \(cos(\phi) \xhat + sin(\phi) \yhat\), and then differentiating it to give \(\dot \phi\big(-sin(\phi)\xhat + cos(\phi)\yhat\big)\) which is \(\dot \phi\) times a vector orthogonal to the original one.">2</a></sup>. In other words, the time derivative of the radial unit
vector is <span class="math">\(\dot \phi(t) \phihat(t)\)</span></p>
<p>The conclusion of all that is that in polar coordinates, the velocity vector is</p>
<div class="math">$$\v(t) = \dot r(t) \rhat(t) + r(t) \dot \phi(t) \phihat(t)$$</div>
<p>Compare this with the expression for velocity in Cartesian coordinates</p>
<div class="math">$$\v(t) = \dot x(t) \xhat + \dot y(t) \yhat$$</div>
<p>and we see it's a bit more complicated in polar coordinates.</p>
<p>I understand the polar coordinates version as follows. At time <span class="math">\(t\)</span> the particle
might be moving radially, and its angle might also be changing. The velocity
vector has two components, one in the radial direction, and one in the tangent
direction. In the radial direction, it's moving at whatever speed the radius is
changing with. In the tangent direction it's moving at the speed that the angle
is changing, multiplied by the current radius. That multiplication by radius
makes sense informally, because if you are further out from the center of a
circle, and the circle rotates by a few degrees, then you move further in space
than if you were closer in to the center.</p>
<h4 id="acceleration">Acceleration</h4>
<p>The acceleration function is the derivative of the velocity function with
respect to time. Therefore, it is also a vector: at time <span class="math">\(t\)</span> the particle is
accelerating by some amount, in some direction.</p>
<p><strong>Cartesian coordinates</strong></p>
<p>Again, because the unit vectors do not change with time, it's as you expect:
there's an x-acceleration in the x-direction, and a y-acceleration in the
y-direction.</p>
<div class="math">$$\a(t) = \ddot x(t) \xhat + \ddot y(t) \yhat$$</div>
<p><strong>Polar coordinates</strong></p>
<p>Above we saw that because, in polar coordinates, the directions of the
coordinate system change with time, the function for velocity was more
complicated than when using Cartesian coordinates. For acceleration, we
differentiate the velocity expression and of course it gets even more
complicated. But basically the answer is still a function of the form</p>
<div class="math">$$\a(t) = \bigg( \text{Some function of } t \bigg) \rhat(t) + \bigg( \text{Another function of } t \bigg) \phihat(t)$$</div>
<p>The functions of <span class="math">\(t\)</span> involve the current radius length, the speed and
acceleration in the current radius direction, and the speed and acceleration of
the angle parameter <span class="math">\(\phi\)</span>. The full expression is in the footnote<sup id="sf-taylor-classical-mechanics-3-back"><a href="#sf-taylor-classical-mechanics-3" class="simple-footnote" title="In polar coordinates, if you suppose that you know functions \(r(t)\) and \(\phi(t)\) giving the angle and distance at time \(t\), then the accelerations in the two orthogonal directions at time \(t\) are \(\a(t) = \bigg( \ddot r(t) - r(t) \dot\phi(t)^2 \bigg) \rhat(t) + \bigg( 2\dot r(t) \dot \phi(t) + r(t) \ddot \phi(t)\bigg) \phihat(t)\) ">3</a></sup>.</p>
<h3 id="newtons-second-law-as-a-differential-equation">Newton's second law as a differential equation</h3>
<p>A key point seems to be: view Newton's second law <span class="math">\(\F = m\a\)</span> as a differential
equation<sup id="sf-taylor-classical-mechanics-4-back"><a href="#sf-taylor-classical-mechanics-4" class="simple-footnote" title='The dot means "differentiated with respect to time". So if \(r\) is position as a function of time then \(\dot r\) is velocity and \(\ddot r\) is acceleration.'>4</a></sup>:</p>
<div class="math">$$m \ddot \r(t) = \F$$</div>
<p>I'm understanding this as follows: You know what forces are acting on the body
in question. You want to know how the position of the body will evolve through
time: <span class="math">\(\r(t)\)</span>. This is a function satisfying the following differential
equation: the second derivative with respect to time of <span class="math">\(\r(t)\)</span>, times <span class="math">\(m\)</span>, is
equal to the net force acting on the body.</p>
<p>In practice: in a typical problem you have some expression for <span class="math">\(\F\)</span> derived from
consideration of a diagram showing forces acting on the body. You might be able
to discover <span class="math">\(\r(t)\)</span> by finding a function whose second derivative is <span class="math">\(\F\)</span>.</p>
<h4 id="example-problems">Example problems</h4>
<p><strong>Cartesian coordinates</strong></p>
<blockquote>
<p>1.37 A student kicks a frictionless puck with initial speed <span class="math">\(v_0\)</span>, so that it
slides up a plane that is inclined at an angle <span class="math">\(\theta\)</span> above the
horizontal. <strong>(a)</strong> Write down Newton's second law for the puck and solve to
give its position as a function of time.</p>
</blockquote>
<p>This is a simple example of using the Second Law as a differential equation. We
write down the forces acting on the particle, set them equal to <span class="math">\(m\ddot r(t)\)</span>
and integrate twice to get position.</p>
<p>The only force acting on the puck is its weight, i.e. its mass times
acceleration due to gravity: <span class="math">\(mg\)</span>. The puck can only move along the surface of
the plane, so we are only interested in the component of the force that acts
parallel to the plane. This component is <span class="math">\(-mg sin(\theta)\)</span>. So taking <span class="math">\(x\)</span> as the
direction up the plane, Newton's second law is</p>
<div class="math">$$ m\ddot x(t) = -mgsin(\theta)$$</div>
<p>Integrating once gives velocity</p>
<div class="math">$$ \dot x(t) = -g sin(\theta) t + v_0$$</div>
<p>Integrating again gives position</p>
<div class="math">$$ x(t) = -\frac{1}{2} g sin(\theta) t^2 + v_0t + x_0$$</div>
<p>and <span class="math">\(x_0=0\)</span> since we start measuring from its starting position.</p>
<blockquote>
<p><strong>(b)</strong> How long will the puck take to return to its starting point?</p>
</blockquote>
<p>The puck is at its starting point whenever <span class="math">\(x = 0\)</span>:</p>
<div class="math">$$0 = t\bigg(-\frac{1}{2} g sin(\theta) t + v_0\bigg)$$</div>
<p>The solutions of that are either <span class="math">\(t=0\)</span> (which we already knew) or (the solution
we want)</p>
<p><span class="math">\(t = \frac{2v_0}{g sin(\theta)}\)</span></p>
<p><strong>Polar coordinates</strong></p>
<blockquote>
<p>A "halfpipe" at a skateboard park consists of a concrete trough with a
semicircular cross section of radius <span class="math">\(R = 5m\)</span>. I hold a frictionless
skateboard on the side of the trough pointing down toward the bottom and
release it. Discuss the subsequent motion using Newton's second law. In
particular, if I release the skateboard just a short way from the bottom, how
long will it take to come back to the point of release?</p>
</blockquote>
<p>Conceptually, we do the same thing as for the problem using Cartesian
coordinates: we write down Newton's second law resolved into two orthogonal
directions. It's just that with polar coordinates, these orthogonal directions
are constantly changing.</p>
<p>The weight of the skateboard acts downwards. This results in a tangent force
causing the skateboard to move along the halfpipe, and also presses the
skateboard into the halfpipe a bit, with an associated reaction force. We
ignore the force/reaction force between the skateboard and the pipe and focus
only on the tangent force: <span class="math">\(-mg sin(\phi)\)</span>.</p>
<p>The equation for acceleration says that, at time <span class="math">\(t\)</span>, acceleration in the
current tangent direction is <span class="math">\(R\ddot \phi(t)\)</span> (halfpipe radius times current
angular acceleration<sup id="sf-taylor-classical-mechanics-5-back"><a href="#sf-taylor-classical-mechanics-5" class="simple-footnote" title="To see this, start with the \(\phihat(t)\) (tangent direction) part of the full expression for acceleration and note that the radial distance of the skateboard is fixed by the presence of the half-pipe, so speed \(\dot r(t)\) (and acceleration) in the radial direction is zero.">5</a></sup>). So Newton's second law in this context is the differential
equation</p>
<div class="math">$$mR \ddot \phi(t) = -mg sin(\phi(t))$$</div>
<p>We read this as saying:</p>
<blockquote>
<p>We don't know how the angle is changing over time <span class="math">\(\phi(t)\)</span> -- that is
precisely what we want to know. But what we do know is that whatever that
function is, its second derivative at time <span class="math">\(t\)</span> is equal to the sin of the
current angle (times <span class="math">\(g/R\)</span> and with a minus sign because the way we've
defined the angle it gets smaller as the weight force takes the skateboard
towards the bottom).</p>
</blockquote>
<p>Once we've got to that point, finding the angle function <span class="math">\(\phi(t)\)</span> is just
math. It turns out that the only function for which it is true that the second
derivative has this property<sup id="sf-taylor-classical-mechanics-6-back"><a href="#sf-taylor-classical-mechanics-6" class="simple-footnote" title="Actually the solution is a function with second derivative having a different property, but one which is very similar to the desired property as long as we're restricting ourselves to the angle being fairly small.">6</a></sup> is</p>
<div class="math">$$\phi(t) = \phi_0 cos\bigg(\sqrt\frac{g}{R}t\bigg)$$</div>
<p>where <span class="math">\(\phi_0\)</span> is the angle that the skateboard was released at at time <span class="math">\(t=0\)</span>.
This is the "solution" of the differential equation: a function matching the
criteria that the differential equation specified.</p>
<p>So we have our answer: the forces acting on the skateboard imply (via Newton's
second law) that the way the angle of the skateboard changes is a cosine
function of time. So the skateboard angle does what cosines do: it starts off
at its maximum, decreases to zero, crosses zero and becomes negative for a
while, starts turning back towards zero, crosses zero and becomes positive
again and gets back to its maximum where it turns around again.</p>
<h3 id="conservation-of-momentum">Conservation of momentum</h3>
<p>Momentum is mass times velocity, <span class="math">\(\p(t) = m\dot \r(t)\)</span>, so another way of
stating the second law is: rate of change of momentum is equal to force. In a
multi-particle system the forces-and-reaction-forces of the third law cancel
each other out when summing the rate of change of momentum of the whole
system. So, total momentum doesn't change due to internal forces (but it does
if there are external forces).</p>
<p>pp 21-23 show that conservation of momentum does not hold when considering
magnetic and electrostatic forces between charged particles moving close to the
speed of light. However I am unfamiliar with those forces and with the
"right-hand rule" for fields/forces and I haven't understood this section.</p>
<hr>
<script type="text/javascript">if (!document.getElementById('mathjaxscript_pelican_#%@#$@#')) {
var align = "center",
indent = "0em",
linebreak = "false";
if (false) {
align = (screen.width < 768) ? "left" : align;
indent = (screen.width < 768) ? "0em" : indent;
linebreak = (screen.width < 768) ? 'true' : linebreak;
}
var mathjaxscript = document.createElement('script');
var location_protocol = (false) ? 'https' : document.location.protocol;
if (location_protocol !== 'http' && location_protocol !== 'https') location_protocol = 'https:';
mathjaxscript.id = 'mathjaxscript_pelican_#%@#$@#';
mathjaxscript.type = 'text/javascript';
mathjaxscript.src = location_protocol + '//cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML';
mathjaxscript[(window.opera ? "innerHTML" : "text")] =
"MathJax.Hub.Config({" +
" config: ['MMLorHTML.js']," +
" TeX: { extensions: ['AMSmath.js','AMSsymbols.js','noErrors.js','noUndefined.js'], equationNumbers: { autoNumber: 'AMS' }, Macros: {} }," +
" jax: ['input/TeX','input/MathML','output/HTML-CSS']," +
" extensions: ['tex2jax.js','mml2jax.js','MathMenu.js','MathZoom.js']," +
" displayAlign: '"+ align +"'," +
" displayIndent: '"+ indent +"'," +
" showMathMenu: true," +
" messageStyle: 'normal'," +
" tex2jax: { " +
" inlineMath: [ ['\\\\(','\\\\)'] ], " +
" displayMath: [ ['$$','$$'] ]," +
" processEscapes: true," +
" preview: 'TeX'," +
" }, " +
" 'HTML-CSS': { " +
" styles: { '.MathJax_Display, .MathJax .mo, .MathJax .mi, .MathJax .mn': {color: 'inherit ! important'} }," +
" linebreaks: { automatic: "+ linebreak +", width: '90% container' }," +
" }, " +
"}); " +
"if ('default' !== 'default') {" +
"MathJax.Hub.Register.StartupHook('HTML-CSS Jax Ready',function () {" +
"var VARIANT = MathJax.OutputJax['HTML-CSS'].FONTDATA.VARIANT;" +
"VARIANT['normal'].fonts.unshift('MathJax_default');" +
"VARIANT['bold'].fonts.unshift('MathJax_default-bold');" +
"VARIANT['italic'].fonts.unshift('MathJax_default-italic');" +
"VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" +
"});" +
"MathJax.Hub.Register.StartupHook('SVG Jax Ready',function () {" +
"var VARIANT = MathJax.OutputJax.SVG.FONTDATA.VARIANT;" +
"VARIANT['normal'].fonts.unshift('MathJax_default');" +
"VARIANT['bold'].fonts.unshift('MathJax_default-bold');" +
"VARIANT['italic'].fonts.unshift('MathJax_default-italic');" +
"VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" +
"});" +
"}";
(document.body || document.getElementsByTagName('head')[0]).appendChild(mathjaxscript);
}
</script><ol class="simple-footnotes"><li id="sf-taylor-classical-mechanics-1"> The product rule is the thing when you studied
differentiation that says: when you're differentiating the product of two
functions you differentiate one and keep the other as-is, then you
differentiate the other while keeping the first as-is, and you add the two
things together: <span class="math">\(\frac{d(f(t)g(t))}{dt} = \dot f(t) g(t) + f(t) \dot g(t)\)</span>
<a href="#sf-taylor-classical-mechanics-1-back" class="simple-footnote-back">↩</a></li><li id="sf-taylor-classical-mechanics-2">You can prove this by writing the unit
vector in Cartesian coordinates, <span class="math">\(cos(\phi) \xhat + sin(\phi) \yhat\)</span>, and then
differentiating it to give <span class="math">\(\dot \phi\big(-sin(\phi)\xhat +
cos(\phi)\yhat\big)\)</span> which is <span class="math">\(\dot \phi\)</span> times a vector orthogonal to the
original one. <a href="#sf-taylor-classical-mechanics-2-back" class="simple-footnote-back">↩</a></li><li id="sf-taylor-classical-mechanics-3">In polar
coordinates, if you suppose that you know functions <span class="math">\(r(t)\)</span> and <span class="math">\(\phi(t)\)</span> giving
the angle and distance at time <span class="math">\(t\)</span>, then the accelerations in the two
orthogonal directions at time <span class="math">\(t\)</span> are
<span class="math">\(\a(t) = \bigg( \ddot r(t) - r(t) \dot\phi(t)^2 \bigg) \rhat(t) + \bigg( 2\dot r(t) \dot \phi(t) + r(t) \ddot \phi(t)\bigg) \phihat(t)\)</span>
<a href="#sf-taylor-classical-mechanics-3-back" class="simple-footnote-back">↩</a></li><li id="sf-taylor-classical-mechanics-4">The dot means "differentiated with respect to time". So if <span class="math">\(r\)</span> is
position as a function of time then <span class="math">\(\dot r\)</span> is velocity and <span class="math">\(\ddot r\)</span> is
acceleration. <a href="#sf-taylor-classical-mechanics-4-back" class="simple-footnote-back">↩</a></li><li id="sf-taylor-classical-mechanics-5">To see this, start with the <span class="math">\(\phihat(t)\)</span> (tangent
direction) part of the full expression for acceleration and note that the
radial distance of the skateboard is fixed by the presence of the half-pipe, so
speed <span class="math">\(\dot r(t)\)</span> (and acceleration) in the radial direction is
zero. <a href="#sf-taylor-classical-mechanics-5-back" class="simple-footnote-back">↩</a></li><li id="sf-taylor-classical-mechanics-6">Actually the solution is a function with
second derivative having a different property, but one which is very similar to
the desired property as long as we're restricting ourselves to the angle being
fairly small. <a href="#sf-taylor-classical-mechanics-6-back" class="simple-footnote-back">↩</a></li></ol>