http://dandavison.github.io/2016-10-17T00:00:00-07:00Finding the nth Fibonacci number via an eigenvector change of basis2016-10-17T00:00:00-07:00Dan Davisontag:dandavison.github.io,2016-10-17:fibonacci-eigenbasis.html<style type="text/css"> body {color: black;} </style> <div class="math">$$\newcommand{\i}{\mathbf{i}} \newcommand{\j}{\mathbf{j}} \newcommand{\cvec}{\begin{pmatrix}#1\\#2\end{pmatrix}} \newcommand{\mat}{\begin{bmatrix}#1 &amp; #2\\#3 &amp; #4\\ \end{bmatrix}} \newcommand{\scvec}{\tiny{\cvec{#1}{#2}}} \newcommand{\smat}{\tiny{\mat{#1}{#2}{#3}{#4}}} \newcommand{\nth}{n^{\text{th}}}$$</div> <p>This is the problem given at the end of the eigenvectors video in the <a href="https://www.youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab">Essence of Linear Algebra</a> series by <a href="http://www.3blue1brown.com/">3blue1brown</a>.</p> <hr /> <h4 id="introduction"><strong>Introduction</strong></h4> <p>Consider the matrix</p> <div class="math">$$A = \mat{0}{1} {1}{1}$$</div> <p>The first few powers are</p> <div class="math">\begin{align*} &amp;A^{1} &amp;= \mat{0}{1} {1}{1} \\ &amp;A^{2} = \mat{0}{1} {1}{1} \mat{0}{1} {1}{1} &amp;= \mat{1}{1} {1}{2} \\ &amp;A^{3} = \mat{0}{1} {1}{1} \mat{1}{1} {1}{2} &amp;= \mat{1}{2} {2}{3} \\ &amp;A^{4} = \mat{0}{1} {1}{1} \mat{1}{2} {2}{3} &amp;= \mat{2}{3} {3}{5} \end{align*}</div> <p>The Fibonacci sequence is the sequence you get by starting with <span class="math">$$0, 1$$</span> and after that always forming the next number by adding the two previous ones: <span class="math">$$F_0, F_1, F_2, F_3, F_4, F_5, F_6, F_7, ...$$</span> = <span class="math">$$0, 1, 1, 2, 3, 5, 8, 13, ...$$</span>.</p> <p>The matrix powers are generating the Fibonacci sequence:</p> <div class="math">$$A^{n} = \mat{F_{n-1} }{F_n } {F_n }{F_{n+1} }$$</div> <p>So if there were a way to compute the <span class="math">$$\nth$$</span> power of that matrix "directly", that would also be a way to compute the <span class="math">$$\nth$$</span> Fibonacci number "directly", i.e. without computing all the preceding Fibonacci numbers <em>en route</em>.</p> <p>How can we do this? To state the problem in a different way, we need to construct a new matrix that performs exactly the same transformation as <span class="math">$$A^n$$</span>, but which somehow does the exponentiation step "in one go" rather than by multiplying <span class="math">$$A$$</span> with itself <span class="math">$$n$$</span> times.</p> <h4 id="solution-outline"><strong>Solution outline</strong></h4> <p>Matrices represent transformations, so we can talk about them as taking in some vector and producing some other vector. The approach we're going to take is to re-express the <span class="math">$$A^n$$</span> transformation as follows:</p> <ol> <li>Convert the input vector to its representation in an alternative basis which uses the eigenvectors as the basis vectors (it's called an "eigenbasis").</li> <li>In this alternative basis, compute the new position of the vector after carrying out the <span class="math">$$A^n$$</span> transformation.</li> <li>Convert the resulting vector back to its representation in our original basis.</li> </ol> <p>I.e., we're going to compute the overall transformation as this product of matrices (remember that one reads these things right-to-left):</p> <div class="math">$$\begin{bmatrix}\text{matrix converting their}\\\text{representation to ours} \\ \end{bmatrix} \begin{bmatrix}\text{matrix that does the A transformation}\\\text{in the alternative basis} \\ \end{bmatrix}^n \begin{bmatrix}\text{matrix converting our}\\\text{representation to theirs} \\ \end{bmatrix}$$</div> <p>The crux of all this is that the exponentiation is efficient in the eigenbasis. That's because, in the eigenbasis, the transformation is just stretching space in the directions of the two basis vectors. So to do the transformation <span class="math">$$n$$</span> times in the eigenbasis, you just stretch by the stretch-factor raised to the <span class="math">$$\nth$$</span> power, rather than doing <span class="math">$$n$$</span> matrix multiplications.</p> <h4 id="solution-details"><strong>Solution details</strong></h4> <p>Let's suppose we've already found the eigenvectors, and that there are two of them, and that we've arranged them as the two columns of a matrix <span class="math">$$V$$</span>. <span class="math">$$V$$</span> holds the basis vectors of the alternative basis, and therefore we know from the <a href="./linear-algebra.html#change-of-basis">change of basis</a> notes that <span class="math">$$V$$</span> is the matrix that takes as input a vector expressed in the alternative basis and outputs its representation in our basis.</p> <p>So, step (3) is done by <span class="math">$$V$$</span>, and step (1) is done by <span class="math">$$V^{-1}$$</span>, and the matrix performing all three steps is going to look like</p> <div class="math">$$V \begin{bmatrix}\text{matrix that does the A transformation}\\\text{in the alternative basis} \\ \end{bmatrix}^n V^{-1}$$</div> <p>OK, so what is the matrix in the middle? The <a href="./linear-algebra.html#change-of-basis">change of basis</a> notes tell us that we can compute it as</p> <div class="math">$$\begin{bmatrix}\text{matrix converting our}\\\text{representation to theirs} \\ \end{bmatrix} A \begin{bmatrix}\text{matrix converting their}\\\text{representation to ours} \\ \end{bmatrix}$$</div> <p>In other words the matrix in the middle is</p> <div class="math">$$V^{-1}AV$$</div> <p>and the entire transformation is</p> <div class="math">$$V \Big(V^{-1}AV\Big)^n V^{-1}$$</div> <p>Put back into words, that's</p> <div class="math">$$\begin{bmatrix}\text{matrix converting their}\\\text{representation to ours} \\ \end{bmatrix} \Bigg( \begin{bmatrix}\text{matrix converting our}\\\text{representation to theirs} \\ \end{bmatrix} A \begin{bmatrix}\text{matrix converting their}\\\text{representation to ours} \\ \end{bmatrix} \Bigg)^n \begin{bmatrix}\text{matrix converting our}\\\text{representation to theirs} \\ \end{bmatrix}$$</div> <p>Recall that above we observed that the <span class="math">$$\nth$$</span> power of <span class="math">$$A$$</span> is a matrix with the nth Fibonacci number in its bottom left and top right entries. So the following tasks remain:</p> <ol> <li>Find the eigenvectors and put them in a matrix <span class="math">$$V$$</span>.</li> <li>Find the inverse of <span class="math">$$V$$</span>.</li> <li>Compute the matrix product <span class="math">$$V^{-1}AV$$</span>.</li> <li>Compute the result of raising that to the <span class="math">$$\nth$$</span> power.</li> <li>Plug the result of that into the overall expression.</li> <li>Take the entry in the bottom left or top right (they should be the same!).</li> </ol> <p>The result should be an expression giving the <span class="math">$$\nth$$</span> Fibonacci number as a function of <span class="math">$$n$$</span>. It should be possible to give as input to that function the number one million, and have it output the one millionth Fibonacci number directly, without it having to go through the preceding 999,999 Fibonacci numbers.</p> <h4 id="the-answer-without-showing-the-calculations"><strong>The answer without showing the calculations</strong></h4> <div class="math">\begin{align*} &amp;\text{The eigenvectors are} \\\\ &amp;V &amp;= \mat{2 }{2 } {1 + \sqrt 5}{1 - \sqrt 5} \\\\ &amp;\text{which has inverse} \\\\ &amp;V^{-1} &amp;= \frac{-1}{4\sqrt 5} \mat{1 - \sqrt 5 }{-2} {-1 - \sqrt 5}{2} \\\\ &amp;\text{Therefore} \\\\ &amp;V^{-1}AV &amp;= \frac{1}{2} \mat{1 + \sqrt 5}{0 } {0 }{1 - \sqrt 5} \\\\ &amp;\text{and} \\\\ &amp;(V^{-1}AV)^n &amp;= \frac{1}{2^n} \mat{(1 + \sqrt 5)^n}{0 } {0 }{(1 - \sqrt 5)^n} \\\\ &amp;\text{and} \\\\ &amp;V \Big(V^{-1}AV\Big)^n V^{-1} &amp;= \mat{\frac{\big((1 + \sqrt 5)^{n-1} - (1 - \sqrt 5)^{n-1}\big)}{2^{n-1}\sqrt 5}}{\frac{\big((1 + \sqrt 5)^n - (1 - \sqrt 5)^n \big)}{2^n \sqrt 5}} {\frac{\big((1 + \sqrt 5)^n - (1 - \sqrt 5)^n \big)}{2^n \sqrt 5}}{\frac{\big((1 + \sqrt 5)^{n+1} - (1 - \sqrt 5)^{n+1}\big)}{2^{n+1}\sqrt 5}} \\\\ &amp;\text{Therefore the nth Fibonacci number is} \\\\ &amp;F_n &amp;= \frac{(1 + \sqrt 5)^n - (1 - \sqrt 5)^n} {2^n \sqrt 5} \end{align*}</div> <h4 id="does-this-actually-work"><strong>Does this actually work?</strong></h4> <p>Yes.</p> <div class="codehilite"><pre><span class="kn">from</span> <span class="nn">math</span> <span class="kn">import</span> <span class="n">sqrt</span> <span class="k">def</span> <span class="nf">fib</span><span class="p">(</span><span class="n">n</span><span class="p">):</span> <span class="k">return</span> <span class="p">(</span> <span class="p">(</span> <span class="p">(</span><span class="mi">1</span> <span class="o">+</span> <span class="n">sqrt</span><span class="p">(</span><span class="mi">5</span><span class="p">))</span><span class="o">**</span><span class="n">n</span> <span class="o">-</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">sqrt</span><span class="p">(</span><span class="mi">5</span><span class="p">))</span><span class="o">**</span><span class="n">n</span> <span class="p">)</span> <span class="o">/</span> <span class="nb">float</span><span class="p">(</span><span class="mi">2</span><span class="o">**</span><span class="n">n</span> <span class="o">*</span> <span class="n">sqrt</span><span class="p">(</span><span class="mi">5</span><span class="p">)))</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">10</span><span class="p">):</span> <span class="k">print</span> <span class="n">i</span><span class="p">,</span> <span class="n">fib</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="mi">0</span> <span class="mf">0.0</span> <span class="mi">1</span> <span class="mf">1.0</span> <span class="mi">2</span> <span class="mf">1.0</span> <span class="mi">3</span> <span class="mf">2.0</span> <span class="mi">4</span> <span class="mf">3.0</span> <span class="mi">5</span> <span class="mf">5.0</span> <span class="mi">6</span> <span class="mf">8.0</span> <span class="mi">7</span> <span class="mf">13.0</span> <span class="mi">8</span> <span class="mf">21.0</span> <span class="mi">9</span> <span class="mf">34.0</span> </pre></div> <h4 id="history"><strong>History</strong></h4> <p>The formula is known as <a href="https://en.wikipedia.org/wiki/Fibonacci_number#Closed-form_expression">Binet's formula</a> (1843) but was apparently known to Euler, Daniel Bernoulli and de Moivre more than a century earlier. It can be derived without using linear algebra techniques; I don't know when the style of proof attempted here would first have been done. The result can be written as</p> <div class="math">$$F_n = \frac{\phi^n - (1-\phi)^n}{\sqrt{5}}$$</div> <p>where <span class="math">$$\phi = \frac{1+\sqrt{5}}{2}$$</span> is the <a href="https://en.wikipedia.org/wiki/Golden_ratio">golden ratio</a>.</p> <h4 id="calculations"><strong>Calculations</strong></h4> <h5 id="1-find-the-eigenvectors"><strong>1. Find the eigenvectors</strong></h5> <p>We follow the textbook approach: We have </p> <div class="math">$$A = \mat{0}{1} {1}{1}$$</div> <p>An eigenvector <span class="math">$$v$$</span> satisfies <span class="math">$$Av = \lambda v$$</span> for some scalar <span class="math">$$\lambda$$</span>. That equation can be rearranged as follows</p> <div class="math">\begin{align*} A\vec v &amp;= \lambda I\vec v \\ A\vec v - \lambda I\vec v &amp;= \vec 0 \\ (A - \lambda I)\vec v &amp;= \vec 0 \end{align*}</div> <p>which means that the matrix <span class="math">$$A - \lambda I$$</span> is a transformation that takes some non-zero vector <span class="math">$$\vec v$$</span> to the zero vector (i.e. it has a non-empty "null space"). This means that the transformation cannot be reversed, i.e. the matrix has no inverse, i.e. its determinant is zero. So, use that last fact to find the eigenvectors <span class="math">$$\lambda$$</span>:</p> <div class="math">\begin{align*} \det (A - \lambda I) &amp;= 0 \\ \\ \det \mat{-\lambda}{1} {1 }{1 - \lambda} &amp;= 0 % \\ % \\ % (-\lambda)(1 - \lambda) - 1 &amp;= 0 \\ \\ \lambda^2 - \lambda - 1 = 0 \end{align*}</div> <p>Using the quadratic formula we have <span class="math">$$a=1, b=-1, c=-1$$</span> and</p> <div class="math">\begin{align*} \lambda = \frac{-b ± \sqrt{b^2 - 4ac}}{2a} = \frac{1 ± \sqrt{5}}{2} \end{align*}</div> <p>which are the two eigenvalues.</p> <p>To find eigenvectors associated with the eigenvalues, go back to the equations</p> <div class="math">\begin{align*} (A - \lambda I)\vec v &amp;= \vec 0 \\ \\ \mat{-\lambda}{1} {1 }{1 - \lambda} \vec v &amp;= \vec 0 \end{align*}</div> <p>Let an eigenvector <span class="math">$$v$$</span> be <span class="math">$$\scvec{v_1}{v_2}$$</span>. The matrix equation corresponds to this system of equations:</p> <div class="math">$$\begin{cases} -\lambda v_1 &amp;+ v_2 &amp;= 0\\ v_1 &amp;+ (1 - \lambda) v_2 &amp;= 0 \end{cases}$$</div> <p>From the first equation we have <span class="math">$$v_2 = \lambda v_1$$</span>. There are infinitely many eigenvectors (a line of them) associated with any given eigenvalue, so we can pick an arbitrary value for <span class="math">$$v_1$$</span>. If we choose <span class="math">$$v_1=2$$</span> then we have eigenvectors <span class="math">$$\scvec{2}{1+\sqrt 5}$$</span> and <span class="math">$$\scvec{2}{1-\sqrt 5}$$</span>. The matrix containing the eigenvectors is</p> <div class="math">$$V = \mat{2 }{2 } {1 + \sqrt 5}{1 - \sqrt 5}$$</div> <h5 id="2-find-inverse-of-v"><strong>2. Find inverse of <span class="math">$$V$$</span></strong></h5> <p>The inverse of a 2x2 matrix is given by</p> <div class="math">$$\mat{a}{c} {b}{d} ^ {-1} = \frac{1}{\text{det}} \mat{d}{-c} {-b}{a}$$</div> <p>where <span class="math">$$\text{det} = ad - cb$$</span>. Therefore</p> <div class="math">\begin{align*} V^{-1} &amp;= \frac{1}{2(1 - \sqrt 5) - 2(1 + \sqrt 5)} \mat{1 - \sqrt 5 }{-2} {-(1 + \sqrt 5)}{2} \\\\ &amp;= \frac{-1}{4\sqrt 5} \mat{1 - \sqrt 5 }{-2} {-(1 + \sqrt 5)}{2} \end{align*}</div> <h5 id="3-find-the-matrix-product-v-1av"><strong>3. Find the matrix product <span class="math">$$V^{-1}AV$$</span></strong></h5> <p>Before we get lost in the calculation, let's remember what this is. It's a matrix that does the <span class="math">$$A$$</span> transformation, but <em>in the coordinate system defined by <span class="math">$$A$$</span>'s eigenvectors</em>. So, the resulting matrix <em>must</em> do nothing other than stretch space in the direction of one or both basis vectors in that coordinate system. That's because (1) we represent a transformation with a matrix saying where each of the basis vectors are taken to, (2) the definition of an eigenvector of a transformation is that it is a vector which is simply stretched by the transformation with no change in direction, therefore (3) if the eigenvectors are the basis vectors, then the matrix representing the transformation must just stretch space in the two directions. A matrix which stretches space in the direction of the basis vectors looks like <span class="math">$$\smat{a}{0}{0}{b}$$</span>, i.e. it is diagonal. Therefore, <span class="math">$$V^{-1}AV$$</span> <em>must</em> be diagonal.</p> <div class="math">\begin{align*} V^{-1}AV &amp;= \frac{-1}{4\sqrt 5} \mat{1 - \sqrt 5 }{-2} {-(1 + \sqrt 5)}{2} \mat{0}{1} {1}{1} \mat{2 }{2 } {1 + \sqrt 5}{1 - \sqrt 5} \\\\ &amp;= \frac{-1}{4\sqrt 5} \mat{1 - \sqrt 5 }{-2} {-(1 + \sqrt 5)}{2} \mat{1 + \sqrt 5}{1 - \sqrt 5} {3 + \sqrt 5}{3 - \sqrt 5} \\\\ &amp;= \frac{-1}{4\sqrt 5} \mat{-4 - 2(3 + \sqrt 5) }{6 - 2\sqrt 5 - 2(3 - \sqrt 5)} {-(6 + 2\sqrt 5) + 2(3 + \sqrt 5)}{4 + 2(3 - \sqrt 5)} \\\\ &amp;= \frac{-1}{2\sqrt 5} \mat{-2 - 3 - \sqrt 5}{3 - \sqrt 5 - 3 + \sqrt 5} {-3 - \sqrt 5 + 3 + \sqrt 5}{2 + 3 - \sqrt 5} \\\\ &amp;= \frac{-1}{2\sqrt 5} \mat{-5 - \sqrt 5}{0 } {0 }{5 - \sqrt 5} \\\\ &amp;= \frac{1}{2} \mat{1 + \sqrt 5}{0 } {0 }{1 - \sqrt 5} \end{align*}</div> <h5 id="4-compute-v-1avn"><strong>4. Compute <span class="math">$$(V^{-1}AV)^n$$</span></strong></h5> <p>The matrix is diagonal so this is straightforward. Note that this is the whole point of converting to the eigenbasis: the exponentiation at this step just involves the usual operations of raising scalar numbers to a power; no need to multiply matrices together. A computer will be able to compute the <span class="math">$$\nth$$</span> power of a diagonal matrix much faster than that of a non-diagonal matrix.</p> <div class="math">$$(V^{-1}AV)^n = \frac{1}{2^n} \mat{(1 + \sqrt 5)^n}{0 } {0 }{(1 - \sqrt 5)^n}$$</div> <h5 id="5-plug-the-nth-power-into-the-overall-expression"><strong>5. Plug the <span class="math">$$\nth$$</span> power into the overall expression</strong></h5> <div class="math">\begin{align*} V \Big(V^{-1}AV\Big)^n V^{-1} &amp;= \frac{-1}{4\sqrt 5} \frac{1}{2^n} \mat{2 }{2 } {1 + \sqrt 5}{1 - \sqrt 5} \mat{(1 + \sqrt 5)^n}{0 } {0 }{(1 - \sqrt 5)^n} \mat{1 - \sqrt 5 }{-2} {-(1 + \sqrt 5)}{2} \\\\ &amp;= \frac{-1}{4\sqrt 5} \frac{1}{2^n} \mat{2 }{2 } {1 + \sqrt 5}{1 - \sqrt 5} \mat{(1 - \sqrt 5)(1 + \sqrt 5)^n}{-2(1 + \sqrt 5)^n} {-(1 + \sqrt 5)(1 - \sqrt 5)^n}{2(1 - \sqrt 5)^n} \\\\ &amp;= \frac{-1}{4\sqrt 5} \frac{1}{2^n} \mat{2(-4)\big((1 + \sqrt 5)^{n-1} - (1 - \sqrt 5)^{n-1}\big)}{-4\big((1 + \sqrt 5)^n - (1 - \sqrt 5)^n \big)} { -4\big((1 + \sqrt 5)^n - (1 - \sqrt 5)^n \big)}{-2\big((1 + \sqrt 5)^{n+1} - (1 - \sqrt 5)^{n+1}\big)} \\\\ &amp;= \frac{1}{4\sqrt 5} \mat{4\frac{\big((1 + \sqrt 5)^{n-1} - (1 - \sqrt 5)^{n-1}\big)}{2^{n-1}}}{4\frac{\big((1 + \sqrt 5)^n - (1 - \sqrt 5)^n \big)}{2^n }} {4\frac{\big((1 + \sqrt 5)^n - (1 - \sqrt 5)^n \big)}{2^n }}{ \frac{\big((1 + \sqrt 5)^{n+1} - (1 - \sqrt 5)^{n+1}\big)}{2^{n-1}}} \\\\ &amp;= \mat{\frac{\big((1 + \sqrt 5)^{n-1} - (1 - \sqrt 5)^{n-1}\big)}{2^{n-1}\sqrt 5}}{\frac{\big((1 + \sqrt 5)^n - (1 - \sqrt 5)^n \big)}{2^n \sqrt 5}} {\frac{\big((1 + \sqrt 5)^n - (1 - \sqrt 5)^n \big)}{2^n \sqrt 5}}{\frac{\big((1 + \sqrt 5)^{n+1} - (1 - \sqrt 5)^{n+1}\big)}{2^{n+1}\sqrt 5}} \end{align*}</div> <script type="text/javascript">if (!document.getElementById('mathjaxscript_pelican_#%@#@#')) { var align = "center", indent = "0em", linebreak = "false"; if (false) { align = (screen.width < 768) ? "left" : align; indent = (screen.width < 768) ? "0em" : indent; linebreak = (screen.width < 768) ? 'true' : linebreak; } var mathjaxscript = document.createElement('script'); var location_protocol = (false) ? 'https' : document.location.protocol; if (location_protocol !== 'http' && location_protocol !== 'https') location_protocol = 'https:'; mathjaxscript.id = 'mathjaxscript_pelican_#%@#@#'; mathjaxscript.type = 'text/javascript'; mathjaxscript.src = location_protocol + '//cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML'; mathjaxscript[(window.opera ? "innerHTML" : "text")] = "MathJax.Hub.Config({" + " config: ['MMLorHTML.js']," + " TeX: { extensions: ['AMSmath.js','AMSsymbols.js','noErrors.js','noUndefined.js'], equationNumbers: { autoNumber: 'AMS' }, Macros: {} }," + " jax: ['input/TeX','input/MathML','output/HTML-CSS']," + " extensions: ['tex2jax.js','mml2jax.js','MathMenu.js','MathZoom.js']," + " displayAlign: '"+ align +"'," + " displayIndent: '"+ indent +"'," + " showMathMenu: true," + " messageStyle: 'normal'," + " tex2jax: { " + " inlineMath: [ ['\\\$$','\\\$$'] ], " + " displayMath: [ ['$$','$$'] ]," + " processEscapes: true," + " preview: 'TeX'," + " }, " + " 'HTML-CSS': { " + " styles: { '.MathJax_Display, .MathJax .mo, .MathJax .mi, .MathJax .mn': {color: 'inherit ! important'} }," + " linebreaks: { automatic: "+ linebreak +", width: '90% container' }," + " }, " + "}); " + "if ('default' !== 'default') {" + "MathJax.Hub.Register.StartupHook('HTML-CSS Jax Ready',function () {" + "var VARIANT = MathJax.OutputJax['HTML-CSS'].FONTDATA.VARIANT;" + "VARIANT['normal'].fonts.unshift('MathJax_default');" + "VARIANT['bold'].fonts.unshift('MathJax_default-bold');" + "VARIANT['italic'].fonts.unshift('MathJax_default-italic');" + "VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" + "});" + "MathJax.Hub.Register.StartupHook('SVG Jax Ready',function () {" + "var VARIANT = MathJax.OutputJax.SVG.FONTDATA.VARIANT;" + "VARIANT['normal'].fonts.unshift('MathJax_default');" + "VARIANT['bold'].fonts.unshift('MathJax_default-bold');" + "VARIANT['italic'].fonts.unshift('MathJax_default-italic');" + "VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" + "});" + "}"; (document.body || document.getElementsByTagName('head')).appendChild(mathjaxscript); } </script>Linear Algebra2016-08-14T00:00:00-07:00Dan Davisontag:dandavison.github.io,2016-08-14:linear-algebra.html<style type="text/css"> body {color: black;} </style> <div class="math">$$\newcommand{\i}{\mathbf{i}} \newcommand{\j}{\mathbf{j}} \newcommand{\cvec}{\begin{pmatrix}#1\\#2\end{pmatrix}} \newcommand{\mat}{\begin{bmatrix}#1 &amp; #2\\#3 &amp; #4\\ \end{bmatrix}} \newcommand{\scvec}{\tiny{\cvec{#1}{#2}}} \newcommand{\smat}{\tiny{\mat{#1}{#2}{#3}{#4}}} \newcommand{\nth}{n^{\text{th}}}$$</div> <p>Notes from the <a href="https://www.youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab">Essence of Linear Algebra</a> video series by <a href="http://www.3blue1brown.com/">3blue1brown</a>.</p> <hr /> <h3 id="linear-transformations-and-matrices">Linear transformations and matrices</h3> <p>A linear transformation is completely specified by</p> <ol> <li>Some basis vectors <span class="math">$$\i$$</span> and <span class="math">$$\j$$</span></li> <li>Where those basis vectors are taken to by the transformation.</li> </ol> <p>How the transformation affects any other point follows from those two pieces of information.</p> <p>So <span class="math">$$\i$$</span> might be taken to <span class="math">$$a\i + b\j$$</span>, and <span class="math">$$\j$$</span> might be taken to <span class="math">$$c\i + d\j$$</span>. In this case we would use the following matrix to describe the transformation:</p> <div class="math">$$\mat{a}{c} {b}{d}$$</div> <p>Some examples are</p> <div class="math">$$\begin{array}{ll} \text{stretch by a in the i-direction} &amp; \mat{a}{0} {0}{1} \\\\ \text{stretch by a in the i-direction and shear right} &amp; \mat{a}{b} {0}{1} \\\\ \text{rotate anticlockwise 90°} &amp; \mat{0}{-1} {1}{ 0} \end{array}$$</div> <p>Note that we haven't said what <span class="math">$$\i$$</span> and <span class="math">$$\j$$</span> are yet; they <em>define</em> the 2-dimensional space that we're considering. But, we can think of them for now as the usual orthogonal unit vectors in 2D space.</p> <p>So the matrix tells us where the basis vectors have been taken to. Any other vector <span class="math">$$f\i + g\j$$</span> is taken to wherever that is using the transformed basis vectors:</p> <div class="math">$$f\i + g\j \longrightarrow f\cvec{a}{b} + g\cvec{c}{d} = \cvec{fa + gc}{fb + gd}$$</div> <p>And that's how matrix multiplication is defined:</p> <div class="math">$$\mat{a}{c} {b}{d} \cvec{f}{g} = \cvec{fa + gc}{fb + gd}$$</div> <p>A matrix represents a linear transformation by showing where the basis vector are taken to.</p> <hr /> <h3 id="change-of-basis">Change of basis</h3> <p>Suppose person B uses some other basis vectors to describe locations in space. Specifically, in our coordinates, their basis vectors are <span class="math">$$\scvec{2}{1}$$</span> and <span class="math">$$\scvec{-1}{1}$$</span>.</p> <p><strong>When they state a vector, what is it in our coordinates?</strong></p> <p>If they say <span class="math">$$\scvec{-1}{2}$$</span>, what is that in our coordinates?</p> <p>Well, if they say <span class="math">$$\scvec{1}{0}$$</span>, that's <span class="math">$$\scvec{2}{1}$$</span> in our coordinates. And if they say <span class="math">$$\scvec{0}{1}$$</span>, that's <span class="math">$$\scvec{-1}{1}$$</span> in our coordinates. So the matrix containing <em>their basis vectors expressed using our coordinate system</em> transforms a point expressed in their coordinate system into one expressed in ours. That last sentence is critical, so hopefully it makes sense! So, the answer is</p> <div class="math">$$\mat{2}{-1} {1}{ 1} \cvec{-1}{2} = \cvec{-4}{1}.$$</div> <p><strong>When we state a vector, what is it in their coordinates?</strong></p> <p>We give the vector <span class="math">$$\scvec{3}{2}$$</span>. What is that in their coordinate system? By definition, the answer is the weights that scales their basis vectors to hit <span class="math">$$\scvec{3}{2}$$</span>. So, the solution to</p> <div class="math">$$\mat{2}{-1} {1}{1} \cvec{a}{b} = \cvec{3}{2}.$$</div> <p>Computationally, we can see that we can get the solution by multiplying both sides by the inverse:</p> <div class="math">$$\cvec{a}{b} = \mat{2}{-1} {1}{1}^{-1} \cvec{3}{2}.$$</div> <p>Conceptually, we have</p> <div class="math">$$\mat{2}{-1} {1}{1} = \begin{bmatrix}\text{matrix converting their}\\\text{representation to ours} \\ \end{bmatrix}$$</div> <p>where "their representation" means the vector expressed using their coordinate system. So the role played by the inverse is</p> <div class="math">$$\cvec{a}{b} = \begin{bmatrix}\text{matrix converting our}\\\text{representation to theirs} \\ \end{bmatrix} \cvec{3}{2}.$$</div> <p><strong>When we state a transformation, what is it in their coordinates?</strong></p> <p>We state a 90° anticlockwise rotation of 2D space:</p> <div class="math">$$\mat{0}{-1} {1}{0}$$</div> <p>what is that transformation in their coordinates? The answer is</p> <div class="math">$$\begin{bmatrix}\text{matrix converting our}\\\text{representation to theirs} \\ \end{bmatrix} \mat{0}{-1} {1}{0} \begin{bmatrix}\text{matrix converting their}\\\text{representation to ours} \\ \end{bmatrix}$$</div> <p>since the composition of those three transformations defines a single transformation that takes in a vector expressed in their coordinate system, converts it to our coordinate system, transforms it as requested, and then converts back to theirs.</p> <script type="text/javascript">if (!document.getElementById('mathjaxscript_pelican_#%@#@#')) { var align = "center", indent = "0em", linebreak = "false"; if (false) { align = (screen.width < 768) ? "left" : align; indent = (screen.width < 768) ? "0em" : indent; linebreak = (screen.width < 768) ? 'true' : linebreak; } var mathjaxscript = document.createElement('script'); var location_protocol = (false) ? 'https' : document.location.protocol; if (location_protocol !== 'http' && location_protocol !== 'https') location_protocol = 'https:'; mathjaxscript.id = 'mathjaxscript_pelican_#%@#@#'; mathjaxscript.type = 'text/javascript'; mathjaxscript.src = location_protocol + '//cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML'; mathjaxscript[(window.opera ? "innerHTML" : "text")] = "MathJax.Hub.Config({" + " config: ['MMLorHTML.js']," + " TeX: { extensions: ['AMSmath.js','AMSsymbols.js','noErrors.js','noUndefined.js'], equationNumbers: { autoNumber: 'AMS' }, Macros: {} }," + " jax: ['input/TeX','input/MathML','output/HTML-CSS']," + " extensions: ['tex2jax.js','mml2jax.js','MathMenu.js','MathZoom.js']," + " displayAlign: '"+ align +"'," + " displayIndent: '"+ indent +"'," + " showMathMenu: true," + " messageStyle: 'normal'," + " tex2jax: { " + " inlineMath: [ ['\\\$$','\\\$$'] ], " + " displayMath: [ ['$$','$$'] ]," + " processEscapes: true," + " preview: 'TeX'," + " }, " + " 'HTML-CSS': { " + " styles: { '.MathJax_Display, .MathJax .mo, .MathJax .mi, .MathJax .mn': {color: 'inherit ! important'} }," + " linebreaks: { automatic: "+ linebreak +", width: '90% container' }," + " }, " + "}); " + "if ('default' !== 'default') {" + "MathJax.Hub.Register.StartupHook('HTML-CSS Jax Ready',function () {" + "var VARIANT = MathJax.OutputJax['HTML-CSS'].FONTDATA.VARIANT;" + "VARIANT['normal'].fonts.unshift('MathJax_default');" + "VARIANT['bold'].fonts.unshift('MathJax_default-bold');" + "VARIANT['italic'].fonts.unshift('MathJax_default-italic');" + "VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" + "});" + "MathJax.Hub.Register.StartupHook('SVG Jax Ready',function () {" + "var VARIANT = MathJax.OutputJax.SVG.FONTDATA.VARIANT;" + "VARIANT['normal'].fonts.unshift('MathJax_default');" + "VARIANT['bold'].fonts.unshift('MathJax_default-bold');" + "VARIANT['italic'].fonts.unshift('MathJax_default-italic');" + "VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" + "});" + "}"; (document.body || document.getElementsByTagName('head')).appendChild(mathjaxscript); } </script>Classical Mechanics by John R. Taylor2016-05-29T00:00:00-07:00Dan Davisontag:dandavison.github.io,2016-05-29:taylor-classical-mechanics.html<p>Notes on <a href="http://www.amazon.com/Classical-Mechanics-John-R-Taylor/dp/189138922X">Classical Mechanics</a> by John R. Taylor.</p> <style type="text/css"> body {color: black;} </style> <div class="math">$$\newcommand{\xhat}{\vec{e_x}} \newcommand{\yhat}{\vec{e_y}} \newcommand{\rhat}{\vec{e_r}} \newcommand{\phihat}{\vec{e_\phi}} \newcommand{\r}{\vec{r}} \newcommand{\v}{\vec{v}} \newcommand{\p}{\vec{p}} \newcommand{\a}{\vec{a}} \newcommand{\F}{\vec{F}} \newcommand{\vector}{\begin{bmatrix}#1\end{bmatrix}}$$</div> <h1 id="chapter-1-newtons-laws-of-motion">Chapter 1 - Newton's Laws of Motion</h1> <h4 id="basics">Basics</h4> <p>The basic object of interest is a moving particle. Its position at time <span class="math">$$t$$</span> is <span class="math">$$\r$$</span>. It has that arrow over it because it is a vector. A vector is something that specifies a direction and a magnitude. Think of <span class="math">$$\r$$</span> as an arrow from the origin pointing to the current position. Don't think of <span class="math">$$\r$$</span> yet as a column vector containing numbers, because we haven't said what coordinate system we're using. Regardless of what coordinate system we use, <span class="math">$$\r$$</span> is always a vector pointing from the origin to the current position.</p> <p>The particle is moving, i.e. the position changes over time. So instead of just writing <span class="math">$$\r$$</span>, we write <span class="math">$$\r(t)$$</span> which says that it's a function of time. Think of that as giving the answer to a question: "At a given time <span class="math">$$t$$</span>, what is the position?". The answer (position) is a vector, so we can say that this is a "vector-valued function" (i.e. whatever output it gives, it's always a vector).</p> <p>Its velocity is a function <span class="math">$$\v(t)$$</span> whose value is also a vector (at time <span class="math">$$t$$</span> it's going at some speed in some direction). The velocity function <span class="math">$$\v(t)$$</span> is the derivative with respect to time of the position function <span class="math">$$\r(t)$$</span>. That sounds very familiar, but what exactly is the derivative of a vector-valued function?</p> <p>In normal, non-vector, calculus we imagine some curve like <span class="math">$$y = x^2$$</span>. So <span class="math">$$y$$</span> is a function of <span class="math">$$x$$</span>. The value of that function is not a vector; it's just a number (a scalar). The derivative of that function with respect to <span class="math">$$x$$</span> is saying: at a particular point along the x-axis, if I start advancing <span class="math">$$x$$</span> a tiny bit, how fast is <span class="math">$$y$$</span> changing? So, it's the slope of the curve at that point (also just a number, not a vector).</p> <p>In vector calculus, the derivative of <span class="math">$$\r(t)$$</span> with respect to <span class="math">$$t$$</span> is saying: at some particular time <span class="math">$$t$$</span>, if I start advancing time a tiny bit, where is the position going and how fast is it going there? So the derivative of a vector-valued function is a vector -- an arrow with direction and magnitude (speed).</p> <h4 id="coordinate-systems">Coordinate systems</h4> <p>Thinking of <span class="math">$$\r(t)$$</span> as an arrow with direction and magnitude is correct but a bit abstract. How specifically do we use numbers to represent position? The chapter covers two main coordinate systems. Let's say the particle is moving in 2D space for now.</p> <ul> <li> <p><strong>Cartesian coordinates</strong>: we write down how far the particle currently is in the x-direction, <span class="math">$$x(t)$$</span>, and how far it currently is in the y-direction, <span class="math">$$y(t)$$</span>.</p> </li> <li> <p><strong>Polar coordinates</strong>: we write down how far the particle currently is, <span class="math">$$r(t)$$</span>, in the current direction to the particle.</p> </li> </ul> <p>Note that <span class="math">$$x(t)$$</span>, <span class="math">$$y(t)$$</span>, and <span class="math">$$r(t)$$</span> were not written with arrows. They are just numbers, saying how far the particle is <em>in some direction</em>. The "in some direction" part corresponds to the concept of a <em>unit vector</em>. A "unit vector" is basically a vector where the direction is of interest, but the magnitude is just set to 1 for convenience.</p> <p>Cartesian coordinates use two directions to specify the position. We'll write these directions as the unit vectors <span class="math">$$\xhat$$</span> and <span class="math">$$\yhat$$</span>. So in Cartesian coordinates, the position is</p> <table style="width:100%"> <tbody><tr> <td> $$\r(t) = x(t)\xhat + y(t)\yhat$$ </td> <td> $$\mathrm{Go~} x(t) \mathrm{~units~in~the~} \xhat \mathrm{~direction} \mathrm{~and~} y(t) \mathrm{~units~in~the~} \yhat \mathrm{~direction}$$ </td> </tr> </tbody></table> <p>In contrast, polar coordinates just use one direction to specify the position: the direction of a direct line to the particle's current position. This direction is the unit vector <span class="math">$$\rhat(t)$$</span>. So in polar coordinates, the position is</p> <table style="width:100%"> <tbody><tr> <td> $$\r(t) = r(t)\rhat(t)$$ </td> <td> $$\mathrm{Go~} r(t) \mathrm{~units~in~the~} \rhat(t) \mathrm{~direction}$$ </td> </tr> </tbody></table> <p>Notice (and this is pretty important; it's basically the reason the chapter is covering polar coordinates) that in polar coordinates the unit vector <span class="math">$$\rhat(t)$$</span> is a function of time (its direction changes as the particle moves); in contrast, in Cartesian coordinates, <span class="math">$$\xhat$$</span> and <span class="math">$$\yhat$$</span> are constant; they always point in the same direction. The polar unit vector is a function of time because it is the direction to wherever-the-particle-currently-is. The Cartesian unit vectors are not functions of time because they are just the x-axis direction and the y-axis direction and these do not change.</p> <h4 id="velocity">Velocity</h4> <p>We can now differentiate these position functions to get the velocity. Recall that the answer is going to be a vector because it is the derivative of a vector-valued function.</p> <p><strong>Cartesian coordinates</strong></p> <p>Because <span class="math">$$\xhat$$</span> and <span class="math">$$\yhat$$</span> are not functions of time, differentiating is straightforward:</p> <div class="math">$$\v(t) = \frac{d}{dt}\bigg(x(t)\xhat + y(t)\yhat\bigg) = \frac{d x(t)}{dt}\xhat + \frac{d y(t)}{dt} \yhat$$</div> <p>Physicists use a dot to represent derivative-with-respect-to-time. So they might write this as</p> <div class="math">$$\v(t) = \dot x(t) \xhat + \dot y(t) \yhat$$</div> <p>Either way, what this is saying is that in Cartesian coordinates, the velocity function is a vector comprised of current x-speed in the x-direction and current y-speed in the y-direction. In other words, it's what you expect.</p> <p><strong>Polar coordinates</strong></p> <div class="math">$$\v(t) = \frac{d}{dt}\bigg(r(t)\rhat(t)\bigg)$$</div> <p>That's a product of two things that are both a function of time, so we use the "product rule"<sup id="sf-taylor-classical-mechanics-1-back"><a href="#sf-taylor-classical-mechanics-1" class="simple-footnote" title=" The product rule is the thing when you studied differentiation that says: when you're differentiating the product of two functions you differentiate one and keep the other as-is, then you differentiate the other while keeping the first as-is, and you add the two things together: $$\frac{d(f(t)g(t))}{dt} = \dot f(t) g(t) + f(t) \dot g(t)$$ ">1</a></sup> to differentiate it:</p> <div class="math">$$\frac{d}{dt}\bigg(r(t)\rhat(t)\bigg) = \dot r(t) \rhat(t) + r(t)\frac{d \rhat(t)}{dt}$$</div> <p>There's quite a few <span class="math">$$r$$</span>s there and it's important at this stage not to get lost in the symbols. We know that the answer (velocity) is a vector. That means we can write it as a bunch of things added together, where each thing is a number times some unit vector. And we're using polar coordinates, so the unit vectors are going to be the polar unit vectors. So the thing on the left <span class="math">$$\dot r(t) \rhat(t)$$</span> is fine: that's saying that the velocity has one component which is the current radial speed (a number <span class="math">$$\dot r(t)$$</span>) in the current radial direction (the unit vector <span class="math">$$\rhat(t)$$</span>).</p> <p>What about the thing on the right? It's the current radial distance times the current derivative of the unit vector function. We've said that in polar coordinates the unit vector <span class="math">$$\rhat(t)$$</span> changes over time, so it does make sense that we could ask what its derivative with respect to time is. So what is it? The answer is that it's a vector-valued function whose current value always points at right-angles to the current radial direction, but that requires explaining:</p> <p>Going back to the informal definition of derivatives above, we're at some point <span class="math">$$t$$</span> in time, and we imagine starting to advance time a tiny bit, and we look at the change in where the unit vector points, after this infinitesimally small amount of time passes. A unit vector always has length 1, so it can't grow in length. There's only one thing it can do: it can point in a slightly different direction. What direction has it gone in? It's basically like the hand of a clock. It's not too hard to see that if the hand of a clock changes just a tiny bit, then the tip moves in a direction that's almost a tangent to the circle. Change "tiny" to "infinitesimally small" and the "almost" goes away: so the time derivative of the radial unit vector is a vector pointing at right angles to the radial vector. This unit vector in that direction is called <span class="math">$$\phihat$$</span>, because it points in the direction that you go in when you increase the angle <span class="math">$$\phi$$</span>, as opposed to <span class="math">$$\rhat$$</span> which points in the direction you go in if you increase the radius <span class="math">$$r$$</span>. How fast does the radial unit vector move in the <span class="math">$$\phihat$$</span> direction? The answer is that it moves at the speed that the angle is increasing, so <span class="math">$$\dot \phi$$</span><sup id="sf-taylor-classical-mechanics-2-back"><a href="#sf-taylor-classical-mechanics-2" class="simple-footnote" title="You can prove this by writing the unit vector in Cartesian coordinates, $$cos(\phi) \xhat + sin(\phi) \yhat$$, and then differentiating it to give $$\dot \phi\big(-sin(\phi)\xhat + cos(\phi)\yhat\big)$$ which is $$\dot \phi$$ times a vector orthogonal to the original one.">2</a></sup>. In other words, the time derivative of the radial unit vector is <span class="math">$$\dot \phi(t) \phihat(t)$$</span></p> <p>The conclusion of all that is that in polar coordinates, the velocity vector is</p> <div class="math">$$\v(t) = \dot r(t) \rhat(t) + r(t) \dot \phi(t) \phihat(t)$$</div> <p>Compare this with the expression for velocity in Cartesian coordinates</p> <div class="math">$$\v(t) = \dot x(t) \xhat + \dot y(t) \yhat$$</div> <p>and we see it's a bit more complicated in polar coordinates.</p> <p>I understand the polar coordinates version as follows. At time <span class="math">$$t$$</span> the particle might be moving radially, and its angle might also be changing. The velocity vector has two components, one in the radial direction, and one in the tangent direction. In the radial direction, it's moving at whatever speed the radius is changing with. In the tangent direction it's moving at the speed that the angle is changing, multiplied by the current radius. That multiplication by radius makes sense informally, because if you are further out from the center of a circle, and the circle rotates by a few degrees, then you move further in space than if you were closer in to the center.</p> <h4 id="acceleration">Acceleration</h4> <p>The acceleration function is the derivative of the velocity function with respect to time. Therefore, it is also a vector: at time <span class="math">$$t$$</span> the particle is accelerating by some amount, in some direction.</p> <p><strong>Cartesian coordinates</strong></p> <p>Again, because the unit vectors do not change with time, it's as you expect: there's an x-acceleration in the x-direction, and a y-acceleration in the y-direction.</p> <div class="math">$$\a(t) = \ddot x(t) \xhat + \ddot y(t) \yhat$$</div> <p><strong>Polar coordinates</strong></p> <p>Above we saw that because, in polar coordinates, the directions of the coordinate system change with time, the function for velocity was more complicated than when using Cartesian coordinates. For acceleration, we differentiate the velocity expression and of course it gets even more complicated. But basically the answer is still a function of the form</p> <div class="math">$$\a(t) = \bigg( \text{Some function of } t \bigg) \rhat(t) + \bigg( \text{Another function of } t \bigg) \phihat(t)$$</div> <p>The functions of <span class="math">$$t$$</span> involve the current radius length, the speed and acceleration in the current radius direction, and the speed and acceleration of the angle parameter <span class="math">$$\phi$$</span>. The full expression is in the footnote<sup id="sf-taylor-classical-mechanics-3-back"><a href="#sf-taylor-classical-mechanics-3" class="simple-footnote" title="In polar coordinates, if you suppose that you know functions $$r(t)$$ and $$\phi(t)$$ giving the angle and distance at time $$t$$, then the accelerations in the two orthogonal directions at time $$t$$ are $$\a(t) = \bigg( \ddot r(t) - r(t) \dot\phi(t)^2 \bigg) \rhat(t) + \bigg( 2\dot r(t) \dot \phi(t) + r(t) \ddot \phi(t)\bigg) \phihat(t)$$ ">3</a></sup>.</p> <h3 id="newtons-second-law-as-a-differential-equation">Newton's second law as a differential equation</h3> <p>A key point seems to be: view Newton's second law <span class="math">$$\F = m\a$$</span> as a differential equation<sup id="sf-taylor-classical-mechanics-4-back"><a href="#sf-taylor-classical-mechanics-4" class="simple-footnote" title='The dot means "differentiated with respect to time". So if $$r$$ is position as a function of time then $$\dot r$$ is velocity and $$\ddot r$$ is acceleration.'>4</a></sup>:</p> <div class="math">$$m \ddot \r(t) = \F$$</div> <p>I'm understanding this as follows: You know what forces are acting on the body in question. You want to know how the position of the body will evolve through time: <span class="math">$$\r(t)$$</span>. This is a function satisfying the following differential equation: the second derivative with respect to time of <span class="math">$$\r(t)$$</span>, times <span class="math">$$m$$</span>, is equal to the net force acting on the body.</p> <p>In practice: in a typical problem you have some expression for <span class="math">$$\F$$</span> derived from consideration of a diagram showing forces acting on the body. You might be able to discover <span class="math">$$\r(t)$$</span> by finding a function whose second derivative is <span class="math">$$\F$$</span>.</p> <h4 id="example-problems">Example problems</h4> <p><strong>Cartesian coordinates</strong></p> <blockquote> <p>1.37 A student kicks a frictionless puck with initial speed <span class="math">$$v_0$$</span>, so that it slides up a plane that is inclined at an angle <span class="math">$$\theta$$</span> above the horizontal. <strong>(a)</strong> Write down Newton's second law for the puck and solve to give its position as a function of time.</p> </blockquote> <p>This is a simple example of using the Second Law as a differential equation. We write down the forces acting on the particle, set them equal to <span class="math">$$m\ddot r(t)$$</span> and integrate twice to get position.</p> <p>The only force acting on the puck is its weight, i.e. its mass times acceleration due to gravity: <span class="math">$$mg$$</span>. The puck can only move along the surface of the plane, so we are only interested in the component of the force that acts parallel to the plane. This component is <span class="math">$$-mg sin(\theta)$$</span>. So taking <span class="math">$$x$$</span> as the direction up the plane, Newton's second law is</p> <div class="math">$$m\ddot x(t) = -mgsin(\theta)$$</div> <p>Integrating once gives velocity</p> <div class="math">$$\dot x(t) = -g sin(\theta) t + v_0$$</div> <p>Integrating again gives position</p> <div class="math">$$x(t) = -\frac{1}{2} g sin(\theta) t^2 + v_0t + x_0$$</div> <p>and <span class="math">$$x_0=0$$</span> since we start measuring from its starting position.</p> <blockquote> <p><strong>(b)</strong> How long will the puck take to return to its starting point?</p> </blockquote> <p>The puck is at its starting point whenever <span class="math">$$x = 0$$</span>:</p> <div class="math">$$0 = t\bigg(-\frac{1}{2} g sin(\theta) t + v_0\bigg)$$</div> <p>The solutions of that are either <span class="math">$$t=0$$</span> (which we already knew) or (the solution we want)</p> <p><span class="math">$$t = \frac{2v_0}{g sin(\theta)}$$</span></p> <p><strong>Polar coordinates</strong></p> <blockquote> <p>A "halfpipe" at a skateboard park consists of a concrete trough with a semicircular cross section of radius <span class="math">$$R = 5m$$</span>. I hold a frictionless skateboard on the side of the trough pointing down toward the bottom and release it. Discuss the subsequent motion using Newton's second law. In particular, if I release the skateboard just a short way from the bottom, how long will it take to come back to the point of release?</p> </blockquote> <p>Conceptually, we do the same thing as for the problem using Cartesian coordinates: we write down Newton's second law resolved into two orthogonal directions. It's just that with polar coordinates, these orthogonal directions are constantly changing.</p> <p>The weight of the skateboard acts downwards. This results in a tangent force causing the skateboard to move along the halfpipe, and also presses the skateboard into the halfpipe a bit, with an associated reaction force. We ignore the force/reaction force between the skateboard and the pipe and focus only on the tangent force: <span class="math">$$-mg sin(\phi)$$</span>.</p> <p>The equation for acceleration says that, at time <span class="math">$$t$$</span>, acceleration in the current tangent direction is <span class="math">$$R\ddot \phi(t)$$</span> (halfpipe radius times current angular acceleration<sup id="sf-taylor-classical-mechanics-5-back"><a href="#sf-taylor-classical-mechanics-5" class="simple-footnote" title="To see this, start with the $$\phihat(t)$$ (tangent direction) part of the full expression for acceleration and note that the radial distance of the skateboard is fixed by the presence of the half-pipe, so speed $$\dot r(t)$$ (and acceleration) in the radial direction is zero.">5</a></sup>). So Newton's second law in this context is the differential equation</p> <div class="math">$$mR \ddot \phi(t) = -mg sin(\phi(t))$$</div> <p>We read this as saying:</p> <blockquote> <p>We don't know how the angle is changing over time <span class="math">$$\phi(t)$$</span> -- that is precisely what we want to know. But what we do know is that whatever that function is, its second derivative at time <span class="math">$$t$$</span> is equal to the sin of the current angle (times <span class="math">$$g/R$$</span> and with a minus sign because the way we've defined the angle it gets smaller as the weight force takes the skateboard towards the bottom).</p> </blockquote> <p>Once we've got to that point, finding the angle function <span class="math">$$\phi(t)$$</span> is just math. It turns out that the only function for which it is true that the second derivative has this property<sup id="sf-taylor-classical-mechanics-6-back"><a href="#sf-taylor-classical-mechanics-6" class="simple-footnote" title="Actually the solution is a function with second derivative having a different property, but one which is very similar to the desired property as long as we're restricting ourselves to the angle being fairly small.">6</a></sup> is</p> <div class="math">$$\phi(t) = \phi_0 cos\bigg(\sqrt\frac{g}{R}t\bigg)$$</div> <p>where <span class="math">$$\phi_0$$</span> is the angle that the skateboard was released at at time <span class="math">$$t=0$$</span>. This is the "solution" of the differential equation: a function matching the criteria that the differential equation specified.</p> <p>So we have our answer: the forces acting on the skateboard imply (via Newton's second law) that the way the angle of the skateboard changes is a cosine function of time. So the skateboard angle does what cosines do: it starts off at its maximum, decreases to zero, crosses zero and becomes negative for a while, starts turning back towards zero, crosses zero and becomes positive again and gets back to its maximum where it turns around again.</p> <h3 id="conservation-of-momentum">Conservation of momentum</h3> <p>Momentum is mass times velocity, <span class="math">$$\p(t) = m\dot \r(t)$$</span>, so another way of stating the second law is: rate of change of momentum is equal to force. In a multi-particle system the forces-and-reaction-forces of the third law cancel each other out when summing the rate of change of momentum of the whole system. So, total momentum doesn't change due to internal forces (but it does if there are external forces).</p> <p>pp 21-23 show that conservation of momentum does not hold when considering magnetic and electrostatic forces between charged particles moving close to the speed of light. However I am unfamiliar with those forces and with the "right-hand rule" for fields/forces and I haven't understood this section.</p> <hr> <script type="text/javascript">if (!document.getElementById('mathjaxscript_pelican_#%@#@#')) { var align = "center", indent = "0em", linebreak = "false"; if (false) { align = (screen.width < 768) ? "left" : align; indent = (screen.width < 768) ? "0em" : indent; linebreak = (screen.width < 768) ? 'true' : linebreak; } var mathjaxscript = document.createElement('script'); var location_protocol = (false) ? 'https' : document.location.protocol; if (location_protocol !== 'http' && location_protocol !== 'https') location_protocol = 'https:'; mathjaxscript.id = 'mathjaxscript_pelican_#%@#@#'; mathjaxscript.type = 'text/javascript'; mathjaxscript.src = location_protocol + '//cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML'; mathjaxscript[(window.opera ? "innerHTML" : "text")] = "MathJax.Hub.Config({" + " config: ['MMLorHTML.js']," + " TeX: { extensions: ['AMSmath.js','AMSsymbols.js','noErrors.js','noUndefined.js'], equationNumbers: { autoNumber: 'AMS' }, Macros: {} }," + " jax: ['input/TeX','input/MathML','output/HTML-CSS']," + " extensions: ['tex2jax.js','mml2jax.js','MathMenu.js','MathZoom.js']," + " displayAlign: '"+ align +"'," + " displayIndent: '"+ indent +"'," + " showMathMenu: true," + " messageStyle: 'normal'," + " tex2jax: { " + " inlineMath: [ ['\\\$$','\\\$$'] ], " + " displayMath: [ ['$$','$$'] ]," + " processEscapes: true," + " preview: 'TeX'," + " }, " + " 'HTML-CSS': { " + " styles: { '.MathJax_Display, .MathJax .mo, .MathJax .mi, .MathJax .mn': {color: 'inherit ! important'} }," + " linebreaks: { automatic: "+ linebreak +", width: '90% container' }," + " }, " + "}); " + "if ('default' !== 'default') {" + "MathJax.Hub.Register.StartupHook('HTML-CSS Jax Ready',function () {" + "var VARIANT = MathJax.OutputJax['HTML-CSS'].FONTDATA.VARIANT;" + "VARIANT['normal'].fonts.unshift('MathJax_default');" + "VARIANT['bold'].fonts.unshift('MathJax_default-bold');" + "VARIANT['italic'].fonts.unshift('MathJax_default-italic');" + "VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" + "});" + "MathJax.Hub.Register.StartupHook('SVG Jax Ready',function () {" + "var VARIANT = MathJax.OutputJax.SVG.FONTDATA.VARIANT;" + "VARIANT['normal'].fonts.unshift('MathJax_default');" + "VARIANT['bold'].fonts.unshift('MathJax_default-bold');" + "VARIANT['italic'].fonts.unshift('MathJax_default-italic');" + "VARIANT['-tex-mathit'].fonts.unshift('MathJax_default-italic');" + "});" + "}"; (document.body || document.getElementsByTagName('head')).appendChild(mathjaxscript); } </script><ol class="simple-footnotes"><li id="sf-taylor-classical-mechanics-1"> The product rule is the thing when you studied differentiation that says: when you're differentiating the product of two functions you differentiate one and keep the other as-is, then you differentiate the other while keeping the first as-is, and you add the two things together: <span class="math">$$\frac{d(f(t)g(t))}{dt} = \dot f(t) g(t) + f(t) \dot g(t)$$</span> <a href="#sf-taylor-classical-mechanics-1-back" class="simple-footnote-back">↩</a></li><li id="sf-taylor-classical-mechanics-2">You can prove this by writing the unit vector in Cartesian coordinates, <span class="math">$$cos(\phi) \xhat + sin(\phi) \yhat$$</span>, and then differentiating it to give <span class="math">$$\dot \phi\big(-sin(\phi)\xhat + cos(\phi)\yhat\big)$$</span> which is <span class="math">$$\dot \phi$$</span> times a vector orthogonal to the original one. <a href="#sf-taylor-classical-mechanics-2-back" class="simple-footnote-back">↩</a></li><li id="sf-taylor-classical-mechanics-3">In polar coordinates, if you suppose that you know functions <span class="math">$$r(t)$$</span> and <span class="math">$$\phi(t)$$</span> giving the angle and distance at time <span class="math">$$t$$</span>, then the accelerations in the two orthogonal directions at time <span class="math">$$t$$</span> are <span class="math">$$\a(t) = \bigg( \ddot r(t) - r(t) \dot\phi(t)^2 \bigg) \rhat(t) + \bigg( 2\dot r(t) \dot \phi(t) + r(t) \ddot \phi(t)\bigg) \phihat(t)$$</span> <a href="#sf-taylor-classical-mechanics-3-back" class="simple-footnote-back">↩</a></li><li id="sf-taylor-classical-mechanics-4">The dot means "differentiated with respect to time". So if <span class="math">$$r$$</span> is position as a function of time then <span class="math">$$\dot r$$</span> is velocity and <span class="math">$$\ddot r$$</span> is acceleration. <a href="#sf-taylor-classical-mechanics-4-back" class="simple-footnote-back">↩</a></li><li id="sf-taylor-classical-mechanics-5">To see this, start with the <span class="math">$$\phihat(t)$$</span> (tangent direction) part of the full expression for acceleration and note that the radial distance of the skateboard is fixed by the presence of the half-pipe, so speed <span class="math">$$\dot r(t)$$</span> (and acceleration) in the radial direction is zero. <a href="#sf-taylor-classical-mechanics-5-back" class="simple-footnote-back">↩</a></li><li id="sf-taylor-classical-mechanics-6">Actually the solution is a function with second derivative having a different property, but one which is very similar to the desired property as long as we're restricting ourselves to the angle being fairly small. <a href="#sf-taylor-classical-mechanics-6-back" class="simple-footnote-back">↩</a></li></ol>