Electromagnetism & Relativity [PHYS10093] Semester 1, 2014/15 Books Brian Pendleton The course should be self-contained, but it’s always good to read textbooks to broaden your education . . . • Email: [email protected] • Office: JCMB 4413 • David J Griffiths, Introduction to Electrodynamics, (Prentice Hall) • Phone: 0131-650-5241 • Web: http://www.ph.ed.ac.uk/∼bjp/emr/ • JD Jackson, Classical Electrodynamics (Wiley) – advanced, good for next year. • KF Riley, MP Hobson and SJ Bence, Mathematical Methods for Physics and Engineering, (CUP 1998). • PC Matthews, Vector Calculus, (Springer 1998). • ML Boas, Mathematical Methods in the Physical Sciences, (Wiley 2006). • GB Arfken and HJ Weber, Mathematical Methods for Physicists, (Academic Press 2001). • DE Bourne and PC Kendall, Vector Analysis and Cartesian Tensors, (Chapman and Hall 1993). Griffiths is the main course text; Jackson is pretty advanced, but it will also be good for Classical Electrodynamics next year. The other books are useful for the first part of the course, which will cover vectors, matrices and tensors. November 11, 2014 i Semester 2: Dynamics, Electromagnetism and Relativity • Dynamics of point particles in gravitational, electric and magnetic fields, inertial systems, Invariance under Galilean translations and rotations. [1] Syllabus • Motional EMF: Lenz’s Law: Faraday’s Law in integral and differential form, mutual Inductance: Self Inductance: Energy stored in inductance: Energy in the magnetic field, simple AC circuits (LCR): use of complex notation for oscillating solutions, impedance. [3] From DRPS. . . • The displacement current and charge conservation: Maxwell’s Equations, Energy conservation from Maxwell’s eqns: Poynting vector, Momentum conservation for EM fields: stress tensor: angular momentum. [3] Semester 1: Kinematics, Electrostatics and Magnetostatics • Vectors, bases, Einstein summation convention, the delta & epsilon symbols, matrices, determinants. [1] • Rotations of bases, composition of two rotations, reflections, projection operators, passive and active transformations, the rotational symmetry group. [2] • Cartesian tensors: definition/transformation properties and rank, quotient theorem, pseudo-tensors, the delta and epsilon symbols as tensors. [2] • Examples of tensors: moment of inertia tensor, rotation of solid bodies, stress and strain tensors, and elastic deformations of solid bodies, ideal fluid flow. [3] • Electric charge and charge density: Coulombs law: linear superposition, Electrostatic potential: equipotentials: derivation of Gauss’ Law in integral and differential form, Electrostatic Energy: Energy in the electric field, Electric dipoles: Force, Torque and Energy for a Dipole: the Multipole expansion. [3] • Perfect conductors: surface charge: pill box boundary conditions at the surface of a conductor: uniqueness theorem: boundary value problems, Linear dielectrics: D and E, boundaries between dielectrics, boundary value problems. [3] • Currents in bulk, surfaces, and wires, current conservation: Ohms Law, conductivity tensor: EMF [2] • Plane Wave solutions of free Maxwell equations: prediction of speed of light, Polarization, linear and circular, in complex notation: energy and momentum for EM waves. [2] • Plane waves in conductors: skin depth: reflection of plane waves from conductors, Waveguides and cavities: lasers, Reflection and refraction at dielectric boundaries: derivation of the Fresnel equations, Interference and diffraction, single and double slits. [3] • Physical basis of Special Relativity: the Michelson-Morley experiment, Einstein’s postulates, Lorentz transformations, time dilation and Fitzgerald contraction, addition of velocities, rapidity, Doppler effect and aberration, Minkowski diagrams. [3] • Non-orthogonal co-ordinates, covariant and contravariant tensors, covariant formulation of classical mechanics, position, velocity, momentum and force 4 vectors, particle collisions. [2] • Relativistic formulation of electromagnetism from the Lorentz force, Maxwell tensor, covariant formulation of Maxwell’s equations, Lorentz transformation of the electric and magnetic fields, invariants, stress energy tensor, the electromagnetic potential, Lorenz gauge. [3] • Generation of radiation by oscillating charges: wave equations for potentials: spherical waves: causality: the Hertzian dipole. [2] The figures in square brackets are the estimated number of lectures for each topic. This will be a very poor estimate in many cases! • Forces between current loops: Biot-Savart Law for the magnetic field, Ampere’s Law in differential and integral form, pill-box boundary conditions with surface currents. [2] • The vector potential: gauge ambiguity: magnetic dipoles: magnetic moment and angular momentum: force and torque on magnetic dipoles. [2] • Magnetization: B and H, boundaries between magnetic materials, boundaryvalue problems. [2] ii iii Contents 3.2 1 Vectors, matrices & determinants 1.1 1.2 3.1.4 Internal consistency in the definition of a tensor . . . . . . . . 27 3.1.5 Properties of Cartesian tensors . . . . . . . . . . . . . . . . . 27 3.1.6 The quotient theorem . . . . . . . . . . . . . . . . . . . . . . . 28 Pseudotensors, pseudovectors & pseudoscalars . . . . . . . . . . . . . 29 3.2.1 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.2.2 Pseudovectors . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.3 Some examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 1 3.4 Invariant/Isotropic Tensors . . . . . . . . . . . . . . . . . . . . . . . . 33 1.1.1 Cartesian components of a vector . . . . . . . . . . . . . . . . 2 1.1.2 Index (or suffix) notation . . . . . . . . . . . . . . . . . . . . . 2 1.1.3 Scalar product 3 . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.4 Free indices and repeated indices . . . . . . . . . . . . . . . . 5 1.1.5 The vector product and the Levi-Civita symbol ijk . . . . . . 5 1.1.6 Relation between and δ . . . . . . . . . . . . . . . . . . . . . 8 1.1.7 Grad, div and curl in index notation . . . . . . . . . . . . . . 9 Matrices and determinants . . . . . . . . . . . . . . . . . . . . . . . . 10 1.2.1 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.2.2 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.2.3 Linear equations . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.2.4 Orthogonal matrices . . . . . . . . . . . . . . . . . . . . . . . 16 17 Rotation of basis (or axes) . . . . . . . . . . . . . . . . . . . . . . . . 17 Composition of two rotations . . . . . . . . . . . . . . . . . . 19 Rotation of vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.2.1 Rotation about an arbitrary axis . . . . . . . . . . . . . . . . 21 2.3 Reflections and inversions . . . . . . . . . . . . . . . . . . . . . . . . 22 2.4 Projection operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.5 Active and passive transformations . . . . . . . . . . . . . . . . . . . 24 4 Taylor expansions 4.1 4.2 35 The one-dimensional case . . . . . . . . . . . . . . . . . . . . . . . . . 35 4.1.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 4.1.2 A precursor to the three-dimensional case . . . . . . . . . . . 38 The three-dimensional case . . . . . . . . . . . . . . . . . . . . . . . . 38 5 The moment of inertia tensor 5.1 40 Angular momentum and kinetic energy . . . . . . . . . . . . . . . . . 40 5.1.1 Angular momentum . . . . . . . . . . . . . . . . . . . . . . . 41 5.1.2 Kinetic energy . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 5.1.3 The parallel axes theorem . . . . . . . . . . . . . . . . . . . . 42 5.1.4 Diagonalisation of rank-two tensors . . . . . . . . . . . . . . . 44 6 Electrostatics 48 6.1 The Dirac delta function in three dimensions . . . . . . . . . . . . . . 48 6.2 Coulomb’s law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 6.3 The electric field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 6.3.1 Field lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 6.3.2 The principle of superposition . . . . . . . . . . . . . . . . . . 52 6.4 The electrostatic potential for a point charge . . . . . . . . . . . . . . 52 6.5 The static Maxwell equations . . . . . . . . . . . . . . . . . . . . . . 53 25 6.5.1 The curl equation . . . . . . . . . . . . . . . . . . . . . . . . . 53 Definition and transformation properties . . . . . . . . . . . . . . . . 25 6.5.2 Conservative fields and potential theory 3.1.1 6.5.3 The divergence equation . . . . . . . . . . . . . . . . . . . . . 55 3 Cartesian tensors 3.1 Dyadic notation . . . . . . . . . . . . . . . . . . . . . . . . . . 26 Cartesian vectors, δ and symbols . . . . . . . . . . . . . . . . . . . 2.1.1 2.2 Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.1.3 1 2 Rotations, reflections & inversions 2.1 3.1.2 Definition of a tensor . . . . . . . . . . . . . . . . . . . . . . . 25 iv v . . . . . . . . . . . . 53 6.6 6.7 6.8 6.9 Electric dipole . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 6.6.1 Potential and electric field due to a dipole . . . . . . . . . . . 57 6.6.2 Force, torque and energy . . . . . . . . . . . . . . . . . . . . . 58 The multipole expansion . . . . . . . . . . . . . . . . . . . . . . . . . 60 6.7.1 Worked example . . . . . . . . . . . . . . . . . . . . . . . . . 62 6.7.2 Interaction energy of a charged distribution . . . . . . . . . . 63 6.7.3 A brute-force calculation - the circular disc . . . . . . . . . . . 63 Gauss’ law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 Boundaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 6.9.1 Normal component . . . . . . . . . . . . . . . . . . . . . . . . 67 6.9.2 Tangential component . . . . . . . . . . . . . . . . . . . . . . 68 6.9.3 Conductors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 vi vii 1.1.1 Cartesian components of a vector The Cartesian components of a vector are projections onto 3 orthogonal axes: ~e3, ~ez , ~k Chapter 1 ~e2, ~ey , ~j Vectors, matrices & determinants ~e1, ~ex, ~i This course will run for the first time in academic year 2014/15. It begins with a lengthy treatment of vectors, matrices and tensors, which covers a variety of topics that we didn’t (or couldn’t) do in last year’s Vector Calculus course. 1.1 Cartesian vectors, δ and symbols We shall work mostly in three-dimensional real space, but we’ll generalise to an arbitrary number of dimensions when appropriate. We have scalars denoted by one number: eg temperature, and vectors, characterised by a direction and length, eg velocity. We shall consider two classes of vectors: • Displacement vector (from an arbitrary point A to B) B We generally denote the set of 3 orthonormal basis vectors by {e1 , e2 , e3 }, occasionally by {ex , ey , ez }, but never by {i, j, k}.1 We use right-handed (RH) axes. By convention, e3 is chosen as the direction of a screw turned from e1 towards e2 . We write a = a1 e1 + a2 e2 + a3 e3 ≡ (a1 , a2 , a3 ) ie the vector can be denoted by a set of three numbers (a1 , a2 , a3 ). Its length is q |a| ≡ a = a21 + a22 + a23 1.1.2 Index (or suffix) notation Index/suffix notation and the Einstein summation convention saves us a huge amount of writing in long expressions involving vectors, matrices and tensors. Example: We can write the vector a in Cartesians as ~ ~a or AB a = a1 e1 + a2 e2 + a3 e3 A = 3 X ai ei i=1 • Position vector (from some fixed origin O) P ~x or ~r ≡ ai ei ≡ aj ej ≡ ap ep ≡ · · · In the third line we introduced the Einstein summation convention, in which we sum over all repeated indices. We omit the summation sign, and instead there is an implicit sum from 1 to 3 over the index that occurs exactly twice in an expression. We can use any letter for the index i, j, p, etc, that we sum over. For this reason, repeated indices are sometimes known as ‘dummy’ indices. 1 The latter notation can be very confusing when using indices and the Einstein summation convention. O 1 2 a · b = (ai ei ) · (bj ej ) = ai bj ei · ej Einstein summation convention In index notation: The basic rules are The parentheses are for clarity only – we don’t require them. • An index that occurs only once is not summed over, and is called a free index. • Two identical (repeated, dummy) indices are summed over • Three identical indices are not permitted when using the summation convention The last rule arises because there is rarely a need for three (or more) identical indices in vector or matrix algebra, or in vector calculus - as we shall see. 1.1.3 For orthonormal basis vectors {ei }, we have e1 · e1 = e2 · e2 = e3 · e3 = 1 [θ = 0] and e1 · e2 = e2 · e3 = e3 · e1 = 0 [θ = π/2] These relations can be expressed succinctly using the Kronecker delta symbol δij . For all i, j = 1, 2, 3: 1 i=j δij ≡ 0 otherwise The orthonormality conditions can then be expressed as Scalar product Geometrical version: Consider three non-parallel vectors a, b and c, which form the sides of a triangle. ~c = ~b − ~a ei · ej = δij The expression for the scalar product at the top of the page becomes a · b = ai bj δij = ai δij bj The second equality holds because δij and bj are just numbers. Let’s evaluate δij bj . Consider first ~b δ1j bj = δ11 b1 + δ12 b2 + δ13 b3 = b1 + 0 + 0 ~c Similarly δ2j bj = b2 and δ3j bj = b3 and hence θ ~a δij bj = bi Define the scalar product as which is known as the sifting property of the Kronecker delta symbol. We can now obtain the expected result a · b ≡ ab cos θ a · b = ai (δij bj ) = ai bi where a = |a| and b = |b| . Clearly a · b = b · a . The length-squared of c is 2 2 c2 ≡ c = b − a = b − a · b − a = b2 +a2 −2 a · b Now consider a · ei = (aj ej ) · ei = aj ej · ei = aj δji = ai So the ith Cartesian component of a is [which is the cosine rule] Hence 1 2 b + a2 − c 2 2 1 2 = b + b22 + b23 + a21 + a22 + a23 − (b1 − a1 )2 − (b2 − a2 )2 − (b3 − a3 )2 2 1 3 X = a1 b 1 + a2 b 2 + a3 b 3 = ai b i a·b = i=1 Using the summation convention, we omit the summation sign and write the scalar product in Cartesian coordinates as a · b = ai b i 3 (a)i = ai = a · ei Example: For vectors a, b, c and d a · b c · d = (a1 b1 + a2 b2 + a3 b3 ) (c1 d1 + c2 d2 + c3 d3 ) Expanding this out would give 9 terms. In index notation with the summation convention, it becomes simply a · b c · d = (ai bi ) (cj dj ) where again the parentheses are for clarity only. Omitting them gives a · b c · d = ai bi cj dj Note that there is an implicit sum over both pairs of indices i and j in the repeated last two expressions. Similarly, for 6 vectors a · b c · d f · g = ai bi cj dj fk gk 4 1.1.4 Free indices and repeated indices We have seen how δij can be used to express the orthonormality of basis vectors succinctly. We now seek a similar succint expression for their vector products.2 Consider, for example, the vector equation a − (b · c) d + 3n = 0 The basis vectors are linearly independent, so this equation must hold for each component separately ai − (b · c) di + 3ni = 0 Levi-Civita symbol: Define the Levi-Civita symbol ijk (pronounced ‘epsilon i j k’ ), where the indices i, j and k can take on the values 1 to 3, such that ijk = +1 if ijk is an even permutation of 123 = −1 if ijk is an odd permutation of 123 = 0 otherwise (i.e. 2 or more indices are the same) for all values of i = 1, 2, 3. If we wish to write this equation in index notation with the summation convention, we must use a different index in the scalar product ai − bk ck di + 3ni = 0 (1.1) The free index or unsummed index i occurs once and once only in each term of the equation. In general every term in the equation must be of the same kind, i.e. have the same free indices. An even permutation consists of an even number of transpositions of two indices; An odd permutation consists of an odd number of transpositions of two indices. 123 = 231 = 312 = +1 213 = 321 = 132 = −1 all others = 0 Since the free index i can take on each of the three values i = 1, 2, 3, equation (1.1) actually represents three equations, one for each value of i. The Levi-Civita symbol is also called the alternating symbol, or the epsilon symbol. 1.1.5 The equations satisfied by the vector products of the orthonormal basis vectors {e i } can now be written uniformly as The vector product and the Levi-Civita symbol ijk Geometrical definition of the vector product (also known as cross product) of two vectors a and b: e i × e j = ijk e k ∀ i, j = 1,2,3 where there is an implicit sum over the ‘dummy’ or ‘repeated’ index k, and i and j are free indices. So there are 9 equations in total. ~n ~b a × b ≡ a b sin θ n e 1 × e 2 = 12k e k = 121 e 1 + 122 e 2 + 123 e 3 = e3 e 1 × e 1 = 11k e k = 111 e 1 + 112 e 2 + 113 e 3 = 0 e 2 × e 1 = 21k e k = 211 e 1 + 212 e 2 + 213 e 3 = −e 3 θ plus 6 more equations. Cearly, ijk very neatly encapsulates the ±1 information. ~a n is a unit vector orthogonal to both a and b, and the vectors {a, b, n} form a right handed set. Geometrically, ab sin θ is the area of the parallelogram shown. Further properties of ijk : Note the symmetry of ijk under cyclic permutations. ijk = kij = jki = −jik = −ikj = −kji Clearly b × a = −a × b. From the diagram on the right, the Cartesian basis vectors {e 1 , e 2 , e 3 } obey e1 × e2 = e3 e2 × e3 = e1 e3 × e1 = e2 This holds for all values of i, j and k. To understand it, note that ~e3 (i) If any two of the free indices i, j, k are the same, all terms vanish. (ii) If (ijk) is an even (odd) permutation of (123), then so are (jki) and (kij), but (jik), (ikj) and (kji) are odd (even) permutations of (123). ~e2 Each of equations (1.2) has three free indices so they each represent 3 × 3 × 3 = 27 equations. For any vector, a × a = 0 (because sin θ = 0), hence ~e1 e1 × e1 = e2 × e2 = e3 × e3 = 0 5 (1.2) For example, in ijk = kij , 3 equations say ‘1 = 1’, 3 equations say ‘−1 = −1’, and 21 equations say ‘0 = 0’. 2 The basis vectors are taken here to be right handed. This will be important later. 6 Vector product: Using explicit Cartesian coordinates, the vector product of two arbitrary vectors a and b in the {e i } basis is ! ! 3 3 X X a×b = bj e j ai e i × i=1 j=1 = (a1 e 1 + a2 e 2 + a3 e 3 ) × (b1 e 1 + b2 e 2 + b3 e 3 ) = e 1 (a2 b3 − a3 b2 ) + e 2 (a3 b1 − a1 b3 ) + e 3 (a1 b2 − a2 b1 ) Note that each of 1, 2, 3 appears as an index exactly once in each product, which is why we can write the vector product as the determinant of a 3 × 3 matrix e1 e2 e3 a × b = a1 a2 a3 b1 b2 b3 1.1.6 Relation between and δ The ‘bac-cab’ rule for the vector triple product is a× b×c = a·c b− a·b c which may be proved by considering explicit components of each side of the equation. Equating the ith component of each side of this equation, we get a × (b × c) i = ijk aj (b × c)k = ijk aj klm bl cm = a·c b − a·b c i = a · c bi − a · b c i = (δil δjm − δim δjl ) aj bl cm More on determinants later. Using index notation and the summation convention, the vector product becomes a × b = (ai e i ) × bj e j = ai bj e i × e j = ai bj ijk e k This is true for all vectors a, b, c which means that it must hold for each component individually. This gives an expression for the product of two epsilon symbols with one summed index: a × b = ijk ai bj e k This is an extremely important result, which you must know by heart, as it enables all vector identities (including those in vector calculus involving the operator ∇ ) to be derived easily, i.e. mechanically. We can reorder the terms in the last expression because ijk is just a number: Since i, j, k are ‘dummy’ indices, which are summed over, we can call them anything we like, so we can write a × b = ijk ai bj e k = jki aj bk e i Using jki = ijk , we can rewrite this as a × b = e i ijk aj bk Since the coefficient of e i in this expression is the ith component of a × b, we have a×b i = ijk aj bk which is an important and useful identity. Let’s check the first component in gory detail: a × b 1 = 1jk aj bk = 111 a1 b1 + 112 a1 b2 + 113 a1 b3 + 121 a2 b1 + 122 a2 b2 + 123 a2 b3 + 131 a3 b1 + 132 a3 b2 + 133 a3 b3 = a2 b 3 − a3 b 2 ijk klm = δil δjm − δim δjl To verify it, one can check all possible cases. For example 12k k12 = 121 112 + 122 212 + 123 312 = 1 = δ11 δ22 − δ12 δ21 However as we have 34 = 81 equations, 6 saying ‘1 = 1’, 6 saying ‘−1 = −1’, and 69 saying 0 = 0’, this will take some time. Taking a more systematic approach, note that the left hand side of the boxed equation may be written out in full as • ij1 1lm + ij2 2lm + ij3 3lm where i, j, l, m are free indices; • for the result to be non-zero we must have i 6= j and l 6= m. • for the result to be non-zero, none of i, j, l, m can be equal to k, so only one term of the three in the sum can be non-zero; • if i = l and j = m we have +1, if i = m and j = l we have −1; • all other terms are zero. This provides an outline of an alternative derivation of the relation. as required. 7 8 1.1.7 Grad, div and curl in index notation Example 2: Define the vector3 operator ‘del’ in Cartesian coordinates Example 3: ∂ ∂ ∂ ∂ + e2 + e3 ≡ ei ∇ ≡ e1 ∂x1 ∂x2 ∂x3 ∂xi ∇ i = ∇ × r = e i ijk ∂j xk = e i ijk δjk = e i ijj = 0 Note that the result is zero simply because ijj = 0 has two identical indices. Index notation provides a fast and succinct method for evaluating most of the important results in vector calculus – thereby eliminating most of the tedium in the proofs of last year’s Vector Calculus course! where in the last expression the repeated index i is summed over. ∇ is a vector operator, with components ∇ a · r = e i ∂i aj xj = e i aj δij = a ∂ ≡ ∂i ∂xi 1.2 We will always use the notation ∂/∂xi rather than ∂/∂ri , although the two notations are interchangeable. Matrices and determinants 1.2.1 Matrices In electromagnetism, we will sometimes use the longhand notation ∇ ≡ ex Index notation and the Einstein summation convention are also useful in matrix (and tensor) algebra. ∂ ∂ ∂ + ey + ez ∂x ∂y ∂z and we (might) occasionally write ∂ ∂ ∂ ∇= , , ∂x1 ∂x2 ∂x3 or ∇= ∂ ∂ ∂ , , ∂x ∂y ∂z We can then define each of gradient, divergence and curl (or their ith components) in index notation: ∇φ , the gradient of a scalar field φ: ∇φ = e i ∂i φ ∇ · a , the divergence of a vector field a: ∇ · a = ∂i ai ∇ × a , the curl of a vector field a: ∇ × a = e i ijk ∂j ak (∇φ)i = ∂i φ The set of quantities {aij }, with aij ≡ Aij for all 1 ≤ i ≤ M , 1 ≤ j ≤ N , are the elements of the matrix. (∇ × a)i = ijk ∂j ak p 1 Example 1: Evaluate ∇ r, where r = x21 + x22 + x23 = (xj xj ) 2 is the length of th the position vector. The i component of ∇r is q − 1 xi ∂ 1 2 x1 + x22 + x23 2 2xi = ∇r i = x21 + x22 + x23 = ∂xi 2 r Therefore 1 ∇r = r r More formally 1 1 ∂ ∂ 1 11 1 (xj xj ) 2 = e i (xj xj )− 2 (xj xj ) = e i 2 δij xj = r ∂xi 2 ∂xi 2r r ∂xj where we used = δij in the second-last step. ∂xi ∇ r = ei 3 As ∇ is always a vector operator, some people drop the vector symbol and just write ∇ , but this may be regarded as sloppy, so we won’t do it here. 9 An M × N matrix is a rectangular array of numbers M rows and N columns, a11 a12 ··· a1,N −1 a1N a21 a22 ··· a2,N −1 a2N · · ≡ {aij } · · A = · · aM −1,1 aM −1,2 aM −1,N −1 aM −1,N aM,1 aM,2 aM,N −1 aM,N A square matrix has N = M . We’ll mostly work with 3 × 3 matrices, but the majority of what we’ll do generalises to N × N matrices rather easily. • We can add & subtract same-dimensional matrices and multiply a matrix by a scalar. Component forms are obvious, e.g. A = B + λC becomes aij = bij + λcij in index notation. Since both i and j are free indices, this represents 9 equations. • The unit matrix, I, defined by 1 0 0 0 1 0 , I= 0 0 1 has components δij , i.e. Iij = δij • The trace of a square matrix is the sum of its diagonal elements Tr A = aii Note the implicit sum over i due to our use of the summation convention. 10 • The transpose of a square matrix A with components aij is defined by swapping its rows with its columns, so AT ij = ATij = aji • If AT A= then aji = aij then aji = −aij Examples: Determinants The determinant det A (or |A| or ||A||) of a 3 × 3 matrix A may be defined by det A = lmn a1l a2m a3n A is symmetric This is equivalent to the ‘usual’ recursive definition, a11 a12 a13 det A = a21 a22 a23 a31 a32 a33 If A = −AT 1.2.2 A is antisymmetric 1 4 2 4 3 6 2 6 7 , where the first index labels rows and the second index columns. Expanding the determinant gives a a a a a a det A = a11 22 23 − a12 21 23 + a13 21 22 a32 a33 a31 a33 a31 a32 is symmetric – we have reflection symmetry in the diagonal – whereas 0 3 5 −3 0 2 −5 −2 0 = a11 (a22 a33 − a23 a32 ) − a12 (a21 a33 − a23 a31 ) + a13 (a21 a32 − a22 a31 ) = (123 a11 a22 a33 + 132 a11 a23 a32 ) + . . . is antisymmetric – we have reflection antisymmetry in the diagonal, and zero along the diagonal. = 1mn a11 a2m a3n + . . . = lmn a1l a2m a3n Product of matrices Thus the two forms are equivalent. The form is convenient for derivation of various properties of determinants. We can very easily implement the usual ‘row into column’ matrix multiplication rule in index notation. Note that only one term from each row and column appears in the determinant sum, which is why the determinant can be expressed in terms of the symbol. If A (with elements aij ) is an M × N matrix and B (with elements bij ) is an N × P matrix then C = AB is an M × P matrix with elements cij = aik bkj . Since we’re using the summation convention, there is an implicit sum k = 1, . . . N in this expression. The determinant is only defined for a square matrix, but the definition can be generalised to N × N matrices, For example where the epsilon symbol with N indices is defined by if i1 , . . . , iN is an even permutation of 1, . . . , N +1 −1 if i1 , . . . , iN is an odd permutation of 1, . . . , N . i1 ...iN = 0 otherwise 3 7 1 6 2 4 8 6 1 2 = 4 0 35 32 64 40 Matrix multiplication is associative and distributive A(BC) = (AB)C ≡ ABC A(B + C) = AB + AC (1.3) det A = i1 ...iN a1i1 . . . aN iN , We shall usually consider N = 3, but most results generalise to arbitrary N . We may use these results to derive several alternative (and equivalent) expressions for the determinant. First define the quantity Xijk = lmn ail ajm akn respectively, but it’s not commutative: AB 6= BA in general. An important result is (AB)T = B T AT which follows because (AB)Tij = (AB)ji = ajk bki = (B T )ik (AT )kj = (B T AT )ij 11 It follows that Xjik = lmn ajl aim akn = mln ajm ail akn (where we relabelled l ↔ m) = −lmn ail ajm akn = −Xijk 12 Thus the symmetry of Xijk is dictated by the symmetry of lmn , and we must have Xijk = c ijk where c is some constant. To determine c, set i = 1, j = 2, k = 3, which gives 123 c = X123 , so c = lmn a1l a2m a3n = det A and hence ijk det A = lmn ail ajm akn Multiplying by ijk and using ijk ijk = 6 gives the symmetrical form for det A: • Interchanging any two adjacent rows of a matrix changes the sign of the determinant. Example: Interchanging the first and second rows gives lmn a2l a1m a3n = mln a2m a1l a3n = −lmn a1l a2m a3n = − det A In the first step we simply relabelled l ↔ m. When A has two identical rows, det A = 0. • Interchanging any two adjacent columns of a matrix also changes the sign of the determinant (use the other form for the determinant, det A = ijk ai1 aj2 ak3 ). ail aim ain ijk lmn det A = ajl ajm ajn akl akm akn Finally, 1 det A = ijk lmn ail ajm akn 3! This elegant expression isn’t of practical use because the number of terms in the sum increases from 33 to 36 (overcounting). We can obtain a result similar to the boxed expression above by defining Ylmn = ijk ail ajm akn Using the same argument as before [tutorial] gives Ylmn = lmn [ijk ai1 aj2 ak3 ] Since det A = 1/3! lmn Ylmn this means that (1.4) To derive this, start with the original definition of det A as | · · · | and permute rows and columns. This produces ± signs equivalent to permutations. 1.2.3 Linear equations A standard use of matrices & determinants is to solve (for x) the matrix-vector equation Ax = y where A is a square 3 × 3 matrix. Representing x and y by column matrix, and writing out the components, this becomes a11 x1 + a12 x2 + a13 x3 = y1 a21 x1 + a22 x2 + a23 x3 = y2 a31 x1 + a32 x2 + a33 x3 = y3 det A = ijk ai1 aj2 ak3 and lmn det A = ijk ail ajm akn In index notation aij xj = yi With a suitable definition of the inverse A−1 of the matrix A, we can write the solution as x = A−1 y or xi = A−1 ij yj [tutorial] Properties of Determinants We can easily derive familiar properties of determinants from the definitions above: • Adding a multiple of one row to another does not change the value of the determinant. Example: Adding a multiple of the second row to the first row lmn a1l a2m a3n → lmn (a1l + λ a2l ) a2m a3n = lmn a1l a2m a3n + 0 and det A is unaltered. The last term is zero because lmn a2l a2m = 0. • Adding a multiple of one column to another does not change the value of the determinant (use the other form for the determinant, det A = ijk ai1 aj2 ak3 ). 13 −1 th where A−1 element of the inverse of A ij ≡ (A )ij is the ij 1 1 imn jpq apm aqn 2! det A By explicit multiplication [tutorial] we can show AA−1 = I = A−1 A as required. A−1 ij = Alternatively [tutorial], CT det A where C = {cij } is the co-factor matrix of A, and cij = (−1)i+j × the determinant formed by omitting the row and column containing aij . A−1 = Note that a solution exists if and only if det A 6= 0. These results generalise to N × N matrices. 14 Determinant of the transpose det AT = lmn ATl1 ATm2 ATn3 = lmn a1l a2m a3n we obtain an old friend, which we now note can be written as the determinant of a 2 × 2 matrix, δ δ ijk klm = δil δjm − δim δjl = il im δjl δjm 1.2.4 and hence Orthogonal matrices T det A = det A Let A be a 3 × 3 matrix with elements aij , and define the row vectors a(1) = (a11 , a12 , a13 ) , a(2) = (a21 , a22 , a23 ) , a(3) = (a31 , a32 , a33 ) , so that a(i) j = aij . If we choose the vectors a(i) to be orthonormal: Product of determinants If C = AB so that cij = aik bkj then det C = ijk c1i c2j c3k a(1) · a(2) = a(2) · a(3) = a(3) · a(1) = 0 , and (1) 2 a = a(2) 2 = a(3) 2 = 1 , = ijk a1l bli a2m bmj a3n bnk = [ijk bli bmj bnk ] a1l a2m a3n i.e. a(i) · a(j) = δij , = lmn det B a1l a2m a3n then A is an orthogonal matrix. The rows of A form an orthonormal triad. and hence det AB = det A det B • Consider Product of two epsilons The product of two epsilon symbols with no identical indices may be written as δil δim δin ijk lmn = δjl δjm δjn δkl δkm δkn This equation has 6 free indices, so it represents 36 = 729 identities: 18 say ‘1 = 1’, 18 say ‘−1 = −0 1, so 693 say ‘0 = 0’. The proof follows almost trivially by setting ail ijk lmn det A = ajl akl A = I in equation (1.4) aim ain ajm ajn , akm akn whence det I = 1 and aij = δij . Unfortunately, this result isn’t particularly useful. . . If we set l = k and sum over k δik δil δim ijk klm = δjk δjl δjm δkk δkl δkm Properties of orthogonal matrices 1 0 0 a11 a12 a13 a11 a21 a31 AAT = a21 a22 a23 a12 a22 a32 = 0 1 0 , 0 0 1 a13 a23 a33 a31 a32 a33 Therefore A AT = I • Taking the determinant of both sides of AAT = I gives det A det AT = 1. Since det AT = det A, then (det A)2 = 1, and therefore det A = ±1 • Since det A 6= 0, the inverse matrix A−1 always exists, and therefore A−1 = AT Multiplying this equation on the right by A gives AT A = I = δik (δjl δkm − δjm δkl ) − δil (δjk δkm − 3δjm ) + δim (δjk δkl − 3δjl )) = δim δjl − δil δjm + 2δil δjm − 2δim δjl 15 so the columns of A also form an orthonormal triad. 16 For example, consider a rotation through angle α in an anticlockwise direction about the z-axis. (This is anticlockwise when looking at the tip of the arrow of the z-axis.) In this case e01 · e1 = e02 · e2 Chapter 2 e01 e02 ~e3 ~e′3 ~e′1 = a01 e01 + a02 e02 + a03 e03 = a0i e0i We can relate the components in the two bases: ~e1 x z, z ′ evaluation or cyclic 0 − sin γ 1 0 0 cos γ This is a general result, which holds because the length of the vector a is unchanged by the rotation ( 2 ai ai = δij ai aj a = a · a = a0k a0k = (`ki ai ) (`kj aj ) Since this is true for all ai then which we write as `ki `kj = δij ~e2 a0i = `ij aj ~e′2 where the transformation matrix L has components `ij 0 e1 · e1 e01 · e2 e01 · e3 n o L = {`ij } = e0i · ej = e02 · e1 e02 · e2 e02 · e3 0 0 0 e3 · e1 e3 · e2 e3 · e3 • `ij is the cosine of the angle between ith axis of S 0 and the j th axis of S. • L describes a rotation of the basis vectors about some axis, in which case it is often called the rotation matrix. Note that the vector a is not rotated – it remains fixed in space – only the basis vectors are rotated. More on this point later. 1 α Since Lz (−α) Lz (α) = I and LTz (α) = Lz (−α) (by inspection), we deduce that LTz (α) Lz (α) = I, i.e. Lz is an orthogonal matrix. a = a1 e1 + a2 e2 + a3 e3 = ai ei a0i = a · e0i = e0i · ej aj sin α x′ · e1 = cos(π/2 + α) = − sin α , etc Similarly for rotations about the x and y axes, either by direct rotations of axes in our previous result, 1 0 0 cos γ cos β sin β 0 Lx (β) = 0 Ly (γ) = 0 − sin β cos β sin γ Consider two right handed (RH) orthonormal bases1 with a common origin. The vector a has components ai in S, and a0i in S 0 : · e2 = cos(π/2 − α) = cos α sin α 0 Lz (α) = − sin α cos α 0 0 0 1 Rotation of basis (or axes) Denote basis vectors in S by {ei }, and in S 0 by {e0i }. cos α The rotation matrix is Rotations, reflections & inversions 2.1 = y y′ sometimes known as frames or frames of reference in physics applications 17 In matrix notation, this is LT L = I. So L is an orthogonal matrix, which has determinant ±1 and hence its inverse always exists. Thus we have LT L = I or LT = L−1 or L LT = I We can write the new basis vectors in terms of the old ones using `ij . Start with a0i e0i = ai ei , and use a0i = `ij aj to write `ij aj e0i = aj ej . This holds for all aj so `ij e0i = ej . Multiplying this last expression on the left by `kj gives `kj `ij e0i = `kj ej . But `kj `ij = δki , which gives e0k = `kj ej . Relabelling k → i gives e0i = `ij ej NB det L is always +1 for a rotation. This is because we must have L → I continuously as the angle of rotation α → 0. Rotations are called proper transformations. 18 Improper transformations: If the original basis {e1 , e2 , e3 } is right handed (RH), but the new basis {e01 , e02 , e03 } is left handed (LH) then det L = −1. Alternatively, L2 followed by L1 gives: y S S′ y′ S ′′ y A simple example is an inversion of axes through the origin e0i = −ei , in which case x′′ x′ z′ −1 0 0 0 L = 0 −1 0 0 −1 x′ x x z′ z ′′ so det L = −1. Basis transformations with det L = −1 are called improper transformations. 2.1.1 z z y ′′ y′ 0 1 0 0 0 1 L1 L2 = 1 0 0 Composition of two rotations Consider a rotation of the axes {e i } described by matrix L1 followed by a rotation about the new axes {e i 0 } described by L2 . Then e0i = (L1 )ij ej and e00i = (L2 )ij e0j , which gives e00i = (L2 L1 )ij ej Note the “reverse” ordering of L1 and L2 in this expression – L1 acts first. ⇒ e001 = e2 e002 = e3 e003 = e1 So L2 L1 6= L1 L2 , i.e. rotations are non-commutative. Euler angles: Making three rotations, the first about the 3-axis (through angle α), the second about the 20 -axis (β), and the third about the 300 -axis (γ) gives the most general rotation: L(α, β, γ) = Lz00 (γ) Ly0 (β) Lz (α) . Multiplying out gives: Example: 0 0 −1 0 L2 = 0 1 1 0 0 0 1 0 L1 = −1 0 0 0 0 1 0 L1 represents a rotation about Oz through π/2, and L2 a rotation about Oy of π/2. For L1 followed by L2 : S y x′ S′ S ′′ cos α sin α cos β 0 − sin β cos γ sin γ 0 0 1 0 − sin α cos α − sin γ cos γ 0 L = 0 0 sin β 0 cos β 0 0 1 cos β cos α cos γ − sin α sin γ cos β sin α cos γ + cos α sin γ = − cos β cos α sin γ − sin α cos γ − cos β sin α sin γ + cos α cos γ sin β cos α sin β sin α 0 0 1 − sin β cos γ sin β sin γ cos β which is a rather complicated result! [There’s a nice picture in the Mathematical Methods book by Mathews and Walker.] z ′′ Beware, conventions for choosing Euler angles vary very widely. x′′ x y′ Euler angles are used in rigid body dynamics – see Lagrangian Dynamics course. y ′′ z′ z 0 0 −1 0 L2 L1 = −1 0 0 1 0 19 ⇒ e001 = −e3 e002 = −e1 e003 = e2 20 2.2 Rotation of vectors Some properties of the rotation matrix (i) R is orthogonal, with det R = 1. Proofs [tutorial] The transformations of the previous section in which we rotate (or reflect or invert) the basis vectors keeping the vector fixed are called passive transformations. (ii) It is straightforward to show that Alternatively we can keep the basis fixed and rotate (or reflect or invert) the vector. These are called active transformations. 2.2.1 TrR = 1 + 2 cos θ − 12 kij Rij Rotation about an arbitrary axis = nk sin θ If R is known, then the angle θ and the axis of rotation n can be determined. Note: R has 1 + (3 − 1) = 3 independent parameters; c.f. the 3 Euler angles. Consider a rotation of a rigid body, through angle θ, about an axis which points in the direction of the unit vector n. The axis passes through a fixed origin O. R = P ~n × ~x S (iv) Consider a small (infinitesimal) rotation δθ, for which cos δθ = 1 + O(δθ2 ) and sin δθ = δθ + O(δθ3 ) , then Q θ ~y T Rij = δij − ijk nk δθ P ~x ~n A quicker (and sufficient) graphical proof follows directly from the diagram on the right, which gives x − (x · n)n n×x x · n n + SQ cos θ + SQ sin θ | {z } | | {z } |n × x| SP {z } −→ −→ | {z } |ST | |T Q| c SP x · n n + cos θ x − x · n n + sin θ n × x δθ |~n × ~x| y − x = n × x δθ O −→ −→ −→ y = OS + ST + T Q = θ S The point P is rotated to Q. The position vector x is rotated to y . −→ In the first diagram, OS is the projection of x onto the n direction, i.e. x · n n. −→ In the second diagram, n × x is parallel to T Q. S (iii) The product of two rotations x → y → z is given by z = SR x . Q d T Q from which the result above follows directly. ~y ~x ~n (v) For θ 6= 0, π, R has only real eigenvalue +1 , with one real eigenvector n . [tutorial] (as SP = SQ and |n × x| = SP = SQ). 2.3 This gives the important result y = x cos θ + (1 − cos θ) n · x n + (n × x) sin θ In index notation, this is yi = xi cos θ + (1 − cos θ)nj xj ni + ikj nk xj sin θ or θ, n xj yi = Rij where the rotation matrix R θ, n has components Reflections and inversions Consider reflection of a vector x → y in a plane with unit normal n. From the figure y = x − 2 x · n n ~x In index notation, this becomes yi = σij xj where σij = δij − 2 ni nj Rij θ, n = δij cos θ + (1 − cos θ) ni nj − ijk nk sin θ 21 ~n ~y 22 |~n × ~x|δθ Inversion of a vector in the origin is given by y = −x . This defines the parity operator P : where Pij = −δij yi = Pij xj For reflections and inversions, det σ = det P = −1. Note that for reflections and inversions, performing the operation twice yields the original vector, i.e. σ 2 = I, P 2 = I. 2.5 Active and passive transformations • Rotation of a vector in a fixed basis is called an active transformation x → y with yi = Rij xj in the {ei } basis • Rotation of the basis whilst keeping the vector fixed is called a passive transformation {ei } → {e0i } and xi → x0i = `ij xj If we set Rij = `ij , then numerically yi = x0i . 2.4 Projection operators Consider a simple example of both types of rotation: P is a parallel projection operator onto a vector u if Pu = u and Pv = 0 where v is any vector orthogonal to u , i.e. v · u = 0 . Similarly Q is an orthogonal projection to u if and Qv = v Qu = 0 so that Q = I − P . Suitable operators are (exercise: check this!) ui uj u2 Pij = Qij = δij − and ui uj u2 Rij (θ, e3 ) = δij cos θ + (1 − cos θ) δi3 δj3 − ijk δk3 sin θ cos θ − sin θ 0 cos θ 0 = sin θ 0 0 1 Q2 = Q , P Q = QP = 0 They’re also unique. For example, if there exists another operator T with the same properties as P , i.e. T u = u and T v = 0, then for any vector w ≡ µ u + ν v + λ u × v we have (P − T ) w = µ u + 0 + 0 − µ u + 0 + 0 = 0 θ ~x ~e1 This is an active rotation through angle θ . ~e3 P2 = P , P u×v ~y where we used ni = δi3 . These have the properties because ~e2 Rotation of a vector about the z-axis i = Pij u × v This holds for all vectors w , so T = P . j = ui uj /u2 jkl uk vl = 0 Rotation of the basis about the z-axis ~e2 e0i = `ij ej ≡ Rij ej ~e′2 In components e01 = cos θ e1 − sin θ e2 ~x e02 = sin θ e1 + cos θ e2 ~e′1 This is a passive rotation through angle −θ. ~e3, ~e′3 We conclude that An active rotation of the vector x through angle θ is equivalent to a passive rotation of the basis vectors by an equal and opposite amount. Colloquially, rotating a vector in one direction is equivalent to rotating the basis in the opposite direction. The general case can be built from three rotations (Euler angles). 23 ~e1 θ e03 = e3 24 Similarly a tensor of rank n, T , is defined to be an entity whose 3n components 0 Tijk···no (n-indices) in S are related to its 3n components Trst···vw (n-indices) in S 0 by 0 Tijk···no = `ir `js `kt · · · `nv `ow Trst···vw Chapter 3 In this new language • A scalar is a tensor of rank 0 (i.e. a0 = a). Cartesian tensors 3.1 Definition and transformation properties Consider a rotation of the {e i } basis (frame S) to the {e0i } basis (frame S 0 ). This is a passive rotation. The rotation matrix L, with components `ij , satisfies LLT = I = LT L, and it has unit determinant det L = +1. The components of two arbitrary vectors a and b in the two frames are related by a0i = `ij aj b0i = `ij bj We now define a vector a as an entity whose 3 components ai in S are related to its 3 components a0i in S 0 by a0i = `ij aj . Now consider the 9 quantities ai bj . Under the change of basis, these transform to a0i b0j = `ir ar `js bs = (`ir `js ) (ar bs ) Clearly, these 9 quantities obey a particular transformation law under the change of frame S → S 0 . This motivates our definition of a tensor. 3.1.1 Definition of a tensor Following on from our new definition of a vector, we define a tensor of rank 2, T , as an entity whose 32 = 9 components Tij in S are related to its 9 components Tij0 in S 0 by Tij0 = `ip `jq Tpq where L is the rotation matrix with components `ij which takes S → S 0 . Since there are 2 free indices in the above expression, it represents 9 equations. • A vector is a tensor of rank 1. We shall often be sloppy and say Tijk···rs is a tensor, when what we really mean is that T is a tensor with components Tijk···rs in a particular frame S. The expressions tensor of rank 2 and second-rank tensor are used interchangeably. Similarly for tensor of rank n and nth -rank tensor. Note that a rank-n tensor is more general than the ‘product’ of n vectors, i.e., not every tensor has components that can be written as ai bj ck . . . pr qs . For example ai bj + aj bi is a rank-2 tensor. Another explicit counterexample for n = 2 will be given in section 3.1.5. 3.1.2 Fields A scalar or vector or tensor quantity is called a field when it is a function of position: • Temperature T (r) is a scalar field • The electric field Ei (r) is a vector field • The stress field Pij (r) is a tensor field (see later) In the latter case the transformation law is Pij0 (r) = `ip `jq Ppq (r) or Pij0 (x0k ) = `ip `jq Ppq (xk ) with x0k = `kp xp These two expressions mean the same thing, but the latter form is perhaps better. 3.1.3 Dyadic notation In some (mostly older) books you will see dyadic notation. This is rather clumsy for tensors – but it works well for vectors of course! dyadic notation a a·b A a A b or a · A · b ··· We will not use dyadic notation for tensors. 25 26 index notation ai ai b i Aij or aij ai Aij bj ··· 3.1.4 Internal consistency in the definition of a tensor Let Tij , Tij0 , Tij00 be the components of a tensor in frames S, S 0 , S 00 respectively L be the rotation matrix for S → S 0 L = {`ij } and let M be the rotation matrix for S 0 → S 00 M = {mij } Then • If Tij = Tji in S, then Tij0 = Tji0 in S 0 : p↔q Tij0 = `ip `jq Tpq = `iq `jp Tqp = `jp `iq Tpq = Tji0 Tij is a symmetric tensor – the symmetry is preserved under a change of basis. [The notation p ↔ q refers to relabelling indices.] Similarly if Tij = −Tji , then Tij0 = −Tji0 . Tij is an anti-symmetric tensor. Given any second rank tensor T , we can always decompose it into symmetric and anti-symmetric parts 0 Tij00 = mip mjq Tpq = mip mjq (`pk `ql Tkl ) Tij = = (M L)ik (M L)jl Tkl 1 2 (Tij + Tji ) + 21 (Tij − Tji ) = nik njl Tkl , where N = M L is the rotation matrix for S → S 00 , so the definition of a tensor is self-consistent. • We can re-write the tensor transformation law for rank 2 tensors (only) using matrix notation: Tij0 = `ip `jq Tpq = (LT LT )ij so 3.1.5 Properties of Cartesian tensors T 0 = LT LT ≡ LT L−1 • If Tij···p and Uij···p are two tensors with the same rank n, i.e both have n indices, then Vij···p = Tij···p + Uij···p • Kronecker delta, δij , is a second rank tensor Since we defined δij = is also a tensor of rank n. The proof is straightforward. • If Tij···s and Ulm···r are the components of tensors T and U of rank n and m respectively then Vij···slm···r = Tij···s Ulm···r are the components of a tensor T U of rank n + m, which has 3n × 3m = 3m+n components. This is because = +1 i = j 0 i= 6 j in all frames, so that a · b = δij ai bj = δij a0i b0j , we may write given δij0 = δij = `ip `jq δpq which is the definition of a second rank tensor. A tensor which has the same components in all frames is called an invariant or isotropic tensor [more later]. δij is an example of a tensor that cannot be written in the form Tij = ai bj . [The diagonal terms require non-zero ai and bi , while the off-diagonal terms require all of ai and bj to be zero.] `iα · · · `rρ Vα···ρ = `iα · · · `sδ Tα···δ `l · · · `rρ U···ρ 0 Ti···s (for rank 2 only) 0 Ul···r 0 = Vi···r We occasionally use Greek letters α, β etc for dummy indices when we have a large number of them! 3.1.6 Example: If U = λ is a scalar, then λ T is a tensor of rank n + 0 = n. Let T be an entity with 9 components in any frame, say Tij in S, and Tij0 in S 0 . • If Tijk···s is a tensor of rank n, then Tiik···s (i.e. n − 2 free indices) is a tensor of rank n − 2. The process of setting indices to be equal and summing is called contraction. Example: If Tij = ai bj is a tensor of rank 2, then Tii = ai bi is a tensor of rank 0 (scalar). This process of multiplying two tensors and contracting over a pair (or pairs) of indices on different tensors is (sometimes) called taking the scalar product. • If Tij is a tensor then so is Tji [the components of the transpose T 27 T of T ]. The quotient theorem Let a be an arbitrary vector and let bi = Tij aj . If b always transforms as a vector, then T is a second rank tensor. To prove this, we determine the transformation properties of T . In S 0 we have b0i = Tij0 a0j = Tij0 `jk ak ≡ `ij bj = `ij Tjk ak Equate the last expression on each line and rearrange Tij0 `jk − `ij Tjk ak = 0 28 This expression holds for all vectors a, for example a = (1, 0, 0) etc, therefore ⇒ ⇒ Tij0 `jk = `ij Tjk Tij0 `jk `mk = `ij `mk Tjk 0 = `ij `mk Tjk Tim where we used `jk `mk = δjm in the last step. Thus T transforms as a tensor. Example: If there is a linear relationship between two vectors a and b, so that ai = Tij bj , it follows from the quotient theorem that T is a tensor. This is an alternative (mathematical) definition of a second-rank tensor, and it’s also the way that tensors arise in nature. Quotient theorem: Generalising, if Rij···s is an arbitrary tensor of rank-m, and Tij···r is a set of 3n numbers (with n > m) such that Tij···r Rij···s is a tensor of rank n − m , then Tij···r is a tensor of rank-n. 3.2.1 Vectors Inversion of the basis vectors is defined by (e1 , e2 , e3 ) → (e01 , e02 , e03 ) = (−e1 , −e2 , −e3 ) so that L = which has components `ij = −δij . −1 0 0 0 −1 0 0 0 −1 Let a be a vector (also called a tensor of rank 1 or a polar vector or a true vector ). We showed previously that the components ai transform as a0i = `ij ai , so we have a01 = −a1 Proof is similar to the rank-2 case, but it isn’t illuminating. a02 = −a2 a03 = −a3 The vector itself therefore transforms as 3.2 a0 ≡ a0i e0i = (−ai ) (−ei ) = ai ei = a Pseudotensors, pseudovectors & pseudoscalars Suppose that we now allow reflection and inversion (as well as rotation) of the basis vectors, and represent them all by a transformation matrix L with det L = +1 det L = −1 and thus a (true/polar) vector remains unchanged by inversion of the basis. S for rotations for reflections and inversions ~e2 S′ ~a ~a ~e′3 In the second case, the handedness of the basis is changed: if S is right-handed (RH), then S 0 is left-handed (LH), and vice versa. ~e1 Before introducing pseudotensors, we note that the basis vectors always transform as e0i = `ij ej ref lection ~e′1 ~e3 ~e′2 A second-rank tensor or true tensor T obeys the transformation law Tij0 = `ip `jq Tpq for all transformations, i.e. rotations, reflections and inversions. A second-rank pseudotensor T obeys the transformation law 3.2.2 Pseudovectors If a and b are (true) vectors then c = a × b is a pseudovector (also known as a pseudotensor of rank 1 or an axial vector ). Let’s illustrate this by considering an inversion. Tij0 = (det L) `ip `jq Tpq c01 = a02 b03 − a03 b02 = (−a2 ) (−b3 ) − (−a3 ) (−b2 ) = +c1 There is a change of sign for components of a pseudotensor, relative to a true tensor, when we make a reflection or inversion of the reference axes. Therefore We can similarly define pseudovectors, pseudoscalars and rank-n pseudotensors. Thus the direction of c = a × b is reversed in this LH basis. Note: det L = −1 for a basis transformation that consists of any combination of an odd number of reflections or inversions, plus any number of rotations. 29 c0 ≡ c0i e0i = (+ci ) (−ei ) = −ci ei = −c 30 etc S ~e2 ~e1 ~c ~e3 S′ ~b ref lection ~b ~e′3 ~c ~e′1 ~a ~a ~e′2 The vectors a, b and a × b form a LH triad in the S 0 basis, i.e. they always have the same ‘handedness’ as the underlying basis. Now consider the general case. We define the epsilon symbol in all bases, regardless of their handedness, in the same way: +1 if ijk is an even permutation of 123 −1 if ijk is an odd permutation of 123 ijk = 0 otherwise The components of c = a × b are ci = ijk aj bk in S and c0i = ijk a0j b0k in S 0 . We can now determine the components of c = a × b in S 0 c0i = ijk a0j b0k We can build pseudotensors of higher rank using a combination of vectors, tensors, δ and . For example: • If ai and bi are tensors of rank 1, then the 35 quantities ijk al bm are the components of a pseudotensor of rank 5. • ijk δpq is a pseudotensor of rank 5. In general: • The product of two tensors is a tensor • The product of two pseudotensors is a tensor • The product of a tensor and a pseudotensor is a pseudotensor 3.3 Some examples dr(t) dt • Velocity v = • Acceleration a = v˙ vector vector = ijk `jp ap `kq bq • Force F = ma vector = mjk `ir `mr `jp `kq ap bq • Electric field E = F /q vector • Torque G = r×F pseudovector • Angular velocity (ω) v = ω×r pseudovector • Angular momentum L = r×p pseudovector • Magnetic field (B) F = qv × B pseudovector = det L `ir rpq ap bq = det L `ir (a × b)r where we used mjk `ir `mr = mjk δim = ijk to get to the third line. So, finally c0i = det L `ir cr This is our definition of a pseudovector. Equivalently, since e0i = `ij ej then c0 ≡ c0i e0i = (det L `ir cr ) `ij ej = det L ci ei = det L c Therefore a pseudovector changes sign under any improper transformation such as inversion of the basis, or reflection. Pseudotensors: is a pseudotensor of rank 3. The proof is simple and uses the fact that is the same in all bases: 0ijk ≡ ijk = det L det L ijk = det L `ip `jq `kr pqr where we used the definition of the determinant in the last step. Furthermore since is the same in all bases, it’s a rank-3 isotropic or invariant tensor. (More on this later.) 31 (F is the force on a test charge q) B(r) = µ0 4π I C idr0 × (r − r0 ) r − r 0 3 (Biot–Savart Law – more later) P i d~r′ C • E·B ~r ~r′ O pseudoscalar A pseudoscalar changes sign under an improper transformation: a0 = (det L) a In this case, one can easily show that 0 E · B = det L E · B 32 • Helicity h = p·s pseudoscalar Finally, for a rotation of π/2 about the x-axis a22 = a33 , p~ a21 = 0 = a31 ~s The pseudovector s is the angular momentum, or spin, of a particle/ball. As the ball spins, a point on it traces out a RH helix. In the figure, p is parallel to s The only solution to these equations is aij = λ δij . We’ve already shown that δij is an invariant second-rank tensor, therefore λ δij is the most general invariant secondrank tensor. One can use a similar argument to show that the only invariant vector is the zero vector. It’s obvious that no non-zero vector has the same components in all bases! Theorem: There is no invariant tensor of rank 3. The most general invariant pseudotensor of rank 3 has components 3.4 Invariant/Isotropic Tensors aijk A tensor T is invariant or isotropic 1 if it has the same components, Tijk··· in any Cartesian basis (or frame of reference), so that Theorem: If aij is a second-rank invariant tensor, then invariant = invariant = λ δij δkl + µ δik δjl + ν δil δjk The proof is long and not very illuminating. See, for example, Matthews, Vector Calculus, (Springer) or Jeffreys, Cartesian Tensors, (CUP). However, it’s easy to show that the expression above is indeed an invariant tensor. Tijk··· = det L `iα `jβ `kγ · · · Tαβγ··· aij The most general rank-5 invariant pseudotensor has components λ δij aijklm invariant = λ δij klm + . . . The most general rank-6 invariant tensor has components Proof: For a rotation of π/2 about the z-axis 0 1 0 L = −1 0 0 0 0 1 aijklmn invariant = λ δij δkl δmn + . . . Note that invariant tensors involving can always be rewritten as sums of products of δs using the expression for the product of two epsilons. Since the only non-zero elements are `12 = 1, `21 = −1, `33 = 1, using aij = `iα `jβ aαβ , we find a11 = `1α `1β aαβ = `12 `12 a22 = a22 a13 = `1α `3β aαβ = `12 `33 a23 = a23 a23 = `2α `3β aαβ = `21 `33 a13 = −a13 The most general invariant tensor of rank 2n is a sum of products of constants times n Kronecker deltas. There is no invariant pseudotensor of rank 2n. Similarly, the most general invariant pseudotensor of rank 2n+1 is a sum of products of constants times one and n − 1 Kronecker deltas. There is no invariant tensor of rank 2n + 1. Therefore a11 = a22 , a13 = 0 = a23 Similarly, for a rotation of π/2 about the y-axis, we find a11 = a33 , 1 a12 = 0 = a32 Isotropic means “the same in all directions”. 33 λ ijk Theorem: The most general 4th rank invariant tensor has components aijkl Similarly, T is an invariant pseudotensor if = Proof: This is similar to the rank 2 case [tutorial]. Tijk··· = `iα `jβ `kγ · · · Tαβγ··· for every (orthogonal) transformation matrix L = {`ij }. invariant 34 Now let xm → x, which gives f (x) = f (a) + (x − a)f 0 (a) + Chapter 4 Taylor expansions Taylor expansion is one of the most important and fundamental techniques in the physicist’s toolkit. It allows a differentiable function to be expressed as a power series in its argument(s). This is useful when approximating a function, it often allows the problem to be ‘solved’ in some range of interest, and it’s used in deriving fundamental differential equations. We shall use the expression ‘Taylor’s Theorem’ interchangeably with ‘Taylor expansion’. We shall assume familiarity with Taylor expansions of functions of one variable, so we won’t cover this in lectures. However, we include some notes here for completeness. You may have seen the multivariate Taylor expansion beyond leading order, but possibly not quite like this. . . 4.1 1 (x − a)2 f 00 + . . . + 2! 1 (x − a)m−1 f (m−1) (a) + Rm , (m − 1)! where is n! = n(n − 1) · · · 1 is the usual factorial function, with 0! = 1, and the remainder Rm is Z x1 Z x f (m) (x0 ) dx0 · · · dxm−1 ··· Rm = a a But from the mean value theorem applied to f (m) , we have Z x f (m) (x0 ) dx0 = (x − a)f (m) (ξ) , a≤ξ≤x a which gives the “Lagrange form” for the remainder Rm (x) = 1 (x − a)m f (m) (ξ) , m! Notes • We can repeat the proof above for Ra x • If (limm→∞ Rm ) → 0 (as usually assumed here), we have an infinite series f (x) = n=∞ X n=0 a Integrating a total of m times gives Z x1 Z xm ··· f (m) (x0 ) dx0 · · · dxm−1 a a Z xm Z x2 (m−1) = ··· f (x1 ) − f (m−1) (a) dx1 · · · dxm−1 a a Z xm Z x3 (m−2) = ··· f (x2 ) − f (m−2) (a) − (x2 − a)f (m−1) (a) dx2 · · · dxm−1 a a = f (xm ) − f (a) − (xm − a)f 0 (a) − 1 (xm − a)2 f 00 (a) − · · · 2! 1 (xm − a)m−1 f (m−1) (a) − (m − 1)! where we used the basic integral Z x 1 (y − a)n−1 dy = (x − a)n n a 35 · · · where x ∈ [c, a] with c ≤ a ≤ b. Since nothing changes, we can talk about expansion in a region about x = a. The one-dimensional case Let f (x) have a continuous mth -order derivative f (m) (x) in a ≤ x ≤ b, so that Z x1 f (m) (x0 ) dx0 = f (m−1) (x1 ) − f (m−1) (a) a≤ξ≤x 1 (n) f (a) (x − a)n n! This is the Taylor expansion of f (x) about x = a. The set of x values for which the series converges is called the region of convergence of the Taylor expansion. • If a = 0 , then f (x) = ∞ X 1 (n) f (0) xn n! n=0 The Taylor expansion about x = 0 is called the Maclaurin expansion. Physicicist’s “proof ”: We can bypass the formal proof above by assuming that a power series expansion of f (x) exists (i.e. the polynomials xn form a complete basis) so that ∞ X f (x) = an x n . n=0 Now differentiate n times, and equate coefficients for each n, to obtain f (n) (0) = 0 + · · · + p! ap + · · · + 0 which gives ap = 36 1 (p) f (0) (as before) p! 4.1.1 Examples 4.1.2 A precursor to the three-dimensional case If we regard f (x + a) ≡ g(a) temporarily as a function of a only, we can write g(a) as a Maclaurin series in powers of a Example 1: Expand the function f (x) = sin x f (x + a) ≡ g(a) = about x = 0. We need f (2n) x n (0) = (−1) sin 0 = 0 f (2n+1) (0) = (−1)n cos 0 = (−1)n Now, since |f (m) (ξ)| ≤ 1, then, for fixed x, 1 m (m) 1 m |Rm | = x f (ξ) ≤ x →0 m! m! Therefore sin x = ∞ X (−1)n n=0 1111111111 0000000000 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 ∞ X 1 (n) g (0) an , n! n=0 which we can rewrite as x− f (x + a) = 1 3 x 3! ∞ X 1 (n) d f (x) an ≡ exp a f (x) n! dx n=0 The differential operator exp (a (d/dx)) is defined by its power-series expansion. This is the form that we shall generalise to three dimensions in an elegant way. It can also be obtained by first defining x2n+1 x 3 x5 = x− + + ... (2n + 1)! 3! 5! This ‘small x’ expansion is shown in the figure. Example 2: Expand the function F (t) ≡ f (x + at) which we regard as a function of t. We need the expansion in powers of t about t = 0, namely ∞ X tn (n) F (0) (4.1) F (t) = n! n=0 Noting that F (n) (0) = an f (n) (x), and setting t = 1, we find f (x) = (1 + x)α f (x + a) = F (1) = about x = 0. In this case f (n) (0) = α(α − 1) · · · (α − n + 1) ≡ giving f (x) = ∞ X n=0 g (n) (0) = f (n) (x) α! (α − n)! ∞ X α! α xn ≡ xn n n!(α − n)! n=0 The Taylor expansion includes the binomial expansion, α need not be a +ve integer. Example 3: a ‘problem’ case Consider, for example, the well-behaved function 1 f (x) = exp − 2 x Now f (0) = 0, and f (n) (0) = 0 ∀n, so ∞ X 1 n (n) a f (x) n! n=0 as before. 4.2 The three-dimensional case With this trick, we can use the one-dimensional result to find the Taylor expansion of φ(r + a) in powers of a about the point r . Let F (t) ≡ φ(r + ta) ≡ φ(u) = ∞ X n=0 (where we defined u = r + ta) n t F (n) (0) n! where we used equation (4.1) above. We want φ(r + a) which is F (1). Using the chain rule, the first derivative of F (t) with respect to t is f (x) = 0 + 0 + 0 + . . . = 0 ∀x Beware of essential singularities – not all functions with an infinite number of derivatives can be expressed as a Taylor series. See “Laurent Series” in courses on Complex Variables/Analysis 37 F (1) (t) = ∂φ(u) ∂ui ∂φ = ai = ∂ui ∂t ∂ui where we used ∂ui ∂ = (xi + tai ) = ai ∂t ∂t a · ∇ u φ(u) and defined a · ∇ u ≡ ai 38 ∂φ ∂ui The nth derivative of F (t) is n F (n) (t) = a · ∇ u φ(u) and hence F (n) (0) = a · ∇ r For F (1) we have φ(r + a) = n φ(r) (4.2) ∞ X n 1 a · ∇ r φ(r) ≡ exp a · ∇ r φ(r) n! n=0 This is the Taylor expansion of a scalar field in three dimensions, in a rather elegant form. Chapter 5 The moment of inertia tensor Generalisation to an arbitrary tensor field is easy. Simply replace φ(r) by Tij··· (r) in the above expression. Example: Find the Taylor expansion of φ(r + a) = 1 for r a. |r + a| Since φ(r) = 1/|r| = 1/r , we have ∞ X n 1 1 1 = a · ∇r |r + a| n! r n=0 1 1 1 1 1 = + (ai ∂i ) + (ai ∂i ) (aj ∂j ) + · · · r 1! r 2! r 2 2 2 a·r 3(a · r) − a r 1 1 = − 3 + +O 4 r r 2r5 r 5.1 Angular momentum and kinetic energy Suppose a rigid body of arbitrary shape rotates with angular velocity ω = ω n about a fixed axis, parallel to the unit vector n, which passes through the origin. ω Consider a small element of mass dm at the point P , with position vector r relative to O. P If the rigid body has density (mass per unit volume) ρ(r), then dm = ρ dV . O dm ~r The velocity of the element is v = ω×r Exercise: check this explicitly. This result is used in the multipole expansion in electrostatics. ~ω We can see this geometrically from the figure. δθ The distance |δr| moved in time δt, is |δr| = r sin φ δθ = δθ n × r So its velocity is δr v = = ω×r δt δθ where ω = n δt ~r φ ~ω Alternatively, we can use the rotation matrix R(θ, n) ⇒ δxi = Rij δθ, n xj − xi = −ijk nk δθ + O (δθ)2 xj δxi = δt n×r which again gives v = ω × r. 39 δθ i δt 40 5.1.1 Angular momentum Alternative (more familiar) forms The angular momentum L of a point particle of mass m at position r moving with velocity v = r˙ is L = r × p, where the momentum p = mv. The angular momentum dL of an element of mass dm = ρ dV at r is dL = ρ(r) dV r × v The angular momentum of the whole rotating body is then Z L = ρ r × ω × r dV body In components Li = Z ρ ijk xj klm ωl xm dV body = Z ρ (δil δjm − δim δjl ) xj ωl xm dV body = Z body Thus Li = Iij ωj ρ r2 ωi − xi xj ωj dV with Iij = Z body ρ r2 δij − xi xj dV The geometric quantity I(O) (where O refers to the origin) is called the moment of inertia tensor.1 It is a tensor because L is pseudovector, ω is a pseudovector, and hence from the quotient theorem I is a tensor. Recalling that the angular velocity may be written as ω = ω n, consider n · L = ni Iij ωj = Iij ni nj ω ≡ I (n) ω where L · n is the component of angular momentum parallel to the axis n, and Z Z 2 2 I (n) = Iij ni nj = ρ r2 − r · n dV ≡ ρ r⊥ dV body Similarly for the kinetic energy, so that ) L(n) = I (n) ω with L(n) = L · n , T = 12 I (n) ω 2 z In this case I11 = ρ Z a dx dy dz 0 I12 1 Kinetic energy which gives T = 1 = −ρ 1 1 Iij ωi ωj = L · ω 2 2 We will often somewhat sloppily call it the inertia tensor 41 By symmetry 5.1.3 x 2 + y 2 + z 2 − x2 y a 3 5 2 1 3 2 2 3 y xz + 3 z xy 0 = 3 ρa = 3 M a Z a dx dy dz (−xy) = ρ = ρ 0 2 The kinetic energy, dT , of an element of mass dm is dT = 12 (ρ dV ) ω × r . The kinetic energy of the body is then Z 1 ρ ijk ωj xk ilm ωl xm dV T = 2 body Z 1 ρ (δjl δkm − δjm δkl ) ωj xk ωl xm dV = 2 body Z 1 = ρ ω 2 r2 − (r · ω)2 dV 2 body Z 1 = ρ r2 δij − xi xj dV ωi ωj 2 body I (n) = Iij ni nj Example: Consider a cube of side a of constant density ρ and mass M = ρa3 Z Iij (O) = ρ r2 δij − xi xj dV Note also that Iij is symmetric, and it is independent of the axis of rotation n. 5.1.2 body is the moment of inertia about n, with r⊥ the perpendicular distance from the n-axis. 1 21 2 a 2x 2y z 0 a = − 14 ρa5 = − 14 M a2 x O 2 1 1 3 −4 −4 2 1 I(O) = M a2 − 14 3 −4 2 − 14 − 14 3 The parallel axes theorem ω It’s often more useful, and also simpler, to find the moment of inertia tensor about the centre of mass G, rather than about an arbitrary point O. There is, however, a simple relationship between them. −→ Taking O to be the origin, and OG = R , we have 0 r = R + r , giving Z Iij (O) = ρ(r) r2 δij − xi xj dV Z n 2 o = ρ0 (r0 ) R + r0 δij − (Xi + x0i ) Xj + x0j dV 0 42 P dm ~r′ O ~r ~ R G 0 In the above, ρ(r) = ρ(R + r0 ) ≡ ρ0 (r0 ), and we changed integration R variables to r . Expanding the integrand and using the definition of G, namely ρ0 (r0 ) r0 dV 0 = 0 , we get Z Iij (O) = ρ0 (r0 ) R2 + r02 δij − Xi Xj − x0i x0j dV 0 5.1.4 Diagonalisation of rank-two tensors Question: are there any directions for ω such that L is parallel to ω? If so, then L = λ ω, and hence (Iij − λδij ) ωj = 0 Hence 2 Iij (O) = Iij (G) + M (R δij − Xi Xj ) R where M = ρ0 (r0 ) dV 0 is the total mass of the body. This is a general result; given I(G) we can easily find the moment of inertia tensor about any other point. The general result above is sometimes called the parallel axes theorem. However, the parallel axes theorem technically refers to the inertia tensor about the same axis n as the original axis 2 I (n) (O) = I (n) (G) + M R⊥ For a non-trivial solution of these three simultaneous linear equations, we must have det (Iij − λδij ) = 0. Expanding the determinant, or writing it as det (Iij − λδij ) = 16 ijk lmn (Iil − λδil ) (Ijm − λδjm ) (Ikn − λδkn ) = 0 and then expanding gives P − Qλ + Rλ2 − λ3 = 0 where P = 2 where R⊥ (with R⊥ ≡ R2 − (R · n)2 ) is the perpendicular distance from the n axis. = det I Q = Example (revisited): In our previous example, the centre of mass G is at the centre of the cube with position vector R = ( 21 a, 12 a, 12 a). Then Z a/2 I11 (G) = ρ dx dy dz x2 + y 2 + z 2 − x2 = = R = −a/2 o n a/2 a/2 a/2 a/2 a/2 a/2 = ρ 31 y 3 −a/2 [x]−a/2 [z]−a/2 + 13 z 3 −a/2 [x]−a/2 [y]−a/2 = ρ 31 · 2(a/2)3 2(a/2) 2(a/2) · 2 = 61 ρa5 = 16 M a2 I12 (G) = ρ Z Since det A, TrA are invariant (the same in any basis), then the quantities P , Q, R are invariants of the tensor I (i.e. their values are also the same in any basis). = 0 The three values of λ (i.e. the solutions of the cubic equation) are the eigenvalues of the rank-two tensor, and the vectors ω are its eigenvectors.2 We will generally call the eigenvectors e. a/2 dx dy dz (−xy) −a/2 because 1 2x = 1 6 ijk lmn (δil Ijm Ikn + Iil δjm Ikn + Iil Ijm δkn ) 1 6 (δjm δkn − δjn δkm ) Ijm Ikn × 3 2 2 1 2 (TrI) − Tr I 1 6 ijk lmn (δil δjm Ikn + δil Ijm δkn + Iil δjm δkn ) 1 6 2 δkn Ikn × 3 = Tr I z and 1 6 ijk lmn Iil Ijm Ikn 2 a/2 −a/2 y = 0 G a Similarly for the other components. x O Eigenvectors and eigenvalues: If we take Iij ωj = λ ωi , and multiply on the left by L, we obtain (in matrix notation) T ⇒ Iij0 L ω j = λ L ω i LI L L ij ωj = λ L ω i |{z} =1 The inertia tensor about the centre of mass is then 1 6 0 0 Iij (G) = M a2 0 16 0 1 0 0 6 ij and Since 1 2 + 1 6 M R2 δij − Xi Xj = M a2 1 2 − 14 − 14 In the primed basis, we have by definition Iij0 ωj0 = λ0 ωi0 . Comparing with the second equation above, we see that eigenvectors ω are vectors, i.e. they transform as vectors because ωi0 = `ij ωj . Similarly, eigenvalues are scalars, i.e. they transform as scalars: λ0 = λ . − 41 − 41 1 2 − 41 − 41 1 2 = 32 , this reproduces our previous result for Iij (O). 43 Note that only the direction (up to a ± sign) of the eigenvectors is determined by the eigenvalue equation, the magnitude is arbitrary. ij The answer to our original question is that we must find the eigenvalues λ(i) , i = 1, 2, 3 and the corresponding eigenvectors ω (i) , whence L(i) = λ(i) ω (i) (no sum on i). 2 Yes, the language is indeed the same as for matrices. 44 Eigenvalues and eigenvectors of a real symmetric tensor Moment of inertia tensor Theorem When studying rigid body dynamics, it’s (usually) best to work in a basis in which the moment of inertia tensor is diagonal. The eigenvectors of I define the principal axes of the tensor. In this (primed) basis A 0 0 0 I = 0 B 0 0 0 C + The eigenvalues of a real symmetric matrix are real. + The eigenvectors corresponding to distinct eigenvalues are orthogonal. If a subset of the eigenvalues is degenerate (eigenvalues are equal), the corresponding eigenvectors can be chosen to be orthogonal because: – the eigenvector subspace corresponding to the degenerate eigenvalues is orthogonal to the other eigenvectors; – within this subspace, the eigenvectors can be chosen to be orthogonal by the Gram-Schmidt procedure. where the (positive) quantities A, B, C are called the principal moments of inertia. In this basis, the angular momentum and kinetic energy take the form L = A ω10 e01 + B ω20 e02 + C ω30 e03 T = 12 A ω10 2 + B ω20 2 + C ω30 2 Proofs will not be given here – see books or lecture notes from mathematics courses. For a free body (i.e. no external forces), L and T are conserved (time-independent), but ω will in general be time dependent. Diagonalisation of a real symmetric tensor Let T be a real second-rank symmetric tensor with real eigenvalues λ(1) , λ(2) , λ(3) and orthonormal eigenvectors `(1) , `(2) , `(3) , so that T `(i) = λ(i) `(i) (no summation) and `(i) · `(j) = δij . Let the matrix L have elements (1) (1) (1) `1 `2 `3 (i) (2) (2) (i) `ij = `j ≡ `(2) `2 `3 = ` · ej 1 (3) `1 (3) `2 (3) `3 ij I.e the ith row of L is the ith eigenvector of T . L is an orthogonal matrix (j) (LLT )ij = `im `jm = `(i) m `m = δij We can always choose the normalised eigenvectors `(i) to form a right-handed basis: det L = ijk `1i `2j `3k = `(3) · `(1) × `(2) = +1 With this choice, L is a rotation matrix which transforms S to S 0 . The tensor T transforms as (summing over the indices p, q only) Tij0 = `ip `jq Tpq = `(i) p Tpq `(j) q (j) (j) = `(i) `p p λ or Tij0 = λ(j) δij λ(1) 0 0 (2) = 0 λ 0 0 0 λ(3) ij 0 Thus we have found a basis or frame of reference, S , in which the tensor T takes a diagonal form; the diagonal elements are the eigenvalues of T .3 3 Thus tensors may be diagonalised in much the same way as matrices. 45 A geometrical picture is provided by the inertia ellipsoid, which is defined by Iij ωi ωj = 1 A factor of √ 2T is absorbed into ω by convention. ω2 In the principal axes basis, where ωi0 = `iα ωα , we have ~h P A ω10 2 + B ω20 2 + C ω30 2 = 1 [The angular momentum L is labelled h in the figure on the right. To be fixed . . . ] ω1 O ω3′ ω3 In any basis, a small displacement ω → ω + dω on the ellipsoidal surface at the point P , with normal n, obeys d~ω ~n Iij ωi dωj = Lj dωj = 0 P for all dω. Therefore L is orthogonal to dω and parallel to n, i.e. L is always orthogonal to the surface of the ellipsoid at P . 46 ω1′ ~ω ω2′ This is called the normal form. It describes an ellipsoid because A, B, C are all positive. (ThisR follows from the definition, for example A = ρ (y 2 + z 2 ) dV .) ~n In the principal axes basis L = A ω10 e01 + B ω20 e02 + C ω30 e03 The directions for which L is parallel to ω are obviously the directions of the principal axes of the ellipsoid. For example, if ω = ω10 e01 then P ω1′ Chapter 6 ~n ω2′ L = A ω10 e01 ~ω In this case, the body is rotating about a principal axis which passes through its center of mass. Electrostatics O This gives a ‘geometrical’ answer to our original question. ω3′ 6.1 Notes: • If two principal moments are identical (A, A, C), the ellipsoid becomes a spheroid. If all three principal moments are identical the ellipsoid becomes a sphere, and L is always parallel to ω. • The principal axes basis is used in the Lagrangian Dynamics course to study the rotational motion of a free rigid body in the Newtonian approach to dynamics, and the motion of a symmetric spinning top with principal moments (A, A, C) in the Lagrangian approach. The Dirac delta function in three dimensions Consider the mass of a body with density ρ(r). The mass of the body is Z M= V ρ(r) dV V How can we use this general expression for the case of a single particle? What is the ‘density’ of a single ‘point’ particle with mass M at r0 ? We need a ‘function’ ρ(r) with the properties • The principal axes basis/frame is ‘fixed to the body’, i.e. it moves with the rotating body, and is therefore a non-inertial frame. ρ(r) = 0 R M = V ρ(r) dV ∀ r 6= r0 r0 ∈ V ) ~r0 O ρ(r) = M δ(r − r0 ) | {z } notation Generalising slightly, we define the delta function to pick out the value of the function f (r 0 )1 at one point r 0 in the range of integration, so that ( Z f (r0 ) r0 ∈ V dV f (r) δ(r − r0 ) = 0 otherwise V Similarly, the total charge on a body with charge density (charge per unit volume) ρ(r) is Z Q= ρ(r) dV V The one dimensional delta function The delta function may be defined by a sequence of functions δ (x − a), each of ‘area’ unity, which have the desired limit when integrated over. We give a number of examples of how this may be done. 1 47 f (r0 ) = 1 in the example above 48 • Top hat 1 2 δ (x − a) = 0 a−<x<a+ 1 2ǫ otherwise a−ǫ a+ǫ For the top hat, we need to evaluate Z + Z +∞ 1 dx xn dx xn δ (x) = 2 − −∞ ( 1 n n = 0, 2, 4, . . . 1 n=0 n+1 = → = |{z} 0 otherwise 0 n = 1, 3, 5, . . . →0 Hence Z a +∞ −∞ • Witch’s hat dx f (x) δ (x − a) |{z} → f (a) →0 i.e. δ (x − a) → δ(x − a) Similarly for the other representations. The Gaussian representation is the cleanest, because it’s a smooth function. 1 [ − |x − a|] 2 δ (x−a) = 0 1 ǫ a−<x<a+ Notes: otherwise (i) The Dirac delta ‘function’ isn’t a function, it’s a distribution or generalised function. a−ǫ a+ǫ a • Gaussian 1 (x − a)2 δ (x − a) = √ exp − 2 π 1 √ ǫ π (ii) Colloquially, it’s an infinitely-tall infinitely-thin spike of unit area. (iii) The delta function is the continuous-variable analogue of the Kronecker delta symbol. If we let i → x Z ui δij = uj → dx x δ(x − x0 ) = x0 (iv) An important identity is Z +∞ dx f (x) δ (g(x)) = −∞ X f (xi ) |g 0 (xi )| i where g(xi ) = 0, i.e. xi are the simple zeroes of g(x) [tutorial]. a In each case Z +∞ Z dx f (x) δ (x − a) = −∞ The three dimensional delta function dx f (x + a) δ (x) −∞ = Z In Cartesian coordinates (x, y, z), +∞ +∞ −∞ dx f (a) + x f 0 (a) + x2 /2 f 00 (a) + . . . δ (x) where we shifted the integration variable in the first line, and Taylor-expanded the integrand in the second. The function f (x) is a ‘good’ test function, i.e. one for which the integral is convergent for all . δ (3) (r − r0 ) ≡ δ(r − r0 ) = δ(x − x0 ) δ(y − y0 ) δ(z − z0 ) In orthogonal curvilinear co-ordinates (u1 , u2 , u3 ), δ(r − a) = 1 δ(u1 − a1 ) δ(u2 − a2 ) δ(u3 − a3 ) h1 h2 h3 where h1 , h2 , h3 are the usual scale factors [tutorial]. (In the last equation, we set r0 = a to avoid double subscripts on the RHS.) 49 50 6.2 Coulomb’s law The particle at P ‘feels’ the electrostatic field as a force q E(r) with 1 q1 r − r 1 E(r) = 3 4π0 r − r P Experimentally, the force between two point charges q and q1 at positions r and r1 , respectively, is given by Coulomb’s law 1 q q1 r − r1 F1 = 4π0 r − r 3 6.3.2 1 F 1 is the force on the charge q at r, produced by the charge q1 at r1 . ~r1 O Charges can be positive or negative. For qq1 > 0 we have repulsion, and for qq1 < 0 we have attraction: like charges repel and opposite charges attract. In SI units, charge is measured in Coulombs and 0 is defined to be 0 = 107 /(4πc2 ) C 2 N −1 m−2 . Aside: Similarly for Newton’s law of gravitation, m m1 r − r1 r − r 3 1 The principle of superposition ~r 1 F 1 = −G which is always attractive (hence the negative sign, so that G, m, m1 are all positive). The principle of superposition states that the total electric field at r is the vector sum of the fields due to the individual charges at ri 1 X qi r − r i E(r) = r − r 3 4π0 i The electric field E(r) = lim q→0 1 F q 6.3.1 Field lines are the ‘lines of force’ on the test charge. Newton’s equations imply that the motion of a (test) particle is unique, which implies that the field lines do not cross, and thus that they are well-defined and can be measured. Thus for our two charges q and q1 we have F 1 = q E(r) 51 P ~r − ~r′ ~r 0 r−r dV 0 ρ(r0 ) r − r 0 3 ~r′ O (r − r1 ) 1 = − ∇ r − r 3 r − r1 1 (6.2) where ∇ operates on r (not r1 ), then for a point charge q1 at r1 ! r − r1 q1 q1 1 E(r) = ∇ 3 = − 4π0 r − r 4π0 r − r1 1 − i.e. we may write E(r) = −∇ φ(r) i.e. particle 1 ‘produces’ an electrostatic field E(r) . The diagram shows the field lines produced by a negative charge. ~ri O The electrostatic potential for a point charge Since Field lines V i To return to our original example of a single charge q1 at position r1 , we simply set the charge density ρ(r0 ) = q1 δ(r0 − r1 ), which recovers the result in equation (6.1). 6.4 Clearly, E is a vector field. 1 4π0 Z ~r i In the limit of (infinitely) many charges, we introduce a continuous charge density (charge/volume) ρ(r0 ), so that the charge in dV 0 at position r0 is ρ(r0 ) dV 0 . The electric field is then E(r) = The electric field E is ‘produced’ by a charge configuration, and is defined in terms of the force on a small positive test charge, P Consider a set of charges qi situated at ri In SI units: G = 6.672 × 10−11 N m2 kg 2 . 6.3 (6.1) 1 with φ(r) = 1 q 1 4π0 r − r1 φ(r) is the electrostatic potential for the electric field E(r). 52 (6.3) 6.5 The static Maxwell equations 6.5.1 An explicit expression for φ(r) can be obtained from (6.4). We have E = −∇φ with The curl equation φ(r) = For a continuous charge distribution, we again use equation (6.2) to write the electric field as a gradient ! Z Z 0 1 1 1 0 ρ(r ) = −∇ E(r) = − dV 0 ρ(r0 ) ∇ dV r − r0 (6.4) 4π0 V 4π0 V r − r0 So ∇ × E = −∇ × ∇ 1 4π0 Z V ρ(r0 ) dV 0 r − r0 Z V This is linear superposition for potentials. ρ(r0 ) dV 0 r − r0 As in the case of the electric field, if we set ρ(r0 ) = q1 δ(r0 − r1 ), we recover the potential for a single charge (equation (6.3)) ! But the curl of the gradient of a scalar field is always zero, which implies Notes: ∇×E =0 P q1 ~r1 ~r 1 q 1 φ(r) = 4π0 r − r1 O • For a surface charge distribution, with charge/unit-area σ(r), the electric field produced is Z Z r − r0 σ(r0 ) 1 1 E(r) = and φ(r) = dS 0 σ(r0 ) dS 0 3 0 r − r 4π0 S 4π0 S |r − r0 | for all static electric fields. This is the second (static) Maxwell equation. 6.5.2 1 4π0 Conservative fields and potential theory where dS is the infinitesimal (scalar) element of area on the surface S. A vector field that satisfies ∇ × E = 0 is said to be conservative. Consider the integral of ∇ × E over an open surface S bounded by the closed curve C1 − C2 . Using Stokes’ theorem I Z 0 = ∇ × E · dS = E · dr Z C1 where dl0 is the infinitesimal element of length along the line (or curve) C. S C1 −C2 S Therefore B ~b C2 E · dr = Z C2 C1 E · dr A ~a Since the line integral is independent of the path from a to b, it can only depend on the end points. So, for some scalar field φ, we must have Z b − E · dr = φ(b) − φ(a) a Now let a = r and b = r + δr, where δr is small, so we can approximate the integral −E(r) · δr + . . . = φ(r + δr) − φ(r) = ∇φ · δr + . . . where we used the definition of the gradient in the last step. Therefore • For a line distribution of charge, with charge/unit-length λ(r) Z Z r − r0 λ(r0 ) 1 1 E(r) = dl0 λ(r0 ) and φ(r) = dl0 3 0 4π0 C 4π0 C |r − r0 | r−r • In SI units, the potential is measured in Volts V . In terms of other units V = C/(C 2 N −1 m−1 ) = N mC −1 = JC −1 . • Field lines are perpendicular to surfaces of constant potential φ, called equipotentials or equipotential surfaces. d~r Let dr be a small displacement of the position vector r of a point in the equipotential surface φ = constant. Therefore φ = const 0 = dφ = ∇φ · dr so E = −∇ φ is perpendicular to dr. Thus electric field lines E are everywhere perpendicular to the surfaces φ = constant. E(r) = −∇ φ(r) φ(r) is called the potential for the vector field E(r). 53 ~n 54 − φ = const • The potential φ is only defined up to an overall constant. If we let φ → φ + c, the electric field E = −∇ φ (and hence the force) is unchanged. So only potential differences have physical significance. In most physical situations, φ → constant as r → ∞, and we usually choose the constant to be zero. • So far we’ve defined the potential in purely mathematical terms. Physically, the potential difference, VAB , between two points A and B is defined as the energy per unit charge required to move a small test charge q from A to B: 1 WAB q Z Z 1 = − F · dr = − E · dr q C C Z Z B = ∇φ · dr = dφ VAB ≡ lim q→0 C A The −ve sign is because this is the work done against the force F . Since the field is conservative, the integral is independent of the path – it depends only on the end points. • The potential energy of a charge q at position r is given by q φ(r). We may generalise this to a charge distribution in an external electric field E ext (r) = −∇φext . In this case, the (interaction) energy is W = dV ρ(r) φext (r) V Note that this does not include the the self-energy of the charge distribution. To emphasize this we write φext . [More on this later] 6.5.3 The divergence equation Let’s return to the potential for an arbitrary charge distribution Z ρ(r0 ) 1 dV 0 φ(r) = 4π0 V r − r0 Since E = −∇ φ, we have ∇ · E = −∇2 φ, and hence Z 1 1 ∇ · E(r) = − dV 0 ρ(r0 ) ∇2 4π0 V r − r0 2 0 1 = −4π δ(r) r ∀r Proof: We first prove it for r 6= 0 x 1 r6=0 3 3 i ∇2 = −∂i 3 = − 3 − xi r−5 2xi = 0 r r r 2 To prove the result for r = 0, we integrate ∇2 (1/r) over an arbitrary volume V containing the origin r = 0. Z Z Z r 1 1 dV = ∇2 dV = − ∇ · 3 dV ∇2 r r r Vε Vε V Z r ε = − · dS = − 3 4πε2 = −4π 3 ε Sε r We then used the divergence theorem to obtain the first result on the second line. On the surface Sε , we have r = ε er and dS = er dS, where er is a unit vector in the direction of r, so the integral over the surface of the sphere is straightforward – check it! Rwe may write the surface integral as an integral over solid angle R[Alternatively, r · dS/r3 = S dΩ = 4π.] S We can now take the limit ε → 0, which simply shrinks the sphere down to the origin, leaving the integral unchanged. Since our result for the integral holds for an arbitrary volume V centred on the R origin, and V δ(r) dV = 1, we deduce that ∇2 (1/r) = −4π δ(r). Similarly ! 1 2 = −4π δ(r − r0 ) ∇ r − r0 Substituting this result into equation (6.5) gives Z 1 ∇ · E(r) = − dV 0 ρ(r0 ) −4πδ r − r0 4π0 V Using the delta function to perform the integral on the right hand side, we get ∇ · E(r) = ρ(r) 0 We now have the two electrostatic Maxwell equations ∇·E = (6.5) 0 Note that ∇ acts only on r (not on r ), so we can take it inside the integral over r . 55 ∇2 In the first line, we used our previous result that ∇2 (1/r) = 0 everywhere away from the origin to write the original integral as an integral over a sphere of radius ε centred on the origin, with volume Vε and area Sε respectively. = φB − φA Z Theorem ρ 0 ∇×E = 0 In terms of the potential ∇2 φ = − E = −∇φ The second equation is called Poisson’s equation. 56 ρ 0 6.6 Electric dipole Spherical polar coordinates Physicallly, an electric dipole consists of two nearby equal and opposite (point) charges, with charge −q situated at r0 and charge +q at r0 + d . +q d~ ~r0 + d~ Define the dipole moment p = qd. −q It will turn out to be useful to consider the dipole limit, in which lim qd p = q→∞ d→0 Then O ~r P 0 0 # where we Taylor (or binomial) expanded the first term about r − r0 [tutorial]. In the dipole limit, the terms of O(qd2 ) vanish, and the potential is simply φ(r) = For a dipole at the origin we have 1 p · (r − r0 ) 4π0 |r − r0 |3 φ(r) = 1 p·r 4π0 r3 Note that φ(r) falls off as 1/r2 . p 1 cos θ 4π0 r2 p 1 E(r) = 3 cos θ er − ez 4π0 r3 φ(r) = θ θ ~eθ E = −∇φ ∂φ 1 ∂φ 1 ∂φ = − er + eθ + eχ ∂r r ∂θ r sin θ ∂χ 2 sin θ p = − − 3 cos θ er − 3 eθ 4π0 r r by the dipole is 1 r − r ~er We can also obtain this result using the expression for ∇φ in polar co-ordinates Potential and electric field due to a dipole Dipole potential The electrostatic potential φ(r) produced " # q 1 1 − φ(r) = r − r 4π0 r − r0 − d 0 " d · (r − r0 ) q 1 + = + O(d2 ) − r − r 3 4π0 r − r0 ~ez [We use χ instead of φ for the azimuthal angle in order to avoid confusion with the potential φ] ~r0 with p finite (and constant). This is sometimes called a point dipole or an ideal dipole. 6.6.1 Consider spherical polar coordinates (r, θ, χ), with the z-axis chosen parallel to the dipole moment, i.e. p = p ez . The second form can be obtained from the first by substituting ez = er cos θ − eθ sin θ [exercise: show this] into the latter. φ = const The sketch shows the electric field (full lines) and the potential (dashed lines) for the dipole. ~ E This picture holds in the dipole limit, but it’s also valid when r d, the ‘far zone’. 6.6.2 Force, torque and energy Force on a dipole th Electric field The i component of the electric field due to a dipole of moment p situated at the origin is p x 1 j j Ei (r) = −∂i φ = − ∂i 4π0 r3 1 δij 3 = − pj 3 + xj − r−5 2xi 4π0 r 2 Therefore 3p · r p 1 E(r) = r − 4π0 r5 r3 which falls off as 1/r3 . 57 The force on a dipole at position r due to an external field E ext is +q d~ F (r) = −q E ext (r) + q E ext (r + d) = −q E ext (r) + q E ext (r) + (d · ∇)E ext (r) + · · · ~r + d~ In the point dipole limit −q ~r F (r) = p · ∇ E ext (r) O 58 Torque on a dipole The torque (or couple or moment) on a dipole about the point r where the dipole is located due to the external electric field is G(r) = −q 0 × E ext (r) + q 0 + d × E ext (r + d) = q d × E ext (r) + d · ∇ E ext (r) + · · · • If the dipole at r has dipole moment p1 , and the electric field E ext (r) is due to a second dipole of moment p2 at the origin, then p2 1 3 p2 · r W = −p1 · E ext with E ext = r − 3 4π0 r5 r Therefore Taking the dipole limit (i.e. ignoring terms of order O(qd2 )), we find W = G(r) = p × E ext (r) 1 4π0 p1 · p2 r3 − 3(r · p1 )(r · p2 ) r5 The interaction energy is not only dependent on the distance between the dipoles, but also on their relative orientations. Energy of a dipole p~1 ~r p~2 O The energy of a dipole in an external electric field E ext is W = −q φext (r) + q φext (r + d) = −q φext (r) + q φext (r) + d · ∇ φext (r) + · · · In the dipole limit, using Eext = −∇ φext , we find W = −p · E ext Is thisconsistent with our previous expression for the force on the dipole, i.e. F (r) = p · ∇ E ext (r)? Recall the following identity for vector fields a and b ∇ a·b = a·∇ b+ b·∇ a+a× ∇×b +b× ∇×a If we set a = p = constant, and b = E ext then ∇ p · E ext = p · ∇ E ext + p × ∇ × E ext Now ∇ × E ext = 0, so p × (∇ × E ext ) = 0, and hence F (r) = p · ∇ E ext = ∇ p · E ext = −∇ W 6.7 The multipole expansion Consider the case of a charge distribution, ρ(r), localised in a volume V . For convenience we will take the origin inside V . The potential at the point P is Z ρ(r0 ) 1 dV 0 φ(r) = 4π0 V |r − r0 | • For the case of a homogeneous (i.e. constant, independent of r) external field, E ext (r) = E 0 , we have F = 0 and G(r) = p × E 0 So a stable or equilibrium position (i.e. position of minimum energy) occurs when p is parallel to E 0 . 59 ~r P O For |r| much larger than the extent of V , i.e |r| |r0 | for all |r0 | such that ρ(r0 ) 6= 0, we can expand the denominator using the binomial theorem (1 + x)n = 1 + nx + (n(n − 1)/2) x2 + O(x3 ) ⇒ n o−1/2 r − r0 −1 = r2 − 2r · r0 + r0 2 2 )−1/2 r · r 0 r 0 1 − 2 2 + 2 r r ( 0 3 ) r · r0 1 r02 3 −2r · r0 2 r 1 = 1+ 2 − + + O r r 2 r2 8 r2 r −1 = r The force on the dipole is the gradient of the potential energy, as expected. Examples q′ ′ ( This can also be obtained by Taylor expansion [exercise]. Then " # 2 Z 1 1 r0 · r 3 r0 · r − r2 r02 φ(r) = dV 0 ρ(r0 ) + 3 + + . . . 4π0 V r r 2r5 This gives the multipole expansion for the potential φ(r) = 1 p·r 1 Qij xi xj 1 Q + + + ... 4π0 r 4π0 r3 4π0 2r5 60 Expanding the denominators in the usual way, and defining ρ = r − r0 , the leading (1/ρ) and dipole terms cancel, so for large ρ where Q = p = Qij = Z dV 0 ρ(r0 ) ZV ZV V is the total charge within V φ(r) = dV 0 r0 ρ(r0 ) is the dipole moment about the origin dV 0 3 x0i x0j − r02 δij ρ(r0 ) is the quadrupole tensor The multipole expansion is valid in the far zone, i.e. when r r0 , with r0 the size of the charge distribution. where Qij = q (3di dj − δij d2 ) is the (traceless, symmetric) quadrupole tensor – as above. The quadrupole moment is sometimes defined to be Q = 2qd2 , where the ‘2’ is conventional. 6.7.1 • If Q 6= 0, the monopole term dominates φ(r) = 3(ρ · d)2 − d2 ρ2 1 1 Qij ρi ρj q = 4π0 ρ5 4π0 ρ5 Worked example The region inside the sphere r < a, contains a charge density 1 Q 4π0 r ρ(x, y, z) = f z (a2 −r2 ) and in the far zone, r r0 , the E field is that of a point charge at the origin. where f is a constant. Show that at large distances from the origin the potential due to the charge distribution is given approximately by • When the total charge Q = 0 the dipole term dominates φ(r) = φ(r) = 1 p·r 4π0 r3 The multipole expansion gives φ(r) = If the charge density is given by two equal but oppositely-charged particles close together, i.e. ρ(r0 ) = q δ(r0 − d) − δ(r0 ) , then Z p = dV 0 r0 q δ(r0 − d) − δ(r0 ) = q d In spherical polars (r, θ, χ), x = r sin θ cos χ , which is the dipole moment as defined previously, and hence justifies the name. • If Q = 0 and p = 0, the quadrupole term dominates φ(r) = 1 Qij xi xj 4π0 2r5 The quadrupole tensor Qij is symmetric, Qij = Qji , and traceless, Qii = 0. A simple linear quadrupole is defined by placing two dipoles (so four charges) ‘back to back’ with equal and opposite dipole moments, as shown. −2q r’ O From the figure φ(r) = d q 1 4π0 r q q −2q + + | r − r0 − d | | r − r0 + d | | r − r0 | 61 = d Z 0 q P ·r Q + +O r r3 y = r sin θ sin χ , 1 r3 z = r cos θ The total charge Q is (we drop the primes in this calculation for brevity): Z 2π Z π Z a Z ρ(r) dV = f r cos θ (a2 −r2 ) r2 sin θ dr dθ dχ = 0. Q = 0 0 0 V Z π Z π 1 This integral vanishes because cos θ sin θ dθ = sin(2θ) dθ = 0. 0 0 2 The total dipole moment P about the origin is Z Z P = r ρ(r) dV = r e r ρ(r) dV V Why quadrupole? 1 4π0 2f a7 z 1050 r3 2π Z 0 π Z 0 V a r (sin θ cos χ e 1 + sin θ sin χ e 2 + cos θ e 3 ) f r cos θ (a2 −r2 ) r2 sin θ dr dθ dχ The x and y components of the χ integral vanish. The z component factorises: Z 2π Z π Z a 2 2a7 Pz = f dχ sin θ cos2 θ dθ r4 (a2 −r2 ) dr = f 2π 3 35 0 0 0 Putting it all together, we obtain 1 8πa7 f e 3 · r 2f a7 z φ(r) = = 3 4π0 105 r 1050 r3 62 6.7.2 Interaction energy of a charged distribution Let’s consider the interaction energy W of an arbitrary (but bounded) charge distribution in an external electric field E ext = −∇ φext , Z W = dV ρ(r) φext (r) Taylor-expand φ(r) about the origin, By symmetry, the electric field on the z axis will be parallel to the z axis, because the sum of all the contributions to the components Ex and Ey will cancel (i.e. integrate to zero). Therefore we need only calculate Ez . Hence E(z) = 1 4π0 Z a 0 The last term may be re-written as − 16 (3xi xj − r2 δij ) ∂j Eext i (0) – because the external field satisfies ∇ · E ext = 0 . Therefore W = Q φext (0) − p · E ext (0) − 16 Qij ∂j Eext i (0) + . . . The physical picture is that the (total) charge interacts with the external potential φext , the dipole moment with the external field E ext , and the quadrupole moment with the (spatial) derivative of the external field. 2π ρ dρ dφ σ 0 Z a A brute-force calculation - the circular disc The electric field E and potential φ can be evaluated exactly for a number of interesting symmetric charge distibutions. We give one example before moving on to more powerful techniques. z A circular disc of radius a carries uniform surface charge density σ. Find the electric field and the potential due to the disc on the axis of symmetry. Consider two limits: (i) z a Expand the second term in square brackets 4 1 1 1 1 a2 a 2 2 −1/2 = 1 + a /z 1 − + O = |z| |z| 2 z2 z4 (a2 + z 2 )1/2 Keeping only the leading term, we have 1/2 From the diagram: r − r0 = (ρ2 + z 2 ) on the z axis, in cylindrical coordinates (ρ, χ, z). 63 σa2 Q ez e = sgn(z) 40 z 2 z 4π0 z 2 where the signum (or sign) function sgn(z) ≡ z/|z| is +1 for z > 0, and −1 for z < 0; and Q = σπa2 is the total charge on the disc. In the far zone, we recover the field for a point charge, as expected. (ii) a z In this case, the leading behaviour is obtained by dropping the second term in square brackets: E(z) = sgn(z) σ e 20 z This is the electric field due to an infinite charged surface – see later. Electric field: Start with the general expression Z r − r0 1 dS 0 σ(r0 ) E(r) = r − r 0 3 4π0 S Choose the z axis parallel to the axis of symmetry with the origin at the centre of the disc, so that r0 lies in the x − y plane. In Cartesians r − r0 = (x − x0 , y − y 0 , z − 0) + z 2 )3/2 z ρ dρ E(z) = sgn(z) 6.7.3 z ez (ρ2 #ρ=a " σz −1 e = ez 3/2 z 20 (ρ2 + z 2 )1/2 0 (ρ2 + z 2 ) ρ=0 # # " " σz 1 1 σ z z = ez = ez − − 20 |z| 20 |z| (a2 + z 2 )1/2 (a2 + z 2 )1/2 σ = 2π 4π0 2 1 φext (r) = φext (0) + r · ∇ φext (0) + r · ∇ φext (0) + . . . 2! 1 = φext (0) − r · E ext (0) − xi xj ∂j Eext i (0) + . . . 2 Z r r − r′ χ Potential: Start with the general expression Z σ(r0 ) 1 dS 0 φ(r) = 4π0 S |r − r0 | For the disc, we have O ρ r′ φ(z) = = a σ 4π0 σ h 20 Z a Z 2π ρ dρ dφ (ρ2 + z 2 )1/2 i 1/2 a2 + z 2 − |z| 0 0 = 1/2 iρ=a σ h 2 ρ + z2 20 ρ=0 Note: It’s often very much easier to find φ than E ! 64 (i) z a Expanding as before, we find, Therefore σ a2 Q φ(z) = = 40 |z| 4π0 |z| Q 4π0 E(r) = Q 4π0 as expected. (ii) a z In this case we find a linear potential φ(z) = 6.8 ∂φ e in each case. ∂z z r≤a r≥a r E = Er er = −∇φ = − We showed previously that the electric field, E(r), due to a charge distribution, ρ(r), satisfies ∇ · E = ρ/0 . This is the differential form of Maxwell’s first equation. Integrating ∇ · E = ρ/0 over a volume V , bounded by a closed surface S, and using the divergence theorem Z Z ∇ · E dV = E · dS S V Z S E · dS = 1 0 Z ρ dV = V Gauss’ Law is extremely useful, particularly for problems with a symmetry,2 and also for solving problems in potential theory. [Gauss’s law is also known as Gauss’ theorem.] Examples By symmetry, the electric field will point radially outwards, so that E(r) = Er (r) er . Integrating with respect to r gives Q 4π0 φ(r) = Q 4π0 ∂φ e ∂r r 1 3a2 − r2 2a3 1 r r≤a r≥a where for r > a we choose the constant of integration to be zero so that φ → 0 as r → 0. The potential outside the sphere is again that of a point charge. • Consider a long (infinite) straight wire with constant charge/unit length λ. Using cylindrical coordinates with the z axis parallel to the wire, we integrate over a cylinder of length L and radius ρ with its axis along the wire. By symmetry we must have E = Eρ (ρ) eρ . Using Gauss’ Law, we get Z 1 E · dS = Eρ (ρ) 2πρL + |{z} 0 = λL 0 ends • Consider a sphere of radius a, centred on the origin, with uniform charge density ρ0 = Q/( 43 πa3 ). This gives ρ E(r) = L λ 1 e 2π0 ρ ρ a 2 This goes against the general rule that it is easier to compute the potential (a scalar) first rather than the electric field (a vector)! 65 r=a For r < a, we choose the constant of integration so that φ is continuous across the boundary. Note that the derivative of φ is discontinuous at the boundary, so E has a cusp. Q 0 where Q is the total charge enclosed by the volume V . Integrating over a sphere of radius r, we find ( Z 3 4 r≤a 3 πr ρ0 /0 2 E · dS = Er (r) 4πr = 3 4 r≥a S 3 πa ρ0 /0 1/r2 We can obtain the electrostatic potential from Gauss’ law gives Gauss’ law ~ |E| Outside the sphere, the electric field appears to come from a point source, and inside it increases linearly with r. σ σ a − |z| = − |z| + constant 20 20 Exercise: Check that E(z) = − r a3 r r3 The potential can be found by integrating the electric field with respect to ρ λ ρ φ(r) = − ln 2π0 ρ0 where ρ0 is a constant of integration. 66 r • Infinite flat sheet of charge with constant charge density σ per unit area. Integrate over a cylindrical ‘Gaussian pill box’ with axis perpendicular to the sheet. See tutorial, and below. 6.9 6.9.2 Tangential component Let us apply Stokes’ theorem to the rectangle of infinitesimal width δl in the figure Z I ∇ × E · dS = E · dr S Boundaries Useful results for the changes in the normal and tangential components of the electric field across a boundary may be obtained using Gauss’ Law and Stokes’ theorem. Consider a surface carrying surface charge density σ. The electric field on one side of the boundary is E 1 , and on the other E 2 . The unit normal to the surface is n. 2 σ C which gives I 0 = E · dr = E2k − E1k l 2 C where E1k and E2k are the components of the electric field parallel to the boundary. This can be written as ~n 0 1 1111111 0000000 0 1 00000001 1111111 00000000 1111111 0 δl 1 0000000 1111111 0 00000001 1111111 0000000 1111111 0000000 1111111 0000000 1111111 000000 111111 0000000 1111111 S 000000 111111 0000000 1111111 000000 111111 0000000 1111111 000000 111111 0000000 1111111 000000 111111 0000000 1111111 000000 111111 0000000 1111111 000000 111111 l 0000000 1111111 000000 111111 000000 111111 1 S n × E1 = n × E2 1 ~2 E where we used the fact that the cross product of the electric field E with n picks out the tangential component Ek of the electric field, since ~n ~1 E E = E⊥ n + E k S Thus the tangential component of E is continuous across a charged boundary. 6.9.1 Normal component 6.9.3 Applying Gauss’ law to the small cylindrical ‘Gaussian pillbox’ with infinitesimal height, δl, shown in the figure gives Z Z 1 E · dS = ρ dV 0 S This gives 2 (E2⊥ − E1⊥ ) A = σA 0 A where E1⊥ and E2⊥ are the components of the electric field perpendicular to the surface. This gives δl σ 1 ~n S σ n · E2 − E1 = 0 Conductors Physically, a conductor is a material in which ‘free’ or ‘surplus’ electrons can move (or flow) freely when an electric field is applied. In Electrostatics • For a conductor in equilibrium, all the charge resides on the surface of the conductor, i.e. ρ = 0 inside a conductor. This holds because if ρ 6= 0 then due to Maxwell’s first equation ∇ · E = ρ/0 (or Gauss’ law) we must have E 6= 0, and hence the charge would move and we wouldn’t have equilibrium – a contradiction. So E = 0 and hence φ = constant everywhere inside a conductor. • The electric field on the surface of a conductor is normal to the surface, i.e. E k n, otherwise charge would move along the surface Taking the potential φ to be continuous across the boundary means that the gradient, and hence the electric field E, can be discontinuous across the boundary when σ 6= 0. Thus if dr is a displacement on the surface of a conductor, E · dr = −dφ = 0, so φ = constant on the surface of a conductor, ie. an equipotential. In this case the discontinuity is proportional to the surface charge density. 67 68 σ Therefore, on the surface of a conductor, we have σ Et = 0 , En = 0 The external electric field induces a charge on the surface of the conductor, which in turn deforms the external field so that it is perpendicular to the conductor surface. In the case of a conductor, the surface charge is calculated from the electric field (and not vice-versa as is the usual case). vacuum ~ E conductor ~ =0 E φ = const For insulators we have the opposite situation – the charges are fixed and we must calculate the electric field and the potential from the charge density. 69

© Copyright 2018