# Electromagnetism & Relativity Books Brian Pendleton

```Electromagnetism & Relativity
[PHYS10093] Semester 1, 2014/15
Books
Brian Pendleton
The course should be self-contained, but it’s always good to read textbooks to
• Email: [email protected]
• Office: JCMB 4413
• David J Griffiths,
Introduction to Electrodynamics, (Prentice Hall)
• Phone: 0131-650-5241
• Web: http://www.ph.ed.ac.uk/∼bjp/emr/
• JD Jackson,
Classical Electrodynamics (Wiley) – advanced, good for next year.
• KF Riley, MP Hobson and SJ Bence,
Mathematical Methods for Physics and Engineering, (CUP 1998).
• PC Matthews,
Vector Calculus, (Springer 1998).
• ML Boas,
Mathematical Methods in the Physical Sciences, (Wiley 2006).
• GB Arfken and HJ Weber,
Mathematical Methods for Physicists, (Academic Press 2001).
• DE Bourne and PC Kendall,
Vector Analysis and Cartesian Tensors, (Chapman and Hall 1993).
Griffiths is the main course text; Jackson is pretty advanced, but it will also be good
for Classical Electrodynamics next year.
The other books are useful for the first part of the course, which will cover vectors,
matrices and tensors.
November 11, 2014
i
Semester 2: Dynamics, Electromagnetism and Relativity
• Dynamics of point particles in gravitational, electric and magnetic fields, inertial systems, Invariance under Galilean translations and rotations. [1]
Syllabus
• Motional EMF: Lenz’s Law: Faraday’s Law in integral and differential form,
mutual Inductance: Self Inductance: Energy stored in inductance: Energy
in the magnetic field, simple AC circuits (LCR): use of complex notation for
oscillating solutions, impedance. [3]
From DRPS. . .
• The displacement current and charge conservation: Maxwell’s Equations, Energy conservation from Maxwell’s eqns: Poynting vector, Momentum conservation for EM fields: stress tensor: angular momentum. [3]
Semester 1: Kinematics, Electrostatics and Magnetostatics
• Vectors, bases, Einstein summation convention, the delta & epsilon symbols,
matrices, determinants. [1]
• Rotations of bases, composition of two rotations, reflections, projection operators, passive and active transformations, the rotational symmetry group.
[2]
• Cartesian tensors: definition/transformation properties and rank, quotient
theorem, pseudo-tensors, the delta and epsilon symbols as tensors. [2]
• Examples of tensors: moment of inertia tensor, rotation of solid bodies, stress
and strain tensors, and elastic deformations of solid bodies, ideal fluid flow.
[3]
• Electric charge and charge density: Coulombs law: linear superposition, Electrostatic potential: equipotentials: derivation of Gauss’ Law in integral and
differential form, Electrostatic Energy: Energy in the electric field, Electric
dipoles: Force, Torque and Energy for a Dipole: the Multipole expansion. [3]
• Perfect conductors: surface charge: pill box boundary conditions at the surface of a conductor: uniqueness theorem: boundary value problems, Linear
dielectrics: D and E, boundaries between dielectrics, boundary value problems. [3]
• Currents in bulk, surfaces, and wires, current conservation: Ohms Law, conductivity tensor: EMF [2]
• Plane Wave solutions of free Maxwell equations: prediction of speed of light,
Polarization, linear and circular, in complex notation: energy and momentum
for EM waves. [2]
• Plane waves in conductors: skin depth: reflection of plane waves from conductors, Waveguides and cavities: lasers, Reflection and refraction at dielectric
boundaries: derivation of the Fresnel equations, Interference and diffraction,
single and double slits. [3]
• Physical basis of Special Relativity: the Michelson-Morley experiment, Einstein’s postulates, Lorentz transformations, time dilation and Fitzgerald contraction, addition of velocities, rapidity, Doppler effect and aberration, Minkowski
diagrams. [3]
• Non-orthogonal co-ordinates, covariant and contravariant tensors, covariant
formulation of classical mechanics, position, velocity, momentum and force 4
vectors, particle collisions. [2]
• Relativistic formulation of electromagnetism from the Lorentz force, Maxwell
tensor, covariant formulation of Maxwell’s equations, Lorentz transformation
of the electric and magnetic fields, invariants, stress energy tensor, the electromagnetic potential, Lorenz gauge. [3]
• Generation of radiation by oscillating charges: wave equations for potentials:
spherical waves: causality: the Hertzian dipole. [2]
The figures in square brackets are the estimated number of lectures for each topic.
This will be a very poor estimate in many cases!
• Forces between current loops: Biot-Savart Law for the magnetic field, Ampere’s Law in differential and integral form, pill-box boundary conditions with
surface currents. [2]
• The vector potential: gauge ambiguity: magnetic dipoles: magnetic moment
and angular momentum: force and torque on magnetic dipoles. [2]
• Magnetization: B and H, boundaries between magnetic materials, boundaryvalue problems. [2]
ii
iii
Contents
3.2
1 Vectors, matrices & determinants
1.1
1.2
3.1.4
Internal consistency in the definition of a tensor . . . . . . . . 27
3.1.5
Properties of Cartesian tensors . . . . . . . . . . . . . . . . . 27
3.1.6
The quotient theorem . . . . . . . . . . . . . . . . . . . . . . . 28
Pseudotensors, pseudovectors & pseudoscalars . . . . . . . . . . . . . 29
3.2.1
Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.2.2
Pseudovectors . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.3
Some examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
1
3.4
Invariant/Isotropic Tensors . . . . . . . . . . . . . . . . . . . . . . . . 33
1.1.1
Cartesian components of a vector . . . . . . . . . . . . . . . .
2
1.1.2
Index (or suffix) notation . . . . . . . . . . . . . . . . . . . . .
2
1.1.3
Scalar product
3
. . . . . . . . . . . . . . . . . . . . . . . . . .
1.1.4
Free indices and repeated indices . . . . . . . . . . . . . . . .
5
1.1.5
The vector product and the Levi-Civita symbol ijk . . . . . .
5
1.1.6
Relation between and δ . . . . . . . . . . . . . . . . . . . . .
8
1.1.7
Grad, div and curl in index notation . . . . . . . . . . . . . .
9
Matrices and determinants . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2.1
Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2.2
Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.2.3
Linear equations . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.2.4
Orthogonal matrices . . . . . . . . . . . . . . . . . . . . . . . 16
17
Rotation of basis (or axes) . . . . . . . . . . . . . . . . . . . . . . . . 17
Composition of two rotations . . . . . . . . . . . . . . . . . . 19
Rotation of vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2.1
Rotation about an arbitrary axis . . . . . . . . . . . . . . . . 21
2.3
Reflections and inversions . . . . . . . . . . . . . . . . . . . . . . . . 22
2.4
Projection operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.5
Active and passive transformations . . . . . . . . . . . . . . . . . . . 24
4 Taylor expansions
4.1
4.2
35
The one-dimensional case . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.1.1
Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.1.2
A precursor to the three-dimensional case . . . . . . . . . . . 38
The three-dimensional case . . . . . . . . . . . . . . . . . . . . . . . . 38
5 The moment of inertia tensor
5.1
40
Angular momentum and kinetic energy . . . . . . . . . . . . . . . . . 40
5.1.1
Angular momentum
. . . . . . . . . . . . . . . . . . . . . . . 41
5.1.2
Kinetic energy . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.1.3
The parallel axes theorem . . . . . . . . . . . . . . . . . . . . 42
5.1.4
Diagonalisation of rank-two tensors . . . . . . . . . . . . . . . 44
6 Electrostatics
48
6.1
The Dirac delta function in three dimensions . . . . . . . . . . . . . . 48
6.2
Coulomb’s law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6.3
The electric field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6.3.1
Field lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6.3.2
The principle of superposition . . . . . . . . . . . . . . . . . . 52
6.4
The electrostatic potential for a point charge . . . . . . . . . . . . . . 52
6.5
The static Maxwell equations . . . . . . . . . . . . . . . . . . . . . . 53
25
6.5.1
The curl equation . . . . . . . . . . . . . . . . . . . . . . . . . 53
Definition and transformation properties . . . . . . . . . . . . . . . . 25
6.5.2
Conservative fields and potential theory
3.1.1
6.5.3
The divergence equation . . . . . . . . . . . . . . . . . . . . . 55
3 Cartesian tensors
3.1
Dyadic notation . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Cartesian vectors, δ and symbols . . . . . . . . . . . . . . . . . . .
2.1.1
2.2
Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.1.3
1
2 Rotations, reflections & inversions
2.1
3.1.2
Definition of a tensor . . . . . . . . . . . . . . . . . . . . . . . 25
iv
v
. . . . . . . . . . . . 53
6.6
6.7
6.8
6.9
Electric dipole . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.6.1
Potential and electric field due to a dipole . . . . . . . . . . . 57
6.6.2
Force, torque and energy . . . . . . . . . . . . . . . . . . . . . 58
The multipole expansion . . . . . . . . . . . . . . . . . . . . . . . . . 60
6.7.1
Worked example . . . . . . . . . . . . . . . . . . . . . . . . . 62
6.7.2
Interaction energy of a charged distribution . . . . . . . . . . 63
6.7.3
A brute-force calculation - the circular disc . . . . . . . . . . . 63
Gauss’ law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Boundaries
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
6.9.1
Normal component . . . . . . . . . . . . . . . . . . . . . . . . 67
6.9.2
Tangential component . . . . . . . . . . . . . . . . . . . . . . 68
6.9.3
Conductors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
vi
vii
1.1.1
Cartesian components of a vector
The Cartesian components of a vector are projections onto 3 orthogonal axes:
~e3, ~ez , ~k
Chapter 1
~e2, ~ey , ~j
Vectors, matrices & determinants
~e1, ~ex, ~i
This course will run for the first time in academic year 2014/15. It begins with a
lengthy treatment of vectors, matrices and tensors, which covers a variety of topics
that we didn’t (or couldn’t) do in last year’s Vector Calculus course.
1.1
Cartesian vectors, δ and symbols
We shall work mostly in three-dimensional real space, but we’ll generalise to an
arbitrary number of dimensions when appropriate.
We have scalars denoted by one number: eg temperature, and vectors, characterised
by a direction and length, eg velocity. We shall consider two classes of vectors:
• Displacement vector (from an arbitrary point A to B)
B
We generally denote the set of 3 orthonormal basis vectors by {e1 , e2 , e3 }, occasionally by {ex , ey , ez }, but never by {i, j, k}.1
We use right-handed (RH) axes. By convention, e3 is chosen as the direction of a
screw turned from e1 towards e2 .
We write
a = a1 e1 + a2 e2 + a3 e3 ≡ (a1 , a2 , a3 )
ie the vector can be denoted by a set of three numbers (a1 , a2 , a3 ). Its length is
q
|a| ≡ a = a21 + a22 + a23
1.1.2
Index (or suffix) notation
Index/suffix notation and the Einstein summation convention saves us a huge amount
of writing in long expressions involving vectors, matrices and tensors.
Example: We can write the vector a in Cartesians as
~
~a or AB
a = a1 e1 + a2 e2 + a3 e3
A
=
3
X
ai ei
i=1
• Position vector (from some fixed origin O)
P
~x or ~r
≡ ai ei ≡ aj ej ≡ ap ep ≡ · · ·
In the third line we introduced the Einstein summation convention, in which we
sum over all repeated indices. We omit the summation sign, and instead there is an
implicit sum from 1 to 3 over the index that occurs exactly twice in an expression.
We can use any letter for the index i, j, p, etc, that we sum over. For this reason,
repeated indices are sometimes known as ‘dummy’ indices.
1
The latter notation can be very confusing when using indices and the Einstein summation
convention.
O
1
2
a · b = (ai ei ) · (bj ej ) = ai bj ei · ej
Einstein summation convention
In index notation:
The basic rules are
The parentheses are for clarity only – we don’t require them.
• An index that occurs only once is not summed over, and is called a free index.
• Two identical (repeated, dummy) indices are summed over
• Three identical indices are not permitted when using the summation convention
The last rule arises because there is rarely a need for three (or more) identical indices
in vector or matrix algebra, or in vector calculus - as we shall see.
1.1.3
For orthonormal basis vectors {ei }, we have
e1 · e1 = e2 · e2 = e3 · e3 = 1 [θ = 0]
and
e1 · e2 = e2 · e3 = e3 · e1 = 0 [θ = π/2]
These relations can be expressed succinctly using the Kronecker delta symbol δij .
For all i, j = 1, 2, 3:
1 i=j
δij ≡
0 otherwise
The orthonormality conditions can then be expressed as
Scalar product
Geometrical version: Consider three non-parallel vectors a, b and c, which form
the sides of a triangle.
~c = ~b − ~a
ei · ej = δij
The expression for the scalar product at the top of the page becomes
a · b = ai bj δij = ai δij bj
The second equality holds because δij and bj are just numbers.
Let’s evaluate δij bj . Consider first
~b
δ1j bj = δ11 b1 + δ12 b2 + δ13 b3
= b1 + 0 + 0
~c
Similarly δ2j bj = b2 and δ3j bj = b3 and hence
θ
~a
δij bj = bi
Define the scalar product as
which is known as the sifting property of the Kronecker delta symbol.
We can now obtain the expected result
a · b ≡ ab cos θ
a · b = ai (δij bj ) = ai bi
where a = |a| and b = |b| . Clearly a · b = b · a .
The length-squared of c is
2
2
c2 ≡ c = b − a = b − a · b − a = b2 +a2 −2 a · b
Now consider a · ei = (aj ej ) · ei = aj ej · ei = aj δji = ai
So the ith Cartesian component of a is
[which is the cosine rule]
Hence
1 2
b + a2 − c 2
2
1 2
=
b + b22 + b23 + a21 + a22 + a23 − (b1 − a1 )2 − (b2 − a2 )2 − (b3 − a3 )2
2 1
3
X
= a1 b 1 + a2 b 2 + a3 b 3 =
ai b i
a·b =
i=1
Using the summation convention, we omit the summation sign and write the scalar
product in Cartesian coordinates as
a · b = ai b i
3
(a)i = ai = a · ei
Example: For vectors a, b, c and d
a · b c · d = (a1 b1 + a2 b2 + a3 b3 ) (c1 d1 + c2 d2 + c3 d3 )
Expanding this out would give 9 terms. In index notation with the summation
convention, it becomes simply
a · b c · d = (ai bi ) (cj dj )
where again the parentheses are for clarity only. Omitting them gives
a · b c · d = ai bi cj dj
Note that there is an implicit sum over both pairs of
indices
i and j in the
repeated
last two expressions. Similarly, for 6 vectors a · b c · d f · g = ai bi cj dj fk gk
4
1.1.4
Free indices and repeated indices
We have seen how δij can be used to express the orthonormality of basis vectors
succinctly. We now seek a similar succint expression for their vector products.2
Consider, for example, the vector equation
a − (b · c) d + 3n = 0
The basis vectors are linearly independent, so this equation must hold for each
component separately
ai − (b · c) di + 3ni = 0
Levi-Civita symbol: Define the Levi-Civita symbol ijk (pronounced ‘epsilon i j k’ ),
where the indices i, j and k can take on the values 1 to 3, such that
ijk = +1 if ijk is an even permutation of 123
= −1 if ijk is an odd permutation of 123
= 0 otherwise (i.e. 2 or more indices are the same)
for all values of i = 1, 2, 3.
If we wish to write this equation in index notation with the summation convention,
we must use a different index in the scalar product
ai − bk ck di + 3ni = 0
(1.1)
The free index or unsummed index i occurs once and once only in each term of the
equation. In general every term in the equation must be of the same kind, i.e. have
the same free indices.
An even permutation consists of an even number of transpositions of two indices;
An odd permutation consists of an odd number of transpositions of two indices.
123 = 231 = 312 = +1
213 = 321 = 132 = −1
all others = 0
Since the free index i can take on each of the three values i = 1, 2, 3, equation (1.1)
actually represents three equations, one for each value of i.
The Levi-Civita symbol is also called the alternating symbol, or the epsilon symbol.
1.1.5
The equations satisfied by the vector products of the orthonormal basis vectors {e i }
can now be written uniformly as
The vector product and the Levi-Civita symbol ijk
Geometrical definition of the vector product (also known as cross product) of two
vectors a and b:
e i × e j = ijk e k
∀ i, j = 1,2,3
where there is an implicit sum over the ‘dummy’ or ‘repeated’ index k, and i and j
are free indices. So there are 9 equations in total.
~n
~b
a × b ≡ a b sin θ n
e 1 × e 2 = 12k e k = 121 e 1 + 122 e 2 + 123 e 3 =
e3
e 1 × e 1 = 11k e k = 111 e 1 + 112 e 2 + 113 e 3 =
0
e 2 × e 1 = 21k e k = 211 e 1 + 212 e 2 + 213 e 3 = −e 3
θ
plus 6 more equations. Cearly, ijk very neatly encapsulates the ±1 information.
~a
n is a unit vector orthogonal to both a and b, and the vectors {a, b, n} form a right
handed set. Geometrically, ab sin θ is the area of the parallelogram shown.
Further properties of ijk : Note the symmetry of ijk under cyclic permutations.
ijk = kij = jki = −jik = −ikj = −kji
Clearly b × a = −a × b.
From the diagram on the right, the Cartesian
basis vectors {e 1 , e 2 , e 3 } obey
e1 × e2 = e3
e2 × e3 = e1
e3 × e1 = e2
This holds for all values of i, j and k. To understand it, note that
~e3
(i) If any two of the free indices i, j, k are the same, all terms vanish.
(ii) If (ijk) is an even (odd) permutation of (123), then so are (jki) and (kij), but
(jik), (ikj) and (kji) are odd (even) permutations of (123).
~e2
Each of equations (1.2) has three free indices so they each represent 3 × 3 × 3 = 27
equations.
For any vector, a × a = 0 (because sin θ = 0),
hence
~e1
e1 × e1 = e2 × e2 = e3 × e3 = 0
5
(1.2)
For example, in ijk = kij , 3 equations say ‘1 = 1’, 3 equations say ‘−1 = −1’, and
21 equations say ‘0 = 0’.
2
The basis vectors are taken here to be right handed. This will be important later.
6
Vector product: Using explicit Cartesian coordinates, the vector product of two
arbitrary vectors a and b in the {e i } basis is
!
!
3
3
X
X
a×b =
bj e j
ai e i ×
i=1
j=1
= (a1 e 1 + a2 e 2 + a3 e 3 ) × (b1 e 1 + b2 e 2 + b3 e 3 )
= e 1 (a2 b3 − a3 b2 ) + e 2 (a3 b1 − a1 b3 ) + e 3 (a1 b2 − a2 b1 )
Note that each of 1, 2, 3 appears as an index exactly once in each product, which is
why we can write the vector product as the determinant of a 3 × 3 matrix
e1 e2 e3 a × b = a1 a2 a3 b1 b2 b3 1.1.6
Relation between and δ
The ‘bac-cab’ rule for the vector triple product is
a× b×c = a·c b− a·b c
which may be proved by considering explicit components of each side of the equation.
Equating the ith component of each side of this equation, we get
a × (b × c) i = ijk aj (b × c)k
= ijk aj klm bl cm
=
a·c b − a·b c i
= a · c bi − a · b c i
= (δil δjm − δim δjl ) aj bl cm
More on determinants later.
Using index notation and the summation convention, the vector product becomes
a × b = (ai e i ) × bj e j = ai bj e i × e j = ai bj ijk e k
This is true for all vectors a, b, c which means that it must hold for each component
individually. This gives an expression for the product of two epsilon symbols with
one summed index:
a × b = ijk ai bj e k
This is an extremely important result, which you must know by heart, as it enables
all vector identities (including those in vector calculus involving the operator ∇ ) to
be derived easily, i.e. mechanically.
We can reorder the terms in the last expression because ijk is just a number:
Since i, j, k are ‘dummy’ indices, which are summed over, we can call them anything
we like, so we can write
a × b = ijk ai bj e k = jki aj bk e i
Using jki = ijk , we can rewrite this as
a × b = e i ijk aj bk
Since the coefficient of e i in this expression is the ith component of a × b, we have
a×b
i
= ijk aj bk
which is an important and useful identity.
Let’s check the first component in gory detail:
a × b 1 = 1jk aj bk
= 111 a1 b1 + 112 a1 b2 + 113 a1 b3 +
121 a2 b1 + 122 a2 b2 + 123 a2 b3 +
131 a3 b1 + 132 a3 b2 + 133 a3 b3
= a2 b 3 − a3 b 2
ijk klm = δil δjm − δim δjl
To verify it, one can check all possible cases. For example
12k k12 = 121 112 + 122 212 + 123 312 = 1 = δ11 δ22 − δ12 δ21
However as we have 34 = 81 equations, 6 saying ‘1 = 1’, 6 saying ‘−1 = −1’, and
69 saying 0 = 0’, this will take some time.
Taking a more systematic approach, note that the left hand side of the boxed equation may be written out in full as
• ij1 1lm + ij2 2lm + ij3 3lm where i, j, l, m are free indices;
• for the result to be non-zero we must have i 6= j and l 6= m.
• for the result to be non-zero, none of i, j, l, m can be equal to k, so only one
term of the three in the sum can be non-zero;
• if i = l and j = m we have +1, if i = m and j = l we have −1;
• all other terms are zero.
This provides an outline of an alternative derivation of the relation.
as required.
7
8
1.1.7
Grad, div and curl in index notation
Example 2:
Define the vector3 operator ‘del’ in Cartesian coordinates
Example 3:
∂
∂
∂
∂
+ e2
+ e3
≡ ei
∇ ≡ e1
∂x1
∂x2
∂x3
∂xi
∇
i
=
∇ × r = e i ijk ∂j xk = e i ijk δjk = e i ijj = 0
Note that the result is zero simply because ijj = 0 has two identical indices.
Index notation provides a fast and succinct method for evaluating most of the important results in vector calculus – thereby eliminating most of the tedium in the
proofs of last year’s Vector Calculus course!
where in the last expression the repeated index i is summed over.
∇ is a vector operator, with components
∇ a · r = e i ∂i aj xj = e i aj δij = a
∂
≡ ∂i
∂xi
1.2
We will always use the notation ∂/∂xi rather than ∂/∂ri , although the two notations
are interchangeable.
Matrices and determinants
1.2.1
Matrices
In electromagnetism, we will sometimes use the longhand notation
∇ ≡ ex
Index notation and the Einstein summation convention are also useful in matrix
(and tensor) algebra.
∂
∂
∂
+ ey
+ ez
∂x
∂y
∂z
and we (might) occasionally write
∂
∂
∂
∇=
,
,
∂x1 ∂x2 ∂x3
or
∇=
∂ ∂ ∂
,
,
∂x ∂y ∂z
We can then define each of gradient, divergence and curl (or their ith components)
in index notation:
∇φ , the gradient of a scalar field φ:
∇φ = e i ∂i φ
∇ · a , the divergence of a vector field a:
∇ · a = ∂i ai
∇ × a , the curl of a vector field a:
∇ × a = e i ijk ∂j ak
(∇φ)i = ∂i φ
The set of quantities {aij }, with aij ≡ Aij for all 1 ≤ i ≤ M , 1 ≤ j ≤ N , are the
elements of the matrix.
(∇ × a)i = ijk ∂j ak
p
1
Example 1: Evaluate ∇ r, where r = x21 + x22 + x23 = (xj xj ) 2 is the length of
th
the position vector. The i component of ∇r is
q
− 1
xi
∂
1 2
x1 + x22 + x23 2 2xi =
∇r i =
x21 + x22 + x23 =
∂xi
2
r
Therefore
1
∇r = r
r
More formally
1
1 ∂
∂
1
11
1
(xj xj ) 2 = e i (xj xj )− 2
(xj xj ) = e i
2 δij xj = r
∂xi
2
∂xi
2r
r
∂xj
where we used
= δij in the second-last step.
∂xi
∇ r = ei
3
As ∇ is always a vector operator, some people drop the vector symbol and just write ∇ , but
this may be regarded as sloppy, so we won’t do it here.
9
An M × N matrix is a rectangular array of numbers M rows and N columns,


a11
a12
···
a1,N −1
a1N
 a21
a22
···
a2,N −1
a2N 




·
·



 ≡ {aij }
·
·
A = 



·
·


 aM −1,1 aM −1,2
aM −1,N −1 aM −1,N 
aM,1
aM,2
aM,N −1
aM,N
A square matrix has N = M . We’ll mostly work with 3 × 3 matrices, but the
majority of what we’ll do generalises to N × N matrices rather easily.
• We can add & subtract same-dimensional matrices and multiply a matrix by
a scalar. Component forms are obvious, e.g. A = B + λC becomes aij =
bij + λcij in index notation. Since both i and j are free indices, this represents
9 equations.
• The unit matrix, I, defined by


1 0 0

0 1 0 ,
I=
0 0 1
has components δij , i.e. Iij = δij
• The trace of a square matrix is the sum of its diagonal elements
Tr A = aii
Note the implicit sum over i due to our use of the summation convention.
10
• The transpose of a square matrix A with components aij is defined by swapping
its rows with its columns, so
AT ij = ATij = aji
• If
AT
A=
then aji =
aij
then aji = −aij
Examples:
Determinants
The determinant det A (or |A| or ||A||) of a 3 × 3 matrix A may be defined by
det A = lmn a1l a2m a3n
A is symmetric
This is equivalent to the ‘usual’ recursive definition,
a11 a12 a13
det A = a21 a22 a23
a31 a32 a33
If
A = −AT
1.2.2
A is antisymmetric


1 4 2
 4 3 6 
2 6 7
,
where the first index labels rows and the second index columns. Expanding the
determinant gives
a
a
a
a a a det A = a11 22 23 − a12 21 23 + a13 21 22 a32 a33
a31 a33
a31 a32
is symmetric – we have reflection symmetry in the diagonal – whereas


0
3 5
 −3
0 2 
−5 −2 0
= a11 (a22 a33 − a23 a32 ) − a12 (a21 a33 − a23 a31 ) + a13 (a21 a32 − a22 a31 )
= (123 a11 a22 a33 + 132 a11 a23 a32 ) + . . .
is antisymmetric – we have reflection antisymmetry in the diagonal, and zero along
the diagonal.
= 1mn a11 a2m a3n + . . .
= lmn a1l a2m a3n
Product of matrices
Thus the two forms are equivalent. The form is convenient for derivation of various
properties of determinants.
We can very easily implement the usual ‘row into column’ matrix multiplication rule
in index notation.
Note that only one term from each row and column appears in the determinant sum,
which is why the determinant can be expressed in terms of the symbol.
If A (with elements aij ) is an M × N matrix and B (with elements bij ) is an N × P
matrix then C = AB is an M × P matrix with elements cij = aik bkj . Since
we’re using the summation convention, there is an implicit sum k = 1, . . . N in this
expression.
The determinant is only defined for a square matrix, but the definition can be
generalised to N × N matrices,
For example
where the epsilon symbol with N indices is defined by

if i1 , . . . , iN is an even permutation of 1, . . . , N
 +1
−1
if i1 , . . . , iN is an odd permutation of 1, . . . , N .
i1 ...iN =

0
otherwise
3 7 1
6 2 4


8 6
 1 2 =
4 0
35 32
64 40
Matrix multiplication is associative and distributive
A(BC) = (AB)C ≡ ABC
A(B + C) = AB + AC
(1.3)
det A = i1 ...iN a1i1 . . . aN iN ,
We shall usually consider N = 3, but most results generalise to arbitrary N .
We may use these results to derive several alternative (and equivalent) expressions
for the determinant. First define the quantity
Xijk = lmn ail ajm akn
respectively, but it’s not commutative: AB 6= BA in general.
An important result is
(AB)T = B T AT
which follows because
(AB)Tij = (AB)ji = ajk bki = (B T )ik (AT )kj = (B T AT )ij
11
It follows that
Xjik = lmn ajl aim akn
= mln ajm ail akn
(where we relabelled l ↔ m)
= −lmn ail ajm akn = −Xijk
12
Thus the symmetry of Xijk is dictated by the symmetry of lmn , and we must have
Xijk = c ijk
where c is some constant. To determine c, set i = 1, j = 2, k = 3, which gives
123 c = X123 , so c = lmn a1l a2m a3n = det A and hence
ijk det A = lmn ail ajm akn
Multiplying by ijk and using ijk ijk = 6 gives the symmetrical form for det A:
• Interchanging any two adjacent rows of a matrix changes the sign of the determinant.
Example: Interchanging the first and second rows gives
lmn a2l a1m a3n = mln a2m a1l a3n = −lmn a1l a2m a3n = − det A
In the first step we simply relabelled l ↔ m.
When A has two identical rows, det A = 0.
• Interchanging any two adjacent columns of a matrix also changes the sign of
the determinant (use the other form for the determinant, det A = ijk ai1 aj2 ak3 ).
ail aim ain
ijk lmn det A = ajl ajm ajn
akl akm akn
Finally,
1
det A =
ijk lmn ail ajm akn
3!
This elegant expression isn’t of practical use because the number of terms in the
sum increases from 33 to 36 (overcounting).
We can obtain a result similar to the boxed expression above by defining
Ylmn = ijk ail ajm akn
Using the same argument as before [tutorial] gives
Ylmn = lmn [ijk ai1 aj2 ak3 ]
Since det A = 1/3! lmn Ylmn this means that
(1.4)
To derive this, start with the original definition of det A as | · · · | and permute rows
and columns. This produces ± signs equivalent to permutations.
1.2.3
Linear equations
A standard use of matrices & determinants is to solve (for x) the matrix-vector
equation
Ax = y
where A is a square 3 × 3 matrix. Representing x and y by column matrix, and
writing out the components, this becomes
a11 x1 + a12 x2 + a13 x3 = y1
a21 x1 + a22 x2 + a23 x3 = y2
a31 x1 + a32 x2 + a33 x3 = y3
det A = ijk ai1 aj2 ak3
and
lmn det A = ijk ail ajm akn
In index notation
aij xj = yi
With a suitable definition of the inverse A−1 of the matrix A, we can write the
solution as
x = A−1 y
or
xi = A−1
ij yj
[tutorial]
Properties of Determinants
We can easily derive familiar properties of determinants from the definitions above:
• Adding a multiple of one row to another does not change the value of the
determinant.
Example: Adding a multiple of the second row to the first row
lmn a1l a2m a3n → lmn (a1l + λ a2l ) a2m a3n = lmn a1l a2m a3n + 0
and det A is unaltered. The last term is zero because lmn a2l a2m = 0.
• Adding a multiple of one column to another does not change the value of the
determinant (use the other form for the determinant, det A = ijk ai1 aj2 ak3 ).
13
−1
th
where A−1
element of the inverse of A
ij ≡ (A )ij is the ij
1 1
imn jpq apm aqn
2! det A
By explicit multiplication [tutorial] we can show AA−1 = I = A−1 A as required.
A−1
ij =
Alternatively [tutorial],
CT
det A
where C = {cij } is the co-factor matrix of A, and cij = (−1)i+j × the determinant
formed by omitting the row and column containing aij .
A−1 =
Note that a solution exists if and only if det A 6= 0.
These results generalise to N × N matrices.
14
Determinant of the transpose
det AT = lmn ATl1 ATm2 ATn3
= lmn a1l a2m a3n
we obtain an old friend, which we now note can be written as the determinant of a
2 × 2 matrix,
δ δ ijk klm = δil δjm − δim δjl = il im δjl δjm
1.2.4
and hence
Orthogonal matrices
T
det A = det A
Let A be a 3 × 3 matrix with elements aij , and define the row vectors
a(1) = (a11 , a12 , a13 ) ,
a(2) = (a21 , a22 , a23 ) ,
a(3) = (a31 , a32 , a33 ) ,
so that a(i) j = aij . If we choose the vectors a(i) to be orthonormal:
Product of determinants
If C = AB so that cij = aik bkj then
det C = ijk c1i c2j c3k
a(1) · a(2) = a(2) · a(3) = a(3) · a(1) = 0 ,
and
(1) 2
a = a(2) 2 = a(3) 2 = 1 ,
= ijk a1l bli a2m bmj a3n bnk
= [ijk bli bmj bnk ] a1l a2m a3n
i.e.
a(i) · a(j) = δij ,
= lmn det B a1l a2m a3n
then A is an orthogonal matrix. The rows of A form an orthonormal triad.
and hence
det AB = det A det B
• Consider
Product of two epsilons
The product of two epsilon symbols with no identical indices may be written as
δil δim δin ijk lmn = δjl δjm δjn δkl δkm δkn This equation has 6 free indices, so it represents 36 = 729 identities: 18 say ‘1 = 1’,
18 say ‘−1 = −0 1, so 693 say ‘0 = 0’.
The proof follows almost trivially by setting
ail
ijk lmn det A = ajl
akl
A = I in equation (1.4)
aim ain ajm ajn ,
akm akn whence det I = 1 and aij = δij . Unfortunately, this result isn’t particularly useful. . .
If we set l = k and sum over k
δik δil δim
ijk klm = δjk δjl δjm
δkk δkl δkm
Properties of orthogonal matrices




1 0 0
a11 a12 a13
a11 a21 a31
AAT =  a21 a22 a23   a12 a22 a32  =  0 1 0  ,
0 0 1
a13 a23 a33
a31 a32 a33

Therefore
A AT = I
• Taking the determinant of both sides of AAT = I gives det A det AT = 1.
Since det AT = det A, then (det A)2 = 1, and therefore
det A = ±1
• Since det A 6= 0, the inverse matrix A−1 always exists, and therefore
A−1 = AT
Multiplying this equation on the right by A gives
AT A = I
= δik (δjl δkm − δjm δkl ) − δil (δjk δkm − 3δjm ) + δim (δjk δkl − 3δjl ))
= δim δjl − δil δjm + 2δil δjm − 2δim δjl
15
so the columns of A also form an orthonormal triad.
16
For example, consider a rotation through angle α in
an anticlockwise direction about the z-axis. (This is
anticlockwise when looking at the tip of the arrow of
the z-axis.) In this case
e01 · e1 = e02 · e2
Chapter 2
e01
e02
~e3
~e′3
~e′1
= a01 e01 + a02 e02 + a03 e03 = a0i e0i
We can relate the components in the two bases:
~e1
x
z, z ′
evaluation or cyclic

0 − sin γ
1
0 
0
cos γ
This is a general result, which holds because the length of the vector a is unchanged
by the rotation
(
2
ai ai = δij ai aj
a = a · a =
a0k a0k = (`ki ai ) (`kj aj )
Since this is true for all ai then
which we write as
`ki `kj = δij
~e2
a0i = `ij aj
~e′2
where the transformation matrix L has components `ij
 0

e1 · e1 e01 · e2 e01 · e3
n
o


L = {`ij } = e0i · ej =  e02 · e1 e02 · e2 e02 · e3 
0
0
0
e3 · e1 e3 · e2 e3 · e3
• `ij is the cosine of the angle between ith axis of S 0 and the j th axis of S.
• L describes a rotation of the basis vectors about some axis, in which case it is
often called the rotation matrix.
Note that the vector a is not rotated – it remains fixed in space – only the basis
vectors are rotated. More on this point later.
1
α
Since Lz (−α) Lz (α) = I and LTz (α) = Lz (−α) (by inspection), we deduce that
LTz (α) Lz (α) = I, i.e. Lz is an orthogonal matrix.
a = a1 e1 + a2 e2 + a3 e3 = ai ei
a0i = a · e0i = e0i · ej aj
sin α
x′
· e1 = cos(π/2 + α) = − sin α , etc
Similarly for rotations about the x and y axes, either by direct
rotations of axes in our previous result,



1
0
0
cos γ
cos β sin β 
0
Lx (β) =  0
Ly (γ) = 
0 − sin β cos β
sin γ
Consider two right handed (RH) orthonormal bases1 with a common origin.
The vector a has components ai in S, and a0i in S 0 :
· e2 = cos(π/2 − α) =

cos α sin α 0
Lz (α) =  − sin α cos α 0 
0
0 1
Rotation of basis (or axes)
Denote basis vectors in S by {ei }, and in S 0 by {e0i }.
cos α
The rotation matrix is

Rotations, reflections & inversions
2.1
=
y
y′
sometimes known as frames or frames of reference in physics applications
17
In matrix notation, this is LT L = I. So L is an orthogonal matrix, which has
determinant ±1 and hence its inverse always exists. Thus we have
LT L = I
or LT = L−1
or L LT = I
We can write the new basis vectors in terms of the old ones using `ij .
Start with a0i e0i = ai ei , and use a0i = `ij aj to write `ij aj e0i = aj ej . This holds for
all aj so `ij e0i = ej .
Multiplying this last expression on the left by `kj gives `kj `ij e0i = `kj ej . But
`kj `ij = δki , which gives e0k = `kj ej . Relabelling k → i gives
e0i = `ij ej
NB det L is always +1 for a rotation. This is because we must have L → I continuously as the angle of rotation α → 0. Rotations are called proper transformations.
18
Improper transformations: If the original basis {e1 , e2 , e3 } is right handed (RH),
but the new basis {e01 , e02 , e03 } is left handed (LH) then det L = −1.
Alternatively, L2 followed by L1 gives:
y
S
S′
y′
S ′′
y
A simple example is an inversion of axes through
the origin e0i = −ei , in which case
x′′
x′
z′


−1
0
0
0 
L =  0 −1
0
0 −1
x′
x
x
z′
z ′′
so det L = −1.
Basis transformations with det L = −1 are called
improper transformations.
2.1.1
z
z
y ′′
y′


0 1 0

0 0 1 
L1 L2 =
1 0 0
Composition of two rotations
Consider a rotation of the axes {e i } described by matrix L1 followed by a rotation
about the new axes {e i 0 } described by L2 .
Then e0i = (L1 )ij ej and e00i = (L2 )ij e0j , which gives
e00i = (L2 L1 )ij ej
Note the “reverse” ordering of L1 and L2 in this expression – L1 acts first.
⇒
e001 = e2
e002 = e3
e003 = e1
So L2 L1 6= L1 L2 , i.e. rotations are non-commutative.
Euler angles: Making three rotations, the first about the 3-axis (through angle α),
the second about the 20 -axis (β), and the third about the 300 -axis (γ) gives the most
general rotation:
L(α, β, γ) = Lz00 (γ) Ly0 (β) Lz (α) .
Multiplying out gives:
Example:



0 0 −1
0 
L2 =  0 1
1 0
0
0 1 0
L1 =  −1 0 0 
0 0 1
0
L1 represents a rotation about Oz through π/2, and L2 a rotation about Oy of π/2.
For L1 followed by L2 :
S
y
x′
S′
S ′′


cos α sin α
cos β 0 − sin β
cos γ sin γ 0



0 1
0   − sin α cos α
− sin γ cos γ 0
L =
0
0
sin β 0
cos β
0
0 1

cos β cos α cos γ − sin α sin γ
cos β sin α cos γ + cos α sin γ
=  − cos β cos α sin γ − sin α cos γ − cos β sin α sin γ + cos α cos γ
sin β cos α
sin β sin α



0
0 
1

− sin β cos γ
sin β sin γ 
cos β
which is a rather complicated result! [There’s a nice picture in the Mathematical
Methods book by Mathews and Walker.]
z ′′
Beware, conventions for choosing Euler angles vary very widely.
x′′
x
y′
Euler angles are used in rigid body dynamics – see Lagrangian Dynamics course.
y ′′
z′
z

0 0 −1
0 
L2 L1 =  −1 0
0 1
0

19
⇒
e001 = −e3
e002 = −e1
e003 = e2
20
2.2
Rotation of vectors
Some properties of the rotation matrix
(i) R is orthogonal, with det R = 1. Proofs [tutorial]
The transformations of the previous section in which we rotate (or reflect or invert)
the basis vectors keeping the vector fixed are called passive transformations.
(ii) It is straightforward to show that
Alternatively we can keep the basis fixed and rotate (or reflect or invert) the vector.
These are called active transformations.
2.2.1
TrR = 1 + 2 cos θ
− 12 kij Rij
= nk sin θ
If R is known, then the angle θ and the axis of rotation n can be determined.
Note: R has 1 + (3 − 1) = 3 independent parameters; c.f. the 3 Euler angles.
Consider a rotation of a rigid body, through angle θ, about an axis which points in the direction
of the unit vector n. The axis passes through a
fixed origin O.
R
=
P
~n × ~x
S
(iv) Consider a small (infinitesimal) rotation δθ, for which
cos δθ = 1 + O(δθ2 ) and sin δθ = δθ + O(δθ3 ) , then
Q
θ
~y
T
Rij = δij − ijk nk δθ
P
~x
~n
A quicker (and sufficient) graphical proof follows directly from the diagram on the right, which gives
x − (x · n)n
n×x
x · n n + SQ cos θ
+ SQ sin θ
| {z } |
| {z } |n × x|
SP
{z
}
−→
−→
| {z }
|ST |
|T Q|
c
SP
x · n n + cos θ x − x · n n + sin θ n × x
δθ
|~n × ~x|
y − x = n × x δθ
O
−→ −→ −→
y = OS + ST + T Q
=
θ
S
The point P is rotated to Q. The position vector
x is rotated to y .
−→
In the first diagram, OS is the projection of x
onto the n direction, i.e. x · n n.
−→
In the second diagram, n × x is parallel to T Q.
S
(iii) The product of two rotations x → y → z is given by z = SR x .
Q
d
T
Q
from which the result above follows directly.
~y
~x
~n
(v) For θ 6= 0, π, R has only real eigenvalue +1 , with one real eigenvector n .
[tutorial]
(as SP = SQ and |n × x| = SP = SQ).
2.3
This gives the important result
y = x cos θ + (1 − cos θ) n · x n + (n × x) sin θ
In index notation, this is
yi = xi cos θ + (1 − cos θ)nj xj ni + ikj nk xj sin θ
or
θ, n xj
yi = Rij
where the rotation matrix R θ, n has components
Reflections and inversions
Consider reflection of a vector x → y in a
plane with unit normal n.
From the figure y = x − 2 x · n n
~x
In index notation, this becomes
yi = σij xj
where
σij = δij − 2 ni nj
Rij θ, n = δij cos θ + (1 − cos θ) ni nj − ijk nk sin θ
21
~n
~y
22
|~n × ~x|δθ
Inversion of a vector in the origin is given by y = −x . This defines the parity
operator P :
where Pij = −δij
yi = Pij xj
For reflections and inversions, det σ = det P = −1.
Note that for reflections and inversions, performing the operation twice yields the
original vector, i.e. σ 2 = I, P 2 = I.
2.5
Active and passive transformations
• Rotation of a vector in a fixed basis is called an active transformation
x → y with yi = Rij xj in the {ei } basis
• Rotation of the basis whilst keeping the vector fixed is called a passive transformation
{ei } → {e0i } and xi → x0i = `ij xj
If we set Rij = `ij , then numerically yi = x0i .
2.4
Projection operators
Consider a simple example of both types of rotation:
P is a parallel projection operator onto a vector u if
Pu = u
and
Pv = 0
where v is any vector orthogonal to u , i.e. v · u = 0 . Similarly Q is an orthogonal
projection to u if
and
Qv = v
Qu = 0
so that Q = I − P . Suitable operators are (exercise: check this!)
ui uj
u2
Pij =
Qij = δij −
and
ui uj
u2
Rij (θ, e3 ) = δij cos θ + (1 − cos θ) δi3 δj3 − ijk δk3 sin θ


cos θ − sin θ 0
cos θ 0 
=  sin θ
0
0 1
Q2 = Q ,
P Q = QP = 0
They’re also unique. For example, if there exists another operator T with the same
properties as P , i.e. T u = u and T v = 0, then for any vector w ≡ µ u + ν v + λ u × v
we have
(P − T ) w = µ u + 0 + 0 − µ u + 0 + 0 = 0
θ
~x
~e1
This is an active rotation through angle θ .
~e3
P2 = P ,
P u×v
~y
where we used ni = δi3 .
These have the properties
because
~e2
Rotation of a vector about the z-axis
i
= Pij u × v
This holds for all vectors w , so T = P .
j
= ui uj /u2 jkl uk vl = 0
Rotation of the basis about the z-axis
~e2
e0i = `ij ej ≡ Rij ej
~e′2
In components
e01 = cos θ e1 − sin θ e2
~x
e02 = sin θ e1 + cos θ e2
~e′1
This is a passive rotation through angle −θ.
~e3, ~e′3
We conclude that
An active rotation of the vector x through angle θ is equivalent to a
passive rotation of the basis vectors by an equal and opposite amount.
Colloquially, rotating a vector in one direction is equivalent to rotating
the basis in the opposite direction.
The general case can be built from three rotations (Euler angles).
23
~e1
θ
e03 = e3
24
Similarly a tensor of rank n, T , is defined to be an entity whose 3n components
0
Tijk···no (n-indices) in S are related to its 3n components Trst···vw
(n-indices) in S 0 by
0
Tijk···no
= `ir `js `kt · · · `nv `ow Trst···vw
Chapter 3
In this new language
• A scalar is a tensor of rank 0 (i.e. a0 = a).
Cartesian tensors
3.1
Definition and transformation properties
Consider a rotation of the {e i } basis (frame S) to the {e0i } basis (frame S 0 ). This
is a passive rotation.
The rotation matrix L, with components `ij , satisfies LLT = I = LT L, and it has
unit determinant det L = +1.
The components of two arbitrary vectors a and b in the two frames are related by
a0i = `ij aj
b0i
= `ij bj
We now define a vector a as an entity whose 3 components ai in S are related to its
3 components a0i in S 0 by a0i = `ij aj .
Now consider the 9 quantities ai bj . Under the change of basis, these transform to
a0i b0j = `ir ar `js bs
= (`ir `js ) (ar bs )
Clearly, these 9 quantities obey a particular transformation law under the change
of frame S → S 0 . This motivates our definition of a tensor.
3.1.1
Definition of a tensor
Following on from our new definition of a vector, we define a tensor of rank 2, T , as
an entity whose 32 = 9 components Tij in S are related to its 9 components Tij0 in
S 0 by
Tij0 = `ip `jq Tpq
where L is the rotation matrix with components `ij which takes S → S 0 . Since there
are 2 free indices in the above expression, it represents 9 equations.
• A vector is a tensor of rank 1.
We shall often be sloppy and say Tijk···rs is a tensor, when what we really mean is
that T is a tensor with components Tijk···rs in a particular frame S.
The expressions tensor of rank 2 and second-rank tensor are used interchangeably.
Similarly for tensor of rank n and nth -rank tensor.
Note that a rank-n tensor is more general than the ‘product’ of n vectors, i.e., not
every tensor has components that can be written as ai bj ck . . . pr qs . For example
ai bj + aj bi is a rank-2 tensor. Another explicit counterexample for n = 2 will be
given in section 3.1.5.
3.1.2
Fields
A scalar or vector or tensor quantity is called a field when it is a function of position:
• Temperature T (r) is a scalar field
• The electric field Ei (r) is a vector field
• The stress field Pij (r) is a tensor field (see later)
In the latter case the transformation law is
Pij0 (r) = `ip `jq Ppq (r)
or
Pij0 (x0k ) = `ip `jq Ppq (xk ) with x0k = `kp xp
These two expressions mean the same thing, but the latter form is perhaps better.
3.1.3
In some (mostly older) books you will see dyadic notation. This is rather clumsy for
tensors – but it works well for vectors of course!
a
a·b
A
a A b or a · A · b
···
We will not use dyadic notation for tensors.
25
26
index notation
ai
ai b i
Aij or aij
ai Aij bj
···
3.1.4
Internal consistency in the definition of a tensor
Let Tij , Tij0 , Tij00 be the components of a tensor in frames S, S 0 , S 00 respectively
L be the rotation matrix for S → S 0 L = {`ij }
and let
M be the rotation matrix for S 0 → S 00 M = {mij }
Then
• If Tij = Tji in S, then Tij0 = Tji0 in S 0 :
p↔q
Tij0 = `ip `jq Tpq = `iq `jp Tqp = `jp `iq Tpq = Tji0
Tij is a symmetric tensor – the symmetry is preserved under a change of basis.
[The notation p ↔ q refers to relabelling indices.]
Similarly if Tij = −Tji , then Tij0 = −Tji0 . Tij is an anti-symmetric tensor.
Given any second rank tensor T , we can always decompose it into symmetric
and anti-symmetric parts
0
Tij00 = mip mjq Tpq
= mip mjq (`pk `ql Tkl )
Tij =
= (M L)ik (M L)jl Tkl
1
2
(Tij + Tji ) + 21 (Tij − Tji )
= nik njl Tkl ,
where N = M L is the rotation matrix for S → S 00 , so the definition of a tensor is
self-consistent.
• We can re-write the tensor transformation law for rank 2 tensors (only) using
matrix notation:
Tij0 = `ip `jq Tpq = (LT LT )ij
so
3.1.5
Properties of Cartesian tensors
T 0 = LT LT ≡ LT L−1
• If Tij···p and Uij···p are two tensors with the same rank n, i.e both have n
indices, then
Vij···p = Tij···p + Uij···p
• Kronecker delta, δij , is a second rank tensor
Since we defined
δij =
is also a tensor of rank n. The proof is straightforward.
• If Tij···s and Ulm···r are the components of tensors T and U of rank n and m
respectively then
Vij···slm···r = Tij···s Ulm···r
are the components of a tensor T U of rank n + m, which has 3n × 3m = 3m+n
components. This is because
=
+1 i = j
0 i=
6 j
in all frames, so that a · b = δij ai bj = δij a0i b0j , we may write
given
δij0 = δij = `ip `jq δpq
which is the definition of a second rank tensor. A tensor which has the same
components in all frames is called an invariant or isotropic tensor [more later].
δij is an example of a tensor that cannot be written in the form Tij = ai bj .
[The diagonal terms require non-zero ai and bi , while the off-diagonal terms
require all of ai and bj to be zero.]
`iα · · · `rρ Vα···ρ = `iα · · · `sδ Tα···δ `l · · · `rρ U···ρ
0
Ti···s
(for rank 2 only)
0
Ul···r
0
= Vi···r
We occasionally use Greek letters α, β etc for dummy indices when we have a
large number of them!
3.1.6
Example: If U = λ is a scalar, then λ T is a tensor of rank n + 0 = n.
Let T be an entity with 9 components in any frame, say Tij in S, and Tij0 in S 0 .
• If Tijk···s is a tensor of rank n, then Tiik···s (i.e. n − 2 free indices) is a tensor of
rank n − 2. The process of setting indices to be equal and summing is called
contraction.
Example: If Tij = ai bj is a tensor of rank 2, then Tii = ai bi is a tensor of
rank 0 (scalar).
This process of multiplying two tensors and contracting over a pair (or pairs)
of indices on different tensors is (sometimes) called taking the scalar product.
• If Tij is a tensor then so is Tji [the components of the transpose T
27
T
of T ].
The quotient theorem
Let a be an arbitrary vector and let bi = Tij aj . If b always transforms as a vector,
then T is a second rank tensor.
To prove this, we determine the transformation properties of T . In S 0 we have
b0i = Tij0 a0j = Tij0 `jk ak
≡ `ij bj = `ij Tjk ak
Equate the last expression on each line and rearrange
Tij0 `jk − `ij Tjk ak = 0
28
This expression holds for all vectors a, for example a = (1, 0, 0) etc, therefore
⇒
⇒
Tij0 `jk = `ij Tjk
Tij0 `jk `mk = `ij `mk Tjk
0
= `ij `mk Tjk
Tim
where we used `jk `mk = δjm in the last step. Thus T transforms as a tensor.
Example: If there is a linear relationship between two vectors a and b, so that
ai = Tij bj , it follows from the quotient theorem that T is a tensor. This is an
alternative (mathematical) definition of a second-rank tensor, and it’s also the way
that tensors arise in nature.
Quotient theorem: Generalising, if Rij···s is an arbitrary tensor of rank-m, and
Tij···r is a set of 3n numbers (with n > m) such that Tij···r Rij···s is a tensor of rank
n − m , then Tij···r is a tensor of rank-n.
3.2.1
Vectors
Inversion of the basis vectors is defined by
(e1 , e2 , e3 ) → (e01 , e02 , e03 ) = (−e1 , −e2 , −e3 )
so that

L = 
which has components `ij = −δij .

−1
0
0
0 −1
0 
0
0 −1
Let a be a vector (also called a tensor of rank 1 or a polar vector or a true vector ).
We showed previously that the components ai transform as a0i = `ij ai , so we have
a01 = −a1
Proof is similar to the rank-2 case, but it isn’t illuminating.
a02 = −a2
a03 = −a3
The vector itself therefore transforms as
3.2
a0 ≡ a0i e0i = (−ai ) (−ei ) = ai ei = a
Pseudotensors, pseudovectors & pseudoscalars
Suppose that we now allow reflection and inversion (as well as rotation) of the basis
vectors, and represent them all by a transformation matrix L with
det L = +1
det L = −1
and thus a (true/polar) vector remains unchanged by inversion of the basis.
S
for rotations
for reflections and inversions
~e2
S′
~a
~a
~e′3
In the second case, the handedness of the basis is changed: if S is right-handed
(RH), then S 0 is left-handed (LH), and vice versa.
~e1
Before introducing pseudotensors, we note that the basis vectors always transform
as
e0i = `ij ej
ref lection
~e′1
~e3
~e′2
A second-rank tensor or true tensor T obeys the transformation law
Tij0 = `ip `jq Tpq
for all transformations, i.e. rotations, reflections and inversions.
A second-rank pseudotensor T obeys the transformation law
3.2.2
Pseudovectors
If a and b are (true) vectors then c = a × b is a pseudovector (also known as a
pseudotensor of rank 1 or an axial vector ).
Let’s illustrate this by considering an inversion.
Tij0
= (det L) `ip `jq Tpq
c01 = a02 b03 − a03 b02 = (−a2 ) (−b3 ) − (−a3 ) (−b2 ) = +c1
There is a change of sign for components of a pseudotensor, relative to a true tensor,
when we make a reflection or inversion of the reference axes.
Therefore
We can similarly define pseudovectors, pseudoscalars and rank-n pseudotensors.
Thus the direction of c = a × b is reversed in this LH basis.
Note: det L = −1 for a basis transformation that consists of any combination of
an odd number of reflections or inversions, plus any number of rotations.
29
c0 ≡ c0i e0i = (+ci ) (−ei ) = −ci ei = −c
30
etc
S
~e2
~e1
~c
~e3
S′
~b
ref lection
~b
~e′3
~c
~e′1
~a
~a
~e′2
The vectors a, b and a × b form a LH triad in the S 0 basis, i.e. they always have the
same ‘handedness’ as the underlying basis.
Now consider the general case. We define the epsilon symbol in all bases, regardless
of their handedness, in the same way:

 +1 if ijk is an even permutation of 123
−1 if ijk is an odd permutation of 123
ijk =

0 otherwise
The components of c = a × b are ci = ijk aj bk in S and c0i = ijk a0j b0k in S 0 .
We can now determine the components of c = a × b in S
0
c0i = ijk a0j b0k
We can build pseudotensors of higher rank using a combination of vectors, tensors,
δ and . For example:
• If ai and bi are tensors of rank 1, then the 35 quantities ijk al bm are the
components of a pseudotensor of rank 5.
• ijk δpq is a pseudotensor of rank 5.
In general:
• The product of two tensors is a tensor
• The product of two pseudotensors is a tensor
• The product of a tensor and a pseudotensor is a pseudotensor
3.3
Some examples
dr(t)
dt
• Velocity
v =
• Acceleration
a = v˙
vector
vector
= ijk `jp ap `kq bq
• Force
F = ma
vector
= mjk `ir `mr `jp `kq ap bq
• Electric field
E = F /q
vector
• Torque
G = r×F
pseudovector
• Angular velocity (ω)
v = ω×r
pseudovector
• Angular momentum
L = r×p
pseudovector
• Magnetic field (B)
F = qv × B
pseudovector
= det L `ir rpq ap bq
= det L `ir (a × b)r
where we used mjk `ir `mr = mjk δim = ijk to get to the third line. So, finally
c0i = det L `ir cr
This is our definition of a pseudovector. Equivalently, since e0i = `ij ej then
c0 ≡ c0i e0i = (det L `ir cr ) `ij ej = det L ci ei = det L c
Therefore a pseudovector changes sign under any improper transformation such as
inversion of the basis, or reflection.
Pseudotensors: is a pseudotensor of rank 3. The proof is simple and uses the
fact that is the same in all bases:
0ijk ≡ ijk = det L det L ijk = det L `ip `jq `kr pqr
where we used the definition of the determinant in the last step.
Furthermore since is the same in all bases, it’s a rank-3 isotropic or invariant
tensor. (More on this later.)
31
(F is the force on a test charge q)
B(r) =
µ0
4π
I
C
idr0 × (r − r0 )
r − r 0 3
(Biot–Savart Law – more later)
P
i
d~r′
C
• E·B
~r
~r′
O
pseudoscalar
A pseudoscalar changes sign under an improper transformation: a0 = (det L) a
In this case, one can easily show that
0
E · B = det L E · B
32
• Helicity
h = p·s
pseudoscalar
Finally, for a rotation of π/2 about the x-axis
a22 = a33 ,
p~
a21 = 0 = a31
~s
The pseudovector s is the angular momentum,
or spin, of a particle/ball. As the ball spins,
a point on it traces out a RH helix. In the
figure, p is parallel to s
The only solution to these equations is aij = λ δij . We’ve already shown that δij is
an invariant second-rank tensor, therefore λ δij is the most general invariant secondrank tensor.
One can use a similar argument to show that the only invariant vector is the zero
vector. It’s obvious that no non-zero vector has the same components in all bases!
Theorem: There is no invariant tensor of rank 3. The most general invariant
pseudotensor of rank 3 has components
3.4
Invariant/Isotropic Tensors
aijk
A tensor T is invariant or isotropic 1 if it has the same components, Tijk··· in any
Cartesian basis (or frame of reference), so that
Theorem: If aij is a second-rank invariant tensor, then
invariant
=
invariant
=
λ δij δkl + µ δik δjl + ν δil δjk
The proof is long and not very illuminating. See, for example, Matthews, Vector
Calculus, (Springer) or Jeffreys, Cartesian Tensors, (CUP). However, it’s easy to
show that the expression above is indeed an invariant tensor.
Tijk··· = det L `iα `jβ `kγ · · · Tαβγ···
aij
The most general rank-5 invariant pseudotensor has components
λ δij
aijklm
invariant
=
λ δij klm + . . .
The most general rank-6 invariant tensor has components
Proof: For a rotation of π/2 about the z-axis


0 1 0
L =  −1 0 0 
0 0 1
aijklmn
invariant
=
λ δij δkl δmn + . . .
Note that invariant tensors involving can always be rewritten as sums of products
of δs using the expression for the product of two epsilons.
Since the only non-zero elements are `12 = 1, `21 = −1, `33 = 1, using aij =
`iα `jβ aαβ , we find
a11 = `1α `1β aαβ = `12 `12 a22 =
a22
a13 = `1α `3β aαβ = `12 `33 a23 =
a23
a23 = `2α `3β aαβ = `21 `33 a13 = −a13
The most general invariant tensor of rank 2n is a sum of products of constants times
n Kronecker deltas. There is no invariant pseudotensor of rank 2n.
Similarly, the most general invariant pseudotensor of rank 2n+1 is a sum of products
of constants times one and n − 1 Kronecker deltas. There is no invariant tensor of
rank 2n + 1.
Therefore
a11 = a22 ,
a13 = 0 = a23
Similarly, for a rotation of π/2 about the y-axis, we find
a11 = a33 ,
1
a12 = 0 = a32
Isotropic means “the same in all directions”.
33
λ ijk
Theorem: The most general 4th rank invariant tensor has components
aijkl
Similarly, T is an invariant pseudotensor if
=
Proof: This is similar to the rank 2 case [tutorial].
Tijk··· = `iα `jβ `kγ · · · Tαβγ···
for every (orthogonal) transformation matrix L = {`ij }.
invariant
34
Now let xm → x, which gives
f (x) = f (a) + (x − a)f 0 (a) +
Chapter 4
Taylor expansions
Taylor expansion is one of the most important and fundamental techniques in the
physicist’s toolkit. It allows a differentiable function to be expressed as a power
series in its argument(s). This is useful when approximating a function, it often
allows the problem to be ‘solved’ in some range of interest, and it’s used in deriving
fundamental differential equations. We shall use the expression ‘Taylor’s Theorem’
interchangeably with ‘Taylor expansion’.
We shall assume familiarity with Taylor expansions of functions of one variable, so we
won’t cover this in lectures. However, we include some notes here for completeness.
You may have seen the multivariate Taylor expansion beyond leading order, but
possibly not quite like this. . .
4.1
1
(x − a)2 f 00 + . . . +
2!
1
(x − a)m−1 f (m−1) (a) + Rm ,
(m − 1)!
where is n! = n(n − 1) · · · 1 is the usual factorial function, with 0! = 1, and the
remainder Rm is
Z x1
Z x
f (m) (x0 ) dx0 · · · dxm−1
···
Rm =
a
a
But from the mean value theorem applied to f (m) , we have
Z x
f (m) (x0 ) dx0 = (x − a)f (m) (ξ) ,
a≤ξ≤x
a
which gives the “Lagrange form” for the remainder
Rm (x) =
1
(x − a)m f (m) (ξ) ,
m!
Notes
• We can repeat the proof above for
Ra
x
• If (limm→∞ Rm ) → 0 (as usually assumed here), we have an infinite series
f (x) =
n=∞
X
n=0
a
Integrating a total of m times gives
Z x1
Z xm
···
f (m) (x0 ) dx0 · · · dxm−1
a
a
Z xm
Z x2
(m−1)
=
···
f
(x1 ) − f (m−1) (a) dx1 · · · dxm−1
a
a
Z xm
Z x3
(m−2)
=
···
f
(x2 ) − f (m−2) (a) − (x2 − a)f (m−1) (a) dx2 · · · dxm−1
a
a
= f (xm ) − f (a) − (xm − a)f 0 (a) −
1
(xm − a)2 f 00 (a) − · · ·
2!
1
(xm − a)m−1 f (m−1) (a)
−
(m − 1)!
where we used the basic integral
Z x
1
(y − a)n−1 dy = (x − a)n
n
a
35
· · · where x ∈ [c, a] with c ≤ a ≤ b.
Since nothing changes, we can talk about expansion in a region about x = a.
The one-dimensional case
Let f (x) have a continuous mth -order derivative f (m) (x) in a ≤ x ≤ b, so that
Z x1
f (m) (x0 ) dx0 = f (m−1) (x1 ) − f (m−1) (a)
a≤ξ≤x
1 (n)
f (a) (x − a)n
n!
This is the Taylor expansion of f (x) about x = a.
The set of x values for which the series converges is called the region of convergence of the Taylor expansion.
• If a = 0 , then
f (x) =
∞
X
1 (n)
f (0) xn
n!
n=0
The Taylor expansion about x = 0 is called the Maclaurin expansion.
Physicicist’s “proof ”: We can bypass the formal proof above by assuming
that a power series expansion of f (x) exists (i.e. the polynomials xn form a
complete basis) so that
∞
X
f (x) =
an x n .
n=0
Now differentiate n times, and equate coefficients for each n, to obtain
f (n) (0) = 0 + · · · + p! ap + · · · + 0
which gives ap =
36
1 (p)
f (0) (as before)
p!
4.1.1
Examples
4.1.2
A precursor to the three-dimensional case
If we regard f (x + a) ≡ g(a) temporarily as a function of a only, we can write g(a)
as a Maclaurin series in powers of a
Example 1: Expand the function
f (x) = sin x
f (x + a) ≡ g(a) =
about x = 0. We need
f
(2n)
x
n
(0) = (−1) sin 0 = 0
f (2n+1) (0) = (−1)n cos 0 = (−1)n
Now, since |f
(m)
(ξ)| ≤ 1, then, for fixed x,
1 m (m) 1 m
|Rm | =
x f (ξ) ≤
x →0
m!
m!
Therefore
sin x =
∞
X
(−1)n
n=0
1111111111
0000000000
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
∞
X
1 (n)
g (0) an ,
n!
n=0
which we can rewrite as
x−
f (x + a) =
1 3
x
3!
∞
X
1 (n)
d
f (x) an ≡ exp a
f (x)
n!
dx
n=0
The differential operator exp (a (d/dx)) is defined by its power-series expansion.
This is the form that we shall generalise to three dimensions in an elegant way.
It can also be obtained by first defining
x2n+1
x 3 x5
= x−
+
+ ...
(2n + 1)!
3!
5!
This ‘small x’ expansion is shown in the figure.
Example 2: Expand the function
F (t) ≡ f (x + at)
which we regard as a function of t. We need the expansion in powers of t about
t = 0, namely
∞
X
tn (n)
F (0)
(4.1)
F (t) =
n!
n=0
Noting that F (n) (0) = an f (n) (x), and setting t = 1, we find
f (x) = (1 + x)α
f (x + a) = F (1) =
about x = 0. In this case
f (n) (0) = α(α − 1) · · · (α − n + 1) ≡
giving
f (x) =
∞
X
n=0
g (n) (0) = f (n) (x)
α!
(α − n)!
∞ X
α!
α
xn ≡
xn
n
n!(α − n)!
n=0
The Taylor expansion includes the binomial expansion, α need not be a +ve integer.
Example 3: a ‘problem’ case Consider, for example, the well-behaved function
1
f (x) = exp − 2
x
Now f (0) = 0, and f (n) (0) = 0 ∀n, so
∞
X
1 n (n)
a f (x)
n!
n=0
as before.
4.2
The three-dimensional case
With this trick, we can use the one-dimensional result to find the Taylor expansion
of φ(r + a) in powers of a about the point r . Let
F (t) ≡ φ(r + ta) ≡ φ(u)
=
∞
X
n=0
(where we defined u = r + ta)
n
t
F (n) (0)
n!
where we used equation (4.1) above. We want φ(r + a) which is F (1).
Using the chain rule, the first derivative of F (t) with respect to t is
f (x) = 0 + 0 + 0 + . . . = 0 ∀x
Beware of essential singularities – not all functions
with an infinite number of derivatives can be expressed as a Taylor series. See “Laurent Series” in
courses on Complex Variables/Analysis
37
F (1) (t) =
∂φ(u) ∂ui
∂φ
=
ai =
∂ui ∂t
∂ui
where we used
∂ui
∂
=
(xi + tai ) = ai
∂t
∂t
a · ∇ u φ(u)
and defined a · ∇ u ≡ ai
38
∂φ
∂ui
The nth derivative of F (t) is
n
F (n) (t) = a · ∇ u φ(u)
and hence F (n) (0) = a · ∇ r
For F (1) we have
φ(r + a) =
n
φ(r)
(4.2)
∞
X
n
1
a · ∇ r φ(r) ≡ exp a · ∇ r φ(r)
n!
n=0
This is the Taylor expansion of a scalar field in three dimensions, in a rather elegant
form.
Chapter 5
The moment of inertia tensor
Generalisation to an arbitrary tensor field is easy. Simply replace φ(r) by Tij··· (r)
in the above expression.
Example: Find the Taylor expansion of φ(r + a) =
1
for r a.
|r + a|
Since φ(r) = 1/|r| = 1/r , we have
∞
X
n 1
1
1
=
a · ∇r
|r + a|
n!
r
n=0
1
1
1
1
1
=
+
(ai ∂i ) +
(ai ∂i ) (aj ∂j ) + · · ·
r
1!
r
2!
r
2
2 2
a·r
3(a · r) − a r
1
1
=
− 3 +
+O 4
r
r
2r5
r
5.1
Angular momentum and kinetic energy
Suppose a rigid body of arbitrary shape rotates
with angular velocity ω = ω n about a fixed axis,
parallel to the unit vector n, which passes through
the origin.
ω
Consider a small element of mass dm at the point
P , with position vector r relative to O.
P
If the rigid body has density (mass per unit volume) ρ(r), then dm = ρ dV .
O
dm
~r
The velocity of the element is
v = ω×r
Exercise: check this explicitly.
This result is used in the multipole expansion in electrostatics.
~ω
We can see this geometrically from the figure.
δθ
The distance |δr| moved in time δt, is
|δr| = r sin φ δθ = δθ n × r
So its velocity is
δr
v =
= ω×r
δt
δθ
where ω =
n
δt
~r
φ
~ω
Alternatively, we can use the rotation matrix R(θ, n)
⇒
δxi = Rij δθ, n xj − xi = −ijk nk δθ + O (δθ)2 xj
δxi
=
δt
n×r
which again gives v = ω × r.
39
δθ
i δt
40
5.1.1
Angular momentum
Alternative (more familiar) forms
The angular momentum L of a point particle of mass m at position r moving with
velocity v = r˙ is L = r × p, where the momentum p = mv.
The angular momentum dL of an element of mass dm = ρ dV at r is
dL = ρ(r) dV r × v
The angular momentum of the whole rotating body is then
Z
L =
ρ r × ω × r dV
body
In components
Li =
Z
ρ ijk xj klm ωl xm dV
body
=
Z
ρ (δil δjm − δim δjl ) xj ωl xm dV
body
=
Z
body
Thus
Li = Iij ωj
ρ r2 ωi − xi xj ωj dV
with
Iij =
Z
body
ρ r2 δij − xi xj dV
The geometric quantity I(O) (where O refers to the origin) is called the moment of
inertia tensor.1 It is a tensor because L is pseudovector, ω is a pseudovector, and
hence from the quotient theorem I is a tensor.
Recalling that the angular velocity may be written as ω = ω n, consider
n · L = ni Iij ωj = Iij ni nj ω ≡ I (n) ω
where L · n is the component of angular momentum parallel to the axis n, and
Z
Z
2 2
I (n) = Iij ni nj =
ρ r2 − r · n
dV ≡
ρ r⊥
dV
body
Similarly for the kinetic energy, so that
)
L(n) = I (n) ω
with L(n) = L · n ,
T = 12 I (n) ω 2
z
In this case
I11 = ρ
Z
a
dx dy dz
0
I12
1
Kinetic energy
which gives
T =
1
= −ρ
1
1
Iij ωi ωj = L · ω
2
2
We will often somewhat sloppily call it the inertia tensor
41
By symmetry
5.1.3
x 2 + y 2 + z 2 − x2
y
a
3
5
2
1 3
2
2
3 y xz + 3 z xy 0 = 3 ρa = 3 M a
Z a
dx dy dz (−xy)
= ρ
= ρ
0
2
The kinetic energy, dT , of an element of mass dm is dT = 12 (ρ dV ) ω × r . The
kinetic energy of the body is then
Z
1
ρ ijk ωj xk ilm ωl xm dV
T =
2 body
Z
1
ρ (δjl δkm − δjm δkl ) ωj xk ωl xm dV
=
2 body
Z
1
=
ρ ω 2 r2 − (r · ω)2 dV
2 body
Z
1
=
ρ r2 δij − xi xj dV ωi ωj
2 body
I (n) = Iij ni nj
Example: Consider a cube of side a of constant density ρ and mass M = ρa3
Z
Iij (O) =
ρ r2 δij − xi xj dV
Note also that Iij is symmetric, and it is independent of the axis of rotation n.
5.1.2
body
is the moment of inertia about n, with r⊥ the perpendicular distance from the n-axis.
1
21 2 a
2x 2y z 0
a
= − 14 ρa5 = − 14 M a2
x
O


2
1
1
3 −4 −4

2
1 
I(O) = M a2  − 14
3 −4 
2
− 14 − 14
3
The parallel axes theorem
ω
It’s often more useful, and also simpler, to find the
moment of inertia tensor about the centre of mass
G, rather than about an arbitrary point O. There
is, however, a simple relationship between them.
−→
Taking O to be the origin, and OG = R , we have
0
r = R + r , giving
Z
Iij (O) =
ρ(r) r2 δij − xi xj dV
Z
n
2
o
=
ρ0 (r0 ) R + r0 δij − (Xi + x0i ) Xj + x0j dV 0
42
P
dm
~r′
O
~r
~
R
G
0
In the above, ρ(r) = ρ(R + r0 ) ≡ ρ0 (r0 ), and we changed integration
R variables to r .
Expanding the integrand and using the definition of G, namely ρ0 (r0 ) r0 dV 0 = 0 ,
we get
Z
Iij (O) =
ρ0 (r0 ) R2 + r02 δij − Xi Xj − x0i x0j dV 0
5.1.4
Diagonalisation of rank-two tensors
Question: are there any directions for ω such that L is parallel to ω?
If so, then L = λ ω, and hence
(Iij − λδij ) ωj = 0
Hence
2
Iij (O) = Iij (G) + M (R δij − Xi Xj )
R
where M = ρ0 (r0 ) dV 0 is the total mass of the body. This is a general result; given
I(G) we can easily find the moment of inertia tensor about any other point.
The general result above is sometimes called the parallel axes theorem. However,
the parallel axes theorem technically refers to the inertia tensor about the same axis
n as the original axis
2
I (n) (O) = I (n) (G) + M R⊥
For a non-trivial solution of these three simultaneous linear equations, we must have
det (Iij − λδij ) = 0. Expanding the determinant, or writing it as
det (Iij − λδij ) = 16 ijk lmn (Iil − λδil ) (Ijm − λδjm ) (Ikn − λδkn ) = 0
and then expanding gives
P − Qλ + Rλ2 − λ3 = 0
where
P =
2
where R⊥ (with R⊥
≡ R2 − (R · n)2 ) is the perpendicular distance from the n axis.
= det I
Q =
Example (revisited): In our previous example, the centre of mass G is at the
centre of the cube with position vector R = ( 21 a, 12 a, 12 a). Then
Z a/2
I11 (G) = ρ
dx dy dz x2 + y 2 + z 2 − x2
=
=
R =
−a/2
o
n a/2
a/2
a/2
a/2
a/2
a/2
= ρ 31 y 3 −a/2 [x]−a/2 [z]−a/2 + 13 z 3 −a/2 [x]−a/2 [y]−a/2
= ρ 31 · 2(a/2)3 2(a/2) 2(a/2) · 2 = 61 ρa5 = 16 M a2
I12 (G) = ρ
Z
Since det A, TrA are invariant (the same in any basis), then the quantities P , Q, R
are invariants of the tensor I (i.e. their values are also the same in any basis).
= 0
The three values of λ (i.e. the solutions of the cubic equation) are the eigenvalues
of the rank-two tensor, and the vectors ω are its eigenvectors.2 We will generally
call the eigenvectors e.
a/2
dx dy dz (−xy)
−a/2
because
1
2x
=
1
6 ijk lmn (δil Ijm Ikn + Iil δjm Ikn + Iil Ijm δkn )
1
6 (δjm δkn − δjn δkm ) Ijm Ikn × 3
2
2
1
2 (TrI) − Tr I
1
6 ijk lmn (δil δjm Ikn + δil Ijm δkn + Iil δjm δkn )
1
6 2 δkn Ikn × 3
= Tr I
z
and
1
6 ijk lmn Iil Ijm Ikn
2 a/2
−a/2
y
= 0
G
a
Similarly for the other components.
x
O
Eigenvectors and eigenvalues: If we take Iij ωj = λ ωi , and multiply on the left
by L, we obtain (in matrix notation)
T
⇒
Iij0 L ω j = λ L ω i
LI L
L ij ωj = λ L ω i
|{z}
=1
The inertia tensor about the centre of mass is then


1
6 0 0


Iij (G) = M a2  0 16 0 
1
0 0 6 ij
and
Since

1
2
+
1
6

M R2 δij − Xi Xj = M a2 
1
2
− 14
− 14
In the primed basis, we have by definition Iij0 ωj0 = λ0 ωi0 . Comparing with the second
equation above, we see that eigenvectors ω are vectors, i.e. they transform as vectors
because ωi0 = `ij ωj .
Similarly, eigenvalues are scalars, i.e. they transform as scalars: λ0 = λ .
− 41 − 41
1
2
− 41

− 41 
1
2
= 32 , this reproduces our previous result for Iij (O).
43
Note that only the direction (up to a ± sign) of the eigenvectors is determined by
the eigenvalue equation, the magnitude is arbitrary.

ij
The answer to our original question is that we must find the eigenvalues λ(i) , i =
1, 2, 3 and the corresponding eigenvectors ω (i) , whence L(i) = λ(i) ω (i) (no sum on i).
2
Yes, the language is indeed the same as for matrices.
44
Eigenvalues and eigenvectors of a real symmetric tensor
Moment of inertia tensor
Theorem
When studying rigid body dynamics, it’s (usually) best to work in a basis in which
the moment of inertia tensor is diagonal. The eigenvectors of I define the principal
axes of the tensor. In this (primed) basis


A 0 0
0
I =  0 B 0 
0 0 C
+ The eigenvalues of a real symmetric matrix are real.
+ The eigenvectors corresponding to distinct eigenvalues are orthogonal.
If a subset of the eigenvalues is degenerate (eigenvalues are equal), the corresponding eigenvectors can be chosen to be orthogonal because:
– the eigenvector subspace corresponding to the degenerate eigenvalues is
orthogonal to the other eigenvectors;
– within this subspace, the eigenvectors can be chosen to be orthogonal by
the Gram-Schmidt procedure.
where the (positive) quantities A, B, C are called the principal moments of inertia.
In this basis, the angular momentum and kinetic energy take the form
L = A ω10 e01 + B ω20 e02 + C ω30 e03
T = 12 A ω10 2 + B ω20 2 + C ω30 2
Proofs will not be given here – see books or lecture notes from mathematics courses.
For a free body (i.e. no external forces), L and T are conserved (time-independent),
but ω will in general be time dependent.
Diagonalisation of a real symmetric tensor
Let T be a real second-rank symmetric tensor with real eigenvalues λ(1) , λ(2) , λ(3)
and orthonormal eigenvectors `(1) , `(2) , `(3) , so that T `(i) = λ(i) `(i) (no summation)
and `(i) · `(j) = δij . Let the matrix L have elements
 (1) (1) (1) 
`1 `2 `3

(i)
(2)
(2) 
(i)
`ij = `j ≡  `(2)
`2 `3  = ` · ej
1
(3)
`1
(3)
`2
(3)
`3
ij
I.e the ith row of L is the ith eigenvector of T . L is an orthogonal matrix
(j)
(LLT )ij = `im `jm = `(i)
m `m = δij
We can always choose the normalised eigenvectors `(i) to form a right-handed basis:
det L = ijk `1i `2j `3k = `(3) · `(1) × `(2) = +1
With this choice, L is a rotation matrix which transforms S to S 0 .
The tensor T transforms as (summing over the indices p, q only)
Tij0
= `ip `jq Tpq
=
`(i)
p
Tpq `(j)
q
(j) (j)
= `(i)
`p
p λ
or
Tij0 = λ(j) δij


λ(1) 0
0
(2)

=
0 λ
0 
0
0 λ(3) ij
0
Thus we have found a basis or frame of reference, S , in which the tensor T takes a
diagonal form; the diagonal elements are the eigenvalues of T .3
3
Thus tensors may be diagonalised in much the same way as matrices.
45
A geometrical picture is provided by the inertia ellipsoid, which is defined by
Iij ωi ωj = 1
A factor of
√
2T is absorbed into ω by convention.
ω2
In the principal axes basis, where ωi0 = `iα ωα ,
we have
~h
P
A ω10 2 + B ω20 2 + C ω30 2 = 1
[The angular momentum L is labelled h in the
figure on the right. To be fixed . . . ]
ω1
O
ω3′
ω3
In any basis, a small displacement ω → ω + dω on
the ellipsoidal surface at the point P , with normal
n, obeys
d~ω
~n
Iij ωi dωj = Lj dωj = 0
P
for all dω.
Therefore L is orthogonal to dω and parallel to n,
i.e. L is always orthogonal to the surface of the
ellipsoid at P .
46
ω1′
~ω
ω2′
This is called the normal form. It describes
an ellipsoid because A, B, C are all positive.
(ThisR follows from the definition, for example
A = ρ (y 2 + z 2 ) dV .)
~n
In the principal axes basis
L = A ω10 e01 + B ω20 e02 + C ω30 e03
The directions for which L is parallel to ω are obviously
the directions of the principal axes of the ellipsoid. For
example, if ω = ω10 e01 then
P
ω1′
Chapter 6
~n
ω2′
L = A ω10 e01
~ω
In this case, the body is rotating about a principal axis
which passes through its center of mass.
Electrostatics
O
This gives a ‘geometrical’ answer to our original question.
ω3′
6.1
Notes:
• If two principal moments are identical (A, A, C), the ellipsoid becomes a
spheroid.
If all three principal moments are identical the ellipsoid becomes a sphere, and
L is always parallel to ω.
• The principal axes basis is used in the Lagrangian Dynamics course to study
the rotational motion of a free rigid body in the Newtonian approach to dynamics, and the motion of a symmetric spinning top with principal moments
(A, A, C) in the Lagrangian approach.
The Dirac delta function in three dimensions
Consider the mass of a body with density ρ(r). The mass
of the body is
Z
M=
V
ρ(r) dV
V
How can we use this general expression for the case of a
single particle? What is the ‘density’ of a single ‘point’
particle with mass M at r0 ?
We need a ‘function’ ρ(r) with the properties
• The principal axes basis/frame is ‘fixed to the body’, i.e. it moves with the
rotating body, and is therefore a non-inertial frame.
ρ(r) = 0
R
M = V ρ(r) dV
∀ r 6= r0
r0 ∈ V
)
~r0
O
ρ(r) = M δ(r − r0 )
| {z }
notation
Generalising slightly, we define the delta function to pick out the value of the function
f (r 0 )1 at one point r 0 in the range of integration, so that
(
Z
f (r0 ) r0 ∈ V
dV f (r) δ(r − r0 ) =
0
otherwise
V
Similarly, the total charge on a body with charge density (charge per unit volume)
ρ(r) is
Z
Q=
ρ(r) dV
V
The one dimensional delta function
The delta function may be defined by a sequence of functions δ (x − a), each of
‘area’ unity, which have the desired limit when integrated over. We give a number
of examples of how this may be done.
1
47
f (r0 ) = 1 in the example above
48
• Top hat

 1
2
δ (x − a) =

0
a−<x<a+
1
2ǫ
otherwise
a−ǫ
a+ǫ
For the top hat, we need to evaluate
Z +
Z +∞
1
dx xn
dx xn δ (x) =
2
−
−∞
(
1
n n = 0, 2, 4, . . .
1 n=0
n+1
=
→ =
|{z}
0 otherwise
0
n = 1, 3, 5, . . . →0
Hence
Z
a
+∞
−∞
• Witch’s hat
dx f (x) δ (x − a) |{z}
→ f (a)
→0
i.e. δ (x − a) → δ(x − a)
Similarly for the other representations. The Gaussian representation is the cleanest,
because it’s a smooth function.

 1 [ − |x − a|]
2
δ (x−a) =

0
1
ǫ
a−<x<a+
Notes:
otherwise
(i) The Dirac delta ‘function’ isn’t a function, it’s a distribution or generalised
function.
a−ǫ
a+ǫ
a
• Gaussian
1
(x − a)2
δ (x − a) = √ exp −
2
π
1
√
ǫ π
(ii) Colloquially, it’s an infinitely-tall infinitely-thin spike of unit area.
(iii) The delta function is the continuous-variable analogue of the Kronecker delta
symbol. If we let i → x
Z
ui δij = uj →
dx x δ(x − x0 ) = x0
(iv) An important identity is
Z +∞
dx f (x) δ (g(x)) =
−∞
X f (xi )
|g 0 (xi )|
i
where g(xi ) = 0, i.e. xi are the simple zeroes of g(x) [tutorial].
a
In each case
Z +∞
Z
dx f (x) δ (x − a) =
−∞
The three dimensional delta function
dx f (x + a) δ (x)
−∞
=
Z
In Cartesian coordinates (x, y, z),
+∞
+∞
−∞
dx f (a) + x f 0 (a) + x2 /2 f 00 (a) + . . . δ (x)
where we shifted the integration variable in the first line, and Taylor-expanded the
integrand in the second. The function f (x) is a ‘good’ test function, i.e. one for
which the integral is convergent for all .
δ (3) (r − r0 ) ≡ δ(r − r0 ) = δ(x − x0 ) δ(y − y0 ) δ(z − z0 )
In orthogonal curvilinear co-ordinates (u1 , u2 , u3 ),
δ(r − a) =
1
δ(u1 − a1 ) δ(u2 − a2 ) δ(u3 − a3 )
h1 h2 h3
where h1 , h2 , h3 are the usual scale factors [tutorial].
(In the last equation, we set r0 = a to avoid double subscripts on the RHS.)
49
50
6.2
Coulomb’s law
The particle at P ‘feels’ the electrostatic field as a force q E(r) with
1 q1 r − r 1
E(r) =
3
4π0 r − r P
Experimentally, the force between two point charges
q and q1 at positions r and r1 , respectively, is given
by Coulomb’s law
1 q q1 r − r1
F1 =
4π0 r − r 3
6.3.2
1
F 1 is the force on the charge q at r, produced by the
charge q1 at r1 .
~r1
O
Charges can be positive or negative. For qq1 > 0 we have repulsion, and for qq1 < 0
we have attraction: like charges repel and opposite charges attract.
In SI units, charge is measured in Coulombs and 0 is defined to be 0 = 107 /(4πc2 ) C 2 N −1 m−2 .
Aside: Similarly for Newton’s law of gravitation,
m m1 r − r1
r − r 3
1
The principle of superposition
~r
1
F 1 = −G
which is always attractive (hence the negative sign, so that G, m, m1 are all positive).
The principle of superposition states that the total
electric field at r is the vector sum of the fields due
to the individual charges at ri
1 X qi r − r i
E(r) =
r − r 3
4π0
i
The electric field
E(r) = lim
q→0
1
F
q
6.3.1
Field lines are the ‘lines of force’ on the test charge.
Newton’s equations imply that the motion of a (test) particle is unique, which implies that the field lines do not cross,
and thus that they are well-defined and can be measured.
Thus for our two charges q and q1 we have
F 1 = q E(r)
51
P
~r − ~r′
~r
0
r−r
dV 0 ρ(r0 ) r − r 0 3
~r′
O
(r − r1 )
1
= −
∇ r − r 3
r − r1 1
(6.2)
where ∇ operates on r (not r1 ), then for a point charge q1 at r1
!
r − r1
q1
q1
1
E(r) =
∇ 3 = −
4π0 r − r 4π0
r − r1 1
−
i.e. we may write
E(r) = −∇ φ(r)
i.e. particle 1 ‘produces’ an electrostatic field E(r) . The
diagram shows the field lines produced by a negative charge.
~ri
O
The electrostatic potential for a point charge
Since
Field lines
V
i
To return to our original example of a single charge q1 at position r1 , we simply set
the charge density ρ(r0 ) = q1 δ(r0 − r1 ), which recovers the result in equation (6.1).
6.4
Clearly, E is a vector field.
1
4π0
Z
~r
i
In the limit of (infinitely) many charges, we introduce a continuous charge density (charge/volume)
ρ(r0 ), so that the charge in dV 0 at position r0 is
ρ(r0 ) dV 0 . The electric field is then
E(r) =
The electric field E is ‘produced’ by a charge configuration, and is defined in terms
of the force on a small positive test charge,
P
Consider a set of charges qi situated at ri
In SI units: G = 6.672 × 10−11 N m2 kg 2 .
6.3
(6.1)
1
with
φ(r) =
1
q
1 4π0 r − r1 φ(r) is the electrostatic potential for the electric field E(r).
52
(6.3)
6.5
The static Maxwell equations
6.5.1
An explicit expression for φ(r) can be obtained from (6.4). We have E = −∇φ with
The curl equation
φ(r) =
For a continuous charge distribution, we again use equation (6.2) to write the electric
!
Z
Z
0
1
1
1
0 ρ(r ) = −∇
E(r) = −
dV 0 ρ(r0 ) ∇ dV
r − r0 (6.4)
4π0 V
4π0 V
r − r0 So
∇ × E = −∇ × ∇
1
4π0
Z
V
ρ(r0 )
dV 0 r − r0 Z
V
This is linear superposition for potentials.
ρ(r0 )
dV 0 r − r0 As in the case of the electric field, if we set
ρ(r0 ) = q1 δ(r0 − r1 ), we recover the potential
for a single charge (equation (6.3))
!
But the curl of the gradient of a scalar field is always zero, which implies
Notes:
∇×E =0
P
q1
~r1
~r
1
q
1 φ(r) =
4π0 r − r1 O
• For a surface charge distribution, with charge/unit-area σ(r), the electric field
produced is
Z
Z
r − r0
σ(r0 )
1
1
E(r) =
and
φ(r) =
dS 0 σ(r0 ) dS 0
3
0
r − r 4π0 S
4π0 S
|r − r0 |
for all static electric fields. This is the second (static) Maxwell equation.
6.5.2
1
4π0
Conservative fields and potential theory
where dS is the infinitesimal (scalar) element of area on the surface S.
A vector field that satisfies ∇ × E = 0 is said to be conservative.
Consider the integral of ∇ × E over an open surface S
bounded by the closed curve C1 − C2 . Using Stokes’
theorem
I
Z
0 =
∇ × E · dS =
E · dr
Z
C1
where dl0 is the infinitesimal element of length along the line (or curve) C.
S
C1 −C2
S
Therefore
B ~b
C2
E · dr =
Z
C2
C1
E · dr
A ~a
Since the line integral is independent of the path from a to b, it can only depend on
the end points. So, for some scalar field φ, we must have
Z b
−
E · dr = φ(b) − φ(a)
a
Now let a = r and b = r + δr, where δr is small, so we can approximate the integral
−E(r) · δr + . . . = φ(r + δr) − φ(r) = ∇φ · δr + . . .
where we used the definition of the gradient in the last step. Therefore
• For a line distribution of charge, with charge/unit-length λ(r)
Z
Z
r − r0
λ(r0 )
1
1
E(r) =
dl0 λ(r0 ) and
φ(r) =
dl0
3
0
4π0 C
4π0 C
|r − r0 |
r−r
• In SI units, the potential is measured in Volts V . In terms of other units
V = C/(C 2 N −1 m−1 ) = N mC −1 = JC −1 .
• Field lines are perpendicular to surfaces of constant potential φ, called equipotentials or equipotential surfaces.
d~r
Let dr be a small displacement of the position
vector r of a point in the equipotential surface
φ = constant.
Therefore
φ = const
0 = dφ = ∇φ · dr
so E = −∇ φ is perpendicular to dr.
Thus electric field lines E are everywhere perpendicular to the surfaces φ = constant.
E(r) = −∇ φ(r)
φ(r) is called the potential for the vector field E(r).
53
~n
54
−
φ = const
• The potential φ is only defined up to an overall constant. If we let φ → φ + c,
the electric field E = −∇ φ (and hence the force) is unchanged. So only
potential differences have physical significance.
In most physical situations, φ → constant as r → ∞, and we usually choose
the constant to be zero.
• So far we’ve defined the potential in purely mathematical terms.
Physically, the potential difference, VAB , between two points A and B is defined
as the energy per unit charge required to move a small test charge q from A
to B:
1
WAB
q
Z
Z
1
= −
F · dr = −
E · dr
q C
C
Z
Z B
=
∇φ · dr =
dφ
VAB ≡ lim
q→0
C
A
The −ve sign is because this is the work done against the force F . Since the
field is conservative, the integral is independent of the path – it depends only
on the end points.
• The potential energy of a charge q at position r is given by q φ(r).
We may generalise this to a charge distribution in an external electric field
E ext (r) = −∇φext . In this case, the (interaction) energy is
W =
dV ρ(r) φext (r)
V
Note that this does not include the the self-energy of the charge distribution.
To emphasize this we write φext . [More on this later]
6.5.3
The divergence equation
Z
ρ(r0 )
1
dV 0 φ(r) =
4π0 V
r − r0 Since E = −∇ φ, we have ∇ · E = −∇2 φ, and hence
Z
1
1
∇ · E(r) = −
dV 0 ρ(r0 ) ∇2 4π0 V
r − r0 2
0
1
= −4π δ(r)
r
∀r
Proof: We first prove it for r 6= 0
x 1 r6=0
3
3
i
∇2
= −∂i 3 = − 3 − xi r−5 2xi = 0
r
r
r
2
To prove the result for r = 0, we integrate ∇2 (1/r) over an arbitrary volume V
containing the origin r = 0.
Z
Z
Z
r
1
1
dV =
∇2
dV = −
∇ · 3 dV
∇2
r
r
r
Vε
Vε
V
Z
r
ε
= −
· dS = − 3 4πε2 = −4π
3
ε
Sε r
We then used the divergence theorem to obtain the first result on the second line.
On the surface Sε , we have r = ε er and dS = er dS, where er is a unit vector in
the direction of r, so the integral over the surface of the sphere is straightforward –
check it!
Rwe may write the surface integral as an integral over solid angle
R[Alternatively,
r · dS/r3 = S dΩ = 4π.]
S
We can now take the limit ε → 0, which simply shrinks the sphere down to the
origin, leaving the integral unchanged.
Since our result
for the integral holds for an arbitrary volume V centred on the
R
origin, and V δ(r) dV = 1, we deduce that ∇2 (1/r) = −4π δ(r). Similarly
!
1
2
= −4π δ(r − r0 )
∇ r − r0 Substituting this result into equation (6.5) gives
Z
1
∇ · E(r) = −
dV 0 ρ(r0 ) −4πδ r − r0
4π0 V
Using the delta function to perform the integral on the right hand side, we get
∇ · E(r) =
ρ(r)
0
We now have the two electrostatic Maxwell equations
∇·E =
(6.5)
0
Note that ∇ acts only on r (not on r ), so we can take it inside the integral over r .
55
∇2
In the first line, we used our previous result that ∇2 (1/r) = 0 everywhere away
from the origin to write the original integral as an integral over a sphere of radius ε
centred on the origin, with volume Vε and area Sε respectively.
= φB − φA
Z
Theorem
ρ
0
∇×E = 0
In terms of the potential
∇2 φ = −
E = −∇φ
The second equation is called Poisson’s equation.
56
ρ
0
6.6
Electric dipole
Spherical polar coordinates
Physicallly, an electric dipole consists of two
nearby equal and opposite (point) charges, with
charge −q situated at r0 and charge +q at r0 + d .
+q
d~
~r0 + d~
Define the dipole moment p = qd.
−q
It will turn out to be useful to consider the dipole
limit, in which
lim qd
p = q→∞
d→0
Then
O
~r
P
0
0
#
where we Taylor (or binomial) expanded the first term about r − r0 [tutorial].
In the dipole limit, the terms of O(qd2 ) vanish, and the potential is simply
φ(r) =
For a dipole at the origin we have
1 p · (r − r0 )
4π0 |r − r0 |3
φ(r) =
1 p·r
4π0 r3
Note that φ(r) falls off as 1/r2 .
p 1
cos θ
4π0 r2
p 1 E(r) =
3 cos θ er − ez
4π0 r3
φ(r) =
θ
θ
~eθ
E = −∇φ
∂φ
1 ∂φ
1 ∂φ
= −
er +
eθ +
eχ
∂r
r ∂θ
r sin θ ∂χ
2
sin θ
p
= −
− 3 cos θ er − 3 eθ
4π0
r
r
by the dipole is
1
r − r ~er
We can also obtain this result using the expression for ∇φ in polar co-ordinates
Potential and electric field due to a dipole
Dipole potential The electrostatic potential φ(r) produced
"
#
q
1
1
− φ(r) =
r − r 4π0 r − r0 − d
0
"
d · (r − r0 )
q
1
+ =
+ O(d2 ) −
r − r 3
4π0 r − r0 ~ez
[We use χ instead of φ for the azimuthal angle in
order to avoid confusion with the potential φ]
~r0
with p finite (and constant). This is sometimes
called a point dipole or an ideal dipole.
6.6.1
Consider spherical polar coordinates (r, θ, χ), with
the z-axis chosen parallel to the dipole moment,
i.e. p = p ez .
The second form can be obtained from the first by
substituting ez = er cos θ − eθ sin θ [exercise: show
this] into the latter.
φ = const
The sketch shows the electric field (full lines) and
the potential (dashed lines) for the dipole.
~
E
This picture holds in the dipole limit, but it’s also
valid when r d, the ‘far zone’.
6.6.2
Force, torque and energy
Force on a dipole
th
Electric field The i component of the electric field due to a dipole of moment
p situated at the origin is
p x 1
j j
Ei (r) = −∂i φ = −
∂i
4π0
r3
1
δij
3
= −
pj 3 + xj − r−5 2xi
4π0
r
2
Therefore
3p · r
p
1
E(r) =
r
−
4π0
r5
r3
which falls off as 1/r3 .
57
The force on a dipole at position r due to an external
field E ext is
+q
d~
F (r) = −q E ext (r) + q E ext (r + d)
= −q E ext (r) + q E ext (r) + (d · ∇)E ext (r) + · · ·
~r + d~
In the point dipole limit
−q
~r
F (r) = p · ∇ E ext (r)
O
58
Torque on a dipole
The torque (or couple or moment) on a dipole about the point r where the dipole is
located due to the external electric field is
G(r) = −q 0 × E ext (r) + q 0 + d × E ext (r + d)
= q d × E ext (r) + d · ∇ E ext (r) + · · ·
• If the dipole at r has dipole moment p1 , and the electric field E ext (r) is due
to a second dipole of moment p2 at the origin, then
 
p2
1  3 p2 · r
W = −p1 · E ext
with E ext =
r − 3
4π0
r5
r
Therefore
Taking the dipole limit (i.e. ignoring terms of order O(qd2 )), we find
W =
G(r) = p × E ext (r)
1
4π0
p1 · p2
r3
−
3(r · p1 )(r · p2 )
r5
The interaction energy is not only dependent on the distance between the dipoles,
but also on their relative orientations.
Energy of a dipole
p~1
~r
p~2
O
The energy of a dipole in an external electric field E ext is
W = −q φext (r) + q φext (r + d)
= −q φext (r) + q φext (r) + d · ∇ φext (r) + · · ·
In the dipole limit, using Eext = −∇ φext , we find
W = −p · E ext
Is thisconsistent with our previous expression for the force on the dipole, i.e. F (r) =
p · ∇ E ext (r)?
Recall the following identity for vector fields a and b
∇ a·b = a·∇ b+ b·∇ a+a× ∇×b +b× ∇×a
If we set a = p = constant, and b = E ext then
∇ p · E ext = p · ∇ E ext + p × ∇ × E ext
Now ∇ × E ext = 0, so p × (∇ × E ext ) = 0, and hence
F (r) = p · ∇ E ext = ∇ p · E ext = −∇ W
6.7
The multipole expansion
Consider the case of a charge distribution, ρ(r), localised in a volume V . For convenience we will take the origin inside V .
The potential at the point P is
Z
ρ(r0 )
1
dV 0
φ(r) =
4π0 V
|r − r0 |
• For the case of a homogeneous (i.e. constant, independent of r) external field,
E ext (r) = E 0 , we have
F = 0
and
G(r) = p × E 0
So a stable or equilibrium position (i.e. position of minimum energy) occurs
when p is parallel to E 0 .
59
~r
P
O
For |r| much larger than the extent of V , i.e |r| |r0 | for all |r0 | such that ρ(r0 ) 6= 0,
we can expand the denominator using the binomial theorem
(1 + x)n = 1 + nx + (n(n − 1)/2) x2 + O(x3 )
⇒
n o−1/2
r − r0 −1 =
r2 − 2r · r0 + r0 2
2 )−1/2
r · r 0 r 0 1 − 2 2 + 2
r r (
0 3 )
r · r0 1 r02 3 −2r · r0 2
r
1
=
1+ 2 −
+
+
O
r
r
2 r2
8
r2
r
−1
= r The force on the dipole is the gradient of the potential energy, as expected.
Examples
q′
′
(
This can also be obtained by Taylor expansion [exercise]. Then
"
#
2
Z
1
1 r0 · r 3 r0 · r − r2 r02
φ(r) =
dV 0 ρ(r0 )
+ 3 +
+
.
.
.
4π0 V
r
r
2r5
This gives the multipole expansion for the potential
φ(r) =
1 p·r
1 Qij xi xj
1 Q
+
+
+ ...
4π0 r
4π0 r3
4π0 2r5
60
Expanding the denominators in the usual way, and defining ρ = r − r0 , the leading
(1/ρ) and dipole terms cancel, so for large ρ
where
Q =
p =
Qij =
Z
dV 0 ρ(r0 )
ZV
ZV
V
is the total charge within V
φ(r) =
dV 0 r0 ρ(r0 )
is the dipole moment about the origin
dV 0 3 x0i x0j − r02 δij ρ(r0 )
The multipole expansion is valid in the far zone, i.e. when r r0 , with r0 the size
of the charge distribution.
where Qij = q (3di dj − δij d2 ) is the (traceless, symmetric) quadrupole tensor – as
above.
The quadrupole moment is sometimes defined to be Q = 2qd2 , where the ‘2’ is
conventional.
6.7.1
• If Q 6= 0, the monopole term dominates
φ(r) =
3(ρ · d)2 − d2 ρ2
1
1 Qij ρi ρj
q
=
4π0
ρ5
4π0
ρ5
Worked example
The region inside the sphere r < a, contains a charge density
1 Q
4π0 r
ρ(x, y, z) = f z (a2 −r2 )
and in the far zone, r r0 , the E field is that of a point charge at the origin.
where f is a constant. Show that at large distances from the origin the potential
due to the charge distribution is given approximately by
• When the total charge Q = 0 the dipole term dominates
φ(r) =
φ(r) =
1 p·r
4π0 r3
The multipole expansion gives
φ(r) =
If the charge density is given by two equal but oppositely-charged particles
close together, i.e. ρ(r0 ) = q δ(r0 − d) − δ(r0 ) , then
Z
p =
dV 0 r0 q δ(r0 − d) − δ(r0 ) = q d
In spherical polars (r, θ, χ),
x = r sin θ cos χ ,
which is the dipole moment as defined previously, and hence justifies the name.
• If Q = 0 and p = 0, the quadrupole term dominates
φ(r) =
1 Qij xi xj
4π0 2r5
The quadrupole tensor Qij is symmetric, Qij = Qji , and traceless, Qii = 0.
A simple linear quadrupole is defined by placing two dipoles (so four charges) ‘back to back’
with equal and opposite dipole moments, as
shown.
−2q
r’
O
From the figure
φ(r) =
d
q
1
4π0
r
q
q
−2q
+
+
| r − r0 − d |
| r − r0 + d |
| r − r0 |
61
=
d
Z
0
q
P ·r
Q
+
+O
r
r3
y = r sin θ sin χ ,
1
r3
z = r cos θ
The total charge Q is (we drop the primes in this calculation for brevity):
Z 2π Z π Z a Z
ρ(r) dV =
f r cos θ (a2 −r2 ) r2 sin θ dr dθ dχ = 0.
Q =
0
0
0
V
Z π
Z π
1
This integral vanishes because
cos θ sin θ dθ =
sin(2θ) dθ = 0.
0
0 2
The total dipole moment P about the origin is
Z
Z
P =
r ρ(r) dV =
r e r ρ(r) dV
V
1
4π0
2f a7 z
1050 r3
2π
Z
0
π
Z
0
V
a
r (sin θ cos χ e 1 + sin θ sin χ e 2 + cos θ e 3 )
f r cos θ (a2 −r2 ) r2 sin θ dr dθ dχ
The x and y components of the χ integral vanish. The z component factorises:
Z 2π
Z π
Z a
2 2a7
Pz = f
dχ
sin θ cos2 θ dθ
r4 (a2 −r2 ) dr = f 2π
3 35
0
0
0
Putting it all together, we obtain
1 8πa7 f e 3 · r
2f a7 z
φ(r) =
=
3
4π0 105
r
1050 r3
62
6.7.2
Interaction energy of a charged distribution
Let’s consider the interaction energy W of an arbitrary (but bounded) charge distribution in an external electric field E ext = −∇ φext ,
Z
W = dV ρ(r) φext (r)
By symmetry, the electric field on the z axis will be parallel to the z axis, because the
sum of all the contributions to the components Ex and Ey will cancel (i.e. integrate
to zero). Therefore we need only calculate Ez .
Hence
E(z) =
1
4π0
Z
a
0
The last term may be re-written as − 16 (3xi xj − r2 δij ) ∂j Eext i (0) – because the external field satisfies ∇ · E ext = 0 . Therefore
W = Q φext (0) − p · E ext (0) − 16 Qij ∂j Eext i (0) + . . .
The physical picture is that the (total) charge interacts with the external potential
φext , the dipole moment with the external field E ext , and the quadrupole moment
with the (spatial) derivative of the external field.
2π
ρ dρ dφ σ
0
Z
a
A brute-force calculation - the circular disc
The electric field E and potential φ can be evaluated exactly for a number of interesting symmetric charge distibutions. We give one example before moving on to
more powerful techniques.
z
A circular disc of radius a carries uniform surface
charge density σ. Find the electric field and the
potential due to the disc on the axis of symmetry.
Consider two limits:
(i) z a Expand the second term in square brackets
4 1
1
1
1 a2
a
2
2 −1/2
=
1
+
a
/z
1
−
+
O
=
|z|
|z|
2 z2
z4
(a2 + z 2 )1/2
Keeping only the leading term, we have
1/2
From the diagram: r − r0 = (ρ2 + z 2 ) on the z
axis, in cylindrical coordinates (ρ, χ, z).
63
σa2
Q ez
e = sgn(z)
40 z 2 z
4π0 z 2
where the signum (or sign) function sgn(z) ≡ z/|z| is +1 for z > 0, and −1
for z < 0; and Q = σπa2 is the total charge on the disc. In the far zone, we
recover the field for a point charge, as expected.
(ii) a z In this case, the leading behaviour is obtained by dropping the second
term in square brackets:
E(z) = sgn(z)
σ
e
20 z
This is the electric field due to an infinite charged surface – see later.
Z
r − r0
1
dS 0 σ(r0 ) E(r) =
r − r 0 3
4π0 S
Choose the z axis parallel to the axis of symmetry
with the origin at the centre of the disc, so that r0
lies in the x − y plane. In Cartesians
r − r0 = (x − x0 , y − y 0 , z − 0)
+ z 2 )3/2
z ρ dρ
E(z) = sgn(z)
6.7.3
z ez
(ρ2
#ρ=a
"
σz
−1
e =
ez
3/2 z
20 (ρ2 + z 2 )1/2
0 (ρ2 + z 2 )
ρ=0
#
#
"
"
σz 1
1
σ
z
z
=
ez =
ez
−
−
20 |z|
20 |z|
(a2 + z 2 )1/2
(a2 + z 2 )1/2
σ
=
2π
4π0
2
1
φext (r) = φext (0) + r · ∇ φext (0) +
r · ∇ φext (0) + . . .
2!
1
= φext (0) − r · E ext (0) − xi xj ∂j Eext i (0) + . . .
2
Z
r
r − r′
χ
Z
σ(r0 )
1
dS 0
φ(r) =
4π0 S
|r − r0 |
For the disc, we have
O
ρ
r′
φ(z) =
=
a
σ
4π0
σ h
20
Z
a
Z
2π
ρ dρ dφ
(ρ2 + z 2 )1/2
i
1/2
a2 + z 2
− |z|
0
0
=
1/2 iρ=a
σ h 2
ρ + z2
20
ρ=0
Note: It’s often very much easier to find φ than E !
64
(i) z a Expanding as before, we find,
Therefore
σ a2
Q
φ(z) =
=
40 |z|
4π0 |z|




Q
4π0
E(r) =
Q



4π0
as expected.
(ii) a z In this case we find a linear potential
φ(z) =
6.8
∂φ
e in each case.
∂z z
r≤a
r≥a
r
E = Er er = −∇φ = −
We showed previously that the electric field, E(r), due to a charge distribution, ρ(r),
satisfies ∇ · E = ρ/0 . This is the differential form of Maxwell’s first equation.
Integrating ∇ · E = ρ/0 over a volume V , bounded by a closed surface S, and using
the divergence theorem
Z
Z
∇ · E dV =
E · dS
S
V
Z
S
E · dS =
1
0
Z
ρ dV =
V
Gauss’ Law is extremely useful, particularly for problems with a symmetry,2 and
also for solving problems in potential theory.
[Gauss’s law is also known as Gauss’ theorem.]
Examples
By symmetry, the electric field will point radially
outwards, so that E(r) = Er (r) er .
Integrating with respect to r gives

Q



4π0
φ(r) =
Q



4π0
∂φ
e
∂r r
1
3a2 − r2
2a3
1
r
r≤a
r≥a
where for r > a we choose the constant of integration to be zero so that φ → 0
as r → 0. The potential outside the sphere is again that of a point charge.
• Consider a long (infinite) straight wire with constant charge/unit length λ.
Using cylindrical coordinates with the z axis parallel to the wire, we integrate over a cylinder of
length L and radius ρ with its axis along the wire.
By symmetry we must have E = Eρ (ρ) eρ .
Using Gauss’ Law, we get
Z
1
E · dS = Eρ (ρ) 2πρL + |{z}
0 =
λL
0
ends
• Consider a sphere of radius a, centred on the origin, with uniform charge density ρ0 = Q/( 43 πa3 ).
This gives
ρ
E(r) =
L
λ 1
e
2π0 ρ ρ
a
2
This goes against the general rule that it is easier to compute the potential (a scalar) first
rather than the electric field (a vector)!
65
r=a
For r < a, we choose the constant of integration so that φ is continuous across
the boundary. Note that the derivative of φ is discontinuous at the boundary,
so E has a cusp.
Q
0
where Q is the total charge enclosed by the volume V .
Integrating over a sphere of radius r, we find
(
Z
3
4
r≤a
3 πr ρ0 /0
2
E · dS = Er (r) 4πr =
3
4
r≥a
S
3 πa ρ0 /0
1/r2
We can obtain the electrostatic potential from
Gauss’ law
gives Gauss’ law
~
|E|
Outside the sphere, the electric field appears to come from a point source, and
inside it increases linearly with r.
σ σ
a − |z| = −
|z| + constant
20
20
Exercise: Check that E(z) = −
r
a3
r
r3
The potential can be found by integrating the electric field with respect to ρ
λ
ρ
φ(r) = −
ln
2π0
ρ0
where ρ0 is a constant of integration.
66
r
• Infinite flat sheet of charge with constant charge density σ per unit area.
Integrate over a cylindrical ‘Gaussian pill box’ with axis perpendicular to the
sheet. See tutorial, and below.
6.9
6.9.2
Tangential component
Let us apply Stokes’ theorem to the rectangle of infinitesimal width δl in the figure
Z
I
∇ × E · dS =
E · dr
S
Boundaries
Useful results for the changes in the normal and tangential components of the electric
field across a boundary may be obtained using Gauss’ Law and Stokes’ theorem.
Consider a surface carrying surface charge density σ. The electric field on one side
of the boundary is E 1 , and on the other E 2 . The unit normal to the surface is n.
2
σ
C
which gives
I
0 =
E · dr = E2k − E1k l
2
C
where E1k and E2k are the components of the electric field parallel to
the boundary.
This can be written as
~n
0
1
1111111
0000000
0
1
00000001
1111111
00000000
1111111
0 δl
1
0000000
1111111
0
00000001
1111111
0000000
1111111
0000000
1111111
0000000
1111111
000000
111111
0000000
1111111
S
000000
111111
0000000
1111111
000000
111111
0000000
1111111
000000
111111
0000000
1111111
000000
111111
0000000
1111111
000000
111111
0000000
1111111
000000
111111
l
0000000
1111111
000000
111111
000000
111111
1
S
n × E1 = n × E2
1
~2
E
where we used the fact that the cross product of the electric field E with n picks
out the tangential component Ek of the electric field, since
~n
~1
E
E = E⊥ n + E k
S
Thus the tangential component of E is continuous across a charged boundary.
6.9.1
Normal component
6.9.3
Applying Gauss’ law to the small cylindrical ‘Gaussian pillbox’ with infinitesimal
height, δl, shown in the figure gives
Z
Z
1
E · dS =
ρ dV
0
S
This gives
2
(E2⊥ − E1⊥ ) A =
σA
0
A
where E1⊥ and E2⊥ are the components
of the electric field perpendicular to the
surface. This gives
δl
σ
1
~n
S
σ
n · E2 − E1 =
0
Conductors
Physically, a conductor is a material in which ‘free’ or ‘surplus’ electrons can move
(or flow) freely when an electric field is applied.
In Electrostatics
• For a conductor in equilibrium, all the charge resides on the surface of the
conductor, i.e. ρ = 0 inside a conductor.
This holds because if ρ 6= 0 then due to Maxwell’s first equation ∇ · E = ρ/0
(or Gauss’ law) we must have E 6= 0, and hence the charge would move and we
wouldn’t have equilibrium – a contradiction. So E = 0 and hence φ = constant
everywhere inside a conductor.
• The electric field on the surface of a conductor is normal to the surface,
i.e. E k n, otherwise charge would move along the surface
Taking the potential φ to be continuous across the boundary means that the gradient, and hence the electric field E, can be discontinuous across the boundary when
σ 6= 0.
Thus if dr is a displacement on the surface of a conductor, E · dr = −dφ = 0,
so φ = constant on the surface of a conductor, ie. an equipotential.
In this case the discontinuity is proportional to the surface charge density.
67
68
σ
Therefore, on the surface of a conductor, we
have
σ
Et = 0 ,
En =
0
The external electric field induces a charge on
the surface of the conductor, which in turn
deforms the external field so that it is perpendicular to the conductor surface. In the case
of a conductor, the surface charge is calculated
from the electric field (and not vice-versa as is
the usual case).
vacuum
~
E
conductor
~ =0
E
φ = const
For insulators we have the opposite situation – the charges are fixed and we must
calculate the electric field and the potential from the charge density.
69
```