Complete Solutions to Peskin and Schroeder (link)

The baker got a bit of chocolate on my Peskin and Schroeder, but it was okay.
The baker got a bit of chocolate on my Peskin and Schroeder, but it was okay.

One of the original unstated goals for this blog (at the time of its inception) was to write up solutions to all the problems in Peskin and Schroeder’s An Introduction to Quantum Field Theory. It was a reasonable idea for a theoretical physics student. Despite having since abandoned theoretical physics, I still occasionally entertain the idea of continuing with the project, an impulse driven perhaps by nostalgia, or some masochistic tendency which drives me towards involved integral calculations. Thankfully, someone has already gone to the trouble, and I can spare myself the hours of pointless TeXing that this particular indulgence may have cost me.


Problem 3.2 from Peskin and Schroeder

This question is actually shockingly easy. It’s one of those questions whose statement is so short and is literally just “prove this identity” that you assume it’s going to take pages of obtuse and demoralising mathematical trickery. Not this time!

The question demands that we prove the Gordon identity, and assures us that this identity will be used in Chapter 6. It goes as follows:

\bar{u}(p^{\prime})\gamma^{\mu}u(p)=\bar{u}(p^{\prime})\left[\frac{p^{\prime\mu}+p^{\mu}}{2m}+\frac{i \sigma^{\mu\nu}q_{\nu}}{2m}\right]u(p)

where q = p^\prime - p

I’m not going to explicitly go through the calculation of this because it’s trivial once you know the steps and have the necessary information about gamma matrices and whatnot.

  • Start by writing \gamma^{\mu}=\frac{1}{2}(\gamma^{\mu}+\gamma^{\mu}). This is a trick that dawned on me exactly half way through the solution (when I realised I was clearly missing a term) but the factor of a half in the RHS of the identity should give it away immediately, really.
  • What are u(p)? We can write down a plane wave solution to the Dirac equation as \psi(x) = u(p)\exp{(-i p \cdot x)} (we can do this since the field also satisfies the Klein-Gordon equation), so u(p) is a column vector with a constraint obtained from plugging this \psi into the Dirac equation[1],

(i\gamma^{\mu}\partial_{\mu}-m)\psi(x)=0 \rightarrow (\gamma^{\mu}p_{\mu} - m)u(p)=0

  • It’s not terribly important because it doesn’t actually impact the solution, but I feel it relevant to point out that what we have here is a four-dimensional matrix (not a scalar) acting on a column vector. Beside the m term is a sneaky invisible \mathbb{I}_{4}, since \gamma^{\mu} p_{\mu} is still a four-dimensional matrix, p_{\mu} being just a number (the \mu component of the momentum four-vector).
  • Use the above to write u(p) = \frac{\gamma^{\nu}p_{\nu}}{m} u(p) and substitute this in.
  • Recalling that \bar{u}(p') \equiv u^{\dagger}(p')\gamma^{0}, combine this with the above to write \bar{u}(p') = \bar{u}(p')\frac{\gamma^{\nu}p_{\nu}}{m}. (It is necessary to recall that (\gamma^{\nu})^{\dagger} = \gamma^0 \gamma^{\nu} \gamma^0 and (\gamma^{0})^2 = \mathbb{I}_4 to get here.)
  • After rewriting u(p) and \bar{u}(p') in this way, we end up having some terms like \gamma^{\mu}\gamma^{\nu}. This can be rewritten as g^{\mu\nu} - i \sigma^{\mu \nu} using (hint: adding together) the following properties of gamma matrices:

\{\gamma^{\mu},\gamma^{\nu}\} = 2 g^{\mu \nu}

[\gamma^{\mu},\gamma^{\nu}] = -2 i \sigma^{\mu \nu}

  • Noting that \sigma^{\mu \nu} is antisymmetric, we get… the result, as desired.
  • Just a note: remember that p_{\nu} is just a number and commutes with everything, so it can be neglected during gamma matrix manipulations.

I hope I have not just committed that wonderful crime students complain about,

“The result obviously follows…”

But… the result does follow, quite obviously!

1. I’m not going to attempt to use Feynman slash notation because I suspect that figuring out how to work it in WordPress will take longer than just writing “\gamma^{\mu}”. This will do, for now.

Problem 2.1 from Peskin and Schroeder

Chapter one being introductory, this is the first numbered problem in the book. There are of course plenty of potentially problematic points prior to the pressing problem, which I might get back to later, given sufficient time, but for now, let’s do some relaxing classical electrodynamics.

Without sources, we have the action

S = \int \mbox{d}^4 x \left( -\frac{1}{4} F_{\mu \nu} F^{\mu \nu} \right) where F_{\mu \nu} = \partial_{\mu} A_{\nu} - \partial_{\nu} A_{\mu}

(a) They want us to derive Maxwell’s equations as the Euler-Lagrange equations of this action, where we take A_{\mu} as the dynamical variables, then write them in standard form – that is, in terms of the electric and magnetic fields, using E^i = -F^{0i} and \epsilon^{i j k}B^k = -F^{ij}. Here, as is the convention, Greek letters such as \mu run over space and time (so 0,1,2,3) while Roman letters only run over space (1,2,3).

So, the Euler-Lagrange equations are

\partial_{\mu} \left( \frac{\partial \mathcal{L} }{ \partial (\partial_{\mu} A_{\nu})}\right) -\frac{\partial \mathcal{L}}{\partial A_{\nu}} = 0^{\nu}

Where \mathcal{L} is of course our Lagrangian (density, but let’s just call it Lagrangian because we’re doing field theory), the term in the brackets in the action.

We can immediately see that the second term vanishes, since F_{\mu \nu} only contains derivatives of A. If there was a source involved, we’d have some term like A_{\rho} J^{\rho} in the Lagrangian, where J^{\rho} is a four-current, and the second term would be non-zero, or have a chance of being non-zero, at least.

So we just care about calculating the first term. This question is actually very straightforward once you can do these sort of calculations, so I will go through it explicitly. Focusing on the bit in brackets in the first term, using the product rule,

\frac{\partial \mathcal{L}}{\partial (\partial_{\mu} A_{\nu})} = -\frac{1}{4} \frac{\partial (F_{\alpha \beta}F^{\alpha \beta})}{\partial (\partial_{\mu} A_{\nu})} = -\frac{1}{4}\left( F_{\alpha \beta} \frac{\partial F^{\alpha \beta}} { \partial(\partial_{\mu} A_{\nu})} + F^{\alpha \beta} \frac{\partial F_{\alpha \beta}} { \partial(\partial_{\mu} A_{\nu})} \right)

An important point – basically the entire problem – is to mind index grammar. That is to say, in \frac{\partial \mathcal{L} }{ \partial (\partial_{\mu} A_{\nu})}, the term we’re presently looking at, we have free indices \mu and \nu. To avoid ambiguity while including the scalar Lagrangian, we must choose the dummy indices different to the free ones. I have chosen \alpha and \beta because they’re at the start of the alphabet and are certainly not \mu or \nu, but you can pick anything you like. Thus we know we have summation over \alpha and \beta, since they are repeated. Did I mention the boring detail with which I would go through this?

Now we can write F_{\alpha \beta} \frac{\partial F^{\alpha \beta} } { \partial ( \partial_{\mu} A_{\nu} ) } as F^{\alpha \beta} \frac{\partial F_{\alpha \beta} } { \partial ( \partial_{\mu} A_{\nu} ) } by raising and lowering the \alpha and \beta indices. You get this from x_{\gamma}x^{\gamma} = x^{\rho}g_{\rho \gamma} g^{\gamma \sigma}x_{\sigma} = x^{\rho} \delta_{\rho}^{\sigma} x_{\sigma} = x^{\rho}x_{\rho} = x^{\gamma} x_{\gamma} (since \rho is just a dummy index). I feel like this particular step wouldn’t be legal if the metric tensor g had any dependence on, in this case \partial_{\mu} A_{\nu}, because then we couldn’t just smoothly extract it from the derivative. I’m not sure why the metric tensor would, but at any rate, it definitely doesn’t here, because we’re in flat Minkowski space. Phew! Just a side note there. Now we have

\frac{\partial \mathcal{L}}{\partial (\partial_{\mu} A_{\nu})} = -\frac{1}{2}\left(F^{\alpha \beta} \frac{\partial F_{\alpha \beta}} { \partial(\partial_{\mu} A_{\nu})} \right)

We must have \frac{\partial (\partial_{\alpha} A_{\beta})}{\partial(\partial_{\mu} A_{\nu})} = \delta_{\alpha}^{\mu} \delta_{\beta}^{\nu} by inspection, so \frac{\partial F_{\alpha \beta}}{\partial(\partial_{\mu} A_{\nu})} = \delta_{\alpha}^{\mu} \delta_{\beta}^{\nu} - \delta_{\beta}^{\mu} \delta_{\alpha}^{\nu}, through the definition of F. Then

\frac{\partial \mathcal{L}}{\partial (\partial_{\mu} A_{\nu})} = -\frac{1}{2} F^{\alpha \beta} \left( \delta_{\alpha}^{\mu} \delta_{\beta}^{\nu} - \delta_{\beta}^{\mu} \delta_{\alpha}^{\nu}\right) = -\frac{1}{2} \left( F^{\mu \nu} - F^{\nu \mu}\right)

F is antisymmetric, so we have to be careful with index order (as we should always be, anyway), and can write -F^{\nu \mu} = F^{\mu \nu}. Then

\frac{\partial \mathcal{L}}{\partial (\partial_{\mu} A_{\nu})} =- F^{\mu \nu}=F^{\nu \mu}

So we now have everything we need for the Euler-Lagrange equations,

\partial_{\mu} \left( \frac{\partial \mathcal{L} }{ \partial (\partial_{\mu} A_{\nu})}\right) -\frac{\partial \mathcal{L}}{\partial A_{\nu}} = 0^{\nu}

\partial_{\mu} \left( \frac{\partial \mathcal{L} }{ \partial (\partial_{\mu} A_{\nu})}\right) - 0^{\nu} = 0^{\nu}

\partial_{\mu} F^{\mu \nu} = 0^{\nu}

Maxwell’s equations! Now let’s write them in a more familiar form. We can see that \nu is a free index, so let’s choose some values for it.

\nu = 0:

\partial_{\mu} F^{\mu 0} = \partial_{0} F^{00} + \partial_{i} F^{i 0} = 0 + \partial_{i} E^i=0 \rightarrow \nabla \cdot \mathbf{E} = 0

This is the sourceless form of Gauss’s law. Makes sense, right? In the absence of sources, the electric field has vanishing divergence. It’s entirely intuitive if your understanding of divergence is as something which vanishes in the absence of soruces, at any rate. Moving swiftly along,

\nu = i:

\partial_{\mu} F^{\mu i} = \partial_{0} F^{0i} + \partial_{j} F^{j i} = -\partial_{0} E^i - \partial_{j} F^{ij} = 0^i

\rightarrow \partial_{0} E^{i} + \partial_{j} F^{ij} = \dot{E}^{i} + \partial_{j} ( -\epsilon^{i j k} B^k ) = 0^i

The second term here is just the index form of the curl, \nabla \times \mathbf{B}, so we get the equation

\dot{\mathbf{E}} = \nabla \times \mathbf{B}, which is also known as Ampère’s circuital law.

At this point we are technically missing two of Maxwell’s equations. In the classical field theory course I took, we obtained the last two by noticing the vanishing divergence of the dual (the Hodge dual if I am not mistaken) of the field tensor F, being \tilde{F}^{\mu \nu} = \frac{1}{2} \epsilon^{\mu \nu \rho \sigma} F_{\rho \sigma} (where this is not the same \epsilon we were dealing with above – notice its four indices). I feel like they’e only asking for the first two in this question, however. Nonetheless, the rest are easily obtained by looking at different values for \nu in \partial_{\mu} \tilde{F}^{\mu \nu} = 0^{\nu} as we did above. Getting the electric and magnetic fields out is slightly more involved, but nothing too complicated.

(b) We are asked to construct the energy-momentum tensor for this theory, and then, essentially, to symmetrise it. Let’s start with the non-symmetric form, anyway. So, the definition of the energy-momentum tensor, knowing the Lagrangian and dynamical variables, is (this is in the book for a scalar field, anyway)

T^{\mu}_{\nu} \equiv \frac{\partial \mathcal{L}}{\partial(\partial_{\mu} A_{\alpha})}\partial_{\nu}A_{\alpha} - \mathcal{L} \delta^{\mu}_{\nu}

This is not an arbitrary formula that someone made up. This is the conserved Noether current arising from the symmetry of the theory under spacetime transformations. Invariance under time transformation (homogeneity of time) gives us the conserved quantity we call “energy”, and invariance under space transformation (homogeneity of space) gives conserved linear momentum. Incidentally, invariance under rotations of space (isotropy of space) gives us conservation of angular momentum. Isotropy of time doesn’t make any sense (we need more than 1 time dimension if we’re going to rotate in it) so we’re not getting any quantities there.

So we already know what the derivative of the Lagrangian is from earlier, so this gives us

T^{\mu}_{\nu} = F^{\alpha \mu}\partial_{\nu} A_{\alpha} - \mathcal{L} \delta^{\mu}_{\nu}

Raising the \nu index using g^{\lambda \nu} and then relabelling \lambda \rightarrow \nu to bring my notation in line with theirs, we get something which is not symmetric,

T^{\mu \nu} = F^{\alpha \mu} \partial^{\nu} A_{\alpha} - \mathcal{L} \delta^{\mu \nu}

The question then suggests that we can remedy this by adding to T^{\mu \nu} a term of the form \partial_{\alpha} K^{\alpha \mu \nu}, the K^{\alpha \mu \nu} being antisymmetric in its first two indices. This is fine because the divergence of such an object vanishes (since \partial_{\mu} \partial_{\alpha} = \partial_{\alpha} \partial_{\mu}, but K^{\alpha \mu \nu} = - K^{\mu \alpha \nu}, to labour the point), so the total stress-energy tensor is still conserved (has vanishing divergence).

The question goes on to tell us to use K^{\alpha \mu \nu} = F^{\mu \alpha} A^{\nu} to get a symmetric stress energy tensor. How nice of them. Let’s check if this new tensor is symmetric.

\hat{T}^{\mu \nu} \equiv T^{\mu \nu} + \partial_{\alpha} (F^{\mu \alpha} A^{\nu}) = F^{\alpha \mu} \partial^{\nu} A_{\alpha} - \mathcal{L} \delta^{\mu \nu} + \partial_{\alpha} (F^{\mu \alpha} A^{\nu})

The Lagrangian term is clearly symmetric, so we can ignore that. What of the other two terms?

Let’s just expand out the new term. \partial_{\alpha} (F^{\mu \alpha} A^{\nu}) = \partial_{\alpha}( F^{\mu \alpha}) A^{\nu} + F^{\mu \alpha} \partial_{\alpha} A^{\nu}. But Maxwell’s equations say that \partial_{\alpha} F^{\mu \alpha} = 0^{\mu}, so that term goes away. So we just need to check the symmetry of

F^{\alpha \mu} \partial^{\nu} A_{\alpha} + F^{\mu \alpha} \partial_{\alpha} A^{\nu} = F^{\alpha \mu} \partial^{\nu} A_{\alpha} - F^{\alpha \mu} \partial_{\alpha} A^{\nu} = F^{\alpha \mu}(\partial^{\nu} A_{\alpha} - \partial_{\alpha} A^{\nu})

That is, F^{\alpha \mu} F^{\nu}_{\;\;\alpha}. This is symmetric! If we swap the order of the indices in both terms (each swapping induces a sign change, since each individual term is antisymmetric) it becomes more apparent. F^{\mu \alpha} F^{\;\;\nu}_{\alpha} = F^{\mu}_{\;\;\alpha} F^{\alpha \nu } , through raising and lowering the \alpha indices (perfectly legal), so now if \mu \rightarrow \nu, we just exchange the locations of each individual term while raising and lowering \alpha. This is fine because F^{\nu \alpha} for example, is just a scalar field ultimately. But wait, I hear you exclaim (or not), aren’t we talking about a tensor here? This is an important point: yes, F is a tensor, and it’s usually referred to as F^{\mu \nu} to remind us that it’s a second rank tensor (field) in a four-dimensional space(time), but the F^{\mu \nu} terms we have been messing with all this time are actually its components, which are themselves simply scalar fields. So what I just described was legal. I assure you, officer.

Finally, the question demands that we obtain the standard formulae for the electromagnetic energy and momentum densities from this energy-momentum tensor. These are just components of the tensor. They call them densities because we’ve been dealing in densities all along (remember when I dropped the descriptor from Lagrangian? Like many things in phyiscs – notably quantities like the speed of light and Planck’s constant – they were secretly there the whole time), so don’t panic at that. It just means we’ll have to integrate over space to get our real quantities.

The energy is \hat{T}^{00} = \mathcal{H} = \mathcal{E} (look again at our first definition of the stress-energy tensor!)

\hat{T}^{00}=F^{0}_{\;\; \alpha}F^{\alpha 0} - \mathcal{L} \delta^{0 0} = F^{0}_{\;\; i} F^{i 0} - \mathcal{L} = -F^{0 i} F^{i 0} - \mathcal{L} = \mathbf{E}^2 - \mathcal{L}

Now, what is \mathcal{L} in terms of the fields? \mathcal{L} = -\frac{1}{4}F^{\mu \nu}F_{\mu \nu}. Let’s forget about the coefficient and expand these F terms. First expand by \mu and then by \nu

F^{\mu \nu}F_{\mu \nu} = F^{0 \nu}F_{0 \nu} + F^{i \nu}F_{i \nu} = F^{0 i}F_{0 i} + F^{i 0}F_{i 0} + F^{i j}F_{i j}

Plugging the fields in then, we have

F^{\mu \nu}F_{\mu \nu}=2\mathbf{E}^2 +\epsilon^{i j k}B^k\epsilon_{i j l} B_{l}=2\mathbf{E}^2-\epsilon^{i j k}\epsilon_{i j l}B^{k} B^{l}

=2 \mathbf{E}^2- 2 \delta^{k l}B^{k}B^{l}=2 (\mathbf{E}^2-\mathbf{B}^2)

Using a property of Levi-Civita symbols at the end there.

Therefore, \hat{T}^{00}= \frac{1}{2} (\mathbf{E}^2 +\mathbf{B}^2) = \mathcal{E}

Now, for the momentum density. In this case \mu = 0 and \nu = i.

\hat{T}^{o i}=\mathcal{S}^i=F^{0}_{\;\;\alpha}F^{\alpha i }-\mathcal{L} \delta^{0 i}=F^{0}_{\;\;\alpha} F^{\alpha i} = F^{0}_{\;\; j} F^{j i} = F^{0 j} F^{i j}

\mathcal{S}^i= F^{0 j} F^{i j} = -E^{j} (- \epsilon^{i j k} B^k) = \epsilon^{i j k}E^{j}B^{k} = (\mathbf{E} \times \mathbf{B})^i

Which is, up to a constant of proportionality, the Poynting vector.

And now we are done. It’s not a difficult problem, but you do need to keep an eye on the indices.