Efficient production of n = 2 Positronium in S states

We routinely excite Positronium (Ps) into its first excited state (n = 2) via 1-photon resonant excitation [NJP. 17 043059], and even though most of the time this is an intermediate step for subsequent excitation to Rydberg (high n) states [PRL. 114, 173001], there is plenty of interesting physics to be explored in n = 2 alone, as we discussed in one of our recent studies [PRL. 115, 183401 and  PRA. 93, 012506].

In this study we showed that the polarisation of the excitation laser, as well as the electric field that the atoms are subjected to, have a drastic effect on the effective lifetime of the excited states and when Ps annihilates.

qexp

Above you can see the data for two laser polarisations, showing the Signal parameter S(%) as a function of electric field, this is essentially a measure of how likely Ps is to annihilate compared to ground-state (n = 1) Ps, that is to say, if S(%) is positive then n = 2 Ps in such configuration annihilates with shorter lifetimes than n = 1 Ps (142 ns), whereas if S(%) is negative then n = 2 Ps will annihilate with longer lifetimes than 142 ns, These longer lifetimes are present in the parallel polarisation (pannel a).

Using this polarisation, and applying a large negative or positive electric field (around 3 kV/cm), provides such long lifetimes due to the excited state containing a significant amount of triplet S character (2S), a substate of = 2 with spin = 1 and \ell = 0. If the Ps atoms are then allowed to travel (adiabatically) to a region of zero nominal electric field (our experimental set-up [RSI. 86, 103101] guarantees such transport), then they will be made up almost entirely of this long-lived triplet S character, and will thus annihilate at much later times than the background n = 1 atoms. These delayed annihilations can be easily detected by simply looking at the gamma-ray spectrum recorded by our LYSO detectors [NIMA. 828, 163] when the laser is on resonance (“Signal”), and subtracting it from the spectrum when the laser is off resonance (“Background”).

The figure above shows such spectra taken with the parallel laser polarisation, at a field where there should be minimal 2S Production (a), and a field where triplet S character is maximised (b).   It is obvious that on the second case, there are far more annihilations at later times, indicated by the positive values of the data on times up to 800 ns. This is clear evidence that we have efficiently produced = 2 triplet S states of Ps using single-photon excitation. Previous studies of 2S Ps produced such states either by collisional methods [PRL34, 1541], which is much more inefficient than single-photon excitation,  or by two-photon excitation, which is also more inefficient, requires much more laser power and is limited by photo-ionisation [PRL. 52, 1689].

This observation is the initial step before we begin a new set of experiments where we  will attempt to measure the = 2 hyperfine structure of Ps using microwaves!

P.A.M. Dirac

Yesterday marked the 114th anniversary of the birth of Paul Adrien Maurice Dirac, one of the world’s greatest ever theoretical physicists. Born on the 8th of August 1902 in Bristol (UK), Dirac studied for his PhD at St. John’s college Cambridge University, where he would subsequently discover the equation that now bears his name,

iγ·∂ψ =  mψ .

The Dirac equation is a solution to the problem of describing an electron in a way that is consistent with both quantum mechanics and Einstein’s theory of relativity. His solution was unique in its natural inclusion of the electron “spin”, which had to otherwise be invoked to account for fine structure in atomic spectra. His brilliant contemporary, Wolfgang Pauli, described Dirac’s thinking as acrobatic. And several of Dirac’s theories are regarded as among the most beautiful and elegant of modern physics.

An important prediction of the Dirac equation is the existence of the anti-electron (also known as the  positron). This particle is equal in mass to the more familiar electron, but has the opposite electric charge. Dirac published his theory of the anti-electron in 1931 – two years before “the positive electron” was discovered by Carl Anderson. Dirac accurately mused that the anti-proton might also exist, and most physicists now believe that all particles posses an antimatter counterpart. But antimatter is apparently – and as yet inexplicably – much scarcer than matter.

In 1933 Dirac shared the Nobel prize in physics with Erwin Schrödinger “for the discovery of new productive forms of atomic theory”. Dirac died aged 82 in 1984. He’s commemorated in Westminster Abbey by an inscription in the Nave, not far from Newton’s monument. Separated in life by more than two centuries, Paul Dirac and Sir Isaac Newton are arguably the fathers of antimatter and gravity.

http://www.westminster-abbey.org/our-history/people/paul-dirac

The Strangest Man by Graham Farmelo is a fascinating account of Dirac’s life and work.

A guide to positronium

Positronium (Ps) is a hybrid of matter and antimatter. Made of just two particles – an electron and a positron – the atomic structure of Ps is similar to hydrogen. The ultimate aim of our experiments at UCL is to observe deflection of a Ps beam due to gravity, as nobody knows if antimatter falls up or down.

In this post, we outline how we recently managed to guide positronium using a quadrupole. Because the Ps atom doesn’t have a heavy nucleus, it’s extremely light and will typically move very, very quickly (~100 km/s). A refinement of the guiding techniques we used can, in principle, be applied to decelerate Ps atoms to speeds that are more suitable for studying gravity.

IMG_20160704_190341-01-01
Point-of-view of a Ps atom entering a quadrupole guide

Before guiding positronium we have to create some. Positrons emitted from a radioisotope of sodium are trapped in a combination of electric and magnetic fields. They are ejected from the trap and implanted into a thin-film of mesoporous silica, where they bind to electrons to form Ps atoms; the network of tiny pores provides a way for these to get out and into vacuum.

The entire Ps distribution is emitted from the film in a time-window of just a few billionths of a second.  This is well matched to our pulsed lasers, which we use to optically excite the atoms to Rydberg levels (high principal quantum number, n). If we didn’t excite the Ps then the electron-positron pairs would annihilate into gamma-ray photons in much less than a millionth of a second, and each would be unlikely to travel more than a few cm. However, in the excited states self-annihilation is almost completely suppressed and they can, therefore, travel much further.

Each Rydberg level contains many sublevels that have almost the same internal energy. This means that for a given n its sublevels can all be populated using a narrow range of laser wavelengths. But if an electric field is applied the sublevels are shifted. This so-called “Stark shift” comes from the electric dipole moment, i.e., the distribution of electric charge within the atom. The dipole is different for each sublevel and it can either be aligned or anti-aligned to the electric field. This results in a range of both positive and negative energy shifts, broadening the overall spectral line. Tuning the laser wavelength can now be used to select a particular sublevel. Or rather, to select a Rydberg-Stark state with a particular electric dipole moment. Stark broadening is demonstrated in the plot below. [For higher electric fields the individual Stark states can be resolved.]

linescan_fancy
Stark broadening of n=12 Ps in an electric field.

The Stark effect provides a way to manipulate the motion of neutral atoms using electric fields. As an atom moves between regions of different electric field strength its internal energy will shift according to its electric dipole moment. However, because the total energy must be conserved the kinetic energy will also change. Depending on whether the atom experiences a positive or negative Stark shift, increasing fields will either slow it down or speed it up. The Rydberg-Stark states can ,therefore, be broadly grouped as either low-field-seeking (LFS) or high-field-seeking (HFS). The force exerted by the electric field is much smaller than would be experienced by a charged particle. Nevertheless, this effect has been demonstrated as a useful tool for deflecting, guiding, decelerating, and trapping Rydberg atoms and polar molecules.

quadrupole_cartoon
Rydberg positronium source, lasers, gamma-ray detectors, and quadrupole guide.

A quadrupole is a device made from a square array of parallel rods.  Positive voltage is applied to one diagonal pair and negative to the other. This creates an electric field that is zero along the centre but which is very large directly between neighbouring rods. The effect this has on atoms in LFS states is that when they drift away from the middle into the high fields they slow down, and eventually turn around and head back towards the centre, i.e., they are guided. On the other hand, atoms in HFS states are steered away from the low-field region and out to the side of the quadrupole.

Stark
Electric field strength and trajectory calculation for low-field-seeking (blue),  high-field-seeking (red), and unaffected (green) Rydberg-Stark states of positronium in a quadrupole guide.

Using gamma-ray detectors at either end of a 40 cm long quadrupole we measured how many Rydberg Ps atoms entered and how many were transported through it. With the guide switched off some atoms from all states were transmitted. However, with the voltages switched on there was a five-fold increase in the number of low-field-seeking atoms getting through, whereas the high-field-seeking atoms could no longer pass at all.

total_fancy2
The number of Rydberg Ps atoms entering (red) and passing all the way through (blue) the quadrupole guide.

A large part of why we chose to use positronium for our gravity studies is that it’s electrically neutral. As the electromagnetic force is so much stronger than gravity we, therefore, avoid otherwise overwhelming effects from stray electric fields. However, by exciting Ps to Rydberg-Stark states with large electric dipole moments we reintroduce the same problem. Nonetheless, it should be possible to exploit the LFS states to decelerate the atoms to low speeds, and then we can use microwaves to drive them to states with zero dipole moment. This will give us a cold Rydberg Ps distribution that is insensitive to electric fields and which can be used for gravitational deflection measurements.


Our article “Electrostatically guided Rydberg positronium” has been published in Physical Review Letters.

14th International Workshop on Slow Positron Beam Techniques & Applications

Members of the UCL positronium laser spectroscopy group recently attended the 14th International Workshop on Slow Positron Beam Techniques & Applications (SLOPOS14) in Matsue, Japan. The conference took place from the 22nd to the 27th of May 2016.  During this time we heard many great talks from groups working with positrons and positronium (Ps) from all over the world.

We also presented some of our work, including Rydberg-Stark states of Ps (PRL. 115, 173001), laser-enhanced time-of-flight spectroscopy (NJP. 17, 043059), Ps production in cryogenic environments (PRB 93, 125305), controlling annihilation of excited-state Ps (PRL115, 183401 & PRA93, 012506), and improved SSPALS measurements with LYSO scintillators (NIM. A,  828, 163). The talk “Controlling Annihilation Dynamics of n = 2
Positronium with Electric Fields”, given by Alberto. M. Alonso (PhD student), was awarded a prize for making an outstanding contribution to the conference!

SLOPOS14 was a great opportunity to meet fellow physicists working in our field, to learn of their progress and to share our own.  These meetings are important for discussing new results and new ideas, and for building collaborations for future work. We are extremely grateful to the organisers for their hard work in hosting the event.

slopos14photo

We look forward to the next SLOPOS, which will be held in Romania in 2019

 

Antimatter annihilation, gamma rays, and Lutetium-yttrium oxyorthosilicate

Doing experiments with antimatter presents a number of challenges. Not least of these is that when a particle meets its antiparticle the two will quickly annihilate. As far as we know we live in a universe that is dominated by matter. We are certainly made of matter and we run experiments in matter-based labs. How then can we confine positrons (anti-electrons) when they disappear on contact with any of our equipment?

Paul Dirac – the theoretical physicist who predicted the existence of antiparticles almost 90 years ago – proposed the solution even before there was evidence that antimatter was any more than a theoretical curiosity. In 1931 Dirac wrote,

“if [positrons] could be produced experimentally in high vacuum they would be quite stable and amenable to observation.”

P. A. M. Dirac (1931) 

Our positron beamline makes use of vacuum chambers and pumps to achieve pressures as low as 12 orders of magnitude less than atmosphere. Inside of our buffer-gas trap, where the vacuum is deliberately not so vacuous, the positrons can still survive for several seconds without meeting an electron. And as positrons are electrically charged they can easily be prevented from touching the chamber walls using a combination of electric and magnetic fields. (For neutral forms of antimatter the task is more difficult.  Nevertheless, the ALPHA experiment was able to trap antihydrogen for 1000 s using a magnetic bottle.)

An antiparticle can be thought of as a mirror image of a particle, with a number of equal but opposite properties, such as electric charge. When the two meet and annihilate, these properties sum to zero and nothing remains. Well, almost nothing. Electrons and positrons have the same mass (m = 9.10938356 × 10-31 kg), and when the two annihilate this is converted to energy in accordance with Einstein’s well-known formula

 E = m c2,

where c is the speed of light (299792458 m/s). For this reason antimatter has long fascinated science fiction writers: there is a potentially vast amount of energy available – e.g., for propelling spaceships or destroying the Vatican – when only a small amount of antimatter annihilates with matter. However, the difficulty in accumulating even minuscule amounts means that applications in weaponry and propulsion are a very long way from viable.

When an electron and positron annihilate the energy takes the form of gamma-ray photons. Usually two, each with 511 keV of energy. Although annihilation raises some difficulties, the distinct signature it produces can be very useful for detection purposes. Gamma rays are hundreds of thousands of times more energetic than visible photons. To detect them we use scintillation materials that absorb the gamma ray energy and then emit visible light. Photo-multiplier tubes are then used to convert the visible photons into an electric current, which can then be recorded with an oscilloscope.

Many materials are known to scintillate when exposed to gamma rays, although their characteristics differ widely. The properties that are most relevant to our work are the density (which must be high to absorb the gamma rays), the length of time that a scintillation signal takes to decay (this can vary from a few ns to a few μs), and the number of visible photons emitted, i.e., the light output.

NaI

Encased sodium iodide crystal 

Sodium iodide (NaI) is a popular choice for antimatter research because the light output is very high, therefore individual annihilation events can easily be detected.  However, for some applications the decay time is too long (~1 μs).

NaI_gamma-rays

PMT output for individual gamma-ray  detection with NaI

The material we normally use to perform single-shot positron annihilation lifetime spectroscopy (SSPALS) is lead tungstate (PbWO4) – the same type of crystal is used in the CMS electromagnetic calorimeter. This material has a fast decay time of around 10 ns, which allows us to resolve the 142 ns lifetime of ground-state positronium (Ps).  However, the amount of visible light emitted from PbWO4 is relatively low (~ 1% of NaI).

Recently we began experimenting with using Lutetium-yttrium oxyorthosilicate (LYSO) for SSPALS measurements, even though its decay time of ~40 ns is considerably slower than that of PbWO4.  So, why LYSO?  The main reason is that it has a much higher light output (~ 75% of NaI), therefore we can more efficiently detect the gamma rays in a given lifetime spectrum, and this significantly improves the overall statistics of our analysis.

lyso

An array of LYSO crystals

The compromise with using LYSO is that the longer decay time distorts the lifetime spectra and reduces our ability to resolve fast components. However, most of our experiments involve using lasers to alter the lifetime of Ps (reducing it via magnetic quenching or photoionisation; or extending it by exciting the atoms to Rydberg levels), and we generally care more about seeing how much the 142 ns component changes than about what happens on shorter timescales.   The decay time of LYSO is just about fast enough for this, and the improvement in contrast between signal and background measurements – which comes with the improved statistics – outweighs the loss in timing resolution.

pwo_lyso.png

SSPALS with LYSO and PbWO4


This post is based on our recent article:

Single-shot positron annihilation lifetime spectroscopy with LYSO scintillators, A. M. Alonso, B. S. Cooper, A. Deller, and D. B. Cassidy, Nucl. Instrum. Methods :  A  828, 163 (2016) DOI:10.1016/j.nima.2016.05.049.

How long does Rydberg positronium live?

Time-of-flight (TOF) is a simple but powerful technique that consists of accurately measuring the time it takes a particle/ atom/ ion/ molecule/ neutrino/ etc. to travel a known distance.  This valuable tool has been used to characterise the kinetic energy distributions of an exhaustive range of sources, including positronium (Ps) [e.g. Howell et al, 1987], and is exploited widely in ion mass spectrometry.

Last year we published an article in which we described TOF measurements of ground-state (n=1) Ps atoms that were produced by implanting a short (5 ns) pulse of positrons into a porous silica film.  Using pulsed lasers to photoionise (tear apart) the atoms at a range of well-defined positions, we were able to estimate the Ps velocity distribution, finding mean speeds on the order of 100 km/s. Extrapolating the measured flight paths back to the film’s surface indicated that the Ps took on average between 1 and 10 ns to escape the pores, depending on the depth to which the positrons were initially implanted.

When in the ground state and isolated in vacuum the electron and positron that make up a positronium atom will tend to annihilate each another in around 140 ns.  Even with a speed of 100 km/s this means that Ps is unlikely to travel further than a couple of cm during its brief existence.  Consequently,  the photoionisation/ TOF measurements mentioned above were made within 6 mm of the silica film. However, instead of ionising the atoms, our lasers can be reconfigured to excite Ps to high-n Rydberg levels, and these typically live for a great deal longer.   The increase in lifetime allows us to measure TOF spectra over much longer timescales (~10 µs) and distances (1.2 m).

TOF_schem

The image above depicts the layout of our TOF apparatus.  Positrons from a Surko trap are guided by magnets to the silica film, wherein they bind to electrons and are remitted as Ps.  Immediately after, ultraviolet and infra-red pulsed lasers drive the atoms to n=2 and then to Rydberg states.  Unlike the positively charged positrons, the neutral Ps atoms are not deflected by the curved magnetic fields and are able to travel straight along the 1.2 m flight tube, eventually crashing into the end of the vacuum chamber.  The annihilation gamma rays are there detected using an NaI scintillator and photomultipler tube (PMT), and the time delay between Ps production and gamma ray detection is digitally recorded.

unknown

 

The plots above show two different views of time-of-flight spectra accumulated with the infra-red laser tuned to address Rydberg levels in the range of n=10 to 20.  The data shows that more Ps are detected at later times for the higher-n states than for lower-n states.  This is easily explained by fluorescence, i.e., the decay of an excited-state atom via spontaneous emission of a photon.  As the fluorescence lifetime increases with n, the lower-n states are more likely to decay to the ground state and then annihilate before reaching the end of the chamber, reducing the number of gamma rays seen by the NaI detector at later times. We estimate from this data that Ps atoms in n=10 fluoresce in about 3 µs, compared to roughly 30 µs for n=20.

This work brings us an important step closer to performing a positronium free-fall measurement.  A flight path of at least ten meters will probably be required to observe gravitational deflection, so we still have some way to go.


This post is based on work discussed in our article:

Measurement of Rydberg positronium fluorescence lifetimes. A. Deller, A. M. Alonso, B. S. Cooper, S. D. Hogan, and D. B. Cassidy. Phys. Rev. A 93, 062513  (2016)DOI:10.1103/PhysRevA.93.062513.

UCL positronium spectroscopy beamline (the first two years)

The UCL Ps spectroscopy positron beamline began producing low-energy positrons almost two years ago, and it has since become slightly longer and somewhat more sophisticated. Though it’s not the most complex scientific machine in the world (compared to, e.g., the LHC) we still find regular use for a 3D depiction of it.  Our model is essentially a cartoon. Typically we use it to create (fairly) accurate schematics that help us to convey the configuration of our equipment at conferences or in publications.

UCL_positron_trap

The snap shot above shows the three main components of the beamline, namely the positron source (left), Surko trap (centre, cross-section), and Ps laser-spectroscopy region (right).  The 3D model is built from simplified forms of the various vacuum chambers and pumps, magnetic coils, and detectors.  And it shows where these all are in relation to one another.  The 45° angled line is being used right now for Rydberg Ps time-of-flight measurements.  The source and trap are based on the design developed by Rod Greaves and Jeremey Moxom of First Point Scientific Inc. (unfortunately now defunct).  You can read about the details of their design in this article.

To allow you to take a closer look we have created a 3D pdf file that you can download here * (licensed under a Creative Commons Attribution 4.0 License). Be aware we use this for illustration/ communication purposes and it is not an accurate technical model. Nonetheless, using this you can pan, zoom, and rotate around our virtual lab to your heart’s content! No need for 3D glasses, though you will need a recent copy of Adobe reader,  (the interactive features probably won’t work in your web browser).

*MD5 checksum c6028573596c9511d9ba0450cd2caa05

And here’s how the lab looks in real life,

beamline_2016_3

 

 

 

Photoemission of Ps from single-crystal p-Ge semiconductors

The production of positronium in a low-temperature (cryogenic) environment is in general only possible using materials that operate via non-thermal processes. In previous experiments we showed that porous silica films can be used in this way at temperatures as low as 10 K, but that Ps formation at these temperatures can be inhibited by condensation of residual gas, or by laser irradiation.

It has been known for several years now that some semiconductors can produce Ps via an exciton-like surface state [12]. Si and Ge are the only semiconductors that have been studied so far, but it is likely that others will work in a similar way. The electronic surface state(s) underlying the Ps production can be populated thermally, resulting in temperature dependent Ps formation that is very similar to what is observed in metals (for which the Ps is actually generated via thermal desorption of positrons in surface states). Since laser irradiation can also populate electronic surface states, and is known to result in Ps emission from Si at room temperature, the possibility exists that this process can be used at cryogenic temperatures.

We have studied this possibility using p-type Ge(100) crystals. Initial sample preparation involves immersion in acid (HCl) and this process leaves the sample with Chlorine-terminated dangling bonds which can be thermally desorbed. We attached the samples to a cold head with a high temperature interface  that can be heated to 700 K and cooled to 12 K. The heating is necessary to remove Cl from the crystal surface, which otherwise inhibits Ps formation. Fig 1 shows the initial heating cycle that prepares the sample for use. The figure shows the delayed annihilation fraction (which is proportional to the amount of positronium) as a function of temperature.

photoweb

FIG. 1:  Delayed fraction as a function of sample temperature after initial installation into the vacuum system. After the surface Cl has been thermally desorbed the amount of Ps emitted at room temperature is substantially increased.

As has been previously observed [2] using visible laser light at 532 nm can increase the Ps yield. This occurs because the electrons necessary for Ps formation can be excited to surface states by the laser. However, these states have a finite lifetime, and as both the laser and positron pulses are typically around 5 ns wide these have to be synchronized in order to optimise the photoemission effect. This is shown in FIG 2.  These data indicate that the electronic surface states are fairly short lived, with lifetimes of less than 10 ns or so. Longer surface states were observed in similar measurements using Si.

phototime web

FIG 2: Delayed fraction as a function of the arrival time of the laser relative to the incident positron pulse. These data are recorded at room temperature.  The laser fluence was ~ 15 mJ/cm^2

When Ge is cooled the Ps fraction drops significantly. This is not related to surface contamination, but is due to the lack of thermally generated surface electrons. However, surface contamination does further reduce the Ps fraction (much more quickly than is the case for silica. This effect is shown in FIG 3. If a photoemission laser is applied to a cold contaminated Ge sample two things happen (1) the laser desorbs some of the surface material and (2) photoemission occurs .This means that Ge can be used to produce Ps with a high efficiency at any temperature, and we don’t even have to worry about the vacuum conditions (within some limits).

laser_powers

FIG 3: Delayed fraction as a function of time that the target was exposed to showing the effect that different laser fluences has on the photoemission process. During irradiation, the positronium fraction is noticeably increased.

There are many possible applications for cryogenic Ps production within the field of antimatter physics, including the formation of antihydrogen formation via Ps collision with antiprotons [3], Ps laser cooling and Bose Einstein Condensation [4], as well as precision spectroscopy.

[1] Positronium formation via excitonlike states on Si and Ge surfaces. D. B. Cassidy, T. H. Hisakado, H. W. K. Tom, and A. P. Mills, Jr. Phys. Rev. B, 84, 195312 (2011). DOI:10.1103/PhysRevB.84.195312.

[2] Photoemission of Positronium from Si. D. B. Cassidy, T. H. Hisakado, H. W. K. Tom, and A. P. Mills, Jr. Phys. Rev. Lett. 107, 033401 (2011). DOI:10.1103/PhysRevLett.107.033401.

[3] Antihydrogen Formation via Antiproton Scattering with Excited Positronium. A. S. Kadyrov, C. M. Rawlins, A. T. Stelbovics, I. Bray, and M. Charlton. Phys. Rev. Lett. 114, 183201 (2015). DOI:10.1103/PhysRevLett.114.183201.

[4] Possibilities for Bose condensation of positronium. P. M. Platzman and A. P. Mills, Jr. Phys. Rev. B 49, 454 (1994). DOI:10.1103/PhysRevB.49.454.

Rydberg Positronium Special Report, ICPEAC 2015

One of the conferences that we attended during the summer (ICPEAC 2015) had the necessary set-up to film one of our talks about our recent Rydberg paper, this was summarised on a published IOP abstract.

You can watch our talk along with the rest of the lectures on ICPEAC’s youtube channel: https://www.youtube.com/watch?v=Cytjc2Er2Co.

ANTIMATTER: who ordered that?

The existence of antimatter became known following Dirac’s formulation of relativistic quantum mechanics, but this incredible development was not anticipated. These days conjuring up a new particle or field (or perhaps even new dimensions) to explain unknown observations is pretty much standard operating procedure, but it was not always so. The famous “who ordered that” statement of I. I. Rabi was made in reference to the discovery of the muon, a heavy electron whose existence seemed a bit unnecessary at the time; in fact it was the harbinger of a subatomic zoo.

The story of Dirac’s relativistic reformulation of the Schrödinger wave equation, and the subsequent prediction of antiparticles, is particularly appealing; the story is nicely explained in a recent biography of Dirac (Farmelo 2009). As with Einstein’s theory of relativity, Dirac’s relativistic quantum mechanics seemed to spring into existence without any experimental imperative. That is to say, nobody ordered it! The reality, of course, is a good deal more complicated and nuanced, but it would not be inaccurate to suggest that Dirac was driven more by mathematical aesthetics than experimental anomalies when he developed his theory.

The motivation for any modification of the Schrödinger equation is that it does not describe the energy of a free particle in a way that is consistent with the special theory of relativity. At first sight it might seem like a trivial matter to simply re-write the equation to include the energy in the necessary form, but things are not so simple. In order to illustrate why this is so it is instructive to briefly consider the Dirac equation, and how it was developed. For explicit mathematical details of the formulation and solution of the Dirac equation see, for example, Griffiths 2008.

The basic form of the Schrödinger wave equation (SWE) is

(-\frac{\hbar^2}{2m}\nabla^2+V)\psi = i\hbar \frac{\partial}{\partial t}\psi.                                                    (1)

The fundamental departure from classical physics embodied in eq (1) is the quantity \psi , which represents not a particle but a wavefunction. That is, the SWE describes how this wavefunction (whatever it may be) will behave. This is not the same thing at all as describing, for example, the trajectory of a particle. Exactly what a wavefunction is remains to this day rather mysterious. For many years it was thought that the wavefunction was simply a handy mathematical tool that could be used to describe atoms and molecules even in the absence of a fully complete theory (e.g., Bohm 1952). This idea, originally suggested by de Broglie in his “pilot wave” description, has been disproved by numerous ingenious experiments (e.g., Aspect et al., 1982). It now seems unavoidable to conclude that wavefunctions represent actual descriptions of reality, and that the “weirdness” of the quantum world is in fact an intrinsic part of that reality, with the concept of “particle” being only an approximation to that reality, only appropriate to a coarse-grained view of the world. Nevertheless, by following the rules that have been developed regarding the application of the SWE, and quantum physics in general, it is possible to describe experimental observations with great accuracy. This is the primary reason why many physicists have, for over 80 years, eschewed the philosophical difficulties associated with wavefunctions and the like, and embraced the sheer predictive power of the theory.

We will not discuss quantum mechanics in any detail here; there are many excellent books on the subject at all levels (e.g., Dirac 1934, Shankar 1994, Schiff 1968). In classical terms the total energy of a particle E can be described simply as the sum of the kinetic energy (KE) and the potential energy (PE) as

KE+PE=\frac{p^2}{2m}+V=E                                                 (2)

where p = mv represents the momentum of a particle of mass m and velocity v. In quantum theory such quantities are described not by simple formulae, but rather by operators that act on the wavefunction. We describe momentum via the operator -i \hbar\nabla and energy by i\hbar \partial / \partial t and so on. The first term of eq (1) represents the total energy of the system, and is also known as the Hamiltonian, H. Thus, the SWE may be written as

H\psi=i\hbar\frac{\partial\psi}{\partial t}=E\psi                                                              (3)

The reason why eq (3) is non-relativistic is that the energy-momentum relation in the Hamiltonian is described in the well-known non-relativistic form. As we know from Einstein, however, the total energy of a free particle does not reside only in its kinetic energy; there is also the rest mass energy, embodied in what may be the most famous equation in all of physics:

E=mc^2.                                                                    (4)

This equation tells us that a particle of mass m has an equivalent energy E, with c2 being a rather large number, illustrating that even a small amount of mass (m) can, in principle, be converted into a very large amount of energy (E). Despite being so famous as to qualify as a cultural icon, the equation E = mc2 is, at best, incomplete. In fact the total energy of a free particle (i.e., V = 0) as prescribed by the theory of relativity is given by

E^2=m^2c^4 +p^2c^2.                                                        (5)

Clearly this will reduce to E = mc2 for a particle at rest (i.e., p = 0): or will it? Actually, we shall have E = ± mc2, and in some sense one might say that the negative solutions to this energy equation represent antimatter, although, as we shall see, the situation is not so clear cut. In order to make the SWE relativistic then, one need only replace the classical kinetic energy E = p2/2m with the relativistic energy E = [m2c4+p2c2]1/2. This sounds simple enough, but the square root sign leads to quite a lot of trouble! This is largely because when we make the “quantum substitution” p \rightarrow -i\hbar\nabla  we find we have to deal with the square root of an operator, which, as it turns out, requires some mathematical sophistication. Moreover, in quantum physics we must deal with operators that act upon complex wavefunctions, so that negative square roots may in fact correspond to a physically meaningful aspect of the system, and cannot simply be discarded as might be the case in a classical system.

To avoid these problems we can instead start with eq (5) interpreted via the operators for momentum and energy so that eq (3) becomes

(- \frac{1}{c^2}\frac{\partial^2}{\partial t^2} + \nabla^2)\psi=\frac{m^2 c^2}{\hbar^2}\psi.                                                (6)

This equation is known as the Klein Gordon equation (KGE), although it was first obtained by Schrödinger in his original development of the SWE. He abandoned it, however, when he found that it did not properly describe the energy levels of the hydrogen atom. It subsequently became clear that when applied to electrons this equation also implied two things that were considered to be unacceptable; negative energy solutions, and, even worse, negative probabilities. We now know that the KGE is not appropriate for electrons, but does describe some massive particles with spin zero when interpreted in the framework of quantum field theory (QFT); neither mesons nor QFT were known when the KGE was formulated.

Some of the problems with the KGE arise from the second order time derivative, which is itself a direct result of squaring everything to avoid the intractable mathematical form of the square root of an operator. The fundamental connection between time and space at the heart of relativity leads to a similar connection between energy and momentum, a connection that is overlooked in the KGE. Dirac was thus motivated by the principles of relativity to keep a first order time derivative, which meant that he had to confront the difficulties associated with using the relativistic energy head on. We will not discuss the details of its derivation but will simply consider the form of the resulting Dirac equation:

(c \alpha \cdot \mathrm{P}+\beta mc^2)\psi=i\hbar \frac{\partial\psi}{\partial t}.                                                     (7)

This equation has the general form of the SWE, but with some significant differences. Perhaps the most important of these is that the Hamiltonian now includes both the kinetic energy and the electron rest mass, but the coefficients αi and \beta  have to be four-component matrices to satisfy the equation. That is, the Dirac equation is really a matrix equation, and the wavefunction it describes must be a four component wavefunction. Although there are no problems with negative probabilities, the negative energy solutions seen in the KGE remain. These initially seemed to be a fatal flaw in Dirac’s work, but were overlooked because in every other aspect the equation was spectacularly successful. It reproduced the hydrogen atomic spectra perfectly (at least, as perfectly as it was known at the time) and even included small relativistic effects, as a proper relativistic wave equation should. For example, when the electromagnetic interaction is included the Dirac equation predicts an electron magnetic moment:

\mu_e = \frac{\hbar e}{2m} = \mu_B                                                                   (8)

where \mu_B is known as the Bohr magneton. This expression is also in agreement with experiment, almost: it was later discovered that the magnetic moment of the electron differs from the value predicted by eq (8) by about 0.1% (Kusch and Foley 1948).  The fact that Dirac’s theory was able to predict these quantities was considered to be a triumph, despite the troublesome negative energy solutions.

Another intriguing aspect of the Dirac equation was noticed by Schrödinger in 1930. He realised that interference between positive and negative energy terms would lead to oscillations of the wavepacket of an electron (or positron) about some central point at the speed of light. This fast motion was given the name zitterbewegung (which is German for “trembling motion”). The underlying physical mechanism that gives rise to the zitterbewegung effect may be interpreted in several different ways but one way to look at it is as an interaction of the electron with the zero-point energy of the (quantised) electromagnetic field. Such electronic oscillations have not been directly observed as they occur at a very high frequency (~ 1021 Hz), but since zitterbewegung also applies to electrons bound to atoms, this motion can affect atomic energy levels in an observable way. In a hydrogen atom the zitterbewegung acts to “smear out” the electron charge over a larger area, lowering the strength of its interaction with the proton charge. Since S states have a non-zero expectation value at the origin, the effect is larger for these than it is for P states. The splitting between the hydrogen 2S1/2 and 2P1/2 states, that are degenerate in the Dirac theory, is known as the Lamb Shift (Lamb, 1947). This shift, which amounts to ~1 GHz was observed in an experiment by Willis Lamb and his student Robert Retherford (not to be confused Ernest Rutherford!). The need to explain this shift, which requires a proper explanation of the electron interacting with the electromagnetic field, gave birth to the theory of quantum electrodynamics, pioneered by Bethe, Tomanoga, Schwinger and Feynman.

The solutions to the SWE for free particles (i.e., neglecting the potential V) are of the form

\psi = A \mathrm{exp}(-iEt / \hbar).                                                       (9)

Here A is some function that depends only on the spatial properties of the wavefunction (i.e., not on t). Note that this wavefunction represents two electron states, corresponding to the two separate spin states. The corresponding solutions to the Dirac equation may be represented as

                                                            \psi_1 = A_1 \mathrm{exp}(-iEt / \hbar),

\psi_2 = A_2 \mathrm{exp}(+iEt / \hbar).                                                   (10)

Here \psi_2 represents the negative energy solutions that have caused so much trouble. The existence of these states is central to the theory they cannot simply be labelled as “unphysical” and discarded. The complete set of solutions is required in quantum mechanics, in which everything is somewhat “unphysical”. More properly, since the wavefunction is essentially a complex probability density function that yields a real result when its absolute value is squared, the negative energy solutions are no less physical than the positive energy solutions; it is in fact simply a matter of convention as to which states are positive and which are negative. However you set things up, you will always have some “wrong” energy states that you can’t get rid of. Thus, Dirac was able to eliminate the negative probabilities and produce a wave equation that was consistent with special relativity, but the negative energy states turned out to be a fundamental part of the theory and could not be eliminated, despite many attempts to get rid of them.

After his first paper in 1928 (The quantum theory of the electron) Dirac had established that his equation was a viable relativistic wave equation, but the negative energy aspects remained controversial. He worried about this for some time, and tried to develop a “hole” theory to explain their seemingly undeniable existence. A serious problem with negative energy solutions is that one would expect all electrons to decay into the lowest energy state available, which would be the negative energy states. Since this would not be consistent with observations there must, so Dirac reasoned, be some mechanism to prevent it. He suggested that the states were already filled with an infinite “sea” of electrons, and therefore the Pauli Exclusion Principle would prevent such decay, just as it prevents more than two electrons from occupying the lowest energy level in an atom. (Note that this scheme does not work for Bosons, which do not obey the exclusion principle). Such an infinite electron sea would have no observable properties, as long as the underlying vacuum has a positive “bare” charge to cancel out the negative electron charge. Since only changes in the energy density of this sea would be apparent, we would not normally notice its presence. Moreover, Dirac suggested that if a particle were missing from the sea the resulting hole would be indistinguishable from a positively charged particle, which he speculated was a proton, protons being the only positively charged subatomic particles known at the time.

This idea was presented in a paper in 1930 (A Theory of Electrons and Protons, Dirac 1930). The theory was less than successful, however, and the deficiencies served only to undermine confidence in the entire Dirac theory. Attempts to identify holes as protons only made matters worse; it was shown independently by Heisenberg, Oppenheimer and Pauli that the holes must have the electron mass, but of course protons are almost 2000 times heavier. Moreover, the instability between electrons and holes completely ruled out stable atomic states made from these entities (bad news for hydrogen, and all other atoms). Eventually Dirac was forced to conclude that the negative energy solutions must correspond to real particles with the same mass as the electron and a positive charge. He called these anti-electrons (Quantised Singularities in the Electromagnetic Field, Dirac 1931).

This almost reluctant conclusion was not based on a full understanding of what the negative energy states were, but rather the fact that the entire theory, which was so beautiful in other ways that it was hard to resist, depended on them. It turns out that to properly understand the negative energy solutions requires the formalism of quantum field theory (QFT). In this description particles (and antiparticles) can be created or destroyed, so it is no longer necessarily appropriate to consider these particles to be the fundamental elements of the theory. If the total number of particles in a system is not conserved then one might prefer to describe that system in terms of the entities that give rise to the particles rather than the particles themselves. These are the quantum fields, and the standard model of particle physics is at its heart a QFT. By describing particles as oscillations in a quantum field not only do we have an immediate mechanism by which they may be created or destroyed, but the problem of negative energies is also removed, as this simply becomes a different kind of variation in the underlying quantum field. Dirac didn’t explicitly know this at the time, although it would be fair to say that he essentially invented QFT, when he produced a quantum theory that included quantized electromagnetic fields (Dirac, 1927, The Quantum Theory of the Emission and Absorption of Radiation). This led, eventually, to what would be known as quantum electrodynamics. Dirac would undoubtedly have been able to make much more use of his creation if he had not been so appalled by the notion of renormalization. Unfortunately this procedure, which in some ways can be thought of as subtracting infinite quantities from each other to leave a finite quantity, was incompatible with his sense of mathematical aesthetics.

So, despite initially struggling with the interpretation of his theory, there can be no question that Dirac did indeed explicitly predict the existence of the positron before it was experimentally observed. This observation came almost immediately in cloud chamber experiments conducted by Carl Anderson in California (C. D. Anderson: The apparent existence of easily deflectable positives, Science 76 238, 1932).  Curiously, however, Anderson was not aware of the prediction, and the proximity of the observation was apparently coincidental. We will discuss this remarkable observation in a later post.

*This post is adapted from an as-yet unpublished book chapter by D. B. Cassidy and A. P. Mills, Jr.

 

References:

Griffiths, D. (2008). Introduction to Elementary Particles Wiley-VCH; 2nd edition.

Farmelo, “The Strangest Man: The Hidden Life of Paul Dirac, Mystic of the Atom” Basic Books, New York, (2011).

Dirac, P.A.M. (1927). The Quantum Theory of the Emission and Absorption of Radiation, Proceedings of the Royal Society of London, Series A, Vol. 114, p. 243.

P. A. M. Dirac, Proc. Phys. Soc. London Sect. A 117, 610 (1928).

P. A. M. Dirac, Proc. Phys. Soc. London Sect. A 126, 360 (1930).

P. A. M. Dirac, Proc. Phys. Soc. London Sect. A 133, 60 (1931).

Anderson, C. D. (1932). The apparent existence of easily deflectable positives, Science 76, 238.

A.  Aspect, D. Jean, R. Gerard (1982). Experimental Test of Bell’s Inequalities Using Time- Varying Analyzers, Phys. Rev. Lett. 49 1804

P. Kusch and H. M. Foley “The Magnetic Moment of the Electron”, Phys. Rev. 74, 250 (1948).