The Cepheid Period-Luminosity Relation


Goals of this project

Prior to starting this lab, you must understand the astronomy at the AST101 level. Check the references, or other basic astronomy references you may find in the library. The introductory quiz will be at the AST101 level.

You must also have read the IDL primer. This is a data intensive lab and you will not be able to waste much time the first period reading the notes.

Grading Policy

Grading is based on the quality of the data analysis, the discussion of the uncertainties, and the answers to the questions at the end of this page.

I. Introduction

The state-of-the-art astronomical detector before the late 19th century was the human eyeball. The eyeball is a marvelous detector, with excellent dynamic range and color sensitivity, but it has limited sensitivity and poor recording ability. The refinement of telescopic technology helped with the sensitivity limitations, but it was the invention of the photographic plate that permitted one to amass and study large amounts of data in a coherent fashion.

Figure 1: Large Magellanic Cloud Figure 2:Henrietta Leavitt
Image by David Malin

In 1885 the Harvard College observatory begin surveying the sky. It did so fairly regularly until about 1950 (with some plates taken up to 1983). Over half a million of these plates are stored in the Harvard plate stacks. One of the early targets was the Large Magellanic Cloud (LMC; see above left image), a nearby (65 kilo-parsecs) dwarf irregular galaxy. The LMC is slowly merging with the Milky Way.

Henrietta Leavitt (right image above); also see this site, who held the position of "computer" at the Harvard College Observatory, began classifying the variable stars in the LMC in 1905.

Stars were not well understood a century ago, and much of astronomy was devoted to discovering and characterizing the variables. The astrophysical significance of studying variable stars in the LMC is that all the stars in the LMC are roughly the same distance from us. Distances are one of the most fundamental, but most difficult to measure, quantities in astronomy. While the distance to the LMC was not well known, it was appreciated that all the stars are at the same distance, with a spread in distance (and hence an uncertainty on any individual star) of less than 10%. Therefore, one can compare distance dependent quantities (like luminosities), which one cannot do easily for the brighter stars in our own Milky Way galaxy. Luminosities are related to the star's size, since to a good approximation stars radiate like black bodies, with L= 4πR2σT4, where R and T are the radius and effective temperature of the star, respectively.

Most, if not all, stars are variable. Some, like the novae, are spectacularly variable, while others are barely noticeable even upon close inspection. Among the plethora of types of variables are:

The Cepheid Variables
In the course of her work, Ms Leavitt identified some 2000 variable stars in the LMC. Of these, only about 20 were Cepheids. Cepheids (more precisely, the classical Cepheids, or δ Cephei stars) are named after their prototype, δ Cephei, a 4th magnitude F5 supergiant. δ Cephei has a period of 5.37 days. Its light curve is shown in Figure 6.

Figure 6: The light curve of δ Cephei

Ms Leavitt plotted the apparent magnitude of the LMC Cepheids as a function of the pulsation period, and found a correlation. Since all the stars are in the LMC, and are at the same distance from us, the apparent magnitudes are an accurate measure of the true relative luminosities of the stars. She found a relation similar to that shown in Figure 7. She actually used apparent magnitudes; the conversion to absolute magnitudes (shown in Figure 7) requires an estimate of the distance to the LMC.

Figure 7: The Cepheid period-luminosity relation

The importance of such a relation, once it is calibrated, is that it provides a simple way to determine the distance to a Cepheid variable and, hence, to the cluster or galaxy that contains it. One merely determines the period, and observes the mean magnitude m. One looks up the absolute magnitude M that corresponds to that period. The difference between the apparent and absolute magnitudes, m-M, knows as the distance modulus, is equal to 5 log(D) -5, where D is the distance in parsecs. This relation is easily derived from the inverse-square law, knowing that magnitudes are log2.5 of the brightness, and from the definition that the absolute magnitude equals the apparent magnitude at a distance of 10 parsecs. Hence a measure of the period directly yields the distance.

Theory of the Cepheid Period-Luminosity Relation

A fluid object (including a star) of radius R has a fundamental pulsation period P ≈ 2R/vs
where vs is the sound speed. This is simply the time it takes a sound (or pressure) wave to cross the stellar diameter. vs2=dP/dρ, where P and ρ are the pressure and density, respectively. In an isothermal gas P=ργ, so vs2=γP/ρ. The virial theorem tells us that GM2/R (the potential energy) = 3 M kT/μH (twice the kinetic energy).
Using the ideal gas law, P/ρ = kT/μH, we find that GM2/R = 3 vs2 M/γ.
γ is 5/3. The pulsation period is then P ≈ √(R3/GM).
Plugging in numbers, P ≈ 1600 (R/R0)1.5/(M/M0)½ seconds
where R0 and M0 refer to the radius and mass of the Sun.

An isothermal star of 100 solar radii and 10 solar masses will have a pulse period of about 5.8 days.

One can derive the period-luminosity law as follows: Assume a sample of stars of the same mass (classical Cepheids have masses of 5-10 solar masses), and the same temperature (the instability strip is approximately vertical in the H-R diagram). The luminosity then scales as R2. The pulse period scales as R3/2 (see above), or as the luminosity L3/4. Luminosity is proportional to 2.5-V, where V is the magnitude. Therefore one expects to find that log P ≈ 0.3 V

Calibrating the Cepheid Period-Luminosity Relation

The only directly measurable distances in astronomy are those made by trigonometric parallax. As observed from the Earth, stars trace appear to trace out the motion of the Earth's orbit around the Sun. The semi-major axis of the ellipse has an angular size of 1/D arcsec, where D is the distance in parsecs. Trigonometric parallax is useful to distance of about 50 pc for ground-based optical observations, and a few hundred pc for space-based, or radio VLBI, observations. The accuracy is limited by smallness of the motions. There are no Cepheids within this distance.

Beyond distances of a few tens of parsecs, one must cobble together distances from a variety of techniques. One can use the trigonometric parallaxes to determine the luminosities of stars on the main sequence. Then, the color or spectral type and observed magnitude of a main sequence star is sufficient to determine its true distance (this is called the spectroscopic parallax). Cepheids are not main sequence stars.

However, some galactic Cepheids are found in clusters of stars. Like the stars in the LMC, all the stars in these clusters are at the same distance from us. One can us the spectroscopic parallax of the main sequence stars in the cluster to determine the distance to the cluster, and the Cepheid. The distance and observed magnitude then directly give the luminosity of the Cepheid, and a calibration of the period-luminosity relation.

There are of course many complications, most of which are beyond the scope of this introduction. However, there is one very important caveat. The galactic calibration of the Cepheids is inapplicable to the LMC Cepheids. This is because there is a large difference in mean metallicity between the two galaxies: stars in the LMC are metal-deficient relative to our Galaxy. This affects the opacity, and the periods. Technically, the stars in the LMC are known as W Virginis stars, or population II Cepheids. Population I consists of metal-rich stars, including the Sun. Population II is metal poor, representing a population of stars that formed from a less enriched interstellar medium. At a given period, W Vir stars are less luminous than are classical Cepheids. Inadvertent application of the classical Cepheid P-L relation to W Vir stars leads to a to a large overestimate of the distances.

Cepheids in Modern Astrophysics

The study of the phenomenology of variable stars reached its zenith in the early 20th century. Of what use is the study of Cepheids today? Simply put, because of the elegance and simplicity of the period-luminosity relation, they form the principle rung in the cosmic distance ladder.

The Cepheids are giants and supergiants which lie in the instability strip. Because they are intrinsically luminous, they can be seen to great distances. One of the key projects in the early years of the Hubble Space Telescope was to find and measure the periods of Cepheids in nearby galaxies, out to distances of about 20 Mpc (6 x 1020 km). With an accurate calibration of the period-luminosity relation, these observations pinned down the Hubble constant H0 to be 72 ± 2 km/s/Mpc.

Cepheids have allowed us to determine the scale of the universe.

II. The Tasks at Hand

Your task is to reproduce Henrietta Leavitt's discovery of the Cepheid period-luminosity relation. A brief recipe of the tasks you need to go through is:
  1. Read and display an image.
  2. Use the FITS header to determine the astrometric solution.
  3. Find the Cepheids
  4. Use aperture photometry to determine their relative brightness changes
  5. Use a time-series analysis to determine the pulsation period
  6. Determine the Period-Luminosity Law

II. The Data

In the fall of 2003 we obtained a series of V and I band images of two fields in the Large Magellanic Cloud. The data were taken with the SMARTS 0.9m reflector at the Cerro Tololo Interamerican Observatory by service observers. The detector is a 2048 x 2048 pixel CCD array. We obtained images in both the V and I filters. The exposures were one minute in each filter. Two exposures were taken in each filter, with the second offset from the first by 10 arcsec East and North to reduce the chance that any point on the sky will be lost to a chip defect.

The data have been bias-corrected, trimmed, and flat-fielded. The two images in each filter have been co-aligned and summed.

The CCD plate scale is 0.4 arcsec/pixel; the full field of view is 13.6 arcmin on a side. The fields (Table 1) were chosen to maximize the number of Cepheid variables in the field. Because the pointing of the telescope is not exact, and because there was little to be gained by spending the time to tweak the positions to match perfectly, we coaligned the images after the fact and trimmed off those areas that were not observed on every night. The resulting image fields, and the number of known Cepheids contained therein, are given in Table 1.

We obtained images on 10 nights, from 25 November 2003 through 12 December 2003. There are 14 I band images, and 15 V band images, for each field.

Table 1
LMC Field 15 27 27.3-69 50 3010.5 x 10.522
LMC Field 25 40 31.9-70 21 0010.5 x 10.133

The data are stored on one CD which should be easy to locate in the lab room. The data have also been uploaded to directory /home/student/cepheid/ on the four astronomy lab computers. There are two main subdirectories, field1 and field2. Each of these subdirectories contains 29 fits images, named lmcfx_yymmdd_z#.fits, where x is the fie1d number (1 or 2), z is the band (v or i), and yymmdd is the date of observation. # is the number of the observation for that night (1, 2, or 3), and is absent if there is only one observation.

All the images are stored in FITS format.

You should probably create a subdirectory on your PC, and then copy the contents of the CD (or of the /home/student/cepheid/ directory) into your directory.

A third directory on the CD (and in /home/student/cepheid/) contains the backup versions of the finding charts.

III. The Known Cepheids

The 55 known Cepheids in these images are presented in Tables 2 and 3.
Table 2
Cepheids in Field 1
35307 5 27 40.14 -69 55 45.7 FO 14.90 15.597
44256 5 27 36.82 -69 53 42.8 FU 14.11 14.878
44391 5 27 57.84 -69 53 48.5 FU 15.79 16.548
53226 5 27 34.14 -69 51 22.4 FU 13.94 14.667
53242 5 27 32.77 -69 49 14.8 FO 14.15 14.846
62624 5 27 55.13 -69 48 3.9 FU 14.11 14.826
62742 5 27 49.69 -69 46 17.6 FO 15.84 16.461
162132 5 28 29.78 -69 52 36.7 FU 14.19 14.967
162135 5 28 45.59 -69 52 6.6 FU 14.40 15.188
162180 5 28 23.25 -69 52 3.8 FO 14.81 15.422
170203 5 28 35.24 -69 51 23.4 FU 14.28 14.974
170205 5 28 32.05 -69 50 54.2 FU 14.28 15.135
170223 5 28 23.03 -69 51 36.5 FO 15.26 15.899
170246 5 28 44.56 -69 50 5.0 FO 14.91 15.567
170248 5 28 43.69 -69 50 3.2 FU 14.71 15.546
295932 5 26 48.88 -69 51 33.6 FO 14.33 14.996
305691 5 26 50.13 -69 45 52.9 FU 14.17 14.970
408742 5 27 21.41 -69 52 19.4 FU 14.25 15.029
417848 5 26 59.59 -69 51 10.4 FU 14.37 15.114
417850 5 27 23.10 -69 50 57.9 FU 14.35 15.193
417853 5 27 5.12 -69 50 44.4 FU 14.04 14.821
427313 5 27 0.88 -69 47 46.7 FO 16.88 17.476
Table 3
Cepheids in Field 2
20975 5 40 23.77 -70 25 44.2 FU 15.49 16.261
24980 5 40 31.29 -70 23 55.1 FO 15.04 15.746
24988 5 40 32.66 -70 23 19.1 FU 15.08 15.852
25039 5 40 26.56 -70 21 46.7 FU 15.57 16.290
29186 5 40 31.94 -70 21 0.9 FU 15.17 15.942
29194 5 40 33.18 -70 20 27.6 FU 14.88 15.776
29197 5 40 26.82 -70 19 58.1 FU 14.56 15.414
29199 5 40 15.70 -70 19 53.7 FU 15.30 16.120
29260 5 40 32.47 -70 18 2.4 FU 15.30 16.024
81414 5 41 9.25 -70 24 10.6 FU 14.78 15.589
81430 5 40 43.10 -70 21 54.4 FU 15.20 15.949
81432 5 41 11.87 -70 21 55.0 FU 15.18 16.035
81440 5 41 6.62 -70 24 19.1 FU 15.59 16.458
81454 5 41 7.96 -70 22 55.3 FO 15.48 16.256
81459 5 40 52.09 -70 22 36.8 FU 15.91 16.681
85244 5 41 5.45 -70 19 36.7 FU 14.99 15.744
85260 5 40 43.90 -70 20 53.3 FU 15.64 16.398
85282 5 41 16.54 -70 18 26.1 FO 15.59 16.428
89131 5 40 58.13 -70 17 4.7 FO 15.43 16.573
130469 5 41 50.76 -70 25 5.9 FU 15.06 15.963
133771 5 41 50.57 -70 23 36.4 FU 15.30 16.074
133782 5 41 19.09 -70 22 50.7 FU 15.26 16.133
137407 5 41 26.59 -70 19 34.6 FU 15.69 16.553
137424 5 41 24.44 -70 17 48.0 FU 16.10 17.382
137680 5 41 47.74 -70 17 51.2 FO 17.33 17.980
140993 5 41 23.73 -70 16 58.0 FU 14.98 15.835
141015 5 41 35.32 -70 17 34.3 FU 15.51 16.303
141019 5 41 34.35 -70 17 22.6 FO 15.44 16.286
203767 5 40 11.00 -70 24 30.9 FO 15.45 16.181
207480 5 40 6.46 -70 23 31.7 FU 15.30 16.116
207506 5 40 8.33 -70 21 36.2 FU 15.43 16.347
211310 5 39 59.41 -70 19 46.6 FO 15.51 16.284
214843 5 40 2.02 -70 16 13.3 FO 15.69 17.313

These data are also provided as ASCII text files f1ceph.list and f2ceph.list, in the respective data directories on the CD.

IV. How to Find the Cepheids

IVa. Reading and Displaying the Images

To read in the images using IDL, you would use the READFITS function. For example, d=readfits(filename,header), where filename is the name of the fits file, d is the output data array, and h is the fits header.

Display the image using the TV command. The TV command is primitive: it does not scale the data, or resize the TV window. If the image is larger than the size of the monitor, you should probably rebin it (see this section) to a manageable size. Use the TVPLOT command or, for more control, size the plot window with the XSIZE and YSIZE keywords and do the flux scaling manually (see the "Using the Cursor to Measure Positions" section in the IDL primer).

The fits header is a string array. You can print it (e.g., print,header). There is a lot of information in the header. For the purposes of this lab, the most important information is

You can extract values from the fits header using the SXPAR function. For example, ra=sxpar(header,'CRVAL1') will extract the value of CRVAL1 and place it in the variable ra

IVa. The Astrometric Solution

You measure the locations of things in the images in terms of pixels. If you want to convert the pixels to (RA,Dec), or if you know the pixel coordinates of an object at a given (RA, Dec), you need the astrometric solution. This is in general a pair of multi-order polynomial equation of the form
RA = aiΣXi + biΣYj + cijΣYiYj
Dec = diΣXi + eiΣYj + fijΣXiYj
where then sums are over the indices i,j = 0 to n (n is generally no larger than 3), with the caveat that i+j ≤ n in the cross term.

In the case at hand, for all practical purposes the axes are oriented with +Y heading north and -X heading east. The b and d coefficients are zero. The cross terms cij and fij are zero. The distortion is minimal; the tangent-plane projection is effectively flat, so terms with i or j > 1 are negligble. The solution reduces to

RA = a0 + a1X
Dec = c0 + c1Y
Transform X so that (X,Y) are (0,0) at the center of the image. a0 and c0 are the coordinates at the center of the image (CRVAL1 and CRVAL2, in units of degrees of RA and DEC). a1 and c1 are the platescale (fits keywords XPIXSIZE and YPIXSIZE, in units of seconds of arc per pixel). If you rebin the image, remember to change the plate scale accordingly.

Note that the RA coordinates of the Cepheids in Tables 2 and 3 are given in hours-minutes-seconds RA format, which needs to be converted to degrees of RA. Furthermore, because the density of the right ascension grid on the celestial sphere varies with declination, the angular difference between two objects with different RA coordinates RA1, RA2 (e.g., RA and a0 above) and identical declinations DEC is

theta = (RA1-RA2) * cos(DEC).

IVc. Identifying the Cepheids

If you really want to reproduce the historical observation, you'll need to examine the images carefully enough to identify the stars that are variable. This was done historically by "blinking" two images - displaying then sequentially in rapid succession so that the eye picks out the variable objects. This is what Henrietta Leavitt did to identify the Cepheids in the LMC. It is the same technique used later by Clyde Tombaugh to discover Pluto.

If you'd like to try this technique, be wary of the following:

Remember that there are lots of kinds of variable stars. In fact, all stars vary at some level. Henrietta Leavitt found 20 Cepheids in the LMC - and 1750 other variable stars!

Using the Astrometry
The recommended method of finding the Cepheids is to use the coordinates in the Tables above, together with the astrometric solution from the fits header, to predict the X and Y coordinates. Then look at those coordinates in the image and see if there is a star there, or near there. Invariably, the answer will be yes, because the LMC has a very high stellar surface density.

To confirm the identity, check the finding chart (linked from Tables 2 and 3). The Cepheid is identified by the cross-marks.

Pattern Recognition
An inelegant, and not recommended solution, is to start with the finding charts and find the patterns in the larger star field.
Do I have to Use All the Cepheids?
No. But if you do the analysis correctly you'll write some generally-applicable software that will make the marginal cost (in time) of analyzing one more star small.

That said, there are some stars that are not worth doing. Some are very close to the edge of the field. If your extraction aperture goes off the image, your results will suffer. Some stars are in clusters or small asterisms (groups of stars). If you cannot place an aperture cleanly around the Cepheid, it may be worth skipping it. However, a faint star in the extraction aperture is no big deal. All you are interested in is the period, and a faint star will just add a small bit of noise.

V. Aperture Photometry

These images were taken under a variety of sky condition. Atmospheric transparency can vary significantly even on clear nights. The fluxes, or count rates, cannot be directly compared. Therefore, we employ a technique called relative photometry. The premise is that the true mean brightness of a large (N>>1) number of randomly-selected stars will not vary significantly with time. The effects of sky transparency (differing airmass, thin clouds, etc.) will affect all the stars in the image equally. Therefore, the ratio of the brightness (of the difference in magnitudes) between a target star and this mean of a large number of stars will represent the true brightness variation of the target star. Differential photometry entails the following steps:

Repeat this process for all the Cepheids in each image, and for each night. You may use the same set of reference stars for all the Cepheids in a given image. You MUST use exactly the same reference stars when comparing between different nights.

Use either the V or I band images; you need not do both. They should give the same result.

VI. Time-Series Analysis

Given a table of times and corresponding magnitudes, how do you find the period? Basically, you need to either fold the period on a number of trial periods, or do a Fourier decomposition of the data. You can find some illustrated examples on the AST 443/PHY 517 Timing Analysis page.

VIa. Period-Folding Techniques

Data folding is a brute-force technique used to find periods. The signal is assumed to be periodic, i.e., I = A + B * f(t-t0), where f is some periodic function. One defines a trial period P. The phase corresponding to this period P is the fractional part of the ratio t/P. At the correct period, a plot of I vs. frac(t/P) will give a smoothly varying function. For our purposes, A (the DC offset), B (the amplitude), and t0 (the time of zero-phase) are unimportant parameters. In period folding, one plots I vs. frac(t/P) for a range of trial parameters. One can eyeball the results (this is fine for getting close), but it can be very time-consuming, since in principle there are an infinite number of periods you could test. Computers are good for repetitious tasks like folding data, but they won't know when they've found it unless you define a statistic that will be minimized at the correct period. Two of these are:
Note that neither of these techniques makes any assumption about the shape of the light curve - it need not be sinusoidal.

VIb. Fourier Decomposition Techniques

The Discrete Fourier Transform (DFT) is a deconvolution of a data set into a sum of terms Ansin(L/n), where L is the length of the set and n is an integer. An is the amplitude of the sinusoidal term at frequency n. The Fast Fourier Transform (FFT) is an algorithm that can be used to generate these coefficients An. An FFT function is built into IDL.

FFT techniques are best for continuous data sets, and are most sensitive to sinusoidal variations. For non-sinusoidal variations, some power will be diverted into sidelobes. For unevenly sampled data, power is diverted into aliases of the true period with that of the sampling function. To find out the power of the sampling function, take an FFT of a uniform data set (all points with the same amplitude) at the actual observation times. This is called the windowing function. You should probably ignore any periods that stand out strongly in the windowing function. For example, if you observe every night at exactly the same time, the windowing data (and the FFT of your data) will have a strong peak at a period of 24 hours.

  • The Lomb-Scargle Periodogram:
    Periodograms have the advantage that the data are weighted by points, not by time intervals, so they are more appropriate than are FFTs for discretely sampled data.

    The power I as a function of frequency ω is given by

    I(ω) = 1/2 [ (∑ X(ti) cos ω(ti-τ))2/∑ cos2 ω(ti-τ)) + (∑ X(ti) sin ω(ti-τ))2/∑ sin2 ω(ti-τ))]
    where τ is defined by
    tan(2ωτ) = (∑ sin 2ωti)/(∑ cos 2ωti)
    and the sum ∑ is over the N points X(ti) at times ti.

    References: Scargle, 1982, ApJ, 263, 835 ; Horne & Baliunas, 1986, MNRAS, 302, 757

  • Caveats:

    You cannot find a period longer than your sampling range. Ideally, you will detect at least 2 complete periods (though 1.5 can be convincing).

    You cannot find a period shorter than your shortest sampling interval. In other words, the frequency cannot exceed the Nyquist, or sampling, frequency. There are an infinite number of shorter periods that can fit your data exactly!

    Aliasing is often a problem. Observations taken from one site on the Earth often show a 1 day period, because the data are always taken at night. (The period will be one sidereal day if the object is always observed when overhead). This is why we obtained 2 or 3 observations on some nights.

    In all cases, some of the power appears in the harmonics of the true frequency ωN, where ω is the frequency and N is an integer. If there is aliasing, there will be beating between the true period and the alias, giving power at 1/(1/P ± 1/A), where P is the true period and A is the alias period.

    VII. Pre-lab

    Having acquainted yourself with the above information, please complete the pre-lab for the experiment. Please, hand this to your astronomy instructor before you begin work on the project.

    VIII. Preparing Your Report

    Write up your analysis in the Physical Review Letters or the Astrophysical Journal preprint style.

    You report should address the following


    The data were obtained by SMARTS service observers, including Alberto Miranda.

    The data reductions were performed by Rustum Nyquist.


    Return to top

    Return to PHY 445/515 Astronomy page