Math: Fourier, Heisenberg, and the eye of the raptor

by alefnot

Community

(This content is not subject to review by Daily Kos staff prior to publication.)

Sunday, Sep. 09, 2012 Sunday, Sep. 09, 2012 at 1:04:58pm PDT

At the zoo the other day I was looking at a pair of magnificent Steller's Sea Eagles (hoping they were rescue birds, for their cage was pretty small for such masters of the air). Among the most amazing things of this amazing creature are their eyes.

We humans have marvelled at the vision of eagles, hawks, and other birds of prey for millennia. Raptor vision is something of a metaphor for acuity of vision: to be "eagle eyed" is to have very sharp vision, there is a commercial ball-tracking technology called Hawk-Eye, etc etc. And certainly raptors have very large eyes for their head size, with huge irises and large black pupils. In fact, in typical daylight conditions, an eagle might have a pupil 9mm in diameter, a great horned owl 15mm, whereas ours would only be 2.5mm. Here are a couple of bird portraits from Wiki Commons:

Hmm, large pupils in full sun -- what's that about? These are hunting birds, they have to able to distinguish small prey from hundreds of feet up against a welter of background noise. I remember kayaking over sand flats in Florida once, in about a meter of water. Not far away an osprey dropped like a rock from at least 50 feet up and came up with a flounder, a well-camouflaged fish I could barely see from my vantage point of 1 meter above the water. So these creatures obviously have excellent visual acuity -- but what does a large pupil have to do with that?

First, a disclaimer. I know squat about raptor biology, raptors in general, and not much more about biology. Apparently their eyes are densely packed with rods, which is one half of the reason they have such sharp eyesight. But this post isn't about biology, it's about physics, which is the other half.

Fourier transforms

Let's back up a bit. Let's say you have a pure tone, the purest you can imagine, even purer than that. Let's say the tone is A above middle C, at 440 Hz, with no overtones. Finally, let's say that this tone has existed unchanged since before time itself, and will continue unchanged forever and ever amen.

Question 1: how would you describe this tone as a function of time?
Answer 1: it's a constant, duh. It never changes so by definition it is constant.

Question 2: how would you describe this tone as a function of frequency?
Answer 2: it's a single valued spike, duh. It only has one value.

If that seems obvious to you, congratulations, you have an intuitive understanding of the Fourier transform. The Fourier transform of a constant is the spike function, and vice versa. The spike function (technically it's a distribution or generalized function, but we don't care about such distinctions here) in fact has a special name, the Dirac delta function.

Evidently a Fourier transform (FT, hereinafter, as they say) relates one way of looking at something to another: both FT pairs are equivalent descriptions of a phenomenon, although in real life, sometimes it is easier to use one over the other. In our example here, the tone is a constant in time space, a delta function in frequency space (in fact we'll need a little modification -- see below -- but you get the idea).

You can safely ignore the blockquote bits if you like
Mathematically you may or may not remember that a periodic function can be written as a sum of sines and cosines -- this is known as a Fourier series. In fact we can write an arbitrary function (so long as it is in some sense bounded -- FTs are integrals and for physical reasons cannot be allowed to blow up) as such a sum. This simplifies tremendously using Euler's relation: e ^ix = cos x + i sin x.

So if we write the Fourier transform of some function f(t) as F(ν), where t is time and ν is frequency, then

F(ν) = ∫f(t) e^{-2π i ν t} dt

where the integral runs from -∞ to +∞ (spans the entire space). Remember that an integral is just a continuous sum. Frequently we simplify by letting ω = 2πν because obviously 2π = 1 for sufficiently large values of 1, so usually you'll see the integrand as f(t) e^{-i ω t} dt, but if we actually have to calculate an observable then we do have to keep track of those 2πs. In any case, we can also write the inverse transform

f(t) = ∫F(ν) e^{2π i ν t} dν

only now we're integrating over all of frequency space (the dν gives it away), whereas before we were integrating over all of time space. F(ν) and f(t) are Fourier transforms of each other.

BTW, I hope at least some of you are crying "Foul!" on the single tone example. If it's a single tone at 440 Hz, surely in time the amplitude should be oscillating 440 times/second. So it is, and as a function of time it should not be a constant but an oscillation, say cos(f t), where t is time and f is the frequency, here 440 Hz. What is the FT of the cosine function? Well, cosine is symmetric about zero, so that cos(x) = cos(-x), so intuitively we should expect spikes at +/- f (the frequency). And indeed FT(cos(f t)) = [δ(ν - f) + δ(ν + f)]/2, the normalized sum of delta functions at +/- f, where ν is the frequency variable and f the specific frequency of oscillation, 440 Hz in our example.

Here's a list of some common FT pairs.

Back to our constant function, and let's make it a true constant this time (no oscillations). What happens if that constant changes? Perhaps it describes a physical phenomenon, which has a beginning and an end. This is then a rectangular function, and to simplify things let's make it symmetric about zero. This no longer has one frequency component in it (it's not a constant), so intuitively we feel that its FT should broaden out to include those components. That's exactly what happens; if you want to be more quantitative about it, it goes as the sinc function, sin(x)/x. If you think in pictures:

Don't worry about the negative values: the FT is a (continuous) sum and you need the negative amplitudes as well as the positive to get the interference terms to add up correctly and reproduce your original function.

There's a trend here: to describe a narrow distribution in one variable (say time) you need a wide distribution in the Fourier conjugate variable (here, frequency). If you're describing a narrow distribution by a Fourier sum you need a LOT of terms in the sum to describe it. You need fewer terms to describe a broad distribution. Skinny in one space means fat in the other. If you can remember that, you've got the gist of it. In fact, the skinnier you get in one space, the fatter you get in the other, until you reach the extrema of the delta function and the constant (it goes both ways, BTW: just as a constant in time is a spike in frequency space, a spike in time is a constant in frequency); conversely, not so skinny in one space is not so fat in the other.

Fourier conjugate spaces
In the definition of a Fourier transform is an exponential, e^{2 π i variable1 variable2}. Now e^anything is just a number, so the ^anything (the "argument of the exponential") cannot have units. 2, π, and i are just numbers; it follows then that variable1 and variable2 must have inverse units or both be unitless. In our case above, our variables were time, which has units of seconds, and frequency, which has units of 1/seconds: the product (time)(frequency) is therefore unitless. In principle, any pair of variables with inverse units can be Fourier conjugate variables. Distance and spatial frequency (how many regularly spaced things you can jam into a unit of distance) are also Fourier conjugates.

The Heisenberg Uncertainty Principle
Let's say you are trying to make a measurement, perhaps where something is. The measurement precision is never infinite, so there will be an uncertainty involved. That is, your measurement is not a single number but a distribution about that number. If your measurement is a good one, that distribution will be narrow. You can also describe that measurement in the Fourier conjugate space, which, if your measurement is a good one, will require a lot of spatial frequency components. The location of something is defined by its position in space, and a narrow distribution (high precision) in position space means a broad distribution in spatial frequency space. Skinny in one space is fat in the other. Spatial frequency space, BTW, is sometimes known as "wavevector" or "k-vector" space, and is labeled "k".

It follows then that you cannot simultaneously measure with infinite precision the values of Fourier conjugate variables. The better the precision in one space, the worse in the other. How much worse? Well, the best you can do is given by the Fourier transform of the distribution in the first space -- in that case your measurements are said to be "transform limited". (You can always do worse however.) This is the Heisenberg Uncertainty Principle, which has its roots in the nature of Fourier conjugate variables. There are a number of ways to describe the Uncertainty Principle quantitatively, depending on your measure of uncertainty: if we use the standard deviation σ of the distributions as our measure of uncertainty it turns out to be

σ₁ σ₂ ≥ 1/4π

where the equality holds only when you are transform limited. At the transform limit, you can improve σ₁ but then σ₂ gets worse; decrease σ₂ and σ₁ increases. Make one uncertainty infinitely sharp (delta function) and the conjugate uncertainty becomes infinite broad (a constant). Skinnier in one space means fatter in the other. All of this applies only to Fourier conjugate variables: if the two quantities of interest are not Fourier conjugates, there is no uncertainty relationship between them and Heisenberg does NOT apply.

We hear about time-energy uncertainty, and position-momentum uncertainty, but what we've seen are time-frequency and position-spatial frequency. How do we reconcile them?

We know that energy E = hν, where ν is frequency and h is Planck's constant = 6.626 x10^-34 J sec. The units follow because E has units of Joules (1 J = 1 kg/(m²sec²)) and ν is in 1/sec so h must be in J sec (NB: if your units do not follow, you're so not right you're not even wrong. Give it up and start over.) So uncertainty in time-energy and time-frequency are equivalent statements.

Position has units of meters, m. Momentum is usually given the symbol p and is given by the product (mass) times (velocity) which has units of (kg)(m/sec). k-vector space has by definition units of 1/m. Energy is proportional to frequency, so maybe momentum is proportional to k-vector: p = Ck. Working out the units of our proportionality constant C:

kg m/sec = C k = C (1/m) or or C has units of kg m²/sec = (kg m²/sec²)(sec) = J sec.

Well that's interesting, C has the same units as Planck's constant, h -- J sec -- and in fact has the same numerical value too: C is h! Momentum p = hk. The position-k-vector and position-momentum uncertainty statements are also equivalent. (BTW, the unit J sec is known as action and is a key quantity in physics. But that's another post.)

Are there any other such pairs? In the modern description of physics, everything can be described by at least one of four fundamental properties: mass, space, time, electric charge. Their units are kg, m, sec, and Coulombs, respectively. So force = F = ma = (kg)(m/sec²). Pressure is F/Area = kg/(m sec²). We've seen that Energy = kg m²/sec². Etc etc. We've accounted for space and time; in principle mass and 1/mass (or h/mass) would be conjugates but I know of no such thing. Same for charge.

A real world example
OK, let's make that a little less abstract. Consider the following image:

It's just the back of Walter Piston's Orchestration, set at a slant. The pictures on the right are crops of the center of the image on the left, the only difference between them being the aperture of the lens used, as expressed by f-number (the shutter speed also changed to account for the difference in aperture).

(The f-number is simply the ratio of the focal length of the lens to the aperture -- technically the entrance pupil -- of the lens. If the focal length doesn't change, as is the case here, an increase in f-number means a decrease in aperture diameter: f/4 means that the diameter of the aperture is 1/4 of the focal length; for f/36 it is 1/36th. Big f-numbers = small apertures, and vice versa. Here's a picture:

If you're wondering why the lens goes to f/36, actually the setting is f/32. But in the book image the magnification was high enough that the actual f-number was 36.)

Anyway, the top right image crop of the book, taken at f/4, is much sharper at the focal plane (about where the words "Art" (of Counterpoint) and "Guide" (to Musical Styles) are located) than the bottom right, taken at f/36. Photographers among you know that the cause is diffraction, which progressively softens the image as the f-number increases and aperture size decreases. But what causes diffraction?

Diffraction and Fourier Optics
There are any number of ways to answer that question, but the one I think is most intuitive is by way of Fourier optics.

Let's say you're trying to image a point. A point has zero dimension, and if your imaging system (lens) were perfect the image would also be a point. Lenses aren't perfect in real life, and will show imperfections -- here's a list of some common ones. That is, instead of a point, your image will be smeared out somewhat, in a manner described by the point spread function. Yet even if you could eliminate all aberrations you would still be left with an image that's smeared out a bit, but smeared out in a very specific pattern known as an Airy disc (assuming a circular cross section for your lens), a bright central spot surrounded by faint and increasingly fainter circles. The diameter of this disc -- the amount the image is smeared out by -- is inversely proportional to the diameter of the lens system. A lens for which all aberrations are smaller than the diameter of the Airy disc is said to be diffraction-limited. There is no way to eliminate diffraction -- a lens showing a small and clean Airy disc diffraction pattern is as good as it gets.

Here's a plot of the Airy disc for three different lens dimensions r.

The inset is a 2-D representation of an Airy disc (courtesy of wikipedia): this is how an image of a point would look at the image plane. Note the bright central spot and the faint rings. The plot is a cross section of the irradiance. θ is the angular deviation from a straight line from the center of the lens to the image, a measure of the "smearing out" you get from diffraction. The smaller this angle, the narrower the distribution at the image plane and the less smearing you have. Note that the distribution gets narrower with larger r (the radius in arbitrary units): the image is more point-like -- it looks sharper and can resolve more detail -- with the larger aperture. The plots aren't normalized for irradiance (area under the curve), BTW: if they were a narrower plot would have a higher peak value than a broader one. That is, a sharp image looks brighter; a fuzzier image looks more diffuse.

In practice most lenses are sharper at less than full aperture because the larger the aperture (actually, the smaller the f-number) the harder it is to correct for other lens aberrations. That is, they're not diffraction-limited at the wider apertures.

What's happening is that light illuminating the object (here, our point object) reflects off the object, and some of the reflected light is captured by our lens and imaged. But not all of it: the physical dimension of the lens truncates the wavefront coming off the object. This wavefront is carrying key information, namely the spatial frequency information necessary to reconstruct the image of your object. Remember, you're adding Fourier components (here, the spatial frequencies) to construct the image -- the Fourier transform is an integral, which is just a really really fine-grained summation. The smaller the physical dimension of the aperture, the more information you lose and the less faithfully you can reconstruct the image (the bigger the Airy disc, or "spot size"). Specifically you are losing the high spatial frequency components of the original waveform. Mathematically, the diffraction pattern at the image is given by the Fourier transform of the field distribution at the aperture of your lens system (in fact, the FT of the aperture function gives you the diffraction pattern of the electric field at the image plane. What you actually see is the irradiance or intensity, which goes as the square of the field. That's why the FT(top hat) has negative amplitudes but the Airy function does not.) A perfect aperture is described by a "top-hat" function: perfect transmission through the aperture, zero transmission outside it. Go back to the plot of the Fourier transforms above and note that our truncated constants are top-hats, and that the Fourier transforms of the top-hats get wider as the top hats get narrower. Now, fat in one space is skinny in the other, so a wide aperture means a small spot size (in the absence of other aberrations), a small aperture means a large spot size. The dimension of the Airy disc determines how well the lens can resolve detail, so a wide aperture means greater resolution. That's why the photo above taken at f/4 (wider aperture) is sharper than the one taken at f/36 (narrower aperture). You're losing less information through the wider aperture and are therefore better able to reconstruct the image of the object. And that's why you cannot eliminate diffraction: the presence of any obstruction to the wavefront, say by having a finite lens dimension, means that you've lost information and therefore cannot perfectly image the object.

Alternatively you can argue from a position-momentum perspective. The finite lens aperture puts a restriction on the location of a photon as it passes through. There is a corresponding uncertainty in momentum, which as we've seen is proportional to the velocity. The speed of light is the same in any given medium, so the uncertainty is found in the directional component of the velocity -- recall that velocity is a vector, with magnitude (speed) and direction -- and the smaller the aperture the greater the uncertainty in direction: the image smears out.

It should be clear now why raptors have such large eyes -- they need them to be able to resolve prey from altitude, to distinguish prey from background noise. Raptor eyes also have a very high density of photoreceptors (rods) in the retina -- but that's only useful if the lens system has sufficient resolution to take advantage of that density. It would be a waste of resources to have have higher photoreceptor density than your lens can resolve. This is also why the pupils of raptor eyes are wide open even in bright sunlight -- that is necessary to maintain resolution. If the pupils constricted in bright light, as human pupils do, they would lose the ability to resolve detail at such high levels. How raptors aren't then blinded in full sun, as humans would be if they kept their pupils wide open, is a different question. But you'll have to ask a biologist about that, because I've no clue.

It's not just raptors, either: the best eyesight in the world just might belong to giant and colossal squid, which have the largest eyes on the planet with lenses up to 90mm in diameter (the pupil would be somewhat smaller). No doubt they require such huge eyes for light gathering in the deep dark of the abyss. Perhaps there's an advantage to high resolution down there as well -- faint light sources that are well resolved are easier to see than faint smudges. But you'll have to ask a biologist about that, too.