A few summers ago I had the chance to work at Mote Marine Laboratory, in the Phytoplankton Ecology Group lead by Dr. Gary Kirkpatrick. This was the first time that I really considered doing research outside of traditional physics (in high school I had always imagined that my research career would consist of aligning lasers and soldering circuit boards), and so it was a really eye-opening experience that showed me how elegant the intersections between biology, computer science, and physics can really be.
The image at the head of this post is a scalogram. It’s the result of performing a mathematical operation called a wavelet transform on a one-dimensional data set. This particular data set was an absorption spectrum from a sample of live phytoplankton (spectrum shown below), which shows the amount of light that a sample of plankton absorbs as a function of the wavelength of that light. This is essentially a really precise way to determine the color of the plankton, a useful metric because different species of phytoplankton tend to absorb different colors. Part of the reason for this is because different species of plankton inhabit different depths in the ocean, and the color of sunlight (and thus the wavelength at which maximal photosynthesis occurs) changes with depth due to dispersion of short-wavelength light by seawater.
The scalogram is the result of taking short sections of the spectrum, and comparing them to short pieces of a reference function called a wavelet, here represented by a normal distribution that’s been truncated outside of some interval (more generally, any finite section of a smooth function will work as a wavelet—it just happens that a bell-shaped distribution is a convenient first choice in many problems). A large similarity between the wavelet and the section of the spectrum results in a dark color being shown on the plot—this indicates that that portion of the spectrum was shaped like the wavelet. As one moves along the vertical axis of the plot, the width of the wavelet is gradually widened and the transform is repeated, resulting in a plot with two explanatory variables (the width of the wavelet and the location in the spectrum), and one response variable (how much that part of the spectrum matched the shape of a wavelet of that size).
Why are scalograms interesting? The plot shows where the data had peaks (corresponding to pigments absorbing colors), but it also shows the characteristic scale of those peaks. Most plankton spectra have the large feature in the right hand side of this plot, which corresponds to the absorption of light by Chlorophyll A (the primary photosynthetic pigment found in most species of phytoplankton). The vertical axis allows one to see that Chlorophyll A absorption peaks also have a characteristic width of 6 (arbitrary units), identified by where the tree-trunk like shape was darkest. The advantage of looking at a plot that separates both the width and location of absorption peaks is that one can also identify more subtle features (like the faint, wide patterns at the top of the plot) which would be almost impossible to identify by looking at the absorption spectrum alone. Scalograms are thus used to increase the information that one can extract from a spectrum, by allowing features in the data to be identified by both their characteristic absorption color (the location of the peak) and their characteristic width. For a sample of water containing many different species of plankton, the added information conferred by taking a wavelet transform can be used to better separate out the many species that can be contributing the overall observed spectrum.