Genetic testing kit

Much to my surprise and suspicion, for Christmas this year I received a DNA testing kit. The Geno 2.0 kit from National Geographic is one of a handful of “test-at-home” kits currently on the market that allow you to send a sample of your DNA to a private lab to be sequenced and analyzed. While some kits specialize in finding unknown relatives or examining your medical risks, the Geno kit promises an “unprecedented view of your ancestral journey”—it focuses on comparing your DNA to stock samples from reference populations throughout the world, and using this information to determine the migration path your distant ancestors took. It also claims to determine your relationship to various hominids, like Neanderthals and Denisovans, whose DNA samples have recently become available.

The kit contains two cheek swabs and two vials of preservatives—I rubbed the swabs inside my cheek, deposited the cotton ends inside the vials, and sent them to Texas for six weeks of processing. The Geno kit appears to be processed at a different type of facility than other kits (such as 23andMe) because it looks for a different set of genetics markers. Due to its emphasis on determining distant lineages and ancestral relationships, the Geno kit focuses on haplogroups, genetic material that remains largely unchanged over many generations. These portions of our DNA are less useful for diagnostic purposes because they are less unique from person to person (ie, they cannot be easily used for genetic fingerprinting), but they instead are useful for determining the relationships among large populations who share similar traits. The two haplogroups of particular interest for the kit are the male Y chromosome and maternal mitochondrial DNA. The idea is that Y chromosomes don’t change much from generation to generation—my father’s Y is the same as his father’s Y, which is the same as that of every male ancestor in his family. Compare this to X and other chromosomes, which get jumbled and reshuffled in every generation by meiosis, in which chromosomes from the father and mother that serve similar functions (such as the chromosome for eye color) get mixed and matched to build a final offspring genome consisting of traits that are a mixture of those of the father and mother. Since males have only one Y chromosome (there is not another similar chromosome from the mother’s genome), this shuffling doesn’t happen during reproduction, and so the Y chromosome can remain largely identical over thousands of generations, save for the occasional random mutation. A similar idea applies to mitochondrial DNA, which is passed down on the mother’s side only. The idea is that mitochondria, the individual organelles within cells that provide energy, contain their own, unique genome from the rest of the cell, which may be a leftover from a time when mitochondria lived independently outside of cells. Since the first cell of every human arises from a mother’s egg, our mitochondria are copies of our mother’s mitochondria, and so the mitochondrial DNA of a single person traces her maternal ancestry.

This basically means that, if you’re male, the Geno 2.0 kit can isolate DNA unique to your father and his male ancestors, as well as your mother and her female ancestors, allowing it to greatly simplify the analysis necessary to determine your ancestry—without haplogroups, it would be impossible to determine whether a given mutation or pattern in the DNA occurred recently or thousands of generations ago, or which parts of the genome came from which side of the family. This means that it can generate custom plots like this one, showing the migration route of known early humans who carried my maternal haploid group (first image) and known groups who bore my paternal haploid (other image). My mother happens to have ancestors from more parts of the world than my father, and so it is unsurprising that her genes are better-traveled according to the plot—the original humans who carried her mitochondria traveled widely, and so they have descendants in parts of the world ranging from the Paraguay to Mongolia. My father has a more direct ancestry, which is reflected in the comparatively small range in which his Y chromosome is found.

maternal heatmap

A heatmap showing the density of my maternal haploid group among the current world population. My maternal ancestors travelled widely, as people with my maternal traits are found in high concentrations throughout the world.

paternal heatmap

A heatmap showing the density of my paternal haploid group among the current population of Eurasia. People with my paternal lineage tend to be concentrated in Northern Europe.

However, isolating the DNA itself would be useless without other sets of genomes to which to compare it, a task aided by the Genographic Project, a research effort which has accumulated a database of thousands of haplogroup DNA samples for very specific subpopulations of current humans. Representatives of the Genographic project have gone into certain parts of the world that have relatively homogenous populations, such as Mongolia or New Zealand, and collected hundreds of DNA samples from various individuals. Assuming that the individuals tested are descended from people who also lived in the area for many generations (ie, no recent immigrants), then many of the individuals tested will have similarities in their haplogroup that suggest common ancestors on their male and female sides. From this pool of results, researchers can then construct a sort of “average” genome that is taken as representative of members of that population. For example, all people who are descended directly from the original Oceanic settlers in New Zealand might have a certain genetic pattern (let’s call it “Pattern A” to be pedantic) that has been passed down to all of their descendants in modern New Zealand. However, along the way, Dutch or British settlers likely intermarried with the descendants of the original Oceanians, which would introduce another Pattern B that would also be present in modern-day New Zealanders. A researcher looking at the sequences of many modern day New Zealanders might notice that all have Pattern A, but only some have Pattern B—this would allow her to infer that pattern B likely comes from a more recent ancestor (like the later settlers), while Pattern A comes from the original Oceanians. This means that the “average genome” taken as representative of the entire population would likely include Pattern A but not Pattern B, since Pattern A is a more common trait that represents earlier ancestors.

Naturally this process gets incredibly noisy when one has all sorts of different waves of settlers from different countries, as well as a millions of possible patterns that may or may not represent ancestral traits. While occasionally scientists will get lucky and find actual DNA from an early human to which they can compare modern DNA, most of the data available to construct ancestry trees comes from present populations, which already have thousands of generations of genetic intermingling behind them. The Genographic Project thus uses masterful statistical analysis and data processing to automate searching for common patterns in the genomes it samples—instead of looking for just single patterns, it also looks for relationships among groups of patterns. The only upshot to this guessing game is the sheer number of people available for testing today, which allows reasonably strong statistical confidence to be established in the most dominant trends. My results, shown below, correctly places my ancestry as dominated by a single group, Southern Indian, which agrees with my mother’s family history. At the same time, no genetic group is represented in more than 50% of my genome, in part because my parents are from two very different parts of the world and thus have very different reference populations.

A breakdown of the percent my haploid groups share in common with reference populations being studied for the Genographic Project. The dominant component, Southwest Asian, matches my mother's family's origin in Southern India.

A breakdown of the percent my haploid groups share in common with reference populations being studied for the Genographic Project. The dominant component, Southwest Asian, matches my mother’s family’s origin in Southern India.

In general, I am impressed by the unique analysis and processing that the Geno 2.0 kit offers, and I am glad to know that my test results will help improve the Genographic Project’s massive database of reference populations. However, my main concern with the test results is the sparseness of final information given the supposed rigor of the analysis. While I am hoping that more analysis of my results will arrive as better data processing technology becomes available, I would appreciate it very much if projects like these made raw sequence data available–it would be easy to do, and it would allow either after-market companies or savvy individuals to further process and analyze the information that the Genographic team has gathered.

Two dipoles radiating out of phase

I thought I’d write about one of my favorite problems from freshman year. It doesn’t require any math to understand, but it points out many of the risks and subtleties that can arise when physics problems make too many “ideal” assumptions:

Suppose that you have two simple antennae, each consisting of a single, straight length of copper wire through which a single frequency of alternating current is passing. The two antennae are positioned some fixed distance apart, and they are oriented in parallel. If a remote physicist operating the two antennae introduces an appropriate delay between their driving signals, causing the AC waveforms in the two antennae to be 90 degrees out-of-phase (but still at the same frequency), then the electric field in the region between the two antennae will vanish due to destructive interference. Yet the two antennae are still emitting radiation; they are still each drawing current, and presumably the power they consume to create this current must be transferred into the resulting fields they emit. So when the two antenna destructively interfere, where does the energy go?

The conventional response to this question (and the one my freshman lab TA insisted upon) is that the field cancels out in some regions—such as between two two antennae–but it increases by a compensatory amount in other regions where the waves constructively interfere, resulting in the net energy stored in the fields (throughout all of space) remaining constant. While this is certainly a satisfactory answer for most textbook treatments of dipole radiation, it remains troubling because one can easily envision a case in which there are no other regions in which the waves can constructively interfere—for example, if mirrors were used carefully. If, instead of antennae, one pictures two out-of-phase lasers pointed towards each other, then it becomes much less clear where the compensating region between the two lasers would be. However, there’s another way of looking at the problem that sheds light on this inconsistency:

Conventional electrodynamics tells us why the two copper antennae will generate radiation: the moving charges in each antenna beget changing magnetic fields, which in turn create electric fields via Faraday’s law, which then create new magnetic fields as they collapse. This cycle of electric and magnetic fields taking turns forming and collapsing gives rise to self-propagating electromagnetic waves—a collapsing electric field changes quickly, thus inducing a magnetic field which eventually collapses to produce a new electric field, and so on. The power transmitted by the wave is thus determined by the amplitude of the initial magnetic field generated by the antenna, which in turn is proportional to the current through the wire. This current is, in turn, determined by the resistance of the wire comprising the antenna—if the wire were a impossibly perfect conductor, then even the most minor voltage difference between the two ends of the antenna would generate an impossibly infinite current via Ohm’s law. Thus the power put into a single antenna is determined by the resistance of its wire, and this power exits the antenna as electromagnetic radiation—so far, energy has neither been created nor destroyed.

The subtlety of the problem arises because an additional effect that occurs when there are two antennae near each other. The electrons moving back and forth inside one antenna aren’t just limited in motion due to the resistance of the copper wire, but also by the electric field due to the other antenna. If the other antenna is in-phase (no delay), then the electrons will keep experiencing a Lorentz force in the opposite direction to the way that the antenna’s power source wants them to move, and so the power source will need to provide more power in order to generate waves of the same amplitude—the current, and therefore power, drawn by the antenna increases. In the case when the sources are out of phase, or the waves are destructively interfering, the opposite effect occurs: the field from the other antenna actually helps the electrons along, allowing a given electron to oscillate at a certain amplitude without requiring as much energy from the power source. In other words, placing the antennae out of phase reduces the effective resistance, or impedance, of the two antennae, and thus reduces their power consumption by an amount equivalent to the drop in the energy of the electromagnetic field due to their destructive interference.

In the laser formulation of this problem, this explanation would amount to the light from one laser damping excitations in the lasing medium of the other laser, resulting in less power drawn from the source.

What I like about this scenario is the manner in which a very common assumption used in physics problems—that power supplies are monoliths, steadily providing a fixed voltage and current to each component of a system–turns out to be the source of the ambiguity. An electrical engineer who places an ammenter in series with one of the antennas would immediately notice the drop in input power when the antennae are placed out-of-phase. But in the way that the problem is often presented, the power consumption of the antennae seems like a fixed quantity, giving rise to the supposed paradox.