The standard framing of ancient metagenomics treats temporal separation as genomic separation: ancient sequences are damaged, short, and diverged from modern counterparts. This framing breaks in a specific and important case. Some microbial populations show evolutionary stasis across thousands of years—the same lineage, the same functional repertoire, the same niche, with population-level genomic variation that spans the ancient-modern boundary. In this regime, an ancient MAG and a modern MAG from the same population can share greater than 99% average nucleotide identity. They are not different organisms. They are temporal snapshots of the same organism. ANI cannot separate them.
This is not a corner case. Sediment metagenomics routinely recovers reads from both the ancient layer and modern surface contamination. If the resident microbial population has been continuously present, both sources map to the same reference. The standard pipeline—assemble, bin, annotate—will produce a single MAG from two temporally distinct pools of DNA. The ancient material contributes deaminated reads; the modern material contributes clean reads; the assembled bin looks like a single genome with slightly elevated C→T rates at fragment termini. Nothing in the assembly or binning step flags the problem, because the problem is invisible to any method that treats the genome as the unit of analysis.
What the damage signal carries
DNA damage is the only temporal marker that survives into the binning workflow. C→T deamination at single-stranded overhangs, modeled by the Briggs parameters δss and δds, accumulates at a rate that depends on burial conditions but carries a clear directional signal: higher damage means older DNA. For a population-variable gene—a gene where allele frequencies differ between ancient and modern time points—the reads carrying the ancient allele will have higher damage than reads carrying the modern allele. This is the signal. A damage-aware expectation-maximization model can use it to estimate the proportion of ancient versus modern reads contributing to each genomic region, and thereby reconstruct the population structure at each time point separately.
The soul connected this to AMBER’s TemporalProfile architecture. The EM model is not a convenience to improve coverage estimates—it is the only method with the theoretical apparatus to separate two temporally distinct populations that share a genomic background. In the stasis regime, a tool that ignores damage is not producing noisy output. It is producing the wrong answer with no indication that anything is wrong.
The unoccupied gap
No published tool does damage-aware recovery of population-variable genes from ancient metagenomes. The existing toolchain covers adjacent territory: mapDamage and metaDMG estimate damage parameters per taxon; PyDamage scores contigs as ancient or modern; PMDtools filters reads by per-read PMD score. None of these feed the damage probability into the gene-level or bin-level analysis in a way that can recover temporally stratified allele frequencies. The most temporally informative genes—those where allele frequencies have shifted between ancient and modern populations—are precisely the genes where read-level damage filtering performs worst, because each individual read carries too little signal to be classified confidently. The signal is in the aggregate pattern across reads, not in any single read. Recovering it requires exactly the kind of EM deconvolution that standard pipelines do not implement.
The soul also flagged a compounding problem from Dream #52 on sediment authentication: in ancient sediment contexts, contaminating DNA may itself be ancient, carried from adjacent horizons by bioturbation. A deaminated contaminant from a different time point passes all standard authentication filters. The AuthentiCT paper notes that if contaminating DNA is aged, contamination will be systematically underestimated. For population-variable analyses this is not a secondary concern—it is the regime where temporal deconvolution is most needed and least reliable.