A Genomic Portrait of India: Tracing Deep Ancestry, Health, and Disease Risk using DNA
Whole-genome sequencing of 2,762 individuals in India has unravelled the layers of ancestry, and high intermixing with Neanderthals and Denisovans, which allowed the reconstruction of roughly half of the Neanderthal genome and one-fifth of the Denisovan genome using only Indian genomes. Beyond revealing our history, the study offers insights into high endogamy practiced in India, the reason why some rare, harmful mutations have become more common in some Indian groups.
A groundbreaking genetic study published recently in the journal Cell provides the most comprehensive view of the genetic origins, evolutionary history, and disease-associated variation across the Indian subcontinent. The study analysed whole-genome sequences from 2,762 individuals — capturing the vast diversity of India’s major linguistic, ethnic, and regional communities — from the Longitudinal Aging Study in India-Diagnostic Assessment of Dementia (LASI-DAD). Led by an international team of researchers from the All India Institute of Medical Sciences (AIIMS) Delhi, the University of California Berkeley, the University of Michigan, and the University of Southern California, this study marks a major step toward filling the longstanding gap in global genomic datasets and has far-reaching implications for both science and public health in India.
Layers of ancestry, woven over millennia
Our DNA carries the story of where we come from — holding clues not just about our parents and grandparents, but also about ancestors who lived thousands of years ago. In this study, scientists analysed the genomes of more than 2,700 individuals from across India and compared them to those of people from around the world, both living and ancient. Their goal was to answer important questions about India’s history: When did humans first arrive in India from Africa? How much DNA do Indians today carry from now-extinct archaic groups like Neanderthals and Denisovans? And how did major technological and cultural changes — like the invention of farming and the spread of languages — shape genetic differences and patterns of disease across India?
We know that the earliest inhabitants of India were ancient hunter-gatherers often referred to as ancient Ancestral South Indians (AASI), who share deep genetic links with present-day people from Andaman Islands in the Bay of Bengal. While some archaeological and genetic evidence suggests that humans may have arrived in India over 70,000 years ago — possibly before the Toba volcanic eruption — this study finds that the majority of genetic variation in present-day Indians traces back to a major migration out of Africa approximately 50,000 years ago. This same migration also populated Europe and East Asia, indicating that modern Indian populations share a common origin with other non-Africans.
On their journey out of Africa, our ancestors encountered Neanderthals and Denisovans, who were already living across Eurasia. As a result, most Indians today have around 1-2% of Neanderthal ancestry, which is the same as Europeans and Americans, and 0.2% Denisovan ancestry similar to East Asians. One striking finding from this study is that Indian genomes contain the most diverse set of Neanderthal DNA segments observed in any modern population outside Africa. This diversity allowed researchers to reconstruct roughly half of the Neanderthal genome and one-fifth of the Denisovan genome using only Indian genomes — a truly remarkable achievement. In particular, one region of the genome of Indians carries a high number of Denisovan-derived variants, which is not seen in any other population outside India.

Over the past 10,000 years, migrations and cultural shifts — especially the development of agriculture — shaped the genetic landscape of India. Archaeological evidence suggests that agriculture in South Asia began around 8,000-9,000 years ago, likely introduced by Neolithic farmers from West Asia. By comparing ancient samples from Neolithic to Iron Age from Central Asia and Middle East, this study pinpointed the source of Neolithic farmer ancestry to 4th-millennium BCE farmers from Tajikistan (Sarazm_EN, about 3,600-3,500 BCE). This genetic link aligns with archaeological discoveries of shell bangles originating in Neolithic India found at the Sarazm site, highlighting early trade connections between the Indus Valley Civilization and Central Asian cultures.
The genome analysis also showed Indians have ancestry from Eurasian Steppe pastoralists, who arrived more recently and whose influence is more pronounced in northern and northwestern populations. Along with Neolithic farmers and indigenous hunter-gatherers, these groups contributed to most of the genetic diversity seen across India today. After these ancient mixtures, India experienced a shift towards endogamy, with people increasingly marrying within their own communities and caste groups. The extent of endogamy practiced in India is reflected in the high level of relatedness seen among Indians — in a sample size of 2,776, each individual studied had at least one 4th-degree cousin or closer relative. The level of homozygosity seen among Indians is two-nine times higher than East Asians and Europeans, revealing the elevated levels of homozygosity seen here. Together, these layers of ancient migration, mixing, and social structure have given rise to one of the most intricate and regionally varied genetic landscapes in the world.
What our DNA reveal about our health
Beyond revealing our history, DNA also offers clues into why we get sick and how we adapt to our environment. This study shows that ancient ancestry can affect modern health — for example, some Neanderthal variants increase the risk of severe COVID-19, while Denisovan genes contribute to immune system diversity. It also found that within-group marriages can increase the chance of inheriting identical segments of DNA from both parents, elevating genomic homozygosity and the risk of recessive diseases. As a result, rare, harmful mutations have become more common in some Indian groups.
These include a mutation causing butyrylcholinesterase deficiency seen elevated among the Vysya community that lives in Andhra Pradesh and Telangana. The deficiency in butyrylcholinesterase can lead to neuromuscular blockade and muscle paralysis when certain muscle relaxants are used during anesthesia. An earlier study had identified several population groups such as the Vysya community in South India, Parsis, baniya community in North India expressing elevated load of recessive genes that may have high genetic-based diasese burden.
Other identified variants are linked to blood disorders, hearing loss, cystic fibrosis (a genetic disorder that primarily affects the lungs and digestive system due to thick, sticky mucus buildup), and phenylketonuria (a rare genetic disorder where the body is unable to break down phenylalanine, an amino acid found in protein) — mutations rarely seen outside India but found more frequently in specific regions in India. These findings highlight the power of genetic studies to reveal hidden patterns of disease risk and lay the foundation for disease mapping and development of population-specific approaches to healthcare.
The road ahead for Indian genomics
This study opens a new chapter in our understanding of India’s population history, bridging genomics, archaeology, and medical science. It reminds us that our past is not a single thread but a complex tapestry — woven from population separations, migrations, and mixtures — and this deep history continues to shape patterns of health and disease today. For those of us working on Indian population history, it is an exciting moment—and one that promises to transform both historical research and public health in India. While I see this study as a major step forward, it is also a call to action.
One promising step forward is the Genome India Project, a national initiative led by the Department of Biotechnology that has sequenced the genomes of over 10,000 people from 83 populations. Alongside genomic data, the project has collected blood biochemistry and anthropometric measures to better understand disease susceptibility and drug response. Yet this is only a beginning. In contrast, the UK Biobank and ‘All of Us’ projects include genomic and detailed phenotypic data from hundreds of thousands to over a million participants.
India with its vast population and complex genetic diversity needs similarly ambitious efforts that integrate genomic, clinical, and environmental data at scale. Equally important is expanding ancient DNA research from India, to understand the origin and selective pressures impacting functional variants. Combining ancient and modern genomes will offer a more complete picture of how past migrations and cultural changes shaped the genetic diversity of India today. Achieving this will require sustained investment and strong international collaborations, which bring together technical expertise, shared data resources, and diverse scientific perspectives.

