	<rdf:RDF xmlns:admin="http://webns.net/mvcb/" xmlns="http://purl.org/rss/1.0/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:prism="http://purl.org/rss/1.0/modules/prism/" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:syn="http://purl.org/rss/1.0/modules/syndication/">
	<channel rdf:about="https://biorxiv.org">
	<admin:errorReportsTo rdf:resource="mailto:biorxiv@cshlpress.edu"/>
	<title>bioRxiv Channel: Somatic Mosaicism across the Human Tissues Network (SMaHT)</title>
	<link>https://biorxiv.org</link>
	<description>
	This feed contains articles for bioRxiv Channel "Somatic Mosaicism across the Human Tissues Network (SMaHT)"
	</description>

		<items>
	<rdf:Seq>
		</rdf:Seq>
	</items>
	<prism:eIssn/>
	<prism:publicationName>bioRxiv</prism:publicationName>
	<prism:issn/>

	<image rdf:resource=""/>
	</channel>
	<image rdf:about="">
	<title>bioRxiv</title>
	<url/>
	<link>https://biorxiv.org</link>
	</image>
	<item rdf:about="https://biorxiv.org/cgi/content/short/2024.12.18.629274v1?rss=1">
<title>
<![CDATA[
A personalized multi-platform assessment of somatic mosaicism in the human frontal cortex 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2024.12.18.629274v1?rss=1"
</link>
<description><![CDATA[
Somatic mutations in individual cells create genomic mosaicism, influencing genetic disorders and cancers. While clonal mutations in cancers are well-studied, rarer somatic variants in normal tissues remain poorly characterized. This study systematically evaluates detection methods using a personalized donor-specific assembly (DSA) from a neurotypical individuals dorsolateral prefrontal cortex assessed with Oxford Nanopore, NovaSeq, linked-read sequencing, Cas9-targeted long-read sequencing (TEnCATS), and single-neuron MALBAC amplification. The haplotype-resolved DSA improved cross-platform analysis, dramatically increasing phasing rates. Germline SNVs, structural variations (SVs), and transposable elements (TEs) were recalled with 99.4%-99.7% accuracy in bulk tissue, and phased haplotype analysis reduced false positives by 15.4%-75.1% for putative somatic candidates. Long-read single-neuron sequencing detected nine somatic SV candidates, demonstrating enhanced sensitivity for rare variants, while TEnCATS identified eight low-frequency somatic TE candidates. These findings highlight advanced methodologies for precise somatic variant detection, critical for understanding mosaicisms role in health and disease.
]]></description>
<dc:creator>Zhou, W.</dc:creator>
<dc:creator>Mumm, C.</dc:creator>
<dc:creator>Gan, Y.</dc:creator>
<dc:creator>Switzenberg, J. A.</dc:creator>
<dc:creator>Wang, J.</dc:creator>
<dc:creator>De Oliveira, P.</dc:creator>
<dc:creator>Kathuria, K.</dc:creator>
<dc:creator>Losh, S. J.</dc:creator>
<dc:creator>McDonald, T. L.</dc:creator>
<dc:creator>Bessell, B.</dc:creator>
<dc:creator>Van Deynze, K.</dc:creator>
<dc:creator>McConnell, M. J.</dc:creator>
<dc:creator>Boyle, A. P.</dc:creator>
<dc:creator>Mills, R. E.</dc:creator>
<dc:date>2024-12-21</dc:date>
<dc:identifier>doi:10.1101/2024.12.18.629274</dc:identifier>
<dc:title><![CDATA[A personalized multi-platform assessment of somatic mosaicism in the human frontal cortex]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2024-12-21</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2024.11.07.619809v1?rss=1">
<title>
<![CDATA[
Image-based DNA Sequencing Encoding for Detecting Low-Mosaicism Somatic Mobile Element Insertions 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2024.11.07.619809v1?rss=1"
</link>
<description><![CDATA[
Active LINE-1 (L1), Alu, and SVA mobile elements in the human genome are capable of retrotransposition, resulting in novel mobile element insertions (MEIs) in both germline and somatic tissues. Detecting MEIs through DNA sequencing relies on supporting reads overlapping MEI junctions; however, artifacts from DNA amplification, sequencing, and alignment errors produce numerous false positives. Systematic detection of somatic MEIs, particularly those with low mosaicism, remains a significant challenge. Previous methods had required a high number of supporting reads which limits the detection sensitivity, or human inspections that are susceptible to biases. Here, we developed RetroNet, an algorithm that encodes MEI-supporting sequencing reads into images, and employs a deep neural network to identify somatic MEIs with as few as two reads. Trained on extensive and diverse datasets and benchmarked across various conditions, RetroNet surpasses previous methods and eliminates the need for extensive manual examinations. The RetroNet analysis on the Illumina sequencing of 161x or 195x of a cancer cell line achieved an average precision of 0.885 and recall of 0.579 for detecting somatic L1 insertions that are present in as few as 1.79% of the cells. Additionally, we demonstrated that RetroNet is effective for analyzing highly degraded DNA, such as circulating tumor DNA. RetroNet is applicable to the rapidly generated short-read sequencing data and has the potential to provide further insights into the functional and pathological implications of somatic retrotranspositions.
]]></description>
<dc:creator>Tan, M.</dc:creator>
<dc:creator>Lin, Z.</dc:creator>
<dc:creator>Chen, Z.</dc:creator>
<dc:creator>Park, J.</dc:creator>
<dc:creator>He, Z.</dc:creator>
<dc:creator>Zhou, H.</dc:creator>
<dc:creator>Lee, E. A.</dc:creator>
<dc:creator>Gao, Z.</dc:creator>
<dc:creator>Zhu, X.</dc:creator>
<dc:date>2024-11-08</dc:date>
<dc:identifier>doi:10.1101/2024.11.07.619809</dc:identifier>
<dc:title><![CDATA[Image-based DNA Sequencing Encoding for Detecting Low-Mosaicism Somatic Mobile Element Insertions]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2024-11-08</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2024.11.06.622310v1?rss=1">
<title>
<![CDATA[
Deaminase-assisted single-molecule and single-cell chromatin fiber sequencing 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2024.11.06.622310v1?rss=1"
</link>
<description><![CDATA[
Gene regulation is mediated by the co-occupancy of numerous proteins along individual chromatin fibers. However, our tools for deeply profiling how proteins co-occupy individual fibers, especially at the single-cell level, remain limited. We present Deaminase-Assisted single-molecule chromatin Fiber sequencing (DAF-seq), which leverages a non-specific double-stranded DNA deaminase toxin A (SsDddA) to efficiently stencil protein occupancy along DNA molecules via selective deamination of accessible cytidines, which are preserved via C-to-T transitions upon DNA amplification. We demonstrate that DAF-seq enables [~]200,000-fold enrichment of target loci for single-molecule footprinting at near single-nucleotide resolution, enabling the precise delineation of the regulatory logic guiding neighboring proteins to cooperatively occupy chromatin fibers. Furthermore, DAF-seq enables the synchronous identification of single-molecule chromatin and genetic architectures - resolving the functional impact of rare somatic variants, as well as transitional chromatin states guiding haplotype-selective promoter actuation. Finally, we demonstrate that single-cell DAF-seq enables the accurate reconstruction of the diploid genome and epigenome from a single cell, revealing that a cells accessible regulatory landscape can diverge by as much as 63% while still retaining the cells identity. Overall, DAF-seq enables the comprehensive characterization of protein occupancy and chromatin accessibility across entire chromosomes with single-nucleotide, single-molecule, single-haplotype, and single-cell precision.
]]></description>
<dc:creator>Swanson, E. G.</dc:creator>
<dc:creator>Mao, Y.</dc:creator>
<dc:creator>Mallory, B. J.</dc:creator>
<dc:creator>Vollger, M. R.</dc:creator>
<dc:creator>Ranchalis, J.</dc:creator>
<dc:creator>Bohaczuk, S. C.</dc:creator>
<dc:creator>Parmalee, N. L.</dc:creator>
<dc:creator>Bennett, J. T.</dc:creator>
<dc:creator>Stergachis, A. B.</dc:creator>
<dc:date>2024-11-06</dc:date>
<dc:identifier>doi:10.1101/2024.11.06.622310</dc:identifier>
<dc:title><![CDATA[Deaminase-assisted single-molecule and single-cell chromatin fiber sequencing]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2024-11-06</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2023.10.09.561356v1?rss=1">
<title>
<![CDATA[
Characterization of Cancer Evolution Landscape Based on Accurate Detection of Somatic Mutations in Single Tumor Cells 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2023.10.09.561356v1?rss=1"
</link>
<description><![CDATA[
Accurate detection of somatic mutations in single tumor cells is greatly desired as it allows us to quantify the single-cell mutation burden and construct the mutation-based phylogenetic tree. Here we developed scNanoSeq chemistry and profiled 842 single cells from 21 human breast cancer samples. The majority of the mutation-based phylogenetic trees comprise a characteristic stem evolution followed by the clonal sweep. We observed the subtype-dependent lengths in the stem evolution. To explain this phenomenon, we propose that the differences are related to different reprogramming required for different subtypes of breast cancer. Furthermore, we reason that the time that the tumor-initiating cell took to acquire the critical clonal-sweep-initiating mutation by random chance set the time limit for the reprogramming process. We refer to this model as a reprogramming and critical mutation co-timing (RCMC) subtype model. Next, in the sweeping clone, we observed that tumor cells undergo a branched evolution with rapidly decreasing selection. In the most recent clades, effectively neutral evolution has been reached, resulting in a substantially large number of mutational heterogeneities. Integrative analysis with 522-713X ultra-deep bulk whole genome sequencing (WGS) further validated this evolution mode. Mutation-based phylogenetic trees also allow us to identify the early branched cells in a few samples, whose phylogenetic trees support the gradual evolution of copy number variations (CNVs). Overall, the development of scNanoSeq allows us to unveil novel insights into breast cancer evolution.
]]></description>
<dc:creator>Niu, M.</dc:creator>
<dc:creator>Zhang, Y.</dc:creator>
<dc:creator>Luo, J.</dc:creator>
<dc:creator>Sinson, J. C.</dc:creator>
<dc:creator>Thompson, A. M.</dc:creator>
<dc:creator>Zong, C.</dc:creator>
<dc:date>2023-10-09</dc:date>
<dc:identifier>doi:10.1101/2023.10.09.561356</dc:identifier>
<dc:title><![CDATA[Characterization of Cancer Evolution Landscape Based on Accurate Detection of Somatic Mutations in Single Tumor Cells]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2023-10-09</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2025.10.07.680917v1?rss=1">
<title>
<![CDATA[
Multi-platform framework for mapping somatic retrotransposition in human tissues 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2025.10.07.680917v1?rss=1"
</link>
<description><![CDATA[
Mobile element insertions (MEI) shape the human genome in both germline and somatic tissues. While inherited MEIs are well characterized, mapping somatic MEIs (sMEI) in non-cancer tissues remains challenging due to their low allelic fraction and repetitive nature. We established an integrative framework for sMEI analysis leveraging modern sequencing technologies and analytical innovations. We first benchmarked sMEI detection and demonstrated advantages of long-read and MEI-targeted sequencing for ultra-low-frequency events using a mixture of well-established cell lines. We then showed that haplotype phasing and donor-specific assemblies refine sMEI detection, effectively distinguishing from germline and false signals in in-silico tumor-normal mixtures. We further developed a source-tracing strategy based on internal sequence variation, expanding the catalogue of active source elements beyond traditional transduction-based methods. Applying this framework to donor tissues, we identified 18 rare somatic L1 insertions, revealing structural and source diversity. Our work provides a foundational framework and biological insight into sMEIs.
]]></description>
<dc:creator>Wang, S.</dc:creator>
<dc:creator>Bae, M.</dc:creator>
<dc:creator>Wang, J.</dc:creator>
<dc:creator>Zhao, B.</dc:creator>
<dc:creator>Nguyen, K.</dc:creator>
<dc:creator>Mallett, S.</dc:creator>
<dc:creator>Switzenberg, J. A.</dc:creator>
<dc:creator>Losh, S. J.</dc:creator>
<dc:creator>Sexton, C. E.</dc:creator>
<dc:creator>Miao, B.</dc:creator>
<dc:creator>Dong, S.</dc:creator>
<dc:creator>Zeng, X.</dc:creator>
<dc:creator>Wang, Z.</dc:creator>
<dc:creator>McDonald, T. L.</dc:creator>
<dc:creator>Mumm, C.</dc:creator>
<dc:creator>Gadde, R. K.</dc:creator>
<dc:creator>Tariq, A. M.</dc:creator>
<dc:creator>Chen, Z.</dc:creator>
<dc:creator>Feng, W. C.</dc:creator>
<dc:creator>Burn, A.</dc:creator>
<dc:creator>Park, J.</dc:creator>
<dc:creator>Chu, C.</dc:creator>
<dc:creator>Shen, H.</dc:creator>
<dc:creator>Wang, T.</dc:creator>
<dc:creator>Urban, A. E.</dc:creator>
<dc:creator>Zhu, X.</dc:creator>
<dc:creator>Li, H.</dc:creator>
<dc:creator>Burns, K. H.</dc:creator>
<dc:creator>Chun, H.-J. E.</dc:creator>
<dc:creator>Park, P. J.</dc:creator>
<dc:creator>SMaHT MEI Working Group,</dc:creator>
<dc:creator>Boyle, A. P.</dc:creator>
<dc:creator>Mills, R. E.</dc:creator>
<dc:creator>Zhou, W.</dc:creator>
<dc:creator>Lee, E. A.</dc:creator>
<dc:date>2025-10-07</dc:date>
<dc:identifier>doi:10.1101/2025.10.07.680917</dc:identifier>
<dc:title><![CDATA[Multi-platform framework for mapping somatic retrotransposition in human tissues]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2025-10-07</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2025.10.09.678885v1?rss=1">
<title>
<![CDATA[
Comprehensive benchmarking of somatic mutation detection by the SMaHT Network 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2025.10.09.678885v1?rss=1"
</link>
<description><![CDATA[
Somatic mosaicism is increasingly recognized as a fundamental feature of human biology, yet the detection of somatic mutations remains challenging. The SMaHT Network conducted four large-scale benchmarking experiments to evaluate sequencing technologies, experimental approaches, and computational methods for detecting diverse somatic mutations. Cumulative sequencing coverage exceeded 1,000x with short reads and 100-400x with long reads for each of nine analyzed samples. We defined optimal strategies for integrating bulk short- and long-read sequencing for mutation detection and demonstrated that using donor-specific assemblies and human pangenome improved variant calling and extended mutation catalogs to challenging genomic regions. We benchmarked six duplex-seq technologies and showed that single-cell sequencing resolves cell type-specific mutational patterns and heterogeneity. Our results indicate that bulk, single-cell, and duplex analyses are complementary - and leveraging all three provides comprehensive characterization of mosaicism within a tissue. Together, these findings provide a roadmap for accurate, genome-wide somatic mutation discovery and analysis.
]]></description>
<dc:creator>The Somatic Mosaicism across Human Tissues Network (SMaHT),</dc:creator>
<dc:creator>Abyzov, A.</dc:creator>
<dc:date>2025-10-10</dc:date>
<dc:identifier>doi:10.1101/2025.10.09.678885</dc:identifier>
<dc:title><![CDATA[Comprehensive benchmarking of somatic mutation detection by the SMaHT Network]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2025-10-10</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2025.10.10.681725v1?rss=1">
<title>
<![CDATA[
A telomere-to-telomere map of somatic mutation burden and functional impact in cancer 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2025.10.10.681725v1?rss=1"
</link>
<description><![CDATA[
Oncogenesis involves widespread genetic and epigenetic alterations, yet the full spectrum of somatic variation genome-wide remains unresolved. We generated a near-telomere-to-telomere (T2T) diploid assembly of a donor paired with deep short- and long-read sequencing of their melanoma. This revealed that 16% of somatic variants occur in sequences absent from GRCh38, with satellite repeats acting as hotspots for UV-induced damage due to sequence-intrinsic mutability and inefficient repair. Centromere kinetochore domains emerged as focal sites of structural, genetic, and epigenetic variation, leading to remodeling of centromere kinetochore binding domains during tumor evolution. Single-molecule telomere reconstructions uncovered cycles of attrition, deletion, and telomerase-mediated extension that shape cancer telomeres. Finally, diploid chromatin maps exposed that copy number alterations and epimutations, rather than point mutations, predominate in rewiring cancer regulatory programs. These findings define the full landscape of a cancers somatic variation and their functional impact, establishing a blueprint for T2T studies of mosaicism.
]]></description>
<dc:creator>Sohn, M.-H.</dc:creator>
<dc:creator>Dubocanin, D.</dc:creator>
<dc:creator>Vollger, M. R.</dc:creator>
<dc:creator>Kwon, Y.</dc:creator>
<dc:creator>Minkina, A.</dc:creator>
<dc:creator>Munson, K. M.</dc:creator>
<dc:creator>Hart, S. F.</dc:creator>
<dc:creator>Ranchalis, J. E.</dc:creator>
<dc:creator>Parmalee, N. L.</dc:creator>
<dc:creator>Sedeno-Cortes, A. E.</dc:creator>
<dc:creator>Ou, J.</dc:creator>
<dc:creator>Au, N. Y.</dc:creator>
<dc:creator>Bohaczuk, S.</dc:creator>
<dc:creator>Carroll, B.</dc:creator>
<dc:creator>Frazar, C. D.</dc:creator>
<dc:creator>Harvey, W. T.</dc:creator>
<dc:creator>Hoekzema, K.</dc:creator>
<dc:creator>Huang, M.-F.</dc:creator>
<dc:creator>Jacques, C. N.</dc:creator>
<dc:creator>Jensen, D. M.</dc:creator>
<dc:creator>Kolar, J. T.</dc:creator>
<dc:creator>Lee, R.</dc:creator>
<dc:creator>Lin, J.</dc:creator>
<dc:creator>Loy, K.</dc:creator>
<dc:creator>Mack, T.</dc:creator>
<dc:creator>Mao, Y.</dc:creator>
<dc:creator>Pham, M. M.</dc:creator>
<dc:creator>Ryke, E.</dc:creator>
<dc:creator>Smith, J. D.</dc:creator>
<dc:creator>Sutherlin, L.</dc:creator>
<dc:creator>Swanson, E. G.</dc:creator>
<dc:creator>Weiss, J. M.</dc:creator>
<dc:creator>SMaHT Assembly Working Group,</dc:creator>
<dc:creator>Carvalho, C.</dc:creator>
<dc:creator>Coorens, T. H.</dc:creator>
<dc:creator>Harris, K.</dc:creator>
<dc:creator>Wei, C.-L.</dc:creator>
<dc:creator>Eichler, E. E.</dc:creator>
<dc:creator>Altemose, N.</dc:creator>
<dc:creator>Bennett, J. T.</dc:creator>
<dc:creator>Stergachis, A. B.</dc:creator>
<dc:date>2025-10-13</dc:date>
<dc:identifier>doi:10.1101/2025.10.10.681725</dc:identifier>
<dc:title><![CDATA[A telomere-to-telomere map of somatic mutation burden and functional impact in cancer]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2025-10-13</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2025.10.31.685648v1?rss=1">
<title>
<![CDATA[
A comprehensive view of somatic mosaicism by single-cell DNA analysis 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2025.10.31.685648v1?rss=1"
</link>
<description><![CDATA[
Single-cell DNA sequencing offers a powerful means of studying somatic mosaicism but requires careful analysis to mitigate DNA amplification-related artifacts. We performed primary template-directed amplification (PTA) and sequencing of 102 nuclei from postmortem lung and colon tissues of a 74-year-old male. Single-cell mutation burdens and spectra were validated by duplex sequencing and revealed heterogeneity across organs and cells, including signatures of APOBEC activity and tobacco exposure. Cells from both tissues exhibited chromosomal aneuploidies, loss of chromosome Y, and chromosomal rearrangements including rearrangements of the T-cell receptor loci indicative of T-cells. Shared embryonic mutations between cells enabled reconstruction of cellular ancestries from the zygote, which were validated by bulk sequencing. Collectively, we demonstrate a comprehensive approach for single-cell genomics that yields an expansive view of diverse somatic mutation types from development through aging across diverse tissues--insights that are obscured in bulk sequencing and only partially captured by other single-cell methods.
]]></description>
<dc:creator>Luquette, L. J.</dc:creator>
<dc:creator>Coorens, T. H. H.</dc:creator>
<dc:creator>Natu, A.</dc:creator>
<dc:creator>Suvakov, M.</dc:creator>
<dc:creator>Caplin, A.</dc:creator>
<dc:creator>Jun, M. S.</dc:creator>
<dc:creator>Mo, A.</dc:creator>
<dc:creator>Pelt, J.</dc:creator>
<dc:creator>Anderson, L.</dc:creator>
<dc:creator>Berselli, M.</dc:creator>
<dc:creator>Bhamidipati, S.</dc:creator>
<dc:creator>Blanchard, T.</dc:creator>
<dc:creator>Brew, J.</dc:creator>
<dc:creator>Chun, H.-J. E.</dc:creator>
<dc:creator>Chun, H.</dc:creator>
<dc:creator>Dehankar, M. K.</dc:creator>
<dc:creator>Feng, W. C.</dc:creator>
<dc:creator>Furatero, R.</dc:creator>
<dc:creator>Grochowski, C. M.</dc:creator>
<dc:creator>Ho, E.</dc:creator>
<dc:creator>Jang, Y.</dc:creator>
<dc:creator>Kottapalli, K.</dc:creator>
<dc:creator>Leonard, M. K.</dc:creator>
<dc:creator>Lim, N. S.</dc:creator>
<dc:creator>Lindsay, T.</dc:creator>
<dc:creator>Nicholson, S.</dc:creator>
<dc:creator>Raimondi, I.</dc:creator>
<dc:creator>Runnels, A.</dc:creator>
<dc:creator>Scharlee, C.</dc:creator>
<dc:creator>Shin, J.</dc:creator>
<dc:creator>Veit, A. D.</dc:creator>
<dc:creator>VonDran, M.</dc:creator>
<dc:creator>Wang, Y.</dc:creator>
<dc:creator>Yuan, D. J.</dc:creator>
<dc:creator>Zhao, Y.</dc:creator>
<dc:creator>Bell, T. J.</dc:creator>
<dc:creator>Ardlie, K.</dc:creator>
<dc:creator>Doddapaneni, H.</dc:creator>
<dc:creator>Fulton, R.</dc:creator>
<dc:creator>Germer, S.</dc:creator>
<dc:creator>Landau, D.</dc:creator>
<dc:creator>Oh, J. W.</dc:creator>
<dc:creator>Park, P. J.</dc:creator>
<dc:creator>Vaccarino, F. M.</dc:creator>
<dc:creator>Walsh, C. A.</dc:creator>
<dc:creator>Abyzov</dc:creator>
<dc:date>2025-11-03</dc:date>
<dc:identifier>doi:10.1101/2025.10.31.685648</dc:identifier>
<dc:title><![CDATA[A comprehensive view of somatic mosaicism by single-cell DNA analysis]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2025-11-03</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2025.10.13.681545v1?rss=1">
<title>
<![CDATA[
Comprehensive benchmarking of somatic single-nucleotide variant and indel detection at ultra-low allele fractions using short- and long-read data 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2025.10.13.681545v1?rss=1"
</link>
<description><![CDATA[
Mosaic mutations in normal tissues occur at low variant allele fractions (VAFs), complicating detection. To benchmark strategies, the SMaHT Network created a cell-line mixture (1:49) and produced ultra-deep whole-genome sequencing using short and long reads (five centers, 180-500x each). We assembled a reference of 44,008 mosaic SNVs and 2,059 Indels, cross-validation between platforms to expose limits of short-read analysis. We also partitioned the genome by mappability to examine the impact of genomic context, added a negative reference set, and accounted for culture-derived mutations. When seven institutions applied eleven algorithms to mixture data, call sets were largely discordant across tools and replicates, partly reflecting stochastic presence of low-VAF mutations in biological replicants. For >2% VAF SNVs, sensitivity and precision approached [~]80% at [&ge;]300x, with little gain from additional sequencing. This work provides a comprehensive framework for reliable detection of low-VAF mutations in non-cancer tissues and a valuable resource for the community.
]]></description>
<dc:creator>Ha, Y.-J. J.</dc:creator>
<dc:creator>Maziec, D.</dc:creator>
<dc:creator>Markowski, J.</dc:creator>
<dc:creator>Georges, S. J.</dc:creator>
<dc:creator>Parmalee, N. L.</dc:creator>
<dc:creator>Berselli, M.</dc:creator>
<dc:creator>Coorens, T. H.</dc:creator>
<dc:creator>Dong, S.</dc:creator>
<dc:creator>Gardiner, S.</dc:creator>
<dc:creator>Kalra, D.</dc:creator>
<dc:creator>Li, D.</dc:creator>
<dc:creator>Miao, B.</dc:creator>
<dc:creator>Musunuri, R.</dc:creator>
<dc:creator>Xue, L.</dc:creator>
<dc:creator>Yu, Z.</dc:creator>
<dc:creator>Walker, K.</dc:creator>
<dc:creator>Anderson, L.</dc:creator>
<dc:creator>Au, N. Y.</dc:creator>
<dc:creator>Cibulskis, C.</dc:creator>
<dc:creator>Doddapaneni, H.</dc:creator>
<dc:creator>Grochowski, C. M.</dc:creator>
<dc:creator>Jensen, D. M.</dc:creator>
<dc:creator>Lindsay, T.</dc:creator>
<dc:creator>Loy, K.</dc:creator>
<dc:creator>Narayan, A.</dc:creator>
<dc:creator>Narzisi, G.</dc:creator>
<dc:creator>Ou, J.</dc:creator>
<dc:creator>Pham, M. M.</dc:creator>
<dc:creator>Runnels, A. M.</dc:creator>
<dc:creator>Stergachis, A. B.</dc:creator>
<dc:creator>Sutherlin, L. M.</dc:creator>
<dc:creator>Wang, T.</dc:creator>
<dc:creator>Jin, H.</dc:creator>
<dc:creator>Feng, W. C.</dc:creator>
<dc:creator>Zhang, Y.</dc:creator>
<dc:creator>Veit, A. D.</dc:creator>
<dc:creator>Kim, C. T.</dc:creator>
<dc:creator>Chun, H.-J. E.</dc:creator>
<dc:creator>Ardlie, K.</dc:creator>
<dc:creator>Fulton, R. S.</dc:creator>
<dc:creator>Germer, S.</dc:creator>
<dc:creator>Gibbs, R. A.</dc:creator>
<dc:creator>Marth, G. T.</dc:creator>
<dc:creator>Bennett, J. T.</dc:creator>
<dc:creator>Park, P. J.</dc:creator>
<dc:date>2025-10-14</dc:date>
<dc:identifier>doi:10.1101/2025.10.13.681545</dc:identifier>
<dc:title><![CDATA[Comprehensive benchmarking of somatic single-nucleotide variant and indel detection at ultra-low allele fractions using short- and long-read data]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2025-10-14</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2025.09.18.677206v1?rss=1">
<title>
<![CDATA[
Comprehensive benchmarking of somatic structural variant detection at ultra-low allele fractions 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2025.09.18.677206v1?rss=1"
</link>
<description><![CDATA[
Postzygotic mosaicism gives rise to somatic structural variants (SVs) at ultra-low variant allele fractions (VAFs), which pose challenges for detection due to the high-coverage sequencing required and noise introduced by sequencing artifacts. Although somatic SV detection has been extensively studied in cancer, these studies are not directly applicable to the study of tissue mosaicism, as they rely on matched normals, target higher VAF ranges, and are enriched for different types of SVs.

We present comprehensive benchmark data and best practices for non-cancer somatic SV detection. We created a synthetic mosaic sample by combining six HapMap individuals at varying proportions, generating allele fractions as low as 0.25%. This sample was sequenced to [~]2,300x total coverage using Illumina, PacBio, and Nanopore technologies across multiple sequencing centers. A high-confidence benchmark SV set containing over 21,000 pseudo-somatic insertions and deletions [&ge;]50bp was derived from haplotype-resolved assemblies.

We evaluated 12 SV discovery pipelines and identified caller-specific strengths and sequencing platform-specific shortcomings. We find that short read-based approaches show reduced recall for insertions and repeat-associated SVs, whereas long-read sequencing achieves high accuracy throughout the genome, increasing linearly with coverage. The best algorithms sensitivity exceeded 80% for VAFs [&ge;]4% and 15% for VAFs of 0.5-1% with 60x coverage.

The publicly available benchmarking data and comparative analysis of current methods provide a foundation for robust discovery of SV mosaicism in non-cancer tissues..
]]></description>
<dc:creator>Zhang, Y.</dc:creator>
<dc:creator>English, A. C.</dc:creator>
<dc:creator>Paulin, L. F.</dc:creator>
<dc:creator>Grochowski, C. M.</dc:creator>
<dc:creator>Maheshwari, S.</dc:creator>
<dc:creator>Mack, T.</dc:creator>
<dc:creator>Berselli, M.</dc:creator>
<dc:creator>Veit, A. D.</dc:creator>
<dc:creator>Fu, Y.</dc:creator>
<dc:creator>SMAHT SV working group,</dc:creator>
<dc:creator>Park, P. J.</dc:creator>
<dc:creator>Sedlazeck, F. J.</dc:creator>
<dc:date>2025-09-20</dc:date>
<dc:identifier>doi:10.1101/2025.09.18.677206</dc:identifier>
<dc:title><![CDATA[Comprehensive benchmarking of somatic structural variant detection at ultra-low allele fractions]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2025-09-20</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2025.09.29.679336v1?rss=1">
<title>
<![CDATA[
A Pangenomic Method for Establishing a Somatic Variant Detection Resource in HapMap Mixtures 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2025.09.29.679336v1?rss=1"
</link>
<description><![CDATA[
Somatic mosaicism is essential in human biology and disease, yet robust benchmarks are scarce. The SMaHT Consortium mixed six HapMap cell lines to create artificial somatic variants spanning 0.25% to 16.5% variant allele fractions. We developed a technology-agnostic method that builds pangenome graphs from individual assemblies to create unified benchmarking sets: > 6M single-nucleotide variants, 1.8M small insertions/deletions, 49K structural variations, and 10K mobile element insertions across autosomes, X, and mitochondrial chromosomes. We validated the variants using ultra-deep simulated reads and developed a binomial-based model to estimate coverage requirements for variant detection. Evaluating multiple callers showed CHM13 alignment improves structural variant detection and offers advantages in difficult-to-map regions compared to GRCh38. Systematic characterization showed regions with low detection rate are enriched in centromeres, satellite sequences, tandem repeats, and falsely duplicated genes. This accurate, versatile resource enables systematic evaluation of somatic variant detection technologies.
]]></description>
<dc:creator>Kong, N.</dc:creator>
<dc:creator>Tang, Z.</dc:creator>
<dc:creator>Ruttenberg, A.</dc:creator>
<dc:creator>Macias-Velasco, J. F.</dc:creator>
<dc:creator>Li, Z.</dc:creator>
<dc:creator>Zhang, W.</dc:creator>
<dc:creator>Miao, B.</dc:creator>
<dc:creator>Xin, Z.</dc:creator>
<dc:creator>Fu, Q.</dc:creator>
<dc:creator>Park, H.</dc:creator>
<dc:creator>Zhuo, X.</dc:creator>
<dc:creator>Mehinovic, E.</dc:creator>
<dc:creator>Belter, E.</dc:creator>
<dc:creator>Garza, J. E.</dc:creator>
<dc:creator>Dong, S.</dc:creator>
<dc:creator>Casey, E.</dc:creator>
<dc:creator>Johnson, B. K.</dc:creator>
<dc:creator>Majewski, M. F.</dc:creator>
<dc:creator>Palmer, T.</dc:creator>
<dc:creator>Cheng, Y.</dc:creator>
<dc:creator>Lindsay, T.</dc:creator>
<dc:creator>Schedl, T.</dc:creator>
<dc:creator>Li, D.</dc:creator>
<dc:creator>Shen, H.</dc:creator>
<dc:creator>SMaHT Network Assembly/Pangenome Working Group,</dc:creator>
<dc:creator>Fulton, R.</dc:creator>
<dc:creator>Wang, T.</dc:creator>
<dc:creator>Jin, S. C.</dc:creator>
<dc:date>2025-10-01</dc:date>
<dc:identifier>doi:10.1101/2025.09.29.679336</dc:identifier>
<dc:title><![CDATA[A Pangenomic Method for Establishing a Somatic Variant Detection Resource in HapMap Mixtures]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2025-10-01</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2025.12.05.692678v1?rss=1">
<title>
<![CDATA[
Expanding the Genome in a Bottle Truth Set: Detection and Validation of Novel Low-frequency Variants Using High-accuracy NanoSeq 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2025.12.05.692678v1?rss=1"
</link>
<description><![CDATA[
HighlightsO_LINanoSeq-MBN achieves near-genome, Poisson-like coverage with minimal trinucleotide bias.
C_LIO_LIExpands the GIAB truth set by up to 160k de novo variants.
C_LIO_LIAdds a somatic layer to GIAB, enabling benchmarking and calibration of rare variants.
C_LIO_LIHigh-CADD exonic and splice variants highlight value for surveillance and clinical triage.
C_LI

Somatic mutations record tissue molecular history and inform risk, prognosis, and therapy, yet their variant allele fractions often fall below the reliable detection limit of conventional short-read sequencing. In contrast, duplex sequencing technology featured by NanoSeq applies the principle of single molecule detection and thereby overcomes the limitation. However, the original NanoSeq protocol relies on the restriction enzyme-based genome fragmentation, which constrained its genome coverage to 30-40%. To enable whole-genome discovery with duplex-level fidelity, we pursued two complementary approaches to optimize the NanoSeq protocol: (i) a restriction-enzyme strategy densifies accessible sites using orthogonal 4-bp cutters; and (ii) a workflow using sonication followed by mung bean nuclease with T4 polynucleotide kinase, Klenow fragment and dATP/ddBTP mixture (NanoSeq-MBN) to blunt and repair/A-tailing DNA, while minimizing repair artifacts. We systematically benchmarked their performance using Genome in a Bottle (GIAB) gold-standard sample mixtures. As a result, NanoSeq-MBN achieved near genome-wide, Poissonlike coverage with minimal trinucleotide-context bias and ultra-high accuracy. Beyond variants already present in the GIAB truth set, NanoSeq-MBN identified approximately 120,000-160,000 de novo mutations per sample missing in the truth set, Notably, over 98% had orthogonal support in reanalyzed GIAB bulk Illumina HiSeq libraries. These novel variants extended GIAB from germline benchmarking to rare-variant discovery and calibration of subclonal detection. Functional annotation revealed enrichment of high Combined Annotation Dependent Depletion (CADD) scores mutations in exonic and splice-related regions. Variants intersecting ClinVar entries and OMIM genes highlighted potential for surveillance and clinical triage. Collectively, these results add a somatic layer to GIAB, enabling calibration of burdens and mutational signatures in lymphoblastoid lines and provide reference material for rare-variant assays. The NanoSeq-MBN workflow offers a path to whole-genome, high-fidelity discovery of ultra-rare somatic variation with relevance to clinical assay validation.
]]></description>
<dc:creator>Zhang, Y.</dc:creator>
<dc:creator>Chao, H.</dc:creator>
<dc:creator>Niu, M.</dc:creator>
<dc:creator>Grochowski, C. M.</dc:creator>
<dc:creator>Kottapalli, K.</dc:creator>
<dc:creator>Bhamidipati, S. V.</dc:creator>
<dc:creator>muzny, d. m.</dc:creator>
<dc:creator>Gibbs, R. A.</dc:creator>
<dc:creator>Zong, C.</dc:creator>
<dc:creator>Doddapaneni, H. V.</dc:creator>
<dc:date>2025-12-06</dc:date>
<dc:identifier>doi:10.64898/2025.12.05.692678</dc:identifier>
<dc:title><![CDATA[Expanding the Genome in a Bottle Truth Set: Detection and Validation of Novel Low-frequency Variants Using High-accuracy NanoSeq]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2025-12-06</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2025.12.03.692191v1?rss=1">
<title>
<![CDATA[
MosaicSim: A Novel Mosaic Variant Simulator Reveals Diminishing Returns of Ultra-High Coverage for Mosaic Variant Detection 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2025.12.03.692191v1?rss=1"
</link>
<description><![CDATA[
Genetic mutations within select cells of a tissue, termed mosaic variants (MV), are being increasingly recognized for their role in human disease. This growing interest underscores the need for specialized tools to detect and analyze MVs. However, such detection methods still lack thorough evaluation, largely due to missing benchmarking datasets that are large, reliable, and reflective of the complexity of biological samples. To address this gap, we developed MosaicSim, a tool for simulating variants in realistic sequencing data. The TweakVar workflow is at the tools core and represents a unique simulation pipeline that layers simulated MVs onto empirical whole genome sequencing data, generating a large, realistic ground truth dataset that combines the strengths of both simulation and biological data. To demonstrate the functionality of the workflow, we simulated 1,000 mosaic single nucleotide polymorphisms using TweakVar within whole genome sequencing files of different coverages. MVs were called with Illuminas DRAGEN and compared to the ground truth. Our results show 150x-445x coverage performed comparably, with a true-positive rate between 50.4% (300x) and 54.9% (150x) and no false-positives detected. Across all samples, increasing variant allele frequency had a significant positive effect on call success. Additionally, we observed that call rates for variants in lower complexity regions improved with increasing read depth. We did not find significant effects attributable to specific mutation patterns or mean read map quality. MosaicSim fills a critical unmet need by providing representative, customizable ground truth datasets for MV benchmarking, enabling systematic evaluation and optimization of variant calling methods.
]]></description>
<dc:creator>Stricker, E.</dc:creator>
<dc:creator>Jaryani, F.</dc:creator>
<dc:creator>Izydorczyk, M.</dc:creator>
<dc:creator>Poon, C.-L.</dc:creator>
<dc:creator>Sanio, P.</dc:creator>
<dc:creator>Alexander, A.</dc:creator>
<dc:creator>Deb, S.</dc:creator>
<dc:creator>Sedlazeck, F.</dc:creator>
<dc:creator>Rogers, J.</dc:creator>
<dc:creator>Atkinson, E. G.</dc:creator>
<dc:date>2025-12-07</dc:date>
<dc:identifier>doi:10.64898/2025.12.03.692191</dc:identifier>
<dc:title><![CDATA[MosaicSim: A Novel Mosaic Variant Simulator Reveals Diminishing Returns of Ultra-High Coverage for Mosaic Variant Detection]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2025-12-07</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2025.10.28.685157v1?rss=1">
<title>
<![CDATA[
Single cell whole genome and transcriptome sequencing links somatic mutations to cell identity and ancestry 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2025.10.28.685157v1?rss=1"
</link>
<description><![CDATA[
The role of somatic mutations in human development and disease is obscured by difficulties in characterizing mutations at the single cell level and identifying cell types carrying them. Here we analysed somatic genomes of clonal iPSC lines and of single-cells after whole-genome amplification (scWGA) by PTA and ResolveOme from skin fibroblasts, blood and urine of a live donor. Mutation burden and spectra converged across approaches, revealing heterogeneous mutational footprints across cells driven by environmental exposures (UV damage and chemotherapy) and lymphocyte differentiation. Aneuploidies in single cells were detected by all the approaches and were orthogonally validated by Strand-seq. Uniquely, ResolveOme enabled cell-type identification using single-cell transcriptomes. Using a newly developed method accounting for noise and allele drop-out in scWGA, we de novo reconstructed the cell phylogenetic tree for this donor. Together, scWGA establishes a powerful foundation for comprehensive, cell type-aware, lineage-aware profiling of somatic mutations at single cell level.
]]></description>
<dc:creator>Natu, A.</dc:creator>
<dc:creator>Dehankar, M. K.</dc:creator>
<dc:creator>Pattni, R.</dc:creator>
<dc:creator>Suvakov, M.</dc:creator>
<dc:creator>Olisov, D.</dc:creator>
<dc:creator>Tomasini, L.</dc:creator>
<dc:creator>Jang, Y.</dc:creator>
<dc:creator>Huang, Y.</dc:creator>
<dc:creator>Benito-Garragori, E.</dc:creator>
<dc:creator>Hasenfeld, P.</dc:creator>
<dc:creator>Korbel, J.</dc:creator>
<dc:creator>Urban, A. E.</dc:creator>
<dc:creator>Abyzov, A.</dc:creator>
<dc:creator>Vaccarino, F. M.</dc:creator>
<dc:date>2025-10-29</dc:date>
<dc:identifier>doi:10.1101/2025.10.28.685157</dc:identifier>
<dc:title><![CDATA[Single cell whole genome and transcriptome sequencing links somatic mutations to cell identity and ancestry]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2025-10-29</prism:publicationDate>
<prism:section></prism:section>
</item>
</rdf:RDF>
