	<rdf:RDF xmlns:admin="http://webns.net/mvcb/" xmlns="http://purl.org/rss/1.0/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:prism="http://purl.org/rss/1.0/modules/prism/" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:syn="http://purl.org/rss/1.0/modules/syndication/">
	<channel rdf:about="https://biorxiv.org">
	<admin:errorReportsTo rdf:resource="mailto:biorxiv@cshlpress.edu"/>
	<title>bioRxiv Channel: Arc Institute</title>
	<link>https://biorxiv.org</link>
	<description>
	This feed contains articles for bioRxiv Channel "Arc Institute"
	</description>

		<items>
	<rdf:Seq>
		</rdf:Seq>
	</items>
	<prism:eIssn/>
	<prism:publicationName>bioRxiv</prism:publicationName>
	<prism:issn/>

	<image rdf:resource=""/>
	</channel>
	<image rdf:about="">
	<title>bioRxiv</title>
	<url/>
	<link>https://biorxiv.org</link>
	</image>
	<item rdf:about="https://biorxiv.org/cgi/content/short/2024.07.11.603067v1?rss=1">
<title>
<![CDATA[
Accurate isoform quantification by joint short- and long-read RNA-sequencing 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2024.07.11.603067v1?rss=1"
</link>
<description><![CDATA[
Accurate quantification of transcript isoforms is crucial for understanding gene regulation, functional diversity, and cellular behavior. Existing RNA sequencing methods have significant limitations: short-read (SR) sequencing provides high depth but struggles with isoform deconvolution, whereas long-read (LR) sequencing offers isoform resolution at the cost of lower depth, higher noise, and technical biases. Addressing this gap, we introduce Multi-Platform Aggregation and Quantification of Transcripts (MPAQT), a generative model that combines the complementary strengths of different sequencing platforms to achieve state-of-the-art isoform-resolved transcript quantification, as demonstrated by extensive simulations and experimental benchmarks. By applying MPAQT to an in vitro model of human embryonic stem cell differentiation into cortical neurons, followed by machine learning-based modeling of transcript abundances, we show that untranslated regions (UTRs) are major determinants of isoform proportion and exon usage; this effect is mediated through isoform-specific sequence features embedded in UTRs, which likely interact with RNA-binding proteins that modulate mRNA stability. These findings highlight MPAQTs potential to enhance our understanding of transcriptomic complexity and underline the role of splicing-independent post-transcriptional mechanisms in shaping the isoform and exon usage landscape of the cell.
]]></description>
<dc:creator>Apostolides, M.</dc:creator>
<dc:creator>Choi, B.</dc:creator>
<dc:creator>Navickas, A.</dc:creator>
<dc:creator>Saberi, A.</dc:creator>
<dc:creator>Soto, L. M.</dc:creator>
<dc:creator>Goodarzi, H.</dc:creator>
<dc:creator>Najafabadi, H. S.</dc:creator>
<dc:date>2024-07-13</dc:date>
<dc:identifier>doi:10.1101/2024.07.11.603067</dc:identifier>
<dc:title><![CDATA[Accurate isoform quantification by joint short- and long-read RNA-sequencing]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2024-07-13</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2023.08.11.553045v1?rss=1">
<title>
<![CDATA[
Blocking oligomerization is the most viable strategy to inhibit STING 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2023.08.11.553045v1?rss=1"
</link>
<description><![CDATA[
The anti-viral and anti-cancer STING innate immune pathway can exacerbate autoimmune and neurodegenerative diseases when aberrantly activated, emphasizing a key unmet need for STING pathway antagonists. However, no such inhibitors have advanced to the clinic because it remains unclear which mechanistic step(s) of human STING activation are crucial for potent and context-independent inhibition of downstream signaling. Here, we report that C91 palmitoylation, the mechanistic target of a potent tool compound, is not universally necessary for human STING signaling, making it a poor target for drug development. Instead, we discover that evolutionarily conserved C64 is basally palmitoylated and is crucial for preventing unproductive STING oligomerization in the absence of cGAMP stimulation. The effects of palmitoylation at C64 and C91 converge on the control of intra-dimer disulfide bond formation at C148. Importantly, we show for the first time that signaling-competent STING oligomers are composed of a mixture of two species: disulfide-linked STING dimers that stabilize the oligomer, and reduced STING dimers that are phosphorylated to actuate interferon signaling. Given this complex landscape and cell type specificity of palmitoylation modifications, we conclude that robust STING inhibitors must directly inhibit the oligomerization process. Taking inspiration from STINGs natural autoinhibitory mechanism, we identified an eight amino acid peptide that binds a defined pocket at the inter-dimer oligomerization interface as a proof-of-concept human STING inhibitor, setting the stage for future therapeutic development.

SummaryWe report that functional STING oligomers require palmitoylation at cysteine 64 and some proportion of reduced dimers, and define the site of autoinhibition that can be targeted to disrupt STING oligomerization and activity.
]]></description>
<dc:creator>Chan, R. J. M.</dc:creator>
<dc:creator>Cao, X.</dc:creator>
<dc:creator>Ergun, S. L.</dc:creator>
<dc:creator>Njomen, E.</dc:creator>
<dc:creator>Lynch, S. R.</dc:creator>
<dc:creator>Cravatt, B. F.</dc:creator>
<dc:creator>Li, L.</dc:creator>
<dc:date>2023-08-12</dc:date>
<dc:identifier>doi:10.1101/2023.08.11.553045</dc:identifier>
<dc:title><![CDATA[Blocking oligomerization is the most viable strategy to inhibit STING]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2023-08-12</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2024.01.29.577779v1?rss=1">
<title>
<![CDATA[
Rewriting endogenous human transcripts with trans-splicing 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2024.01.29.577779v1?rss=1"
</link>
<description><![CDATA[
Splicing bridges the gap between static DNA sequence and the diverse and dynamic set of protein products that execute a genes biological functions. While exon skipping technologies enable influence over splice site selection, many desired perturbations to the transcriptome require replacement or addition of exogenous exons to target mRNAs: for example, to replace disease-causing exons, repair truncated proteins, or engineer protein fusions. Here, we report the development of RNA-guided trans-splicing with Cas editor (RESPLICE), inspired by the rare, natural process of trans-splicing that joins exons from two distinct primary transcripts. RESPLICE uses two orthogonal RNA-targeting CRISPR effectors to co-localize a trans-splicing pre-mRNA and to inhibit the cis-splicing reaction, respectively. We demonstrate efficient, specific, and programmable trans-splicing of multi-kilobase RNA cargo into nine endogenous transcripts across two human cell types, achieving up to 45% trans-splicing efficiency in bulk, or 90% when sorting for high effector expression. Our results present RESPLICE as a new mode of RNA editing for fine-tuned and transient control of cellular programs without permanent alterations to the genetic code.
]]></description>
<dc:creator>Chandrasekaran, S. S.</dc:creator>
<dc:creator>Tau, C.</dc:creator>
<dc:creator>Nemeth, M.</dc:creator>
<dc:creator>Pawluk, A.</dc:creator>
<dc:creator>Konermann, S.</dc:creator>
<dc:creator>Hsu, P. D.</dc:creator>
<dc:date>2024-01-30</dc:date>
<dc:identifier>doi:10.1101/2024.01.29.577779</dc:identifier>
<dc:title><![CDATA[Rewriting endogenous human transcripts with trans-splicing]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2024-01-30</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2025.01.25.634869v1?rss=1">
<title>
<![CDATA[
Decoy-seq unlocks scalable genetic screening for regulatory small noncoding RNAs 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2025.01.25.634869v1?rss=1"
</link>
<description><![CDATA[
Small noncoding RNAs (smRNAs) play critical roles in regulating various cellular processes, including development, stress response, and disease pathogenesis. However, functional characterization of smRNAs remains limited by the scale and simplicity of phenotypic readouts. Recently, single-cell perturbation screening methods, which link CRISPR-mediated genetic perturbations to rich transcriptomic profiling, have emerged as foundational and scalable approaches for understanding gene functions, mapping regulatory networks, and revealing genetic interactions. However, a comparable approach for probing the regulatory consequences of smRNA perturbations is lacking. Here, we present Decoy-seq as an extension of this approach for high-content, single-cell perturbation screening of smRNAs. This method leverages U6-driven tough decoys (TuD), which form stable duplexes with their target smRNAs, for inhibition in the cell. Lentiviral-encoded TuDs are compatible with conventional single-cell RNA-sequencing (scRNA-seq) technologies, allowing joint identification of the smRNA perturbation in each cell and its associated transcriptomic profile. We applied Decoy-seq to 336 microRNAs (miRNAs) and 196 tRNA-derived fragments (tRFs) in a human breast cancer cell line, demonstrating its ability to uncover complex regulatory pathways and novel functions of these smRNAs. Notably, we show that tRFs influence mRNA polyadenylation and regulate key cancer-associated processes, such as cell cycle progression and proliferation. Therefore, Decoy-seq provides a powerful framework for exploring the functional roles of smRNAs in normal physiology and disease, and holds promise for accelerating future discoveries.
]]></description>
<dc:creator>Choi, B.</dc:creator>
<dc:creator>Sobti, S.</dc:creator>
<dc:creator>Soto, L. M.</dc:creator>
<dc:creator>Charbonneau, T.</dc:creator>
<dc:creator>Sababi, A.</dc:creator>
<dc:creator>Navickas, A.</dc:creator>
<dc:creator>Najafabadi, H. S.</dc:creator>
<dc:creator>Goodarzi, H.</dc:creator>
<dc:date>2025-01-27</dc:date>
<dc:identifier>doi:10.1101/2025.01.25.634869</dc:identifier>
<dc:title><![CDATA[Decoy-seq unlocks scalable genetic screening for regulatory small noncoding RNAs]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2025-01-27</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2024.11.01.621560v1?rss=1">
<title>
<![CDATA[
Site-specific DNA insertion into the human genome with engineered recombinases 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2024.11.01.621560v1?rss=1"
</link>
<description><![CDATA[
Technologies for precisely inserting large DNA sequences into the genome are critical for diverse research and therapeutic applications. Large serine recombinases (LSRs) can mediate direct, site-specific genomic integration of multi-kilobase DNA sequences without a pre-installed landing pad, but current approaches suffer from low insertion rates and high off-target activity. Here, we present a comprehensive engineering roadmap for the joint optimization of DNA recombination efficiency and specificity. We combined directed evolution, structural analysis, and computational models to rapidly identify additive mutational combinations. We further enhanced performance through donor DNA optimization and dCas9 fusions, enabling simultaneous target and donor recruitment. Top engineered LSR variants achieved up to 53% integration efficiency and 97% genome-wide specificity at an endogenous human locus, and effectively integrated large DNA cargoes (up to 12 kb tested) for stable expression in challenging cell types, including non-dividing cells, human embryonic stem cells, and primary human T cells. This blueprint for rational engineering of DNA recombinases enables precise genome engineering without the generation of double-stranded breaks.
]]></description>
<dc:creator>Fanton, A.</dc:creator>
<dc:creator>Bartie, L. J.</dc:creator>
<dc:creator>Martins, J. Q.</dc:creator>
<dc:creator>Tran, V. Q.</dc:creator>
<dc:creator>Goudy, L.</dc:creator>
<dc:creator>Durrant, M. G.</dc:creator>
<dc:creator>Wei, J.</dc:creator>
<dc:creator>Pawluk, A.</dc:creator>
<dc:creator>Konermann, S.</dc:creator>
<dc:creator>Marson, A.</dc:creator>
<dc:creator>Gilbert, L. A.</dc:creator>
<dc:creator>Hsu, P. D.</dc:creator>
<dc:date>2024-11-03</dc:date>
<dc:identifier>doi:10.1101/2024.11.01.621560</dc:identifier>
<dc:title><![CDATA[Site-specific DNA insertion into the human genome with engineered recombinases]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2024-11-03</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2023.08.09.552346v1?rss=1">
<title>
<![CDATA[
Environmental challenge rewires functional connections among human genes 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2023.08.09.552346v1?rss=1"
</link>
<description><![CDATA[
A fundamental question in biology is how a limited number of genes combinatorially govern cellular responses to environmental changes. While the prevailing hypothesis is that relationships between genes, processes, and ontologies could be plastic to achieve this adaptability, quantitatively comparing human gene functional connections between specific environmental conditions at scale is very challenging. Therefore, it remains unclear whether and how human genetic interaction networks are rewired in response to changing environmental conditions. Here, we developed a framework for mapping context-specific genetic interactions, enabling us to measure the plasticity of human genetic architecture upon environmental challenge for [~]250,000 interactions, using cell cycle interruption, genotoxic perturbation, and nutrient deprivation as archetypes. We discover large-scale rewiring of human gene relationships across conditions, highlighted by dramatic shifts in the functional connections of epigenetic regulators (TIP60), cell cycle regulators (PP2A), and glycolysis metabolism. Our study demonstrates that upon environmental perturbation, intra-complex genetic rewiring is rare while inter-complex rewiring is common, suggesting a modular and flexible evolutionary genetic strategy that allows a limited number of human genes to enable adaptation to a large number of environmental conditions.

One Sentence SummaryFive human genetic interaction maps reveal how the landscape of genes functional relationships is rewired as cells experience environmental stress to DNA integrity, cell cycle regulation, and metabolism.
]]></description>
<dc:creator>Herken, B. W.</dc:creator>
<dc:creator>Wong, G.</dc:creator>
<dc:creator>Norman, T.</dc:creator>
<dc:creator>Gilbert, L.</dc:creator>
<dc:date>2023-08-09</dc:date>
<dc:identifier>doi:10.1101/2023.08.09.552346</dc:identifier>
<dc:title><![CDATA[Environmental challenge rewires functional connections among human genes]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2023-08-09</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2024.12.31.630783v1?rss=1">
<title>
<![CDATA[
A generative framework for enhanced cell-type specificity in rationally designed mRNAs 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2024.12.31.630783v1?rss=1"
</link>
<description><![CDATA[
mRNA delivery offers new opportunities for disease treatment by directing cells to produce therapeutic proteins. However, designing highly stable mRNAs with programmable cell type-specificity remains a challenge. To address this, we measured the regulatory activity of 60,000 5 and 3 untranslated regions (UTRs) across six cell types and developed PARADE (Prediction And RAtional DEsign of mRNA UTRs), a generative AI framework to engineer untranslated RNA regions with tailored cell type-specific activity. We validated PARADE by testing 15,800 de novo-designed sequences across these cell lines and identified many sequences that demonstrated superior specificity and activity compared to existing RNA therapeutics. mRNAs with PARADE-engineered UTRs also exhibited robust tissue-specific activity in animal models, achieving selective expression in the liver and spleen. We also leveraged PARADE to enhance mRNA stability, significantly increasing protein output and therapeutic durability in vivo. These advancements translated to notable increases in therapeutic efficacy, as PARADE-designed UTRs in oncosuppressor mRNAs, namely PTEN and P16, effectively reduced tumor growth in patient-derived neuroglioma xenograft models and orthotopic mouse models. Collectively, these findings establish PARADE as a versatile platform for designing safer, more precise, and highly stable mRNA therapies.
]]></description>
<dc:creator>Khoroshkin, M. S.</dc:creator>
<dc:creator>Zinkevich, A.</dc:creator>
<dc:creator>Aristova, E.</dc:creator>
<dc:creator>Yousefi, H.</dc:creator>
<dc:creator>Lee, S. B.</dc:creator>
<dc:creator>Mittmann, T.</dc:creator>
<dc:creator>Manegold, K.</dc:creator>
<dc:creator>Penzar, D.</dc:creator>
<dc:creator>Raleigh, D. R.</dc:creator>
<dc:creator>Kulakovskiy, I. V.</dc:creator>
<dc:creator>Goodarzi, H.</dc:creator>
<dc:date>2024-12-31</dc:date>
<dc:identifier>doi:10.1101/2024.12.31.630783</dc:identifier>
<dc:title><![CDATA[A generative framework for enhanced cell-type specificity in rationally designed mRNAs]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2024-12-31</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2024.12.17.628962v1?rss=1">
<title>
<![CDATA[
Semantic mining of functional de novo genes from a genomic language model 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2024.12.17.628962v1?rss=1"
</link>
<description><![CDATA[
Generative genomics models can design increasingly complex biological systems. However, effectively controlling these models to generate novel sequences with desired functions remains a major challenge. Here, we show that Evo, a 7-billion parameter genomic language model, can perform function-guided design that generalizes beyond natural sequences. By learning semantic relationships across multiple genes, Evo enables a genomic "autocomplete" in which a DNA prompt encoding a desired function instructs the model to generate novel DNA sequences that can be mined for similar functions. We term this process "semantic mining," which, unlike traditional genome mining, can access a sequence landscape unconstrained by discovered evolutionary innovation. We validate this approach by experimentally testing the activity of generated anti-CRISPR proteins and toxin-antitoxin systems, including de novo genes with no significant homology to any natural protein. Strikingly, in-context protein design with Evo achieves potent activity and high experimental success rates even in the absence of structural hypotheses, known evolutionary conservation, or task-specific fine-tuning. We then use Evo to autocomplete millions of prompts to produce SynGenome, a first-of-its-kind database containing over 120 billion base pairs of AI-generated genomic sequences that enables semantic mining across many possible functions. The semantic mining paradigm enables functional exploration that ventures beyond the observed evolutionary universe.
]]></description>
<dc:creator>Merchant, A. T.</dc:creator>
<dc:creator>King, S. H.</dc:creator>
<dc:creator>Nguyen, E.</dc:creator>
<dc:creator>Hie, B. L.</dc:creator>
<dc:date>2024-12-18</dc:date>
<dc:identifier>doi:10.1101/2024.12.17.628962</dc:identifier>
<dc:title><![CDATA[Semantic mining of functional de novo genes from a genomic language model]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2024-12-18</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2024.10.28.620683v1?rss=1">
<title>
<![CDATA[
A combinatorial domain screening platform reveals epigenetic effector interactions for transcriptional perturbation 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2024.10.28.620683v1?rss=1"
</link>
<description><![CDATA[
Epigenetic regulation involves the coordinated interplay of diverse proteins. To systematically explore these combinations, we present COMBINE (combinatorial interaction exploration), a high-throughput platform that tests over 50,000 pairs of epigenetic effector domains up to 2,094 amino acids in length for their ability to modulate endogenous human gene transcription. COMBINE revealed diverse synergistic and antagonistic interactions between epigenetic protein domains, including a potent KRAB-L3MBTL3 fusion that enhanced gene silencing up to 34-fold in dose-limited conditions and enabled robust bidirectional CRISPR perturbation. Inducible screening showed DNA methylation modifiers are essential for epigenetic memory, with distinct combinations driving long-term silencing, repression, or activation. Notably, we identified TET1-based combinations that induce hit-and-run upregulation for over 50 days, demonstrating long-term transcriptional activation. This systematic analysis of pairwise domain interactions provides a rich resource for understanding epigenetic crosstalk and developing next-generation epigenome editing tools. More broadly, COMBINE offers a generalizable platform to functionally characterize combinatorial biological processes at scale.
]]></description>
<dc:creator>Moon, H. C.</dc:creator>
<dc:creator>Herschl, M. H.</dc:creator>
<dc:creator>Pawluk, A.</dc:creator>
<dc:creator>Konermann, S.</dc:creator>
<dc:creator>Hsu, P. D.</dc:creator>
<dc:date>2024-10-30</dc:date>
<dc:identifier>doi:10.1101/2024.10.28.620683</dc:identifier>
<dc:title><![CDATA[A combinatorial domain screening platform reveals epigenetic effector interactions for transcriptional perturbation]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2024-10-30</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2024.10.10.617568v1?rss=1">
<title>
<![CDATA[
A Suite of Foundation Models Captures the Contextual Interplay Between Codons 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2024.10.10.617568v1?rss=1"
</link>
<description><![CDATA[
In the canonical genetic code, many amino acids are assigned more than one codon. Work by us and others has shown that the choice of these synonymous codon is not random, and carries regulatory and functional consequences. Existing protein foundation models ignore this context-dependent role of coding sequence in shaping the protein landscape of the cell. To address this gap, we introduce cdsFM, a suite of codon-resolution large language models, including both EnCodon and DeCodon models, with up to 1B parameters. Pre-trained on 60 million protein-coding sequences from more than 5,000 species, our models effectively learn the relationship between codons and amino acids, recapitualing the overall structure of the genetic code. In addition to outperforming state-of-the-art genomic foundation models in a variety of zero-shot and few-shot learning tasks, the larger pre-trained models were superior in predicting the choice of synonymous codons. To systematically assess the impact of synonymous codon choices on protein expression and our models ability to capture these effects, we generated a large dataset measuring overall and surface expression levels of three proteins as a function of changes in their synonymous codons. We showed that our EnCodon models could be readily fine-tuned to predict the contextual consequences of synonymous codon choices. Armed with this knowledge, we applied EnCodon to existing clinical datasets of synonymous variants, and we identified a large number of synonymous codons that are likely pathogenic, several of which we experimentally confirmed in a cellbased model. Together, our findings establish the cdsFM suite as a powerful tool for decoding the complex functional grammar underlying the choice of synonymous codons.
]]></description>
<dc:creator>Naghipourfar, M.</dc:creator>
<dc:creator>Chen, S.</dc:creator>
<dc:creator>Howard, M.</dc:creator>
<dc:creator>Macdonald, C.</dc:creator>
<dc:creator>Saberi, A.</dc:creator>
<dc:creator>Hagen, T.</dc:creator>
<dc:creator>Mofrad, M.</dc:creator>
<dc:creator>Coyote-Maestas, W.</dc:creator>
<dc:creator>Goodarzi, H.</dc:creator>
<dc:date>2024-10-13</dc:date>
<dc:identifier>doi:10.1101/2024.10.10.617568</dc:identifier>
<dc:title><![CDATA[A Suite of Foundation Models Captures the Contextual Interplay Between Codons]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2024-10-13</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2024.08.26.609813v1?rss=1">
<title>
<![CDATA[
A long context RNA foundation model for predicting transcriptome architecture 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2024.08.26.609813v1?rss=1"
</link>
<description><![CDATA[
Linking DNA sequence to genomic function remains one of the grand challenges in genetics and genomics. Here, we combine large-scale single-molecule transcriptome sequencing of diverse cancer cell lines with cutting-edge machine learning to build LoRNASH, an RNA foundation model that learns how the nucleotide sequence of unspliced pre-mRNA dictates transcriptome architecture--the relative abundances and molecular structures of mRNA isoforms. Owing to its use of the StripedHyena architecture, LoRNASH handles extremely long sequence inputs at base-pair resolution ([~]65 kilobase pairs), allowing for quantitative, zero-shot prediction of all aspects of transcriptome architecture, including isoform abundance, isoform structure, and the impact of DNA sequence variants on transcript structure and abundance. We anticipate that our public data release and the accompanying frontier model will accelerate many aspects of RNA biotechnology. More broadly, we envision the use of LoRNASH as a foundation for fine-tuning of any transcriptome-related downstream prediction task, including cell-type specific gene expression, splicing, and general RNA processing.
]]></description>
<dc:creator>Goodarzi, H.</dc:creator>
<dc:creator>Najafabadi, H. S.</dc:creator>
<dc:creator>Ramani, V.</dc:creator>
<dc:creator>Emad, A.</dc:creator>
<dc:creator>Namini, A.</dc:creator>
<dc:creator>Naghipourfar, M.</dc:creator>
<dc:creator>Wang, S.</dc:creator>
<dc:creator>Choi, B.</dc:creator>
<dc:creator>Saberi, A.</dc:creator>
<dc:date>2024-08-27</dc:date>
<dc:identifier>doi:10.1101/2024.08.26.609813</dc:identifier>
<dc:title><![CDATA[A long context RNA foundation model for predicting transcriptome architecture]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2024-08-27</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2025.01.17.633622v1?rss=1">
<title>
<![CDATA[
Pervasive and programmed nucleosome distortion patterns on single mammalian chromatin fibers 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2025.01.17.633622v1?rss=1"
</link>
<description><![CDATA[
We present a genome-scale method to map the single-molecule co-occupancy of structurally distinct nucleosomes, subnucleosomes, and other protein-DNA interactions via long-read high-resolution adenine methyltransferase footprinting. Iteratively Defined Lengths of Inaccessibility (IDLI) classifies nucleosomes on the basis of shared patterns of intranucleosomal accessibility, into: i.) minimally-accessible chromatosomes; ii.) octasomes with stereotyped DNA accessibility from superhelical locations (SHLs) {+/-}1 through {+/-}7; iii.) highly-accessible unwrapped nucleosomes; and iv.) subnucleosomal species, such as hexasomes, tetrasomes, and other short DNA protections. Applying IDLI to mouse embryonic stem cell (mESC) chromatin, we discover widespread nucleosomal distortion on individual mammalian chromatin fibers, with >85% of nucleosomes surveyed displaying degrees of intranucleosomally accessible DNA. We observe epigenomic-domain-specific patterns of distorted nucleosome co-occupancy and positioning, including at enhancers, promoters, and mouse satellite repeat sequences. Nucleosome distortion is programmed by the presence of bound transcription factors (TFs) at cognate motifs; occupied TF binding sites are differentially decorated by distorted nucleosomes compared to unbound sites, and degradation experiments establish direct roles for TFs in structuring binding-site proximal nucleosomes. Finally, we apply IDLI in the context of primary mouse hepatocytes, observing evidence for pervasive nucleosomal distortion in vivo. Further genetic experiments reveal a role for the hepatocyte master regulator FOXA2 in directly impacting nucleosome distortion at hepatocyte-specific regulatory elements in vivo. Our work suggests extreme--but regulated--plasticity in nucleosomal DNA accessibility at the single-molecule level. Further, our study offers an essential new framework to model transcription factor binding, nucleosome remodeling, and cell-type specific gene regulation across biological contexts.
]]></description>
<dc:creator>Yang, M. G.</dc:creator>
<dc:creator>Richter, H. J.</dc:creator>
<dc:creator>Wang, S.</dc:creator>
<dc:creator>McNally, C. P.</dc:creator>
<dc:creator>Harris, N.</dc:creator>
<dc:creator>Dhillon, S.</dc:creator>
<dc:creator>Maresca, M.</dc:creator>
<dc:creator>de Wit, E.</dc:creator>
<dc:creator>Willenbring, H.</dc:creator>
<dc:creator>Maher, J.</dc:creator>
<dc:creator>Goodarzi, H.</dc:creator>
<dc:creator>Ramani, V.</dc:creator>
<dc:date>2025-01-22</dc:date>
<dc:identifier>doi:10.1101/2025.01.17.633622</dc:identifier>
<dc:title><![CDATA[Pervasive and programmed nucleosome distortion patterns on single mammalian chromatin fibers]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2025-01-22</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2024.12.13.628239v1?rss=1">
<title>
<![CDATA[
Multiplexed mosaic tumor models reveal natural phenotypic variations in drug response within and between populations 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2024.12.13.628239v1?rss=1"
</link>
<description><![CDATA[
Many agents that show promise in preclinical cancer models lack efficacy in patients due to patient heterogeneity that is not captured in traditional assays. To address this problem, we have developed GENEVA, a platform that measures the molecular and phenotypic consequences of drug perturbations within diverse populations of cancer cells at single-cell resolution, both in vitro and in vivo. Here, we apply GENEVA to study the KRAS G12C inhibitors, recapitulating known properties of these drugs and uncovering a previously unknown role for mitochondrial activation in cell death induced by KRAS inhibition. We demonstrate that this finding can be leveraged for the development of combination therapies with greater efficacy. Finally, we show that the application of GENEVA with in vivo mouse models revealed epithelial to mesenchymal transition (EMT) as a key mechanism for resistance to KRAS G12C inhibition.
]]></description>
<dc:creator>Yu, J. X.</dc:creator>
<dc:creator>Suh, J. M.</dc:creator>
<dc:creator>Popova, K. D.</dc:creator>
<dc:creator>Garcia, K.</dc:creator>
<dc:creator>Joshi, T.</dc:creator>
<dc:creator>Culbertson, B.</dc:creator>
<dc:creator>Spinelli, J. B.</dc:creator>
<dc:creator>Subramanyam, V.</dc:creator>
<dc:creator>Lou, K.</dc:creator>
<dc:creator>Shokat, K. M.</dc:creator>
<dc:creator>Weissman, J.</dc:creator>
<dc:creator>Goodarzi, H.</dc:creator>
<dc:date>2024-12-16</dc:date>
<dc:identifier>doi:10.1101/2024.12.13.628239</dc:identifier>
<dc:title><![CDATA[Multiplexed mosaic tumor models reveal natural phenotypic variations in drug response within and between populations]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2024-12-16</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2025.02.10.637436v1?rss=1">
<title>
<![CDATA[
Dietary manipulation of intestinal microbes prolongs survival in a mouse model of Hirschsprung disease 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2025.02.10.637436v1?rss=1"
</link>
<description><![CDATA[
Enterocolitis is a common and potentially deadly manifestation of Hirschsprung disease (HSCR) but disease mechanisms remain poorly defined. Unexpectedly, we discovered that diet can dramatically affect the lifespan of a HSCR mouse model (Piebald lethal, sl/sl) where affected animals die from HAEC complications. In the sl/sl model, diet alters gut microbes and metabolites, leading to changes in colon epithelial gene expression and epithelial oxygen levels known to influence colitis severity. Our findings demonstrate unrecognized similarity between HAEC and other types of colitis and suggest dietary manipulation could be a valuable therapeutic strategy for people with HSCR.

AbstractHirschsprung disease (HSCR) is a birth defect where enteric nervous system (ENS) is absent from distal bowel. Bowel lacking ENS fails to relax, causing partial obstruction. Affected children often have "Hirschsprung disease associated enterocolitis" (HAEC), which predisposes to sepsis. We discovered survival of Piebald lethal (sl/sl) mice, a well-established HSCR model with HAEC, is markedly altered by two distinct standard chow diets. A "Protective" diet increased fecal butyrate/isobutyrate and enhanced production of gut epithelial antimicrobial peptides in proximal colon. In contrast, "Detrimental" diet-fed sl/sl had abnormal appearing distal colon epithelium mitochondria, reduced epithelial mRNA involved in oxidative phosphorylation, and elevated epithelial oxygen that fostered growth of inflammation-associated Enterobacteriaceae. Accordingly, selective depletion of Enterobacteriaceae with sodium tungstate prolonged sl/sl survival. Our results provide the first strong evidence that diet modifies survival in a HSCR mouse model, without altering length of distal colon lacking ENS.

HighlightsO_LITwo different standard mouse diets alter survival in the Piebald lethal (sl/sl) mouse model of Hirschsprung disease, without impacting extent of distal colon aganglionosis (the region lacking ENS).
C_LIO_LIPiebald lethal mice fed the "Detrimental" diet had many changes in colon epithelial transcriptome including decreased mRNA for antimicrobial peptides and genes involved in oxidative phosphorylation. Detrimental diet fed sl/sl also had aberrant-appearing mitochondria in distal colon epithelium, with elevated epithelial oxygen that drives lethal Enterobacteriaceae overgrowth via aerobic respiration.
C_LIO_LIElimination of Enterobacteriaceae with antibiotics or sodium tungstate improves survival of Piebald lethal fed the "Detrimental diet".
C_LI

Graphical abstract

O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=94 SRC="FIGDIR/small/637436v1_ufig1.gif" ALT="Figure 1">
View larger version (15K):
org.highwire.dtl.DTLVardef@d95251org.highwire.dtl.DTLVardef@1ab58caorg.highwire.dtl.DTLVardef@5260b0org.highwire.dtl.DTLVardef@49ce42_HPS_FORMAT_FIGEXP  M_FIG C_FIG
]]></description>
<dc:creator>Tjaden, N. E. B.</dc:creator>
<dc:creator>Liou, M. J.</dc:creator>
<dc:creator>Sax, S.</dc:creator>
<dc:creator>Lassoued, N.</dc:creator>
<dc:creator>Lou, M.</dc:creator>
<dc:creator>Schneider, S.</dc:creator>
<dc:creator>Beigel, K.</dc:creator>
<dc:creator>Eisenberg, J. D.</dc:creator>
<dc:creator>Loeffler, E.</dc:creator>
<dc:creator>Anderson, S. E.</dc:creator>
<dc:creator>Yan, G.</dc:creator>
<dc:creator>Litichevskiy, L.</dc:creator>
<dc:creator>Dohnalova, L.</dc:creator>
<dc:creator>Zhu, Y.</dc:creator>
<dc:creator>Jin, D. M. J. C.</dc:creator>
<dc:creator>Raab, J.</dc:creator>
<dc:creator>Furth, E. E.</dc:creator>
<dc:creator>Thompson, Z.</dc:creator>
<dc:creator>Rubenstein, R. C.</dc:creator>
<dc:creator>Pilon, N.</dc:creator>
<dc:creator>Thaiss, C. A.</dc:creator>
<dc:creator>Heuckeroth, R.</dc:creator>
<dc:date>2025-02-11</dc:date>
<dc:identifier>doi:10.1101/2025.02.10.637436</dc:identifier>
<dc:title><![CDATA[Dietary manipulation of intestinal microbes prolongs survival in a mouse model of Hirschsprung disease]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2025-02-11</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2025.02.05.636714v1?rss=1">
<title>
<![CDATA[
scGPT-spatial: Continual Pretraining of Single-Cell Foundation Model for Spatial Transcriptomics 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2025.02.05.636714v1?rss=1"
</link>
<description><![CDATA[
Spatial transcriptomics has emerged as a pivotal technology for profiling gene expression of cells within their spatial context. The rapid growth of publicly available spatial data presents an opportunity to further our understanding of microenvironments that drive cell fate decisions and disease progression. However, existing foundation models, largely pretrained on single-cell RNA sequencing (scRNA-seq) data, fail to resolve the spatial relationships among samples or capture the unique distributions from various sequencing protocols. We introduce scGPT-spatial, a specialized foundation model for spatial transcriptomics continually pretrained on our previously published scGPT scRNA-seq foundation model. We also curate SpatialHuman30M, a comprehensive spatial transcriptomics dataset comprising of 30 million spatial transcriptomic profiles, encompassing both imaging- and sequencing-based protocols. To facilitate integration, scGPT-spatial introduces a novel MoE (Mixture of Experts) decoder that adaptively routes samples for protocol-aware decoding of gene expression profiles. Moreover, scGPT-spatial employs a spatially-aware sampling strategy and a novel neighborhood-based training objective to better capture spatial co-localization patterns among cell states within tissue. Empirical evaluations demonstrate that scGPT-spatial robustly integrates spatial data in mulit-slide and multi-modal settings, and effectively supports cell-type deconvolution and contextualized missing gene expression imputation, outperforming many existing methods. The scGPT-spatial codebase is publicly available at https://github.com/bowang-lab/scGPT-spatial.
]]></description>
<dc:creator>Wang, C. X.</dc:creator>
<dc:creator>Cui, H.</dc:creator>
<dc:creator>Zhang, A. H.</dc:creator>
<dc:creator>Xie, R.</dc:creator>
<dc:creator>Goodarzi, H.</dc:creator>
<dc:creator>Wang, B.</dc:creator>
<dc:date>2025-02-08</dc:date>
<dc:identifier>doi:10.1101/2025.02.05.636714</dc:identifier>
<dc:title><![CDATA[scGPT-spatial: Continual Pretraining of Single-Cell Foundation Model for Spatial Transcriptomics]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2025-02-08</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2025.02.01.636038v1?rss=1">
<title>
<![CDATA[
Learning single-cell spatial context through integrated spatial multiomics with CORAL 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2025.02.01.636038v1?rss=1"
</link>
<description><![CDATA[
Cellular organization is central to tissue function and homeostasis, influencing development, disease progression, and therapeutic outcomes. The emergence of spatial omics technologies, including spatial transcriptomics and proteomics, has enabled the integration of molecular and histological features within tissues. Analyzing these multimodal data presents unique challenges, including variable resolutions, imperfect tissue alignment, and limited or variable spatial coverage. To address these issues, we introduce CORAL, a probabilistic deep generative model that leverages graph attention mechanisms to learn expressive, integrated representations of multimodal spatial omics data. CORAL deconvolves low-resolution spatial data into high-resolution single-cell profiles and detects functional spatial domains. It also characterizes cell-cell interactions and elucidates disease-relevant spatial features. Validated on synthetic data and experimental datasets, including Stero-CITE-seq data from mouse thymus, and paired CODEX and Visium data from hepatocellular carcinoma, CORAL demonstrates robustness and versatility. In hepatocellular carcinoma, CORAL uncovered key immune cell subsets that drive the failure of response to immunotherapy, highlighting its potential to advance spatial single-cell analyses and accelerate translational research.
]]></description>
<dc:creator>He, S.</dc:creator>
<dc:creator>Bieniosek, M.</dc:creator>
<dc:creator>Song, D.</dc:creator>
<dc:creator>Zhou, J.</dc:creator>
<dc:creator>Chidester, B.</dc:creator>
<dc:creator>Wu, Z.</dc:creator>
<dc:creator>Boen, J.</dc:creator>
<dc:creator>Sharma, P.</dc:creator>
<dc:creator>Trevino, A. E.</dc:creator>
<dc:creator>Zou, J.</dc:creator>
<dc:date>2025-02-06</dc:date>
<dc:identifier>doi:10.1101/2025.02.01.636038</dc:identifier>
<dc:title><![CDATA[Learning single-cell spatial context through integrated spatial multiomics with CORAL]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2025-02-06</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2024.03.19.585748v1?rss=1">
<title>
<![CDATA[
Systematic annotation of orphan RNAs reveals blood-accessible molecular barcodes of cancer identity and cancer-emergent oncogenic drivers 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2024.03.19.585748v1?rss=1"
</link>
<description><![CDATA[
From extrachromosomal DNA to neo-peptides, the broad reprogramming of the cancer genome leads to the emergence of molecules that are specific to the cancer state. We recently described orphan non-coding RNAs (oncRNAs) as a class of cancer-specific small RNAs with the potential to play functional roles in breast cancer progression1. Here, we report a systematic and comprehensive search to identify, annotate, and characterize cancer-emergent oncRNAs across 32 tumor types. We also leverage large-scale in vivo genetic screens in xenografted mice to functionally identify driver oncRNAs in multiple tumor types. We have not only discovered a large repertoire of oncRNAs, but also found that their presence and absence represent a digital molecular barcode that faithfully captures the types and subtypes of cancer. Importantly, we discovered that this molecular barcode is partially accessible from the cell-free space as some oncRNAs are secreted by cancer cells. In a large retrospective study across 192 breast cancer patients, we showed that oncRNAs can be reliably detected in the blood and that changes in the cell-free oncRNA burden captures both short-term and long-term clinical outcomes upon completion of a neoadjuvant chemotherapy regimen. Together, our findings establish oncRNAs as an emergent class of cancer-specific non-coding RNAs with potential roles in tumor progression and clinical utility in liquid biopsies and disease monitoring.
]]></description>
<dc:creator>Wang, J.</dc:creator>
<dc:creator>Suh, J. M.</dc:creator>
<dc:creator>Woo, B. J.</dc:creator>
<dc:creator>Navickas, A.</dc:creator>
<dc:creator>Garcia, K.</dc:creator>
<dc:creator>Yin, K.</dc:creator>
<dc:creator>Fish, L.</dc:creator>
<dc:creator>Cavazos, T.</dc:creator>
<dc:creator>Hanisch, B.</dc:creator>
<dc:creator>Markett, D.</dc:creator>
<dc:creator>Yu, S.</dc:creator>
<dc:creator>Hirst, G.</dc:creator>
<dc:creator>Brown-Swigart, L.</dc:creator>
<dc:creator>Esserman, L. J.</dc:creator>
<dc:creator>van 't Veer, L. J.</dc:creator>
<dc:creator>Goodarzi, H.</dc:creator>
<dc:date>2024-03-21</dc:date>
<dc:identifier>doi:10.1101/2024.03.19.585748</dc:identifier>
<dc:title><![CDATA[Systematic annotation of orphan RNAs reveals blood-accessible molecular barcodes of cancer identity and cancer-emergent oncogenic drivers]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2024-03-21</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2025.02.18.638918v1?rss=1">
<title>
<![CDATA[
Genome modeling and design across all domains of life with Evo 2 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2025.02.18.638918v1?rss=1"
</link>
<description><![CDATA[
All of life encodes information with DNA. While tools for sequencing, synthesis, and editing of genomic code have transformed biological research, intelligently composing new biological systems would also require a deep understanding of the immense complexity encoded by genomes. We introduce Evo 2, a biological foundation model trained on 9.3 trillion DNA base pairs from a highly curated genomic atlas spanning all domains of life. We train Evo 2 with 7B and 40B parameters to have an unprecedented 1 million token context window with single-nucleotide resolution. Evo 2 learns from DNA sequence alone to accurately predict the functional impacts of genetic variation--from noncoding pathogenic mutations to clinically significant BRCA1 variants--without task-specific finetuning. Applying mechanistic interpretability analyses, we reveal that Evo 2 autonomously learns a breadth of biological features, including exon-intron boundaries, transcription factor binding sites, protein structural elements, and prophage genomic regions. Beyond its predictive capabilities, Evo 2 generates mitochondrial, prokaryotic, and eukaryotic sequences at genome scale with greater naturalness and coherence than previous methods. Guiding Evo 2 via inference-time search enables controllable generation of epigenomic structure, for which we demonstrate the first inference-time scaling results in biology. We make Evo 2 fully open, including model parameters, training code, inference code, and the OpenGenome2 dataset, to accelerate the exploration and design of biological complexity.
]]></description>
<dc:creator>Brixi, G.</dc:creator>
<dc:creator>Durrant, M. G.</dc:creator>
<dc:creator>Ku, J.</dc:creator>
<dc:creator>Poli, M.</dc:creator>
<dc:creator>Brockman, G.</dc:creator>
<dc:creator>Chang, D.</dc:creator>
<dc:creator>Gonzalez, G. A.</dc:creator>
<dc:creator>King, S. H.</dc:creator>
<dc:creator>Li, D. B.</dc:creator>
<dc:creator>Merchant, A. T.</dc:creator>
<dc:creator>Naghipourfar, M.</dc:creator>
<dc:creator>Nguyen, E.</dc:creator>
<dc:creator>Ricci-Tam, C.</dc:creator>
<dc:creator>Romero, D. W.</dc:creator>
<dc:creator>Sun, G.</dc:creator>
<dc:creator>Taghibakshi, A.</dc:creator>
<dc:creator>Vorontsov, A.</dc:creator>
<dc:creator>Yang, B.</dc:creator>
<dc:creator>Deng, M.</dc:creator>
<dc:creator>Gorton, L.</dc:creator>
<dc:creator>Nguyen, N.</dc:creator>
<dc:creator>Wang, N. K.</dc:creator>
<dc:creator>Adams, E.</dc:creator>
<dc:creator>Baccus, S. A.</dc:creator>
<dc:creator>Dillmann, S.</dc:creator>
<dc:creator>Ermon, S.</dc:creator>
<dc:creator>Guo, D.</dc:creator>
<dc:creator>Ilango, R.</dc:creator>
<dc:creator>Janik, K.</dc:creator>
<dc:creator>Lu, A. X.</dc:creator>
<dc:creator>Mehta, R.</dc:creator>
<dc:creator>Mofrad, M. R. K.</dc:creator>
<dc:creator>Ng, M. Y.</dc:creator>
<dc:creator>Pannu, J.</dc:creator>
<dc:creator>Re, C.</dc:creator>
<dc:creator>Schmok, J. C.</dc:creator>
<dc:creator>St. John, J.</dc:creator>
<dc:creator>Sullivan, J.</dc:creator>
<dc:creator>Zhu, K.</dc:creator>
<dc:creator>Zynda, G.</dc:creator>
<dc:creator>Balsam, D.</dc:creator>
<dc:creator>Collison, P.</dc:creator>
<dc:creator>Costa, A. B.</dc:creator>
<dc:creator>Hernandez-Boussard, T.</dc:creator>
<dc:creator>Ho, E.</dc:creator>
<dc:creator>Liu, M.-Y.</dc:creator>
<dc:creator>McGrath, T.</dc:creator>
<dc:creator>P</dc:creator>
<dc:date>2025-02-21</dc:date>
<dc:identifier>doi:10.1101/2025.02.18.638918</dc:identifier>
<dc:title><![CDATA[Genome modeling and design across all domains of life with Evo 2]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2025-02-21</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2024.09.21.614248v1?rss=1">
<title>
<![CDATA[
Toxicity of extracellular cGAMP and its analogs to T cells is due to SLC7A1-mediated import 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2024.09.21.614248v1?rss=1"
</link>
<description><![CDATA[
STING agonists are promising innate immune therapies and can synergize with adaptive immune checkpoint blockade therapies for cancer treatment, but their effectiveness is limited by the toxicity to activated T cells. An important class of STING agonists are analogs of the endogenous STING agonist, cGAMP, and while transporters for these small molecules are known in some cell types, how they enter and kill T cells remains unknown. Here, we identify the cationic amino acid transporter SLC7A1 as the dominant transporter of cGAMP and its analogs in activated primary mouse and human T cells. T cells upregulate this transporter upon activation and rapid proliferation to meet their high metabolic demand, but this comes at the cost of enabling increased transport and toxicity of cGAMP. To circumvent the essentiality of SLC7A1 to proliferating T cells, we found that the residues responsible for cGAMP transport are separate from the arginine binding pocket allowing us to perturb cGAMP transport and STING-activation mediated killing without impacting arginine transport. These results suggest that SLC7A1 is a potential target for alleviating T cell toxicity associated with cGAMP and its analogs.
]]></description>
<dc:creator>Sudaryo, V.</dc:creator>
<dc:creator>R. Carvalho, D.</dc:creator>
<dc:creator>Lee, J. M.</dc:creator>
<dc:creator>Carozza, J. A.</dc:creator>
<dc:creator>Cao, X.</dc:creator>
<dc:creator>Cordova, A. F.</dc:creator>
<dc:creator>Li, L.</dc:creator>
<dc:date>2024-09-24</dc:date>
<dc:identifier>doi:10.1101/2024.09.21.614248</dc:identifier>
<dc:title><![CDATA[Toxicity of extracellular cGAMP and its analogs to T cells is due to SLC7A1-mediated import]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2024-09-24</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2025.02.20.639398v1?rss=1">
<title>
<![CDATA[
Tahoe-100M: A Giga-Scale Single-Cell Perturbation Atlas for Context-Dependent Gene Function and Cellular Modeling 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2025.02.20.639398v1?rss=1"
</link>
<description><![CDATA[
Building predictive models of the cell requires systematically mapping how perturbations reshape each cells state, function, and behavior. Here, we present Tahoe-100M, a giga-scale single-cell atlas of 100 million transcriptomic profiles measuring how each of 1,100 small-molecule perturbations impact cells across 50 cancer cell lines. Our high-throughput Mosaic platform, composed of a highly diverse and optimally balanced "cell village", reduces batch effects and enables parallel profiling of thousands of conditions at single-cell resolution at an unprecedented scale.

As the largest single-cell dataset to date, Tahoe-100M enables artificial-intelligence (AI)-driven models to learn context-dependent functions, capturing fundamental principles of gene regulation and network dynamics. Although we leverage cancer models and pharmacological compounds to create this resource, Tahoe-100M is fundamentally designed as a broadly applicable perturbation atlas and supports deeper insights into cell biology across multiple tissues and contexts. By publicly releasing this atlas, we aim to accelerate the creation and development of robust AI frameworks for systems biology, ultimately improving our ability to predict and manipulate cellular behaviors across a wide range of applications.
]]></description>
<dc:creator>Zhang, J.</dc:creator>
<dc:creator>Ubas, A. A.</dc:creator>
<dc:creator>de Borja, R.</dc:creator>
<dc:creator>Svensson, V.</dc:creator>
<dc:creator>Thomas, N.</dc:creator>
<dc:creator>Thakar, N.</dc:creator>
<dc:creator>Lai, I.</dc:creator>
<dc:creator>Winters, A.</dc:creator>
<dc:creator>Khan, U.</dc:creator>
<dc:creator>Jones, M. G.</dc:creator>
<dc:creator>Tran, V.</dc:creator>
<dc:creator>Pangallo, J.</dc:creator>
<dc:creator>Papalexi, E.</dc:creator>
<dc:creator>Sapre, A.</dc:creator>
<dc:creator>Nguyen, H.</dc:creator>
<dc:creator>Sanderson, O.</dc:creator>
<dc:creator>Nigos, M.</dc:creator>
<dc:creator>Kaplan, O.</dc:creator>
<dc:creator>Schroeder, S.</dc:creator>
<dc:creator>Hariadi, B.</dc:creator>
<dc:creator>Marrujo, S.</dc:creator>
<dc:creator>Salvino, C. C. A.</dc:creator>
<dc:creator>Gallareta Olivares, G.</dc:creator>
<dc:creator>Koehler, R.</dc:creator>
<dc:creator>Geiss, G.</dc:creator>
<dc:creator>Rosenberg, A.</dc:creator>
<dc:creator>Roco, C.</dc:creator>
<dc:creator>Merico, D.</dc:creator>
<dc:creator>Alidoust, N.</dc:creator>
<dc:creator>Goodarzi, H.</dc:creator>
<dc:creator>Yu, J.</dc:creator>
<dc:date>2025-02-24</dc:date>
<dc:identifier>doi:10.1101/2025.02.20.639398</dc:identifier>
<dc:title><![CDATA[Tahoe-100M: A Giga-Scale Single-Cell Perturbation Atlas for Context-Dependent Gene Function and Cellular Modeling]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2025-02-24</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2024.10.28.620517v1?rss=1">
<title>
<![CDATA[
Combinatorial effector targeting (COMET) for transcriptional modulation and locus-specific biochemistry 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2024.10.28.620517v1?rss=1"
</link>
<description><![CDATA[
Understanding how human gene expression is coordinately regulated by functional units of proteins across the genome remains a major biological goal. Here, we present COMET, a high-throughput screening platform for combinatorial effector targeting for the identification of transcriptional modulators. We generate libraries of combinatorial dCas9-based fusion proteins, containing two to six effector domains, allowing us to systematically investigate more than 110,000 combinations of effector proteins at endogenous human loci for their influence on transcription. Importantly, we keep full proteins or domains intact, maintaining catalytic cores and surfaces for protein-protein interactions. We observe more than 5800 significant hits that modulate transcription, we demonstrate cell type specific transcriptional modulation, and we further investigate epistatic relationships between our effector combinations. We validate unexpected combinations as synergistic or buffering, emphasizing COMET as both a method for transcriptional effector discovery, and as a functional genomics tool for identifying novel domain interactions and directing locus-specific biochemistry.
]]></description>
<dc:creator>Wilson, C. M.</dc:creator>
<dc:creator>Pommier, G. C.</dc:creator>
<dc:creator>Richman, D. D.</dc:creator>
<dc:creator>Sambold, N.</dc:creator>
<dc:creator>Hussmann, J. A.</dc:creator>
<dc:creator>Weissman, J. S.</dc:creator>
<dc:creator>Gilbert, L. A.</dc:creator>
<dc:date>2024-10-28</dc:date>
<dc:identifier>doi:10.1101/2024.10.28.620517</dc:identifier>
<dc:title><![CDATA[Combinatorial effector targeting (COMET) for transcriptional modulation and locus-specific biochemistry]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2024-10-28</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2025.02.27.640494v1?rss=1">
<title>
<![CDATA[
scBaseCamp: An AI agent-curated, uniformly processed, and continually expanding single cell data repository 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2025.02.27.640494v1?rss=1"
</link>
<description><![CDATA[
Single-cell RNA sequencing has transformed cell biology by enabling precise transcriptomic measurements of individual cells. The Sequence Read Archive (SRA) is the largest public repository of sequencing reads, yet much of it remains underutilized due to unstandardized metadata and the cost of processing reads. Here, we introduce scBaseCount, a single-cell RNA sequencing database that leverages an AI agent to automate discovery and metadata extraction, and standardize data processing. Built by directly mining all 10x Genomics datasets from SRA, scBaseCount is the largest freely accessible public repository of single-cell gene expression data, comprising over 502 million cells across 27 organisms and 75 tissues, offering an unbiased view of the composition of data within SRA. Uniform processing enables measurement of both intronic and exonic reads, non-coding gene expression and improves alignment across experiments as well as the performance of AI models trained on this phenotypically diverse data. Moreover, scBaseCount provides a blueprint for how AI can be leveraged to curate and autonomously update large biological data repositories.
]]></description>
<dc:creator>Youngblut, N. D.</dc:creator>
<dc:creator>Carpenter, C.</dc:creator>
<dc:creator>Prashar, J.</dc:creator>
<dc:creator>Ricci-Tam, C.</dc:creator>
<dc:creator>Ilango, R.</dc:creator>
<dc:creator>Teyssier, N.</dc:creator>
<dc:creator>Konermann, S.</dc:creator>
<dc:creator>Hsu, P.</dc:creator>
<dc:creator>Dobin, A.</dc:creator>
<dc:creator>Burke, D. P.</dc:creator>
<dc:creator>Goodarzi, H.</dc:creator>
<dc:creator>Roohani, Y. H.</dc:creator>
<dc:date>2025-03-04</dc:date>
<dc:identifier>doi:10.1101/2025.02.27.640494</dc:identifier>
<dc:title><![CDATA[scBaseCamp: An AI agent-curated, uniformly processed, and continually expanding single cell data repository]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2025-03-04</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2025.05.14.653916v1?rss=1">
<title>
<![CDATA[
Megabase-scale human genome rearrangement with programmable bridge recombinases 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2025.05.14.653916v1?rss=1"
</link>
<description><![CDATA[
Bridge recombinases are a class of naturally occurring RNA-guided DNA recombinases. We previously demonstrated they can programmably insert, excise, and invert DNA in vitro and in bacteria. Here, we report the discovery and engineering of IS622, a simple two-component system capable of universal DNA rearrangements of the human genome. We define strategies for the optimal application of bridge systems, leveraging mechanistic insights to improve their targeting specificity. Through rational engineering of the IS622 bridge RNA and deep mutational scanning of its recombinase, we achieve up to 20% insertion efficiency into the human genome and genome-wide specificity as high as 82%. We further demonstrate intra-chromosomal inversion and excision, mobilizing up to 0.93 megabases of DNA. Finally, we provide proof-of-concept for excision of a gene regulatory region or expanded repeats relevant for the treatment of genetic diseases.
]]></description>
<dc:creator>Perry, N. T.</dc:creator>
<dc:creator>Bartie, L. J.</dc:creator>
<dc:creator>Katrekar, D.</dc:creator>
<dc:creator>Gonzalez, G. A.</dc:creator>
<dc:creator>Durrant, M. G.</dc:creator>
<dc:creator>Pai, J. J.</dc:creator>
<dc:creator>Fanton, A.</dc:creator>
<dc:creator>Hiraizumi, M.</dc:creator>
<dc:creator>Ricci-Tam, C.</dc:creator>
<dc:creator>Nishimasu, H.</dc:creator>
<dc:creator>Konermann, S.</dc:creator>
<dc:creator>Hsu, P. D.</dc:creator>
<dc:date>2025-05-14</dc:date>
<dc:identifier>doi:10.1101/2025.05.14.653916</dc:identifier>
<dc:title><![CDATA[Megabase-scale human genome rearrangement with programmable bridge recombinases]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2025-05-14</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2025.04.08.647863v1?rss=1">
<title>
<![CDATA[
BINSEQ: A Family of High-Performance Binary Formats for Nucleotide Sequences 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2025.04.08.647863v1?rss=1"
</link>
<description><![CDATA[
AO_SCPLOWBSTRACTC_SCPLOWModern genomics produces billions of sequencing records per run, which are typically stored as gzip-compressed FASTQ files. While this format is widely used, it is not optimal for high-throughput processing due to its reliance on single-threaded decompression and sequential parsing of irregularly sized records. This limitation is particularly problematic for applications that would benefit from parallel processing, such as read mapping, variant calling, and de novo assembly. Here, we present BINSEQ, a family of simple binary formats that enable high-throughput parallel processing of sequencing data. The BINSEQ family consists of two complementary implementations: BQ, optimized for fixed-length reads using a two-bit or four-bit encoding scheme with true random record access capability, and VBQ, designed for variable-length sequences with optional quality scores and block-based compression. We demonstrate that BINSEQ files are up to 90x faster than compressed FASTQ for parallel processing and can reduce analysis time from hours to minutes for large-scale genome and transcriptome analyses, particularly for resource-intensive applications like alignment, mapping, and de novo assembly. To facilitate adoption we provide high-performance libraries for reading and writing BINSEQ formats, native parallelization strategies with convenient APIs, and a command-line tool for conversion to and from traditional formats.

Author SummaryModern sequencing technologies routinely generate billions of reads per experiment, yet the methods for storing and accessing this data have not kept pace. Sequencing reads remain predominantly stored in FASTQ, a text-based format designed for far smaller datasets. FASTQs sequential parsing requirements and practical need for compression create a fundamental mismatch with modern multi-core architectures, where data access rather than computation has become the primary bottleneck. We address this problem with BINSEQ, a family of binary formats engineered for random access and native parallelization. Systematic benchmarking across applications of varying computational complexity demonstrates that BINSEQ achieves 90-fold improvements in data access and maintains substantial advantages in compute-intensive tasks such as genome alignment, reducing runtimes from hours to minutes. We present two complementary implementations: BQ, optimized for simplicity and maximal throughput, and VBQ, designed for flexibility while maintaining high performance. By reconsidering the relationship between storage architecture and parallel processing capabilities, BINSEQ provides a practical solution to a critical infrastructure challenge in high-throughput genomics.
]]></description>
<dc:creator>Teyssier, N.</dc:creator>
<dc:creator>Dobin, A.</dc:creator>
<dc:date>2025-04-15</dc:date>
<dc:identifier>doi:10.1101/2025.04.08.647863</dc:identifier>
<dc:title><![CDATA[BINSEQ: A Family of High-Performance Binary Formats for Nucleotide Sequences]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2025-04-15</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2025.04.26.650725v1?rss=1">
<title>
<![CDATA[
Discovery of a tRNA-regulatory transcription factor that suppresses breast cancer metastasis 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2025.04.26.650725v1?rss=1"
</link>
<description><![CDATA[
Transfer RNAs (tRNAs), once viewed as static adaptors in translation, are now recognized as dynamic regulators of gene expression. While recent studies have illuminated roles for tRNA stability, the upstream mechanisms governing tRNA transcription remain poorly understood. To address this gap, we generated the EXpression atlas of tRNA (EXTRNA), a high-resolution tRNA expression dataset spanning 24 cell lines across 9 human tissues. EXTRNA revealed both tissue-type-specific expression programs ("tRNAomes") and unexpected intra-tissue heterogeneity across breast cancer samples. Integrating EXTRNA with computational network analysis and data from other publicly available datasets, we identified Zinc Finger ZZ-Type And EF-Hand Domain Containing 1 (ZZEF1) as the first sequence-specific transcription factor of a particular tRNA. ZZEF1 promoted tRNA-LysUUU transcription by partnering with the ATP-dependent chromatin remodeler Chromodomain Helicase DNA Binding Protein 6 (CHD6), enhancing chromatin accessibility at tRNA-Lys-TTT-3 loci. ZZEF1 deficiency reduced tRNA-LysUUU abundance, decreased the translational efficiency of AAR codon-enriched mRNAs--including the tumor suppressor Serine/Threonine Kinase 3 (STK3)--and promoted metastatic progression in breast cancer in vivo. Together, our findings establish a previously unrecognized mechanism for RNA polymerase III-mediated tRNA transcription and define a regulatory circuit linking chromatin remodeling, codon-specific translation, and tumor suppression. More broadly, this work introduces a framework for dissecting the regulatory logic of the tRNAome and highlights tRNA expression control as a promising avenue for therapeutic intervention.
]]></description>
<dc:creator>Chen, S.</dc:creator>
<dc:creator>Markett, D.</dc:creator>
<dc:creator>Karimzadeh, M.</dc:creator>
<dc:creator>Luo, Y.</dc:creator>
<dc:creator>Khoroshkin, M. S.</dc:creator>
<dc:creator>Boyraz, B.</dc:creator>
<dc:creator>Lee, S.</dc:creator>
<dc:creator>Carpenter, C.</dc:creator>
<dc:creator>Nguyen, P.</dc:creator>
<dc:creator>Garcia, K.</dc:creator>
<dc:creator>Joshi, T.</dc:creator>
<dc:creator>Martin, C.</dc:creator>
<dc:creator>Hanisch, B.</dc:creator>
<dc:creator>Molina, H.</dc:creator>
<dc:creator>Tavazoie, S.</dc:creator>
<dc:creator>Ramani, V.</dc:creator>
<dc:creator>Navickas, A.</dc:creator>
<dc:creator>Goodarzi, H.</dc:creator>
<dc:date>2025-04-29</dc:date>
<dc:identifier>doi:10.1101/2025.04.26.650725</dc:identifier>
<dc:title><![CDATA[Discovery of a tRNA-regulatory transcription factor that suppresses breast cancer metastasis]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2025-04-29</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2025.04.24.650365v1?rss=1">
<title>
<![CDATA[
Red Blood Cells Serve as a Primary Glucose Sink to Improve Glucose Tolerance at Altitude 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2025.04.24.650365v1?rss=1"
</link>
<description><![CDATA[
High altitude conditions result in improved glucose tolerance and lower diabetes risk across species, yet the underlying physiological mechanism remains unclear. Using mouse models, we found that hypoxia alone robustly improved glucose tolerance, independent of insulin sensitivity. This effect persisted for weeks after mice returned to normoxia. PET-CT imaging revealed that internal organs explained only a small fraction of increased glucose uptake in hypoxia, suggesting the presence of an unknown glucose sink. We hypothesized that increased glucose tolerance might be linked to the hypoxia-induced increase in red blood cells (RBCs), whose metabolism relies entirely on glucose. Experimental manipulation of RBC numbers through phlebotomy or transfusion directly altered blood glucose levels, demonstrating the necessity and sufficiency of RBCs as primary glucose sinks in hypoxia. Moreover, RBCs produced during systemic hypoxia exhibited a sustained [~]3-fold increase in glucose uptake, rapidly synthesizing the hemoglobin allosteric regulator 2,3-DPG that allows for increased oxygen release in hypoxia. Therapeutically, we demonstrated that both chronic hypoxia and our recently developed pharmacological hypoxia mimetic, HypoxyStat, effectively rescued hyperglycemia in mouse models of type 1 and type 2 diabetes. Our findings identify RBCs as critical regulators of systemic glucose metabolism under hypoxic conditions, illuminating a conserved physiological adaptation and suggesting novel therapeutic avenues for hyperglycemic disorders.

Graphical Abstract

O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=184 SRC="FIGDIR/small/650365v1_ufig1.gif" ALT="Figure 1">
View larger version (51K):
org.highwire.dtl.DTLVardef@b6e505org.highwire.dtl.DTLVardef@175e720org.highwire.dtl.DTLVardef@198efdborg.highwire.dtl.DTLVardef@1ab2418_HPS_FORMAT_FIGEXP  M_FIG C_FIG
]]></description>
<dc:creator>Marti-Mateos, Y.</dc:creator>
<dc:creator>Midha, A. D.</dc:creator>
<dc:creator>Huynh, H.</dc:creator>
<dc:creator>Flanigan, W. R.</dc:creator>
<dc:creator>Blume, S. Y.</dc:creator>
<dc:creator>Jain, I. H.</dc:creator>
<dc:date>2025-04-26</dc:date>
<dc:identifier>doi:10.1101/2025.04.24.650365</dc:identifier>
<dc:title><![CDATA[Red Blood Cells Serve as a Primary Glucose Sink to Improve Glucose Tolerance at Altitude]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2025-04-26</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2025.06.26.661135v1?rss=1">
<title>
<![CDATA[
Predicting cellular responses to perturbation across diverse contexts with STATE 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2025.06.26.661135v1?rss=1"
</link>
<description><![CDATA[
Cellular responses to perturbations are a cornerstone for understanding biological mechanisms and selecting drug targets. While machine learning models offer tremendous potential for predicting perturbation effects, they currently struggle to generalize to unobserved cellular contexts. Here, we introduce SO_SCPLOWTATEC_SCPLOW, a transformer model that predicts perturbation effects while accounting for cellular heterogeneity within and across experiments. SO_SCPLOWTATEC_SCPLOW predicts perturbation effects across sets of cells and is trained using gene expression data from over 100 million perturbed cells. SO_SCPLOWTATEC_SCPLOW improved discrimination of effects on large datasets by more than 30% and identified differentially expressed genes across genetic, signaling and chemical perturbations with significantly improved accuracy. Using its cell embedding trained on observational data from 167 million cells, SO_SCPLOWTATEC_SCPLOW identified strong perturbations in novel cellular contexts where no perturbations were observed during training. We further introduce Cell-Eval, a comprehensive evaluation framework that highlights SO_SCPLOWTATEC_SCPLOWs ability to detect cell type-specific perturbation responses, such as cell survival. Overall, the performance and flexibility of SO_SCPLOWTATEC_SCPLOW sets the stage for scaling the development of virtual cell models.
]]></description>
<dc:creator>Adduri, A.</dc:creator>
<dc:creator>Gautam, D.</dc:creator>
<dc:creator>Bevilacqua, B.</dc:creator>
<dc:creator>Imran, A.</dc:creator>
<dc:creator>Shah, R.</dc:creator>
<dc:creator>Naghipourfar, M.</dc:creator>
<dc:creator>Teyssier, N.</dc:creator>
<dc:creator>Ilango, R.</dc:creator>
<dc:creator>Nagaraj, S.</dc:creator>
<dc:creator>Ricci-Tam, C.</dc:creator>
<dc:creator>Carpenter, C.</dc:creator>
<dc:creator>Subramanyam, V.</dc:creator>
<dc:creator>Winters, A.</dc:creator>
<dc:creator>Dong, M.</dc:creator>
<dc:creator>Tirukkovalur, S.</dc:creator>
<dc:creator>Sullivan, J.</dc:creator>
<dc:creator>Plosky, B.</dc:creator>
<dc:creator>Eraslan, B.</dc:creator>
<dc:creator>Youngblut, N. D.</dc:creator>
<dc:creator>Leskovec, J.</dc:creator>
<dc:creator>Gilbert, L. A.</dc:creator>
<dc:creator>Konermann, S.</dc:creator>
<dc:creator>Hsu, P. D.</dc:creator>
<dc:creator>Dobin, A.</dc:creator>
<dc:creator>Burke, D. P.</dc:creator>
<dc:creator>Goodarzi, H.</dc:creator>
<dc:creator>Goodarzi, H.</dc:creator>
<dc:creator>Roohani, Y. H.</dc:creator>
<dc:date>2025-06-27</dc:date>
<dc:identifier>doi:10.1101/2025.06.26.661135</dc:identifier>
<dc:title><![CDATA[Predicting cellular responses to perturbation across diverse contexts with STATE]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2025-06-27</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2025.09.12.675911v1?rss=1">
<title>
<![CDATA[
Generative design of novel bacteriophages with genome language models 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2025.09.12.675911v1?rss=1"
</link>
<description><![CDATA[
Many important biological functions arise not from single genes, but from complex interactions encoded by entire genomes. Genome language models have emerged as a promising strategy for designing biological systems, but their ability to generate functional sequences at the scale of whole genomes has remained untested. Here, we report the first generative design of viable bacteriophage genomes. We leveraged frontier genome language models, Evo 1 and Evo 2, to generate whole-genome sequences with realistic genetic architectures and desirable host tropism, using the lytic phage {Phi}X174 as our design template. Experimental testing of AI-generated genomes yielded 16 viable phages with substantial evolutionary novelty. Cryo-electron microscopy revealed that one of the generated phages utilizes an evolutionarily distant DNA packaging protein within its capsid. Multiple phages demonstrate higher fitness than {Phi}X174 in growth competitions and in their lysis kinetics. A cocktail of the generated phages rapidly overcomes {Phi}X174-resistance in three E. coli strains, demonstrating the potential utility of our approach for designing phage therapies against rapidly evolving bacterial pathogens. This work provides a blueprint for the design of diverse synthetic bacteriophages and, more broadly, lays a foundation for the generative design of useful living systems at the genome scale.
]]></description>
<dc:creator>King, S. H.</dc:creator>
<dc:creator>Driscoll, C. L.</dc:creator>
<dc:creator>Li, D. B.</dc:creator>
<dc:creator>Guo, D.</dc:creator>
<dc:creator>Merchant, A. T.</dc:creator>
<dc:creator>Brixi, G.</dc:creator>
<dc:creator>Wilkinson, M. E.</dc:creator>
<dc:creator>Hie, B. L.</dc:creator>
<dc:date>2025-09-17</dc:date>
<dc:identifier>doi:10.1101/2025.09.12.675911</dc:identifier>
<dc:title><![CDATA[Generative design of novel bacteriophages with genome language models]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2025-09-17</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2025.09.19.677421v1?rss=1">
<title>
<![CDATA[
Efficient generation of epitope-targeted de novo antibodies with Germinal 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2025.09.19.677421v1?rss=1"
</link>
<description><![CDATA[
Obtaining novel antibodies against specific protein targets is a widely important yet experimentally laborious process. Meanwhile, computational methods for antibody design have been limited by low success rates that currently require resource-intensive screening. Here, we introduce Germinal, a broadly enabling generative pipeline that designs antibodies against specific epitopes with nanomolar binding affinities while requiring only low-n experimental testing. Our method co-optimizes antibody structure and sequence by integrating a structure predictor with an antibody-specific protein language model to perform de novo design of functional complementarity-determining regions (CDRs) onto a user-specified structural framework. When tested against four diverse protein targets, Germinal successfully designed functional antibodies across all targets and binder formats, testing only 43-101 designs for each antigen. Validated designs also exhibited robust expression in mammalian cells and high sequence and structural novelty. We provide open-source code and full computational and experimental protocols to facilitate wide adoption. Germinal represents a milestone in efficient, epitope-targeted de novo antibody design, with notable implications for the development of molecular tools and therapeutics.
]]></description>
<dc:creator>Mille-Fragoso, L. S.</dc:creator>
<dc:creator>Wang, J. N.</dc:creator>
<dc:creator>Driscoll, C. L.</dc:creator>
<dc:creator>Dai, H.</dc:creator>
<dc:creator>Widatalla, T. M.</dc:creator>
<dc:creator>Zhang, X.</dc:creator>
<dc:creator>Hie, B. L.</dc:creator>
<dc:creator>Gao, X. J.</dc:creator>
<dc:date>2025-09-24</dc:date>
<dc:identifier>doi:10.1101/2025.09.19.677421</dc:identifier>
<dc:title><![CDATA[Efficient generation of epitope-targeted de novo antibodies with Germinal]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2025-09-24</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2025.09.21.676939v1?rss=1">
<title>
<![CDATA[
An image-based CRISPR screen reveals splicing-mediated control of HP1α condensates 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2025.09.21.676939v1?rss=1"
</link>
<description><![CDATA[
Heterochromatin Protein 1 (HP1) is a fundamental component of constitutive heterochromatin, forming subnuclear condensates whose regulation and function remain poorly understood. Here, we present an image-based CRISPR screen targeting nuclear factors that identifies splicing as a pivotal pathway regulating HP1 condensates. We discovered that unspliced intronic RNA modulates HP1 condensates by interacting co-transcriptionally with HP1. By modulating the intron content, RNA processing restricts HP1-RNA interactions at chromatin, thus enabling heterochromatin organization. Disruption of HP1 condensates due to enhanced interactions with unspliced RNA leads to loss of heterochromatin and the activation of stress response protective genes. We propose that RNA is a central component of heterochromatin that modulates HP1 condensates, and that RNA processing enzymes act as a surveillance mechanism for condensates by dynamically regulating the network of multi-valent interactions between RNA and chromatin factors. This model underscores the crosstalk between chromatin organization, transcription, and RNA processing, potentially governing broader nuclear functions.
]]></description>
<dc:creator>Wong, M. M.-K.</dc:creator>
<dc:creator>Zhou, S.</dc:creator>
<dc:creator>Carpenter, C.</dc:creator>
<dc:creator>Valbuena, R.</dc:creator>
<dc:creator>Priyadarshini, M.</dc:creator>
<dc:creator>Arya, A.</dc:creator>
<dc:creator>Rizvi, A.</dc:creator>
<dc:creator>Carswell-Crumpton, C.</dc:creator>
<dc:creator>Wileveau, A.</dc:creator>
<dc:creator>Lopez-Lopez, G.</dc:creator>
<dc:creator>Tycko, J.</dc:creator>
<dc:creator>Yao, D.</dc:creator>
<dc:creator>Spees, K.</dc:creator>
<dc:creator>Maynard, J.</dc:creator>
<dc:creator>Bassik, M. C.</dc:creator>
<dc:creator>Goodarzi, H.</dc:creator>
<dc:creator>Sanulli, S.</dc:creator>
<dc:date>2025-09-21</dc:date>
<dc:identifier>doi:10.1101/2025.09.21.676939</dc:identifier>
<dc:title><![CDATA[An image-based CRISPR screen reveals splicing-mediated control of HP1α condensates]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2025-09-21</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2026.01.09.698608v1?rss=1">
<title>
<![CDATA[
Stack: In-Context Learning of Single-Cell Biology 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2026.01.09.698608v1?rss=1"
</link>
<description><![CDATA[
Single-cell transcriptomics offers the promise of measuring the diversity of cellular phenotypes across species, diseases, and other biological conditions. Recently, foundation models have emerged to identify this variation, yet most methods represent each cell independently, despite technical limitations that reduce measurement precision at the single-cell level. Here, we present SO_SCPLOWTACKC_SCPLOW, a foundation model trained on 149 million uniformly preprocessed human single cells that leverages tabular attention to generate representations for each cell informed by the cells in its context. SO_SCPLOWTACKC_SCPLOW offers substantial improvements for downstream tasks in the zero-shot setting compared to baselines, whether they are zero-shot, fine-tuned, or trained from scratch on the target dataset. SO_SCPLOWTACKC_SCPLOW can perform in-context learning from unlabeled cells representing arbitrary conditions, such as a chemical perturbation or a different donor, and predict the effect of those conditions on a target cell population without requiring data-specific fine-tuning. We apply SO_SCPLOWTACKC_SCPLOW to generate Perturb Sapiens, the first human whole-organism atlas of perturbed cells, spanning 28 tissues, 40 cell classes, and 201 perturbations. We validated subsets of Perturb Sapiens using in vitro stimulation profiles. Overall, SO_SCPLOWTACKC_SCPLOW presents a new modeling framework where cells themselves act as guiding examples at inference time, unlocking general-purpose in-context learning capabilities for single-cell biology.
]]></description>
<dc:creator>Dong, M.</dc:creator>
<dc:creator>Adduri, A.</dc:creator>
<dc:creator>Gautam, D.</dc:creator>
<dc:creator>Carpenter, C.</dc:creator>
<dc:creator>Shah, R.</dc:creator>
<dc:creator>Ricci-Tam, C.</dc:creator>
<dc:creator>Kluger, Y.</dc:creator>
<dc:creator>Burke, D. P.</dc:creator>
<dc:creator>Roohani, Y. H.</dc:creator>
<dc:date>2026-01-09</dc:date>
<dc:identifier>doi:10.64898/2026.01.09.698608</dc:identifier>
<dc:title><![CDATA[Stack: In-Context Learning of Single-Cell Biology]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2026-01-09</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2026.01.21.700936v1?rss=1">
<title>
<![CDATA[
cyto: ultra high-throughput processing of 10x-flex single cell sequencing 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2026.01.21.700936v1?rss=1"
</link>
<description><![CDATA[
Single-cell genomics is rapidly scaling toward billion-cell atlases, but computational analysis has become a critical bottleneck. Processing multiplexed datasets with existing tools requires substantial computational resources and runtime that become prohibitive at scale. Here we present cyto, an ultra highthroughput processor for 10x Genomics Flex single-cell sequencing optimized for production-scale analysis. cyto exploits the fixed sequence geometry of Flex libraries through direct k-mer lookup rather than alignment-based mapping, and introduces IBU (Indexed-Barcode-UMI), a compact binary format for efficient read processing. cyto further leverages BINSEQ, a binary sequencing format that enables highly parallel parsing and overcomes the single-threaded limitations of gzip compression. On a benchmark 320,000-cell multiplexed dataset, cyto completes processing in 13 minutes compared to CellRangers 3.7 hours, a 16.5-fold speedup, while requiring 2.4-fold less memory and performing 5.6-fold less disk I/O. The 31.7-fold reduction in CPU-hours represents true algorithmic efficiency rather than parallelization alone. Critically, cyto maintains 99.85% concordance with CellRanger outputs, with identical cell type clustering in dimensionality reduction analyses. These performance improvements enable costeffective processing on smaller cloud instances and make previously prohibitive experiments computationally feasible. cyto is production-ready, open-source software that provides the computational foundation for atlas-scale single-cell projects and genome-wide perturbation screens.
]]></description>
<dc:creator>Teyssier, N.</dc:creator>
<dc:creator>Dobin, A.</dc:creator>
<dc:date>2026-01-22</dc:date>
<dc:identifier>doi:10.64898/2026.01.21.700936</dc:identifier>
<dc:title><![CDATA[cyto: ultra high-throughput processing of 10x-flex single cell sequencing]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2026-01-22</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2026.01.26.701810v1?rss=1">
<title>
<![CDATA[
Genome-wide CRISPRi screen identifies basigin loss as protective in cardiac hypoxia 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2026.01.26.701810v1?rss=1"
</link>
<description><![CDATA[
Cardiac function depends on continuous oxidative metabolism, rendering cardiomyocytes highly vulnerable to oxygen deprivation. Here, we performed a genome-wide CRISPR interference (CRISPRi) screen in human iPSC-derived cardiomyocytes to identify genes that modulate survival during chronic hypoxia. This screen revealed that knockdown of basigin (BSG), a chaperone for the monocarboxylate transporters MCT1 and MCT4, confers robust protection. Canonically, hypoxic cells suppress pyruvate dehydrogenase (PDH) activity to reduce the oxidation of major fuel sources, thereby limiting TCA cycle flux, lowering oxygen consumption, and minimizing reactive oxygen species generated by an overly reduced electron transport chain (ETC). In contrast, we found that BSG inhibition reverses this response, prioritizing ATP maintenance during hypoxia and enhancing cardiomyocyte survival. Mechanistically, BSG loss restricts lactate efflux, leading to decreased PDH phosphorylation and increased glucose uptake for oxidation. Consistent with this, ETC subunits are more essential under hypoxia, highlighting cardiomyocytes unusual reliance on aerobic ATP production even when oxygen is limited. These findings challenge prevailing models of hypoxic adaptation by revealing cardiomyocyte-specific bioenergetic requirements and motivating future therapeutic efforts.
]]></description>
<dc:creator>Flanigan, W. R.</dc:creator>
<dc:creator>Midha, A. D.</dc:creator>
<dc:creator>Blume, S. Y.</dc:creator>
<dc:creator>Marti-Mateos, Y.</dc:creator>
<dc:creator>Costa, M. W.</dc:creator>
<dc:creator>Huang, Y.</dc:creator>
<dc:creator>Baik, A. H.</dc:creator>
<dc:creator>Huynh, H.</dc:creator>
<dc:creator>Susarla, G.</dc:creator>
<dc:creator>Bennett, N. K.</dc:creator>
<dc:creator>Nowak, R. A.</dc:creator>
<dc:creator>Srivastava, D.</dc:creator>
<dc:creator>Nakamura, K.</dc:creator>
<dc:creator>Jain, I. H.</dc:creator>
<dc:date>2026-01-27</dc:date>
<dc:identifier>doi:10.64898/2026.01.26.701810</dc:identifier>
<dc:title><![CDATA[Genome-wide CRISPRi screen identifies basigin loss as protective in cardiac hypoxia]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2026-01-27</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2026.02.01.703132v1?rss=1">
<title>
<![CDATA[
Structure-guided design of a targeted autoantibody degrader for neurologic disease 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2026.02.01.703132v1?rss=1"
</link>
<description><![CDATA[
Despite rapid progress in the diagnosis of autoantibody-mediated neurologic diseases, standard-of-care therapeutic options remain limited to nonspecific immunosuppression. Here, we report an alternative therapeutic strategy using targeted protein degradation to eliminate pathogenic autoantibodies while leaving the rest of the immune system intact. We previously discovered autoimmune vitamin B12 central deficiency (ABCD), a neurologic condition in which autoantibodies targeting the transcobalamin receptor (CD320) impair the transport of cobalamin (B12) from the blood into the central nervous system (CNS). Combining scanning alanine mutagenesis by phage display, cryo-electron microscopy, and computational modeling, we elucidated a highly conserved anti-CD320 epitope and defined the structural determinants of antigen-autoantibody binding. Next, we synthesized a lysosome-targeting chimera (LYTAC) comprising the lysosome targeting glycan, triGalNAc, fused to the antigenic epitope of CD320 as autoantibody bait. In vitro, this LYTAC promoted the specific lysosomal internalization and extracellular clearance of anti-CD320, restoring homeostatic cellular uptake of B12. In a passive transfer mouse model of ABCD, LYTAC treatment rapidly cleared anti-CD320 from circulation and prevented penetration of anti-CD320 into the CNS. These findings uncover the mechanism of autoantibody-antigen binding in ABCD and demonstrate targeted autoantibody degradation as a therapeutic strategy that may be generalizable to other autoimmune neurologic diseases.
]]></description>
<dc:creator>Zimanyi, M.</dc:creator>
<dc:creator>Dayao, M.</dc:creator>
<dc:creator>Asencor, A. I.</dc:creator>
<dc:creator>Kondapavulur, S.</dc:creator>
<dc:creator>Asaki, J.</dc:creator>
<dc:creator>McCutcheon, K.</dc:creator>
<dc:creator>Dabaco, C.</dc:creator>
<dc:creator>Bodansky, A.</dc:creator>
<dc:creator>Craik, C.</dc:creator>
<dc:creator>Pleasure, S.</dc:creator>
<dc:creator>DeRisi, J. L.</dc:creator>
<dc:creator>Cheng, Y.</dc:creator>
<dc:creator>Wilson, M.</dc:creator>
<dc:creator>Pluvinage, J. V.</dc:creator>
<dc:date>2026-02-03</dc:date>
<dc:identifier>doi:10.64898/2026.02.01.703132</dc:identifier>
<dc:title><![CDATA[Structure-guided design of a targeted autoantibody degrader for neurologic disease]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2026-02-03</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2026.02.09.704975v1?rss=1">
<title>
<![CDATA[
Systemic hypoxia suppresses solid tumor growth 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2026.02.09.704975v1?rss=1"
</link>
<description><![CDATA[
Local hypoxia is a hallmark of solid tumors and a negative prognostic factor in the progression and treatment of cancer. Here, we showed that systemic hypoxia, in contrast to localized tumor hypoxia, decreases tumor growth in vivo across multiple cancer types and preclinical models. The reduced tumor growth in systemic hypoxia was not explained by hypoglycemia, hypoinsulinemia, or HIF activation. Instead, metabolite profiling in tumors and tumor interstitial fluid revealed extensive perturbations in purine-related metabolites. Stable isotope tracing demonstrated that systemic hypoxia caused tumors to suppress de novo purine synthesis. Furthermore, tumors did not develop resistance to systemic hypoxia therapy, and when used in combination with chemotherapy or immunotherapy, systemic hypoxia dramatically suppressed tumor growth. Finally, we showed that systemic hypoxia can be achieved pharmacologically with the small molecule HypoxyStat. These findings challenge the long-held paradigm of hypoxia as a negative prognostic factor in cancer progression, and they suggest a potential therapeutic role for systemic hypoxia in suppressing solid tumor growth.
]]></description>
<dc:creator>Midha, A. D.</dc:creator>
<dc:creator>Chew, B. T. L.</dc:creator>
<dc:creator>Choi, B. M. H.</dc:creator>
<dc:creator>Suh, J. M.</dc:creator>
<dc:creator>Carpenter, C.</dc:creator>
<dc:creator>Baik, A. H.</dc:creator>
<dc:creator>Joshi, T.</dc:creator>
<dc:creator>Blume, S. Y.</dc:creator>
<dc:creator>Haribowo, A. G.</dc:creator>
<dc:creator>Ruivo, P.</dc:creator>
<dc:creator>Flanigan, W. R.</dc:creator>
<dc:creator>Garg, A.</dc:creator>
<dc:creator>Zhang, D. D.</dc:creator>
<dc:creator>Subramanyam, V.</dc:creator>
<dc:creator>Shuere, R.</dc:creator>
<dc:creator>Seo, Y.</dc:creator>
<dc:creator>VanBrocklin, H.</dc:creator>
<dc:creator>Goodarzi, H.</dc:creator>
<dc:creator>Jain, I. H.</dc:creator>
<dc:date>2026-02-10</dc:date>
<dc:identifier>doi:10.64898/2026.02.09.704975</dc:identifier>
<dc:title><![CDATA[Systemic hypoxia suppresses solid tumor growth]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2026-02-10</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2026.03.19.712954v1?rss=1">
<title>
<![CDATA[
BioReason-Pro: Advancing Protein Function Prediction with Multimodal Biological Reasoning 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2026.03.19.712954v1?rss=1"
</link>
<description><![CDATA[
Protein function annotation is fundamental to understanding biological mechanisms, designing therapeutics, and advancing biomedical research. Current computational methods either rely on shallow sequence similarity or treat function prediction as isolated classification tasks, failing to capture the integrative reasoning across sequence, structure, domains, and interactions that expert biologists perform to infer function. We introduce BioReason-Pro, the first multimodal reasoning large language model (LLM) for protein function prediction that integrates protein embeddings with biological context to generate structured reasoning traces. A key input into BioReason-Pro is the set of GO term predictions made by GO-GPT, our autoregressive transformer that captures hierarchical and cross-aspect dependencies of GO terms. BioReason-Pro is trained via supervised fine-tuning on synthetic reasoning traces generated by GPT-5 for over 130K proteins and further optimized through reinforcement learning. It achieves 73.6% Fmax on GO term prediction and an LLM judge score of 8/10 on functional summaries, substantially outperforming previous methods. Evaluations with human protein experts show that BioReason-Pro annotations are preferred over ground truth UniProt annotations in 79% of cases. Remarkably, BioReason-Pro de novo predicted experimentally confirmed binding partners with per-residue attention localizing to the exact contact residues resolved in cryo-EM structures of those complexes. Together, GO-GPT and BioReason-Pro establish a framework for protein function prediction that combines precise ontology modeling with interpretable biological reasoning.
]]></description>
<dc:creator>Fallahpour, A.</dc:creator>
<dc:creator>Seyed-Ahmadi, A.</dc:creator>
<dc:creator>Idehpour, P.</dc:creator>
<dc:creator>Ibrahim, O.</dc:creator>
<dc:creator>Gupta, P.</dc:creator>
<dc:creator>Naimer, J.</dc:creator>
<dc:creator>Zhu, K.</dc:creator>
<dc:creator>Shah, A.</dc:creator>
<dc:creator>Ma, S.</dc:creator>
<dc:creator>Adduri, A.</dc:creator>
<dc:creator>Güloglu, T.</dc:creator>
<dc:creator>Liu, N.</dc:creator>
<dc:creator>Cui, H.</dc:creator>
<dc:creator>Jain, A.</dc:creator>
<dc:creator>de Castro, M.</dc:creator>
<dc:creator>Fallahpour, A.</dc:creator>
<dc:creator>Cembellin-Prieto, A.</dc:creator>
<dc:creator>Stiles, J. S.</dc:creator>
<dc:creator>Nemcko, F.</dc:creator>
<dc:creator>Nevue, A. A.</dc:creator>
<dc:creator>Moon, H. C.</dc:creator>
<dc:creator>Sosnick, L.</dc:creator>
<dc:creator>Markham, O.</dc:creator>
<dc:creator>Duan, H.</dc:creator>
<dc:creator>Lee, M. Y. Y.</dc:creator>
<dc:creator>Salvador, A. F. M.</dc:creator>
<dc:creator>Maddison, C. J.</dc:creator>
<dc:creator>Thaiss, C. A.</dc:creator>
<dc:creator>Ricci-Tam, C.</dc:creator>
<dc:creator>Plosky, B. S.</dc:creator>
<dc:creator>Burke, D. P.</dc:creator>
<dc:creator>Hsu, P. D.</dc:creator>
<dc:creator>Goodarzi, H.</dc:creator>
<dc:creator>Wang, B.</dc:creator>
<dc:date>2026-03-20</dc:date>
<dc:identifier>doi:10.64898/2026.03.19.712954</dc:identifier>
<dc:title><![CDATA[BioReason-Pro: Advancing Protein Function Prediction with Multimodal Biological Reasoning]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2026-03-20</prism:publicationDate>
<prism:section></prism:section>
</item>
<item rdf:about="https://biorxiv.org/cgi/content/short/2026.04.12.718013v1?rss=1">
<title>
<![CDATA[
ENPP1 buffers extracellular cGAMP in brown adipose tissue to limit insulin resistance 
]]>
</title>
<link>
https://biorxiv.org/cgi/content/short/2026.04.12.718013v1?rss=1"
</link>
<description><![CDATA[
The ectonucleotide pyrophosphatase/phosphodiesterase 1 (ENPP1) has long been linked with metabolic diseases, with the common ENPP1 K173Q (historically K121Q) variant conferring increased risk for type 2 diabetes (T2D). However, the mechanistic basis of this association has remained unclear. Here, we demonstrate that the K173Q variant has decreased cGAMP hydrolysis activity, suggesting that this loss of enzymatic function could contribute to its pathogenesis. Using a cGAMP-hydrolysis-deficient knock-in mouse (Enpp1H362A), we show that selective loss of this activity leads to a primary defect in energy expenditure and exacerbates high-fat diet (HFD)-induced weight gain and insulin resistance. An unbiased in vivo glucose-uptake screen reveals brown adipose tissue (BAT) as a focal site of metabolic impairment, characterized by profound extracellular cGAMP accumulation and a selective failure of insulin-stimulated glucose uptake. Mechanistically, we demonstrate that nutrient excess drives mitochondrial DNA leakage in brown adipocytes, triggering cGAMP production and export. Excess cGAMP directly propagates STING-dependent suppression of glucose uptake and lipogenesis in brown adipocytes. Additionally, when ENPP1-mediated clearance is compromised, extracellular cGAMP acts as a paracrine immunotransmitter that remodels the BAT microenvironment by recruiting and polarizing macrophages toward an M1-like phenotype. Together, our findings nominate the impaired ENPP1-dependent buffering of extracellular cGAMP as one mechanism by which ENPP1 variants influence metabolic homeostasis.
]]></description>
<dc:creator>Wang, S.</dc:creator>
<dc:creator>Guo, Y.</dc:creator>
<dc:creator>An, W.</dc:creator>
<dc:creator>Lee, M.</dc:creator>
<dc:creator>Li, Y.</dc:creator>
<dc:creator>Sudaryo, V.</dc:creator>
<dc:creator>Grenot, G.</dc:creator>
<dc:creator>Skariah, G.</dc:creator>
<dc:creator>Reghupaty, S. C.</dc:creator>
<dc:creator>Young, S.</dc:creator>
<dc:creator>Bai, X.</dc:creator>
<dc:creator>Svensson, K. J.</dc:creator>
<dc:creator>Li, L.</dc:creator>
<dc:date>2026-04-14</dc:date>
<dc:identifier>doi:10.64898/2026.04.12.718013</dc:identifier>
<dc:title><![CDATA[ENPP1 buffers extracellular cGAMP in brown adipose tissue to limit insulin resistance]]></dc:title>
<dc:publisher>Cold Spring Harbor Laboratory Press</dc:publisher>
<prism:publicationDate>2026-04-14</prism:publicationDate>
<prism:section></prism:section>
</item>
</rdf:RDF>
