- by Song, J., Li, Q.Spatial transcriptomics (ST) measures gene expression while preserving spatial context within tissues, enabling detailed characterization of tissue organization. As ST technologies advance, aligning datasets across tissue sections, individuals, platforms, and developmental stages has become increasingly important but remains challenging due to sparse expression, biological heterogeneity, and geometric distortions between slices. We introduce OT-knn, a method for ST alignment that integrates local neighborhood information within an optimal transport framework. Rather than relying solely on single-spot expression, OT-knn reconstructs each spot using […]
- by Lapp, Z., Leitner, T.Motivation: Understanding how virus sequences are shaped by selection can inform vaccine design and transmission inference. Modeling within-host evolution to interrogate these questions requires a detailed mechanistic framework that accurately captures sequence diversification. The CD8+ cytotoxic T-lymphocyte (CTL) response plays an important role in immune-mediated selection and can leave strong signatures in virus sequences; however, existing sequence-based within-host virus modeling frameworks do not explicitly include an HLA-aware CTL response. Results: We extended our previously published within-host sequence evolution simulator, wavess, […]
- by Herault, L., Gabriel, A. A., Duc, B., Dolfi, B., Shah, A., Joyce, J. A., Gfeller, D.Multimodal single-cell atlases comprising hundreds of thousands of cells provide unique resources for exploring complex biological tissues and generating testable hypotheses. To streamline the analysis of such large datasets, we introduce SuperCell2.0, a robust workflow to build (semi-)supervised multimodal metacells. We demonstrate that multimodal metacells outperform metacells built with a single modality, improve inter-modality consistency, and facilitate integration of multiomic single-cell datasets. SuperCell2.0 can further leverage full or partial cell type annotations to improve metacell quality. This workflow enables us […]
- by Tang, Q., Mchaourab, H., Wu, T., Soubasis, B.AlphaFold3 architecture represented an important leap relative to Alphafold2 by enabling the inclusion of protein ligands in the prediction network. Ligand-dependent structural rearrangements are inherently difficult to predict computationally as they imply transitions between states separated by large energy differences. Here we apply AlphaFold3 to predict nucleotide-dependent changes in the conformational cycle of representative ABC transporters that have been extensively investigated by experimental structural biology techniques. We show that under similar conditions, AlphaFold3 predictions sample experimentally observed conformations. Moreover, the […]
- by Min, J., Vishnyakova, O., Brooks-Wilson, A., Elliott, L. T.Identifying physiological sweet spots (optimal ranges for homeostasis) is essential for precision medicine. However, traditional statistical methods often rely on globally linear or locally jagged models that struggle to capture the smooth, non-linear nature of biological regulation in high-dimensional data. We present the Quantile Feature Selection Network (Q-FSNet), a neural network-based framework that integrates quantile regression, feature selection, and uncertainty estimation to identify biomarkers with sweet spots. Unlike traditional methods, Q-FSNet learns continuous response curves without requiring pre-specified number of […]
- by Adasme, M. F., Ochoa, D., Lopez, I., Do, H.-M.-A., McDonagh, E. M., O'Boyle, N. M., Leach, A. R., Zdrazil, B.Chemical probes are indispensable tools for validating therapeutic hypotheses, yet their broader impact on early-stage drug discovery remains unquantified. To our knowledge, this study represents the first systematic, large-scale investigation of the chemical probe literature. By screening over 18 million articles using a high-quality dictionary of 561 chemical probes, we identified 20,000 articles mentioning a chemical probe which resulted in 5,558 unique target-disease (T-D) associations. Our analysis yields four principal findings that redefine the utility of these chemicals: First, we […]
- by Jiang, C., Zheng, R., Ji, Y., Cao, S., Fang, Y., Wang, Z., Wang, R., Liang, S., Tao, S.Single-cell RNA sequencing enables high-resolution characterization of cellular heterogeneity, yet integrating datasets from diverse sources remains challenging due to batch effects. Current methods rely on implicit feature disentanglement and and lack geometric constraintsoften result in under-correction, over-correction, or compromised biological fidelity. Here, we present iDLC, an interpretable deep learning framework that performs dual-level correction through explicit feature disentanglement and optimal transport – regularized adversarial alignment. iDLC separates biological and technical components within a structured latent space, then leverages high-confidence mutual […]
- by Sefa, S. M., Sarkar, J., Robin, A. H. K., Uddin, M.Protein function depends on interactions between structural domains and regulatory motifs. Yet current tools analyze these elements separately, hindering investigation of disease mutations affecting evolutionarily conserved, structurally constrained motifs. We present ProteoMapper, a computational framework integrating HMMER-based domain annotation with user-defined motif detection to quantify motif-domain spatial relationships in protein families. ProteoMapper introduces two discovery metrics: (1) positional conservation scoring, identifying motifs at identical alignment coordinates in [≥] N% of sequences (default 60%), indicating purifying selection; (2) Motif-Domain Coverage Score […]
- by Xia, T., Zhao, X., Islam, S. S. M., Mohammed, K. K., Xie, Z., Zhi, D.Magnetic resonance imaging (MRI)-derived phenotypes (IDP) has enabled the discovery of numerous genomic loci associated with brain structure and function. However, most existing IDPs and learned representations are derived from a single imaging modality, missing complementary information across modalities and potentially limiting the scope of genetic discovery. Here, we introduce a multimodal contrastive learning framework to derive heritable representations from paired T1- and T2-weighted MRIs. Unlike single-modality reconstruction-based models, we designed a momentum-based contrastive learning framework. As a result, our […]
- by Ma, Z., Liu, M., Wang, S., Wang, S., Zang, C.Spatial organization of the genome plays a vital role in defining cell identity and regulating gene expression. The three-dimensional (3D) genome structure can be measured by sequencing-based techniques such as Hi-C usually on the cell population level or by imaging-based techniques such as chromatin tracing at the single-cell level. Chromatin tracing is a multiplexed DNA fluorescence in situ hybridization (FISH)-based method that can directly map the 3D positions of genomic loci along individual chromosomes at single-molecule resolution. However, few computational […]
- by Poehls, J., Landerer, C., Daniels, K. G., Toth-Petroczy, A.Protein translation is an error-prone process resulting in a random population of altered protein sequences in every cell. Here, we analyzed thousands of publicly available mass spectrometry datasets to detect amino acid misincorporations and quantify error rates in 14 model organisms. We find that overall error rates and the patterns of codon to amino acid error rates correlate across species. We estimate that on average 1-2% of protein molecules in a cell harbor a misincorporation, whereas this proportion can reach […]
- by Jacques, M.-A., Gottgens, B., Marioni, J. C.Single-cell transcriptomics has revolutionised developmental biology by providing an unprecedented, fine-grained view of cellular lineages. However, our ability to compare species and distinguish universal from species-specific developmental principles remains limited by biological and technical variability. To address this, we introduce RIMA (RIgorous Matching of Atlases), a method for quantitatively comparing transcriptomic atlases across species at near-single-cell resolution. RIMA uses a novel computational approach to identify matching cell states across atlases and leverages this to enable quantitative comparative analyses. Applied to […]
- by Hoyle, A., Midwood, K. S.Tissues dynamically remodel extracellular matrix to maintain homeostasis, alterations in which are an early pathogenic hallmark of disease. Protein degradation, essential for tissue remodelling, is often dismissed as indiscriminate damage, despite evidence of its specificity. A major determinant of protein tissue levels and activity, matrix proteolysis also creates circulating degradation products that are emerging biomarkers, with specific collagen fragments capable of tracking disease severity. Understanding intentional matrix destruction therefore is key to understanding tissue biology. Unbiased, holistic analysis, extending our […]
- by Li, M., Shi, M., Zhang, C., Turkez, H., Uhlen, M., Mardinoglu, A.Human tissues exhibit specialized metabolic functions that are essential for maintaining whole-body metabolic homeostasis. To systematically characterize organ- and cell-type-specific metabolic heterogeneity, we constructed 32 tissue-specific and 81 cell-type-specific enzyme-constrained genome-scale metabolic models (ecGEMs) by integrating the global human metabolic network with the tissue- and single-cell transcriptomic data from the Human Protein Atlas (HPA). Our analysis revealed pronounced differences in metabolic network architecture and activity across the human body, identifying key cell types that drive tissue metabolic functions. To demonstrate […]
- by Gohl, P., Fornes, O., Bota, P. M., Messeguer, A., Bonet, J., Molina-Fernandez, R., Planas-Iglesias, J., Hernandez, A. C., Gallego, O., Fernandez-Fuentes, N., Oliva, B.The ModCRElib package provides various tools for the analysis and modelling of transcription factor(TF)-DNA and regulatory complex inter-protein interactions. It takes structural information on these interactions to predict TF binding motifs, generate binding profiles along DNA sequences that score the binding affinity, predict TF binding sites and model the structure of higher order regulatory complexes. It is capable of working with a variety of input data formats and sources. The user may follow the analysis pipeline as outlined in the […]
- by Zondi, S., Mtambo, S., Buthelezi, N., Shunmugam, L., Magwenyane, A., Kumalo, H. M.Chikungunya virus (CHIKV) infection is one of the major public health concerns in several countries around the world. CHIKV non-structural protein 2 (nsP2) is a promising drug design target due to the enzymes multifunctional properties that facilitate viral replication and propagation. To date, there is an evident lack of preventative and therapeutic developments that can be used against CHIKV. Drug repurposing is a time-saving and cost-effective method used for the development of new drugs. In this study, drug repurposing was […]
- by Goncalves, D. M., Patricio, A., Costa, R. S., Henriques, R.The growing availability and complexity of omics data have driven the development of specialized algorithms for modeling molecular systems. Although graph-based learning methods effectively represent biological interactions, they often neglect the statistical information embedded in node and edge annotations. To address this limitation, we propose a novel graph-based framework that integrates structured statistical distributions into nodes and edges, capturing probabilistic characteristics of molecular relationships. We evaluate the proposed approach on omics datasets from five cancer types across multiple clinical outcomes, […]
- by Gonzalez-Bermejo, M., Serrano-Ron, L., Garcia-Martin, S., Lapuente-Santana, O., Sanz-Portillo, I., Gonzalez-Martinez, P., Gomez-Lopez, G., Al-Shahrour, F.Intratumoral heterogeneity (ITH) is a major determinant of therapeutic failure, yet its impact on drug response across cancers remains incompletely understood. Here, we present the Therapeutic Cancer Cell Atlas (TCCA), a pan-cancer single-cell resource integrating ~1.8 million transcriptomes from 537 patients and 183 cancer cell lines spanning 34 tumor types. By combining single-cell transcriptomics with copy-number alteration inference and computational drug-response prediction, we systematically map therapeutic heterogeneity at subclonal resolution across cancers. Using this framework, we identify ten recurrent therapeutic […]
- by Munoz-Gacitua, D., Blamey, J.The LRLLR cell-penetrating motif can be transferred to confer membrane translocation activity, but only to compatible recipient peptides. Using umbrella sampling molecular dynamics simulations, we demonstrate that C-terminal LRLLR addition to the pro-apoptotic smacN peptide eliminates its translocation barrier entirely, transforming a +65 kJ/mol barrier into a -50 kJ/mol energy well. In contrast, N-terminal LRLLR addition to the neuroprotective NR2B9c peptide increases the translocation barrier from +85 to +100 kJ/mol, demonstrating that motif transfer can prove counterproductive for incompatible sequences. […]
- by Liu, H., Zhang, P., Wei, Y., Tian, Q., Zhai, Y., Zou, Q., Niu, M.Partial order alignment (POA) has emerged as a fundamental component in long-read error correction, assembly and pangenomics. However, conventional POA algorithms are limited by high time and memory requirements, making them inefficient for large-scale datasets. Here, we present minipoa, a fast and memory-efficient POA tool that incorporates seed-chain-align heuristics, adaptive or static banding strategies, and single-instruction multiple-data optimizations. Minipoa achieves up to a 5-fold speedup over abPOA, reduces memory usage by up to 16-fold, and improves correction accuracy, while maintaining […]
