CRIS

Permanent URI for this communityhttps://scripta.up.edu.mx/handle/20.500.12552/1

Browse

Search Results

Now showing 1 - 6 of 6

Some of the metrics are blocked by your
consent settings
Leveraging Different Distance Functions to Predict Antiviral Peptides with Geometric Deep Learning from ESMFold-Predicted Tertiary Structures
(MDPI AG, 2026)
Cordoves-Delgado, Greneter
;
García-Jacas, César R.
;
Marrero Ponce, Yovani
;
Aguila, Sergio A.
;
Lizama-Uc, Gabriel
Background: Machine learning models have been shown to be a time-saving and cost-effective tool for peptide-based drug discovery. In this regard, different graph learning-driven frameworks have been introduced to exploit graph representations derived from predicted peptide structures. Such graphs are always derived by applying a Euclidean distance threshold between amino acid pairs, despite the fact that there is no evidence other than intuitive reasoning that supports the Euclidean distance as the most suitable. Objective: In this work, we examined the use of different distance functions to derive graph representations from predicted peptide structures to train deep graph learning-based models to predict antiviral peptides. Methods: To this end, we first analyzed how differently the closeness of the amino acids is characterized by different distance functions. Then, we studied the similarity between the graphs derived with several distance functions, as well as between them and random graphs. Finally, we trained several models with the best graph representations and analyzed how different they are regarding their predictions. Comparisons regarding state-of-the-art models were also performed. Results and Conclusion: We demonstrated that only using Euclidean distance thresholds is not sufficient criterion to build graphs representing structural features of predicted peptide structures, since other distance functions enabled building dissimilar graphs codifying different chemical spaces, which were useful in the construction of better discriminative models. ©The authors ©MDPI.
Some of the metrics are blocked by your
consent settings
StarPepWeb: an integrative, graph-based resource for bioactive peptides
(Oxford University Press, 2024)
López, Christian
;
Cárdenas, Roberto
;
Aguilera-Mendoza, Longendri
;
Agüero-Chapin, Guillermin
;
Martínez Ríos, Félix Orlando
Motivation: The rapid growth of bioactive peptide sequences presents challenges for organization and analysis. Existing repositories often specialize in functions, taxonomic origins, or structural classes, but most remain isolated, use heterogeneous metadata, and lack uniform descriptors or structural models. Few integrative web services exist, offering only partial coverage or depth. As a result, reproducible and comprehensive exploration of the bioactive peptide landscape remains limited, underscoring the need for a unified, source-tracked, extensible platform. Results: We present StarPepWeb, a freely accessible web application that democratizes access to StarPepDB, one of the largest curated repositories of bioactive peptides. The platform integrates 45 120 non-redundant sequences from 40 public databases into a source-tracked graph enriched with metadata, physicochemical features, and predicted 3D structures from ESMFold. Each peptide is represented with ESM-2 embeddings and iFeature descriptors, while the interface supports metadata-aware filtering, alignment-based similarity searches with single and multiple queries, and interactive visualization. A microservice-oriented architecture ensures scalability, maintainability, and reproducible versioned downloads, including Neo4j exports. StarPepWeb thus overcomes deployment and expertise barriers of the standalone database, providing an extensible, cloud-hosted framework for integrative bioactive peptide analysis. ©The authors ©Oxford University Press.
Some of the metrics are blocked by your
consent settings
Optimal Descriptor Subset Search via Chemical Information and Target Activity-Guided Algorithm for Antimicrobial Peptide Prediction
(American Chemical Society (ACS), 2025)
García-González, Luis A.
;
Marrero Ponce, Yovani
;
García-Jacas, César R.
;
Aguila Puentes, Sergio A.
Antimicrobial peptides (AMPs) have emerged as a promising alternative to conventional drugs due to their potential applications in combating multidrug-resistant pathogens. Various computational approaches have been developed for AMP prediction, ranging from shallow learning methods to advanced deep learning techniques. Additionally, the performance of shallow learning models based on self-learning features derived from protein language models has recently been studied. However, the performance of AMP models based on shallow learning strongly depends on the quality of descriptors derived via manual feature engineering, which may miss crucial information by assuming that the initial descriptor set fully captures relevant information. The AExOp-DCS algorithm was introduced as an automatic feature domain optimization method that identifies the “optimal” descriptor set driven by the chemical structure and biological activity of the compounds under study. QSAR models built on AExOp-DCS optimized descriptors outperform those using nonoptimized sets. In this study, we explore the use of AExOp-DCS to identify optimal descriptor subsets for AMP modeling. Experimental results show that the descriptors returned by AExOp-DCS contain information comparable to those used in top-performing models while exhibiting higher discriminative capacity. The generated models based on the descriptors returned by AExOp-DCS achieved performance metric values comparable to state-of-the-art approaches while utilizing fewer descriptors, suggesting a more efficient modeling process. By reducing dimensionality without sacrificing accuracy, this approach contributes to the development of more efficient computational pipelines for AMP discovery. Finally, a Java software called AExOp-DCS-SEQ is freely available, enabling researchers to leverage its capabilities for peptide descriptor search and AMP classification tasks. ©The authors ©American Chemical Society (ACS) © Journal of Chemical Information and Modeling.
Some of the metrics are blocked by your
consent settings
StarPep Toolbox: an open-source software to assist chemical space analysis of bioactive peptides and their functions using complex networks
(Oxford University Press, 2023)
Aguilera-Mendoza, Longendri
;
Ayala-Ruano, Sebastián
;
Chávez, Edgar
;
García-Jacas, César R.
;
Brizuela, Carlos A.
Motivation: Antimicrobial peptides (AMPs) are promising molecules to treat infectious diseases caused by multi-drug resistance pathogens, some types of cancer, and other conditions. Computer-aided strategies are efficient tools for the high-throughput screening of AMPs. Copyright © 2024 Oxford University Press
Scopus© Citations 5 20 1
Some of the metrics are blocked by your
consent settings
Enhancing Acute Oral Toxicity Predictions by using Consensus Modeling and Algebraic Form-Based 0D-to-2D Molecular Encodes
(2019)
García-Jacas, César R.
;
Marrero Ponce, Yovani
;
Cortés-Guzmán, Fernando
;
Suárez-Lezcano, José
;
Martínez Ríos, Félix Orlando
Quantitative structure–activity relationships (QSAR) are introduced to predict acute oral toxicity (AOT), by using the QuBiLS-MAS (acronym for quadratic, bilinear and N-Linear maps based on graph-theoretic electronic-density matrices and atomic weightings) framework for the molecular encoding. Three training sets were employed to build the models: EPA training set (5931 compounds), EPA-full training set (7413 compounds), and Zhu training set (10 152 compounds). Additionally, the EPA test set (1482 compounds) was used for the validation of the QSAR models built on the EPA training set, while the ProTox (425 compounds) and T3DB (284 compounds) external sets were employed for the assessment of all the models. The k-nearest neighbor, multilayer perceptron, random forest, and support vector machine procedures were employed to build several base (individual) models. The base models with REPA–training ≥ 0.75 (R = correlation coefficient) and MAEEPA–training ≤ 0.5 (MAE = mean absolute error) were retained to build consensus models. As a result, two consensus models based on the minimum operator and denoted as M19 and M22, as well as a consensus model based on the weighted average operator and denoted as M24, were selected as the best ones for each training set considered. According to the applicability domain (AD) analysis performed, model M19 (built on the EPA training set) has MAEtest–AD = 0.4044, MAEProTox–AD = 0.4067 and MAET3DB–AD = 0.2586 on the EPA test set, ProTox external set, and T3DB external set, respectively; whereas model M22 (built on the EPA-full set) and model M24 (built on the Zhu set) present MAEProTox–AD = 0.3992 and MAET3DB–AD = 0.2286, and MAEProTox–AD = 0.3773 and MAET3DB–AD = 0.2471 on the two external sets accounted for, respectively. These outcomes were compared and statistically validated with respect to 14 QSAR methods (e.g., admetSAR, ProTox-II) from the literature. As a result, model M22 presents the best overall performance. In addition, a retrospective study on 261 withdrawn drugs due to their toxic/side effects was performed, to assess the usefulness of prospectively using the QSAR models proposed in the labeling of chemicals. A comparison with regard to the methods from the literature was also made. As a result, model M22 has the best ability of labeling a compound as toxic according to the globally harmonized system of classification and labeling of chemicals. Therefore, it can be concluded that the models proposed, especially model M22, constitute prominent tools for studying AOT, at providing the best results among all the methods examined. A freely available software was also developed to be used in virtual screening tasks (http://tomocomd.com/apps/ptoxra).
Scopus© Citations 22 9 1
Some of the metrics are blocked by your
consent settings
Handcrafted versus non-handcrafted (self-supervised) features for the classification of antimicrobial peptides: complementary or redundant?
(2022)
García-Jacas, César R.
;
García-González, Luis A.
;
Martínez Ríos, Félix Orlando
;
Tapia-Contreras, Issac P.
;
Brizuela, Carlos A.
Antimicrobial peptides (AMPs) have received a great deal of attention given their potential to become a plausible option to fight multidrug resistant bacteria as well as other pathogens. Quantitative sequence-activity models (QSAMs) have been helpful to discover new AMPs because they allow to explore a large universe of peptide sequences and help reduce the number of wet lab experiments. A main aspect in the building of QSAMs based on shallow learning is to determine an optimal set of protein descriptors (features) required to discriminate between sequences with different antimicrobial activities. These features are generally handcrafted from peptide sequence datasets that are labeled with specific antimicrobial activities. However, recent developments have shown that unsupervised approaches can be used to determine features that outperform human-engineered (handcrafted) features. Thus, knowing which of these two approaches contribute to a better classification of AMPs, it is a fundamental question in order to design more accurate models. Here, we present a systematic and rigorous study to compare both types of features. Experimental outcomes show that non-handcrafted features lead to achieve better performances than handcrafted features. However, the experiments also prove that an improvement in performance is achieved when both types of features are merged. A relevance analysis reveals that nonhandcrafted features have higher information content than handcrafted features, while an interaction-based importance analysis reveals that handcrafted features are more important. These findings suggest that there is complementarity between both types of features. Comparisons regarding state-of-the-art deep models show that shallow models yield better performances both when fed with non-handcrafted features alone and when fed with non-handcrafted and handcrafted features together.
Scopus© Citations 9 34 2

CRIS

Browse

Filters

Settings

Sort By

Results per page

Search Results