Résumés étudiants

Mercredi 10 novembre 2021

11h30 - 11H40

Deep Learning on Genetic Data with Diet Network

Rochefort-Boulanger, C*(1)(2)(3), Scicluna, M(1)(3), Choinière, L(1)(3), Grenier, JC(1), Carrier, PL(2), Bengio, Y(2)(3) and Hussin, J(1)(3) (1) Montreal Heart Institute, (2) Mila and (3) Université de Montréal

Présentateur.trice : ROCHEFORT-BOULANGER, Camille

Statut : Étudiant·e au doctorat

Encadrant.e : Pr Julie Hussin (Montreal Heart Institute, Université de Montréal) and Pr Yoshua Bengio (Mila, Université de Montréal)

Contact : camille.rochefortboulanger@gmail.com

Objective: The Diet Network (DN) is a deep learning approach proposed to accommodate the large number of genetic variants used as features in prediction problems in genomics. This architecture, developed on the dataset 1000 Genomes Project (1000G), has proven to be effective at determining individuals’ population using their Single Nucleotide Polymorphisms (300K SNPs), without overfitting. However, given the heterogeneity of genomic data collection protocols and the high number of missing data in genomic datasets, it is important to assess if a model trained on a dataset can generalize its predictions to independent datasets. Furthermore, the DN has not yet been tested for the prediction of complex phenotypes. Methods: The generalization capability of the DN was tested on the independent dataset CARTaGENE (CAG) using a model trained on 1000G with dropout on the input layer to randomly remove SNPs at training time. The DN was trained to predict obesity using 408K SNPs of 197K White British individuals from the UKbiobank divided into normal (body mass index between 18.5 and 24.99) and obese (body mass index over 30) categories. Results: Our results show that missing values greatly impact the accuracy of the predictions made by the model. Using dropout on the input layer makes the model more robust to high numbers of missing values and increases generalization performance on CAG. For obesity prediction, the DN achieves a classification accuracy of 74.5% on the training set and 61.4% on the test set. This vanilla DN therefore overfits on the training data but still reaches an accuracy on the test set comparable to what is achieved with polygenic risk scores. Conclusion: Our work on the population stratification task shows that the DN can generalize its predictions to independent datasets which is critical for its application across genomics datasets. The results obtained for obesity prediction are promising for predicting complex phenotypes. The DN will need to be further fine-tuned and several changes will be made in order to achieve the best possible results.

Génètique, apprentissage profond

11h40 - 11h50

Single nuclei sequencing of human adipose tissue coupled with deep learning reveals transcriptional dynamics and in vivo dysmetabolism associated with adipocyte hypertrophy

Ye, RZ*(1), Montastier, E.(1), Frisch, F(1), Noll, C(1), Allard-Chamard, H(1), Gévry, N(2), Tchernof, A.(3), Carpentier, A(1) (1) Division of Endocrinology, Department of Medicine, Centre de recherche du Centre hospitalier universitaire de Sherbrooke, Université de Sherbrooke, Sherbrooke, Québec, Canada (2) Department of Biology, Université de Sherbrooke, Sherbrooke, Québec, Canada (3) Québec Heart and Lung Research Institute, Laval University, Québec, Québec, Canada

Présentateur.trice : YE, Run Zhou

Statut : Étudiant·e au doctorat

Encadrant.e : Dr André Carpentier (UdeS, CRCHUS)

Contact : run.zhou.ye@usherbrooke.ca

Objective: In this study, we aimed to explore potential mechanisms driving the obesity-independent, hypertrophy-associated dysmetabolism at a single cell resolution and to assess the applicability of single nuclei RNA-sequencing of flash-frozen human subcutaneous adipose tissue (AT). We also sought to assess whether ex vivo cellular heterogeneity is consistent with variations in the whole-body AT distribution and in vivo metabolic endpoints. Methods: Forty subjects were included in postprandial studies. Subcutaneous AT biopsies were obtained by needle aspiration and adipocyte size measured by histology. We performed single-nuclei RNA sequencing on biopsies from two individuals with adipocyte hypertrophy and hyperplasia matched for sex, ethnicity, glucose tolerance, BMI, body fat mass and percentage, and waist circumference. The measurement of dietary fatty acid distribution was performed by positron emission tomography (PET) with oral administration of [(18)F]-Fluoro-6-thia-heptadecanoic acid. Fatty acid uptake, oxidation and esterification rates were measured using intravenous 11C-palmitate with dynamic PET acquisition and multi-compartment modeling. To determine AT volume using whole-body computed tomography (CT), we performed transfer learning by training a deep convolutional neural network that had been previously trained to perform semantic segmentation of CT images using our DeepImageTranslator software. Results: Adipocyte hypertrophy is related to the differentiation of fibro-adipogenic cells to mature insulin-resistant adipocytes and alterations in RNA velocity along the adipogenic trajectory. Hypertrophy is also associated with insulin resistance in AT, distribution of fatty acid in visceral AT, hepatic fat, and the rate of fatty acid uptake by the liver and heart. Conclusion: Changes in cell composition and differentiation trajectories assessed by single nuclei sequencing are consistent with whole-body dysmetabolism measured by in vivo metabolic testing and PET/CT imaging.

Lipidologie, Imagerie

Jeudi 11 novembre 2021

11h30 - 11h40

Utilisation d’algorithmes d’apprentissage automatique pour prédire un comportement en nutrition

Côté, M*(1), Osseni, MA(2), Brassard, D(1), Carbonneau, E(1)(3), Robitaille, J(1), Vohl, MC(1), Lemieux, S(1), Laviolette, F(2) et Lamarche, B(1).
(1) Centre de recherche Nutrition, santé et société (NUTRISS), INAF, Université Laval (Québec, Canada) (2) Centre de recherche en données massives (CRDM), Université Laval (Québec, Canada) (3) School of Nutrition Sciences, University of Ottawa (Ontario, Canada)

Présentateur.trice : CÔTÉ, Mélina

Statut : Étudiant·e à la maîtrise

Encadrant.e : Dr Benoît Lamarche (Centre de recherche Nutrition, santé et société (NUTRISS), INAF, Université Laval)

Contact : melina.cote.2@ulaval.ca

Objectif: L’apprentissage automatique (AA) pourrait permettre de mieux identifier et comprendre les interactions complexes qui existent entre les nombreux facteurs qui influencent les choix alimentaires. Le but de cette étude était de tester l’hypothèse que des modèles de prédiction basés sur l’AA performent mieux que des modèles statistiques traditionnels pour prédire une consommation adéquate de légumes et fruits (LF). Méthodes: De nombreuses variables (525) individuelles et environnementales reliées aux habitudes alimentaires dans un échantillon de 1147 hommes et femmes ont été utilisées. La consommation adéquate de LF, défini comme étant 5 portions et plus par jour, a été mesurée à partir de données tirées de rappels de 24h web validés. Neuf algorithmes d’AA ont été comparés à deux modèles statistiques traditionnels (régression logistique et lasso) sur la base de leur capacité à prédire avec exactitude une consommation adéquate de LF. Une série d’analyses de sensibilité a aussi été effectuée afin de tenter d’améliorer la performance prédictive des différents algorithmes. Résultats: La régression logistique et le lasso ont tous les deux prédit une consommation adéquate de LF avec une exactitude de 64% (intervalles de confiance de 95% [95%CI] entre 58% et 68%). Les modèles d’AA qui ont le mieux prédit une consommation adéquate de LF étaient la machine à vecteurs de support (MVS) à noyau à base radiale ou à noyau sigmoïde, les deux avec une exactitude de 65% (95%CI: 59%-71%). Le modèle d’AA le moins performant était la MVS à noyau linéaire avec une exactitude de 55% (95%CI: 49%-61%). Les analyses de sensibilité n’ont pas permis d’améliorer sensiblement la performance des algorithmes traditionnels et d’AA. Conclusion: En somme, les modèles d’AA ne semblent pas mieux performer que les modèles statistiques traditionnels pour prédire une consommation adéquate de LF. Ces résultats suggèrent que davantage d’études sur le sujet sont nécessaires pour explorer le vrai potentiel de l’AA dans la prédiction de comportements complexes associés à la saine alimentation.

Nutrition, comportement, intelligence artificlelle

11h40 - 11h50

Assessment of the usefulness of machine learning-based prediction of COVID-19 related outcomes: A study of Quebec’s Second Wave

Sugiarta, W*(1) (3), Bosson-Rieutort, D(1), Benigeri, M(1), Ghachem, A(1) (2)

(1) National Institute of Excellence in Health and Social Services (INESSS), (Montreal, Canada) (2) Faculty of Physical Activity Sciences, University of Sherbrooke, (Sherbrooke, Canada) (3) Department of Computer Science, McGill University, (Montreal, Canada)

Présentateur.trice : SUGIARTA, WISANG

Encadrant.e : Dr. Ahmed Ghachem (UdeS, INESSS) and Dr. Emma Frejinger (UdeM, MILA)

Statut : Étudiant·e à la maîtrise

Contact : wisang.sugiarta@umontreal.ca

Objective: Using a medical-administrative database collected routinely for the province of Quebec, we developed and assessed the usefulness of ML models to identify COVID-19 patients who will be hospitalized, admitted to ICU, or die. Methods: Data from laboratories and epidemiological surveillance systems in Quebec, Canada, were linked to the health-administrative databases to build a cohort of confirmed COVID-19 cases. Data from the 2nd wave were analyzed. Three ML models: Adaptive Boosting Classifier (AdaBoost), Support Vector Machines (SVM), and logistic regression were tested. Several metrics were used to assess the performance of ML models. Results: Overall, all models showed similar performance. Logistic regression seemed to perform better to predict hospitalisation (AUC=0.82; sensitivity=83%; specificity=81%) and deaths (AUC=0.91; sensitivity=94%; specificity=67%), whereas, ADABoost best predicted ICU admissions (AUC=0.76; sensitivity=72%; specificity=78%). All models showed low precision, +F1 scores and +LR. Conclusion: Overall, all models showed an acceptable to excellent classification ability, but low precision, to identify patients hospitalized, admitted to IUC, or who died. In emergency contexts such as a pandemic, our models could help mitigate pressure on healthcare systems by identifying individuals at high risk to develop COVID-19-related complications for whom vaccination and preventive care would be recommended.

Facteurs de risques, COVID-19, apprentissage machine