Title: On Using Inductive Biases for Designing Deep Learning Architectures
Date: Wednesday, December 9th, 2020
Time: 13:00 to 15:00 (EDT)
Machine Learning PhD Student
School of Computational Science and Engineering
Georgia Institute of Technology
- Dr. Srinivas Aluru (Advisor) - School of Computational Science and Engineering, Georgia Institute of Technology
- Dr. Le Song - School of Computational Science and Engineering, Georgia Institute of Technology
- Dr. Xiuwei Zhang - School of Computational Science and Engineering, Georgia Institute of Technology
- Dr. B. Aditya Prakash - School of Computational Science and Engineering, Georgia Institute of Technology
- Dr. Ashok Goel - School of Interactive Computing, Georgia Institute of Technology
I will go over two novel and generic approaches for designing deep learning architectures which incorporate our domain knowledge about the problem under consideration. The 'Cooperative Neural Networks' take their inductive biases from the underlying probabilistic graphical models, while the problem dependent 'Unrolled Algorithms' are designed using the structure obtained by unrolling an optimization algorithm on an objective function of interest as a template. We found that the neural network architectures obtained from our approaches typically end up with very few learnable parameters and provide considerable improvement in run-time compared to other deep learning methods. We have applied our techniques to solve NLP related tasks and problems in finance, healthcare & computational biology.
There are three components of my thesis:
Firstly, I will go through the Cooperative Neural Network (CoNN-sLDA) approach which we developed for the document classification task. We use the popular Latent Dirichlet Allocation graphical model as the inductive bias for the CoNN-sLDA model. We demonstrate a 23% reduction in error on the challenging MultiSent data set compared to state-of-the-art.
Secondly, I will explain the idea of using ‘Unrolled Algorithms’ for the sparse graph recovery task. We propose a deep learning architecture, GLAD, which uses an Alternating Minimization algorithm as our model inductive bias and learns the model parameters via supervised learning. We show that GLAD learns a very compact and effective model for recovering sparse graphs from data.
Finally, I will walk through our approach of solving problems related to single-cell RNA sequencing data. Specifically, we design a novel gene regulatory network reconstruction framework called `GRNUlar'. Our method smartly utilizes the expressive ability of neural networks in a multi-task learning framework merged with our `Unrolled Algorithms' technique. To the best of our knowledge, our work is the first to introduce the successful use of expression data simulators for supervised learning of gene regulatory networks from single cell RNA seq data.