-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathresearchstatement.tex
37 lines (23 loc) · 3.97 KB
/
researchstatement.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
% !TEX root = thesis.tex
\documentclass[]{article}
%opening
\title{Research Statement}
\author{Yani Ioannou}
\begin{document}
\maketitle
%\begin{abstract}
%\end{abstract}
\section{Research Statement}
Deep learning has in recent years come to dominate the previously separate fields of research in machine learning, computer vision, natural language understanding and speech recognition. Despite breakthroughs in training deep networks, many relatively simple questions remain unanswered about the representations learned within these networks. In addition
This lack of understanding in both the optimization and structure of deep networks has meant that contemporary deep network architectures for image classification have high computational and memory complexity. This is a direct result of the inability to identify the optimal architecture for datasets. Instead, the approach advocated by many researchers in the field has been to train monolithic networks with excess complexity, and strong regularization - an approach that has found success in practice for accuracy, but leaves much to desire in efficiency.
Instead, in our work, we propose that carefully designing networks in consideration of our prior knowledge of the task can improve the memory and compute efficiency of state-of-the art networks, and even increase accuracy through structurally induced regularization. We have described several such novel methods for creating computationally efficient and compact deep neural networks (\glspl{cnn}\index{CNN}) using several methods:
- Using conditional computation for faster inference, by understanding the connections between two state-of-the-art classifiers: neural networks and decision forests.
- Exploiting our knowledge of the low-rank nature of most filters learned for image recognition by structuring a deep network to learn a low-rank basis for the spatial extents of filters.
- A novel sparse connection structure for deep networks that allows a significant reduction in computational cost while increasing accuracy.
\pagebreak
\section{Research Proposal}
\textbf{Learning with Less: Efficient Deep Neural Networks}
\Glspl{dnn} are the driving force behind emerging technology such as self-driving cars, automatic image tagging, AI assistants such as Siri, and human-beating go playing AI. Recent breakthroughs in training \glspl{dnn}\index{DNN} have been responsible for vastly improved generalisation in a range of tasking including image class recognition, segmentation and detection. There remain serious issues in training \glspl{dnn}\index{DNN} however, with a lack of understanding of the optimal structure or training methodology for a given problem/data set. Training also requires massive amounts of labelled data, which for many problems is difficult and/or expensive to obtain.
Due to the increasing number of parameters in state-of-the-art deep networks (more than 100 million), the optimization of \glspl{dnn}\index{DNN} is increasingly more challenging. Our previous work has shown that re-structuring existing state-of-the-art networks with fewer, more relevant parameters, can improve generalisation - the ability to make accurate predictions outside the training set - while also improving computational efficiency. We plan to take this concept further towards answering one of the most pressing questions on the limitations of training \glspl{dnn}\index{DNN}: Can \glspl{dnn}\index{DNN} be designed to be trained with smaller data sets just as effectively as with large data sets?
While our previous work has shown that existing networks are greatly over-parametrized, and generalisation can be improved, it would be of more practical value if instead generalisation could be maintained while reducing data set sizes. Humans can learn with surprisingly few labelled samples - a child does not need to see many images of a dog in order to recognise dogs accurately. Pushing the size of data sets down significantly would vastly increase the possible applications of \glspl{dnn}\index{DNN}
\end{document}