<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFCC99">
<p><a class="moz-txt-link-freetext" href="https://arxiv.org/abs/1808.03578">https://arxiv.org/abs/1808.03578</a></p>
<h1 class="title mathjax" style="margin: 0.5em 0px 0.5em 20px;
font-size: x-large; font-weight: bold; line-height: 28.8px; color:
rgb(0, 0, 0); font-family: "Lucida Grande", helvetica,
arial, verdana, sans-serif; font-style: normal;
font-variant-ligatures: normal; font-variant-caps: normal;
letter-spacing: normal; orphans: 2; text-align: start;
text-indent: 0px; text-transform: none; white-space: normal;
widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px;
background-color: rgb(255, 255, 255); text-decoration-style:
initial; text-decoration-color: initial;">Dropout is a special
case of the stochastic delta rule: faster and more accurate deep
learning</h1>
<div class="authors" style="margin: 0.5em 0px 0.5em 20px; font-size:
medium; line-height: 24px; color: rgb(0, 0, 0); font-family:
"Lucida Grande", helvetica, arial, verdana, sans-serif;
font-style: normal; font-variant-ligatures: normal;
font-variant-caps: normal; font-weight: 400; letter-spacing:
normal; orphans: 2; text-align: start; text-indent: 0px;
text-transform: none; white-space: normal; widows: 2;
word-spacing: 0px; -webkit-text-stroke-width: 0px;
background-color: rgb(255, 255, 255); text-decoration-style:
initial; text-decoration-color: initial;"><a
href="https://arxiv.org/search/cs?searchtype=author&query=Frazier-Logue%2C+N"
style="text-decoration: none; font-size: medium;">Noah
Frazier-Logue</a>,<span> </span><a
href="https://arxiv.org/search/cs?searchtype=author&query=Hanson%2C+S+J"
style="text-decoration: none; font-size: medium;">Stephen José
Hanson</a></div>
<div class="dateline" style="margin: 0.5em 0px 0.5em 20px;
font-style: italic; font-size: small; color: rgb(0, 0, 0);
font-family: "Lucida Grande", helvetica, arial, verdana,
sans-serif; font-variant-ligatures: normal; font-variant-caps:
normal; font-weight: 400; letter-spacing: normal; orphans: 2;
text-align: start; text-indent: 0px; text-transform: none;
white-space: normal; widows: 2; word-spacing: 0px;
-webkit-text-stroke-width: 0px; background-color: rgb(255, 255,
255); text-decoration-style: initial; text-decoration-color:
initial;">(Submitted on 10 Aug 2018)</div>
<blockquote class="abstract mathjax" style="line-height: 20.16px;
margin-bottom: 1.5em; color: rgb(0, 0, 0); font-family:
"Lucida Grande", helvetica, arial, verdana, sans-serif;
font-size: 14.4px; font-style: normal; font-variant-ligatures:
normal; font-variant-caps: normal; font-weight: 400;
letter-spacing: normal; orphans: 2; text-align: start;
text-indent: 0px; text-transform: none; white-space: normal;
widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px;
background-color: rgb(255, 255, 255); text-decoration-style:
initial; text-decoration-color: initial;">Multi-layer neural
networks have lead to remarkable performance on many kinds of
benchmark tasks in text, speech and image processing. Nonlinear
parameter estimation in hierarchical models is known to be subject
to overfitting. One approach to this overfitting and related
problems (local minima, colinearity, feature discovery etc.) is
called dropout (Srivastava, et al 2014, Baldi et al 2016). This
method removes hidden units with a Bernoulli random variable with
probability<span> </span><span class="MathJax"
id="MathJax-Element-1-Frame" tabindex="0" style="display:
inline; font-style: normal; font-weight: normal; line-height:
normal; font-size: 14.4px; text-indent: 0px; text-align: left;
text-transform: none; letter-spacing: normal; word-spacing:
normal; word-wrap: normal; white-space: nowrap; float: none;
direction: ltr; max-width: none; max-height: none; min-width:
0px; min-height: 0px; border: 0px; padding: 0px; margin: 0px;"><nobr
style="transition: none 0s ease 0s; border: 0px; padding: 0px;
margin: 0px; max-width: none; max-height: none; min-width:
0px; min-height: 0px; vertical-align: 0px; line-height:
normal; text-decoration: none; white-space: nowrap
!important;"><span class="math" id="MathJax-Span-1"
style="transition: none 0s ease 0s; display: inline-block;
position: static; border: 0px; padding: 0px; margin: 0px;
vertical-align: 0px; line-height: normal; text-decoration:
none; width: 0.629em;"><span style="transition: none 0s ease
0s; display: inline-block; position: relative; border:
0px; padding: 0px; margin: 0px; vertical-align: 0px;
line-height: normal; text-decoration: none; width:
0.515em; height: 0px; font-size: 17.568px;"><span
style="transition: none 0s ease 0s; display: inline;
position: absolute; border: 0px; padding: 0px; margin:
0px; vertical-align: 0px; line-height: normal;
text-decoration: none; clip: rect(1.312em, 1000.51em,
2.28em, -999.997em); top: -1.932em; left: 0em;"><span
class="mrow" id="MathJax-Span-2" style="transition:
none 0s ease 0s; display: inline; position: static;
border: 0px; padding: 0px; margin: 0px;
vertical-align: 0px; line-height: normal;
text-decoration: none;"><span class="mi"
id="MathJax-Span-3" style="transition: none 0s ease
0s; display: inline; position: static; border: 0px;
padding: 0px; margin: 0px; vertical-align: 0px;
line-height: normal; text-decoration: none;
font-family: MathJax_Math-italic;">p</span></span><span
style="transition: none 0s ease 0s; display:
inline-block; position: static; border: 0px; padding:
0px; margin: 0px; vertical-align: 0px; line-height:
normal; text-decoration: none; width: 0px; height:
1.938em;"></span></span></span><span
style="transition: none 0s ease 0s; display: inline-block;
position: static; border-width: 0px; border-top-style:
initial; border-right-style: initial; border-bottom-style:
initial; border-left-style: solid; border-color: initial;
border-image: initial; padding: 0px; margin: 0px;
vertical-align: -0.274em; line-height: normal;
text-decoration: none; overflow: hidden; width: 0px;
height: 0.906em;"></span></span></nobr></span>over
updates. In this paper we will show that Dropout is a special case
of a more general model published originally in 1990 called the
stochastic delta rule ( SDR, Hanson, 1990). SDR parameterizes each
weight in the network as a random variable with mean<span> </span><span
class="MathJax" id="MathJax-Element-2-Frame" tabindex="0"
style="display: inline; font-style: normal; font-weight: normal;
line-height: normal; font-size: 14.4px; text-indent: 0px;
text-align: left; text-transform: none; letter-spacing: normal;
word-spacing: normal; word-wrap: normal; white-space: nowrap;
float: none; direction: ltr; max-width: none; max-height: none;
min-width: 0px; min-height: 0px; border: 0px; padding: 0px;
margin: 0px;"><nobr style="transition: none 0s ease 0s; border:
0px; padding: 0px; margin: 0px; max-width: none; max-height:
none; min-width: 0px; min-height: 0px; vertical-align: 0px;
line-height: normal; text-decoration: none; white-space:
nowrap !important;"><span class="math" id="MathJax-Span-4"
style="transition: none 0s ease 0s; display: inline-block;
position: static; border: 0px; padding: 0px; margin: 0px;
vertical-align: 0px; line-height: normal; text-decoration:
none; width: 1.938em;"><span style="transition: none 0s ease
0s; display: inline-block; position: relative; border:
0px; padding: 0px; margin: 0px; vertical-align: 0px;
line-height: normal; text-decoration: none; width:
1.597em; height: 0px; font-size: 17.568px;"><span
style="transition: none 0s ease 0s; display: inline;
position: absolute; border: 0px; padding: 0px; margin:
0px; vertical-align: 0px; line-height: normal;
text-decoration: none; clip: rect(0.401em, 1001.6em,
1.54em, -999.997em); top: -1.022em; left: 0em;"><span
class="mrow" id="MathJax-Span-5" style="transition:
none 0s ease 0s; display: inline; position: static;
border: 0px; padding: 0px; margin: 0px;
vertical-align: 0px; line-height: normal;
text-decoration: none;"><span class="msubsup"
id="MathJax-Span-6" style="transition: none 0s ease
0s; display: inline; position: static; border: 0px;
padding: 0px; margin: 0px; vertical-align: 0px;
line-height: normal; text-decoration: none;"><span
style="transition: none 0s ease 0s; display:
inline-block; position: relative; border: 0px;
padding: 0px; margin: 0px; vertical-align: 0px;
line-height: normal; text-decoration: none; width:
1.597em; height: 0px;"><span style="transition:
none 0s ease 0s; display: inline; position:
absolute; border: 0px; padding: 0px; margin:
0px; vertical-align: 0px; line-height: normal;
text-decoration: none; clip: rect(3.361em,
1000.57em, 4.386em, -999.997em); top: -3.982em;
left: 0em;"><span class="mi" id="MathJax-Span-7"
style="transition: none 0s ease 0s; display:
inline; position: static; border: 0px;
padding: 0px; margin: 0px; vertical-align:
0px; line-height: normal; text-decoration:
none; font-family: MathJax_Math-italic;">μ</span><span
style="transition: none 0s ease 0s; display:
inline-block; position: static; border: 0px;
padding: 0px; margin: 0px; vertical-align:
0px; line-height: normal; text-decoration:
none; width: 0px; height: 3.987em;"></span></span><span
style="transition: none 0s ease 0s; display:
inline; position: absolute; border: 0px;
padding: 0px; margin: 0px; vertical-align: 0px;
line-height: normal; text-decoration: none; top:
-3.811em; left: 0.629em;"><span class="texatom"
id="MathJax-Span-8" style="transition: none 0s
ease 0s; display: inline; position: static;
border: 0px; padding: 0px; margin: 0px;
vertical-align: 0px; line-height: normal;
text-decoration: none;"><span class="mrow"
id="MathJax-Span-9" style="transition: none
0s ease 0s; display: inline; position:
static; border: 0px; padding: 0px; margin:
0px; vertical-align: 0px; line-height:
normal; text-decoration: none;"><span
class="msubsup" id="MathJax-Span-10"
style="transition: none 0s ease 0s;
display: inline; position: static; border:
0px; padding: 0px; margin: 0px;
vertical-align: 0px; line-height: normal;
text-decoration: none;"><span
style="transition: none 0s ease 0s;
display: inline-block; position:
relative; border: 0px; padding: 0px;
margin: 0px; vertical-align: 0px;
line-height: normal; text-decoration:
none; width: 0.914em; height: 0px;"><span
style="transition: none 0s ease 0s;
display: inline; position: absolute;
border: 0px; padding: 0px; margin:
0px; vertical-align: 0px; line-height:
normal; text-decoration: none; clip:
rect(3.475em, 1000.51em, 4.158em,
-999.997em); top: -3.982em; left:
0em;"><span class="mi"
id="MathJax-Span-11"
style="transition: none 0s ease 0s;
display: inline; position: static;
border: 0px; padding: 0px; margin:
0px; vertical-align: 0px;
line-height: normal;
text-decoration: none; font-size:
12.4206px; font-family:
MathJax_Math-italic;">w</span><span
style="transition: none 0s ease 0s;
display: inline-block; position:
static; border: 0px; padding: 0px;
margin: 0px; vertical-align: 0px;
line-height: normal;
text-decoration: none; width: 0px;
height: 3.987em;"></span></span><span
style="transition: none 0s ease 0s;
display: inline; position: absolute;
border: 0px; padding: 0px; margin:
0px; vertical-align: 0px; line-height:
normal; text-decoration: none; top:
-3.868em; left: 0.515em;"><span
class="texatom" id="MathJax-Span-12"
style="transition: none 0s ease 0s;
display: inline; position: static;
border: 0px; padding: 0px; margin:
0px; vertical-align: 0px;
line-height: normal;
text-decoration: none;"><span
class="mrow" id="MathJax-Span-13"
style="transition: none 0s ease
0s; display: inline; position:
static; border: 0px; padding: 0px;
margin: 0px; vertical-align: 0px;
line-height: normal;
text-decoration: none;"><span
class="mi" id="MathJax-Span-14"
style="transition: none 0s ease
0s; display: inline; position:
static; border: 0px; padding:
0px; margin: 0px;
vertical-align: 0px;
line-height: normal;
text-decoration: none;
font-size: 8.784px; font-family:
MathJax_Math-italic;">i</span><span
class="mi" id="MathJax-Span-15"
style="transition: none 0s ease
0s; display: inline; position:
static; border: 0px; padding:
0px; margin: 0px;
vertical-align: 0px;
line-height: normal;
text-decoration: none;
font-size: 8.784px; font-family:
MathJax_Math-italic;">j</span></span></span><span
style="transition: none 0s ease 0s;
display: inline-block; position:
static; border: 0px; padding: 0px;
margin: 0px; vertical-align: 0px;
line-height: normal;
text-decoration: none; width: 0px;
height: 3.987em;"></span></span></span></span></span></span><span
style="transition: none 0s ease 0s; display:
inline-block; position: static; border: 0px;
padding: 0px; margin: 0px; vertical-align:
0px; line-height: normal; text-decoration:
none; width: 0px; height: 3.987em;"></span></span></span></span></span><span
style="transition: none 0s ease 0s; display:
inline-block; position: static; border: 0px; padding:
0px; margin: 0px; vertical-align: 0px; line-height:
normal; text-decoration: none; width: 0px; height:
1.027em;"></span></span></span><span
style="transition: none 0s ease 0s; display: inline-block;
position: static; border-width: 0px; border-top-style:
initial; border-right-style: initial; border-bottom-style:
initial; border-left-style: solid; border-color: initial;
border-image: initial; padding: 0px; margin: 0px;
vertical-align: -0.483em; line-height: normal;
text-decoration: none; overflow: hidden; width: 0px;
height: 1.115em;"></span></span></nobr></span><span> </span>and
standard deviation<span> </span><span class="MathJax"
id="MathJax-Element-3-Frame" tabindex="0" style="display:
inline; font-style: normal; font-weight: normal; line-height:
normal; font-size: 14.4px; text-indent: 0px; text-align: left;
text-transform: none; letter-spacing: normal; word-spacing:
normal; word-wrap: normal; white-space: nowrap; float: none;
direction: ltr; max-width: none; max-height: none; min-width:
0px; min-height: 0px; border: 0px; padding: 0px; margin: 0px;"><nobr
style="transition: none 0s ease 0s; border: 0px; padding: 0px;
margin: 0px; max-width: none; max-height: none; min-width:
0px; min-height: 0px; vertical-align: 0px; line-height:
normal; text-decoration: none; white-space: nowrap
!important;"><span class="math" id="MathJax-Span-16"
style="transition: none 0s ease 0s; display: inline-block;
position: static; border: 0px; padding: 0px; margin: 0px;
vertical-align: 0px; line-height: normal; text-decoration:
none; width: 1.938em;"><span style="transition: none 0s ease
0s; display: inline-block; position: relative; border:
0px; padding: 0px; margin: 0px; vertical-align: 0px;
line-height: normal; text-decoration: none; width:
1.597em; height: 0px; font-size: 17.568px;"><span
style="transition: none 0s ease 0s; display: inline;
position: absolute; border: 0px; padding: 0px; margin:
0px; vertical-align: 0px; line-height: normal;
text-decoration: none; clip: rect(0.401em, 1001.6em,
1.54em, -999.997em); top: -1.022em; left: 0em;"><span
class="mrow" id="MathJax-Span-17" style="transition:
none 0s ease 0s; display: inline; position: static;
border: 0px; padding: 0px; margin: 0px;
vertical-align: 0px; line-height: normal;
text-decoration: none;"><span class="msubsup"
id="MathJax-Span-18" style="transition: none 0s ease
0s; display: inline; position: static; border: 0px;
padding: 0px; margin: 0px; vertical-align: 0px;
line-height: normal; text-decoration: none;"><span
style="transition: none 0s ease 0s; display:
inline-block; position: relative; border: 0px;
padding: 0px; margin: 0px; vertical-align: 0px;
line-height: normal; text-decoration: none; width:
1.597em; height: 0px;"><span style="transition:
none 0s ease 0s; display: inline; position:
absolute; border: 0px; padding: 0px; margin:
0px; vertical-align: 0px; line-height: normal;
text-decoration: none; clip: rect(3.361em,
1000.57em, 4.158em, -999.997em); top: -3.982em;
left: 0em;"><span class="mi"
id="MathJax-Span-19" style="transition: none
0s ease 0s; display: inline; position: static;
border: 0px; padding: 0px; margin: 0px;
vertical-align: 0px; line-height: normal;
text-decoration: none; font-family:
MathJax_Math-italic;">σ<span
style="transition: none 0s ease 0s; display:
inline-block; position: static; border: 0px;
padding: 0px; margin: 0px; vertical-align:
0px; line-height: normal; text-decoration:
none; overflow: hidden; height: 1px; width:
0.003em;"></span></span><span
style="transition: none 0s ease 0s; display:
inline-block; position: static; border: 0px;
padding: 0px; margin: 0px; vertical-align:
0px; line-height: normal; text-decoration:
none; width: 0px; height: 3.987em;"></span></span><span
style="transition: none 0s ease 0s; display:
inline; position: absolute; border: 0px;
padding: 0px; margin: 0px; vertical-align: 0px;
line-height: normal; text-decoration: none; top:
-3.811em; left: 0.572em;"><span class="texatom"
id="MathJax-Span-20" style="transition: none
0s ease 0s; display: inline; position: static;
border: 0px; padding: 0px; margin: 0px;
vertical-align: 0px; line-height: normal;
text-decoration: none;"><span class="mrow"
id="MathJax-Span-21" style="transition: none
0s ease 0s; display: inline; position:
static; border: 0px; padding: 0px; margin:
0px; vertical-align: 0px; line-height:
normal; text-decoration: none;"><span
class="msubsup" id="MathJax-Span-22"
style="transition: none 0s ease 0s;
display: inline; position: static; border:
0px; padding: 0px; margin: 0px;
vertical-align: 0px; line-height: normal;
text-decoration: none;"><span
style="transition: none 0s ease 0s;
display: inline-block; position:
relative; border: 0px; padding: 0px;
margin: 0px; vertical-align: 0px;
line-height: normal; text-decoration:
none; width: 0.914em; height: 0px;"><span
style="transition: none 0s ease 0s;
display: inline; position: absolute;
border: 0px; padding: 0px; margin:
0px; vertical-align: 0px; line-height:
normal; text-decoration: none; clip:
rect(3.475em, 1000.51em, 4.158em,
-999.997em); top: -3.982em; left:
0em;"><span class="mi"
id="MathJax-Span-23"
style="transition: none 0s ease 0s;
display: inline; position: static;
border: 0px; padding: 0px; margin:
0px; vertical-align: 0px;
line-height: normal;
text-decoration: none; font-size:
12.4206px; font-family:
MathJax_Math-italic;">w</span><span
style="transition: none 0s ease 0s;
display: inline-block; position:
static; border: 0px; padding: 0px;
margin: 0px; vertical-align: 0px;
line-height: normal;
text-decoration: none; width: 0px;
height: 3.987em;"></span></span><span
style="transition: none 0s ease 0s;
display: inline; position: absolute;
border: 0px; padding: 0px; margin:
0px; vertical-align: 0px; line-height:
normal; text-decoration: none; top:
-3.868em; left: 0.515em;"><span
class="texatom" id="MathJax-Span-24"
style="transition: none 0s ease 0s;
display: inline; position: static;
border: 0px; padding: 0px; margin:
0px; vertical-align: 0px;
line-height: normal;
text-decoration: none;"><span
class="mrow" id="MathJax-Span-25"
style="transition: none 0s ease
0s; display: inline; position:
static; border: 0px; padding: 0px;
margin: 0px; vertical-align: 0px;
line-height: normal;
text-decoration: none;"><span
class="mi" id="MathJax-Span-26"
style="transition: none 0s ease
0s; display: inline; position:
static; border: 0px; padding:
0px; margin: 0px;
vertical-align: 0px;
line-height: normal;
text-decoration: none;
font-size: 8.784px; font-family:
MathJax_Math-italic;">i</span><span
class="mi" id="MathJax-Span-27"
style="transition: none 0s ease
0s; display: inline; position:
static; border: 0px; padding:
0px; margin: 0px;
vertical-align: 0px;
line-height: normal;
text-decoration: none;
font-size: 8.784px; font-family:
MathJax_Math-italic;">j</span></span></span><span
style="transition: none 0s ease 0s;
display: inline-block; position:
static; border: 0px; padding: 0px;
margin: 0px; vertical-align: 0px;
line-height: normal;
text-decoration: none; width: 0px;
height: 3.987em;"></span></span></span></span></span></span><span
style="transition: none 0s ease 0s; display:
inline-block; position: static; border: 0px;
padding: 0px; margin: 0px; vertical-align:
0px; line-height: normal; text-decoration:
none; width: 0px; height: 3.987em;"></span></span></span></span></span><span
style="transition: none 0s ease 0s; display:
inline-block; position: static; border: 0px; padding:
0px; margin: 0px; vertical-align: 0px; line-height:
normal; text-decoration: none; width: 0px; height:
1.027em;"></span></span></span><span
style="transition: none 0s ease 0s; display: inline-block;
position: static; border-width: 0px; border-top-style:
initial; border-right-style: initial; border-bottom-style:
initial; border-left-style: solid; border-color: initial;
border-image: initial; padding: 0px; margin: 0px;
vertical-align: -0.483em; line-height: normal;
text-decoration: none; overflow: hidden; width: 0px;
height: 1.115em;"></span></span></nobr></span>. These
random variables are sampled on each forward activation,
consequently creating an exponential number of potential networks
with shared weights. Both parameters are updated according to
prediction error, thus implementing weight noise injections that
reflect a local history of prediction error and efficient model
averaging. SDR therefore implements a local gradient-dependent
simulated annealing per weight converging to a bayes optimal
network. Tests on standard benchmarks (CIFAR) using a modified
version of DenseNet shows the SDR outperforms standard dropout in
error by over 50% and in loss by over 50%. Furthermore, the SDR
implementation converges on a solution much faster, reaching a
training error of 5 in just 15 epochs with DenseNet-40 compared to
standard DenseNet-40's 94 epochs.</blockquote>
<p><br>
</p>
<pre class="moz-signature" cols="72">--
Stephen José Hanson
Full Professor
Director RUBIC (University-Wide)
Department of Psychology (NK)
Cognitive Science Center (NB)</pre>
</body>
</html>