New post from 2022

brianzhang01 · brianzhang01 · commit c61ccc38e50b · 2023-08-20T15:06:14.000-04:00
diff --git a/content/post/2022-09-19-a-regularization-proof.Rmd b/content/post/2022-09-19-a-regularization-proof.Rmd
@@ -0,0 +1,78 @@
+---
+title: A Regularization Proof
+author: Brian Zhang
+date: '2022-09-19'
+slug: a-regularization-proof
+categories: []
+tags: []
+description: 'Investigating behavior of a function minimum as we add regularization.'
+---
+
+Say we have a loss function $l(w)$. With no regularization, we might obtain the minimum at $w = w_0$. Now consider the setting with regularization:
+$$
+f_\lambda(w) = l(w) + \lambda R(w),
+$$
+where $R(w) \geq 0$ is some regularization function and $\lambda \geq 0$. What can we say if we consider the minimizing inputs $w_1$ for $f_{\lambda_1}(w)$ and $w_2$ for $f_{\lambda_2}(w)$, with $0 \leq \lambda_1 < \lambda_2$?
+$$
+w_1 = argmin_w \left[ l(w) + \lambda_1 R(w) \right],\\
+w_2 = argmin_w \left[ l(w) + \lambda_2 R(w) \right].
+$$
+
+Intuitively, as we increase $\lambda$ from $\lambda_1$ to $\lambda_2$, the function $f_\lambda(w)$ places more importance on the regularization term $R(w)$. We should expect $l(w)$ evaluated at the optimum $w$ to increase, and the regularization term $R(w)$ evaluated at the optimum $w$ to decrease.
+
+By the properties of the optimum, we have
+\begin{gather}
+l(w_1) + \lambda_1 R(w_1) \leq l(w_2) + \lambda_1 R(w_2), \quad (1)\\
+l(w_2) + \lambda_2 R(w_2) \leq l(w_1) + \lambda_2 R(w_1). \quad (2)
+\end{gather}
+The only other information we have relating these terms is that $R(w) \geq 0$ (for all $w$) and $0 \leq \lambda_1 < \lambda_2$. So we work with what we have. First, leveraging $(1)$,
+\begin{align*}
+f_{\lambda_1}(w_1) &= l(w_1) + \lambda_1 R(w_1)\\
+&\leq l(w_2) + \lambda_1 R(w_2)\\
+&\leq l(w_2) + \lambda_2 R(w_2)\\
+&= f_{\lambda_2}(w_2),
+\end{align*}
+so the minimum of the optimized function increases (or stays the same) as we increase $\lambda$. This can also be proved as $f_{\lambda_2}(w) \geq f_{\lambda_1}(w)$ for all $w$.
+
+The other inequalities are trickier. Observe (starting with $(2)$):
+\begin{align*}
+l(w_1) + \lambda_2 R(w_1) &\geq l(w_2) + \lambda_2 R(w_2)\\
+&= l(w_2) + (\lambda_1 + \lambda_2 - \lambda_1) R(w_2)\\
+&= \left[l(w_2) + \lambda_1 R(w_2)\right] + (\lambda_2 - \lambda_1) R(w_2)\\
+&\geq \left[l(w_1) + \lambda_1 R(w_1)\right] + (\lambda_2 - \lambda_1) R(w_2).
+\end{align*}
+Subtracting $(l(w_1) + \lambda_1 R(w_1))$ from both sides, we have
+$$
+(\lambda_2 - \lambda_1) R(w_1) \geq (\lambda_2 - \lambda_1) R(w_2).
+$$
+$\lambda_2 - \lambda_1 > 0$, so dividing on both sides,
+$$
+R(w_1) \geq R(w_2).
+$$
+In words, the minimum of the regularization component (not including the factor of $\lambda$) decreases (or stays the same) as we increase $\lambda$.^[An alternate proof, by adding $(1)$ with $(2)$:
+$$
+l(w_1) + l(w_2) + \lambda_1 R(w_1) + \lambda_2 R(w_2) \leq l(w_1) + l(w_2) + \lambda_1 R(w_2) + \lambda_2 R(w_1),\\
+\lambda_1 R(w_1) + \lambda_2 R(w_2) \leq \lambda_1 R(w_2) + \lambda_2 R(w_1),\\
+(\lambda_2 - \lambda_1) R(w_2) \leq (\lambda_2 - \lambda_1) R(w_1),\\
+R(w_2) \leq R(w_1).
+$$
+]
+
+Starting with $(1)$ and leveraging this fact, we additionally have
+\begin{align*}
+l(w_1) + \lambda_1 R(w_1) &\leq l(w_2) + \lambda_1 R(w_2)\\
+&\leq l(w_2) + \lambda_1 R(w_1)
+\end{align*}
+Subtracting $\lambda_1 R(w_1)$ from both sides, we obtain
+$$
+l(w_1) \leq l(w_2).
+$$
+In words, the minimum of the loss function component increases (or stays the same) as we increase $\lambda$.^[An alternate proof, by adding $1/\lambda_1$ times $(1)$ with $1/\lambda_2$ times $(2)$:
+$$
+\frac{l(w_1)}{\lambda_1} + \frac{l(w_2)}{\lambda_2} + R(w_1) + R(w_2) \leq \frac{l(w_2)}{\lambda_1} + \frac{l(w_1)}{\lambda_2} + R(w_2) + R(w_1),\\
+\frac{l(w_1)}{\lambda_1} + \frac{l(w_2)}{\lambda_2} \leq \frac{l(w_2)}{\lambda_1} + \frac{l(w_1)}{\lambda_2},\\
+\left(\frac{1}{\lambda_1} - \frac{1}{\lambda_2}\right) l(w_1) \leq \left(\frac{1}{\lambda_1} - \frac{1}{\lambda_2}\right) l(w_2) ,\\
+l(w_1) \leq l(w_2).
+$$
+]
+
diff --git a/content/post/2022-09-19-a-regularization-proof.html b/content/post/2022-09-19-a-regularization-proof.html
@@ -0,0 +1,80 @@
+---
+title: A Regularization Proof
+author: Brian Zhang
+date: '2022-09-19'
+slug: a-regularization-proof
+categories: []
+tags: []
+description: 'Investigating behavior of a function minimum as we add regularization.'
+---
+
+
+
+<p>Say we have a loss function <span class="math inline">\(l(w)\)</span>. With no regularization, we might obtain the minimum at <span class="math inline">\(w = w_0\)</span>. Now consider the setting with regularization:
+<span class="math display">\[
+f_\lambda(w) = l(w) + \lambda R(w),
+\]</span>
+where <span class="math inline">\(R(w) \geq 0\)</span> is some regularization function and <span class="math inline">\(\lambda \geq 0\)</span>. What can we say if we consider the minimizing inputs <span class="math inline">\(w_1\)</span> for <span class="math inline">\(f_{\lambda_1}(w)\)</span> and <span class="math inline">\(w_2\)</span> for <span class="math inline">\(f_{\lambda_2}(w)\)</span>, with <span class="math inline">\(0 \leq \lambda_1 &lt; \lambda_2\)</span>?
+<span class="math display">\[
+w_1 = argmin_w \left[ l(w) + \lambda_1 R(w) \right],\\
+w_2 = argmin_w \left[ l(w) + \lambda_2 R(w) \right].
+\]</span></p>
+<p>Intuitively, as we increase <span class="math inline">\(\lambda\)</span> from <span class="math inline">\(\lambda_1\)</span> to <span class="math inline">\(\lambda_2\)</span>, the function <span class="math inline">\(f_\lambda(w)\)</span> places more importance on the regularization term <span class="math inline">\(R(w)\)</span>. We should expect <span class="math inline">\(l(w)\)</span> evaluated at the optimum <span class="math inline">\(w\)</span> to increase, and the regularization term <span class="math inline">\(R(w)\)</span> evaluated at the optimum <span class="math inline">\(w\)</span> to decrease.</p>
+<p>By the properties of the optimum, we have
+<span class="math display">\[\begin{gather}
+l(w_1) + \lambda_1 R(w_1) \leq l(w_2) + \lambda_1 R(w_2), \quad (1)\\
+l(w_2) + \lambda_2 R(w_2) \leq l(w_1) + \lambda_2 R(w_1). \quad (2)
+\end{gather}\]</span>
+The only other information we have relating these terms is that <span class="math inline">\(R(w) \geq 0\)</span> (for all <span class="math inline">\(w\)</span>) and <span class="math inline">\(0 \leq \lambda_1 &lt; \lambda_2\)</span>. So we work with what we have. First, leveraging <span class="math inline">\((1)\)</span>,
+<span class="math display">\[\begin{align*}
+f_{\lambda_1}(w_1) &amp;= l(w_1) + \lambda_1 R(w_1)\\
+&amp;\leq l(w_2) + \lambda_1 R(w_2)\\
+&amp;\leq l(w_2) + \lambda_2 R(w_2)\\
+&amp;= f_{\lambda_2}(w_2),
+\end{align*}\]</span>
+so the minimum of the optimized function increases (or stays the same) as we increase <span class="math inline">\(\lambda\)</span>. This can also be proved as <span class="math inline">\(f_{\lambda_2}(w) \geq f_{\lambda_1}(w)\)</span> for all <span class="math inline">\(w\)</span>.</p>
+<p>The other inequalities are trickier. Observe (starting with <span class="math inline">\((2)\)</span>):
+<span class="math display">\[\begin{align*}
+l(w_1) + \lambda_2 R(w_1) &amp;\geq l(w_2) + \lambda_2 R(w_2)\\
+&amp;= l(w_2) + (\lambda_1 + \lambda_2 - \lambda_1) R(w_2)\\
+&amp;= \left[l(w_2) + \lambda_1 R(w_2)\right] + (\lambda_2 - \lambda_1) R(w_2)\\
+&amp;\geq \left[l(w_1) + \lambda_1 R(w_1)\right] + (\lambda_2 - \lambda_1) R(w_2).
+\end{align*}\]</span>
+Subtracting <span class="math inline">\((l(w_1) + \lambda_1 R(w_1))\)</span> from both sides, we have
+<span class="math display">\[
+(\lambda_2 - \lambda_1) R(w_1) \geq (\lambda_2 - \lambda_1) R(w_2).
+\]</span>
+<span class="math inline">\(\lambda_2 - \lambda_1 &gt; 0\)</span>, so dividing on both sides,
+<span class="math display">\[
+R(w_1) \geq R(w_2).
+\]</span>
+In words, the minimum of the regularization component (not including the factor of <span class="math inline">\(\lambda\)</span>) decreases (or stays the same) as we increase <span class="math inline">\(\lambda\)</span>.<a href="#fn1" class="footnote-ref" id="fnref1"><sup>1</sup></a></p>
+<p>Starting with <span class="math inline">\((1)\)</span> and leveraging this fact, we additionally have
+<span class="math display">\[\begin{align*}
+l(w_1) + \lambda_1 R(w_1) &amp;\leq l(w_2) + \lambda_1 R(w_2)\\
+&amp;\leq l(w_2) + \lambda_1 R(w_1)
+\end{align*}\]</span>
+Subtracting <span class="math inline">\(\lambda_1 R(w_1)\)</span> from both sides, we obtain
+<span class="math display">\[
+l(w_1) \leq l(w_2).
+\]</span>
+In words, the minimum of the loss function component increases (or stays the same) as we increase <span class="math inline">\(\lambda\)</span>.<a href="#fn2" class="footnote-ref" id="fnref2"><sup>2</sup></a></p>
+<div class="footnotes">
+<hr />
+<ol>
+<li id="fn1"><p>An alternate proof, by adding <span class="math inline">\((1)\)</span> with <span class="math inline">\((2)\)</span>:
+<span class="math display">\[
+l(w_1) + l(w_2) + \lambda_1 R(w_1) + \lambda_2 R(w_2) \leq l(w_1) + l(w_2) + \lambda_1 R(w_2) + \lambda_2 R(w_1),\\
+\lambda_1 R(w_1) + \lambda_2 R(w_2) \leq \lambda_1 R(w_2) + \lambda_2 R(w_1),\\
+(\lambda_2 - \lambda_1) R(w_2) \leq (\lambda_2 - \lambda_1) R(w_1),\\
+R(w_2) \leq R(w_1).
+\]</span><a href="#fnref1" class="footnote-back">↩︎</a></p></li>
+<li id="fn2"><p>An alternate proof, by adding <span class="math inline">\(1/\lambda_1\)</span> times <span class="math inline">\((1)\)</span> with <span class="math inline">\(1/\lambda_2\)</span> times <span class="math inline">\((2)\)</span>:
+<span class="math display">\[
+\frac{l(w_1)}{\lambda_1} + \frac{l(w_2)}{\lambda_2} + R(w_1) + R(w_2) \leq \frac{l(w_2)}{\lambda_1} + \frac{l(w_1)}{\lambda_2} + R(w_2) + R(w_1),\\
+\frac{l(w_1)}{\lambda_1} + \frac{l(w_2)}{\lambda_2} \leq \frac{l(w_2)}{\lambda_1} + \frac{l(w_1)}{\lambda_2},\\
+\left(\frac{1}{\lambda_1} - \frac{1}{\lambda_2}\right) l(w_1) \leq \left(\frac{1}{\lambda_1} - \frac{1}{\lambda_2}\right) l(w_2) ,\\
+l(w_1) \leq l(w_2).
+\]</span><a href="#fnref2" class="footnote-back">↩︎</a></p></li>
+</ol>
+</div>
diff --git a/packages.txt b/packages.txt
@@ -1,180 +1,44 @@
-[1] "Today's date is 2020-02-04"
-        Package   Version
-      animation       2.5
-  AnnotationDbi    1.28.2
-            ape       5.3
-     assertthat     0.2.0
-      backports     1.1.2
-      base64enc     0.1-3
-             BB 2014.10-1
-       beeswarm     0.2.3
-             BH  1.65.0-1
-          bindr       0.1
-       bindrcpp       0.2
-        Biobase    2.26.0
-   BiocGenerics    0.12.1
-  BiocInstaller    1.16.2
-            bit    1.1-12
-          bit64     0.9-7
-         bitops     1.0-6
-           blob     1.1.0
-       blogdown       0.4
-       bookdown       0.5
-           brew     1.0-6
-          broom     0.4.3
-      calibrate     1.7.2
-          callr     1.0.0
-        caTools    1.17.1
-     cellranger     1.1.0
-            cli     1.0.0
-          clipr     0.4.0
-           coda    0.19-1
-     colorspace     1.3-2
-     commonmark       1.4
-         crayon     1.3.4
-           curl       3.1
-     data.table    1.11.6
-            DBI       0.7
-         dbplyr     1.2.0
-          debug     1.3.1
-         deldir    0.1-14
-           desc     1.1.1
-       devtools    1.13.4
-        dfoptim 2017.12-1
-      dichromat     2.0-0
-         digest    0.6.14
-          dplyr     0.7.4
-          edgeR     3.8.5
-        ellipse     0.4.1
-       evaluate    0.10.1
-         farver     1.1.0
-        FLtools     0.0.2
-        forcats     0.2.0
-        foreach     1.4.4
-          gdata    2.18.0
-   GenomeInfoDb     1.2.4
-         getopt    1.20.2
-      gganimate     1.0.0
-        ggplot2     3.2.1
-        ggrepel     0.8.1
-          git2r    0.21.0
-         glmnet    2.0-13
-           glue     1.2.0
-         gplots     3.0.1
-         gtable     0.2.0
-         gtools     3.5.0
-          haven     1.1.0
-          highr       0.6
-            hms     0.4.0
-      htmltools     0.3.6
-    htmlwidgets       0.9
-         httpuv     1.5.2
-           httr     1.3.1
-         igraph     1.2.1
-         inline    0.3.14
-        IRanges     2.0.1
-      iterators     1.0.9
-      itertools     0.1-3
-       jsonlite       1.5
-          knitr      1.20
-       labeling       0.3
-          later     0.8.0
-      latex2exp     0.4.0
-       lazyeval     0.2.1
-          limma    3.22.4
-       lineprof  0.1.9001
-           lme4    1.1-17
-      lubridate     1.7.1
-       magrittr       1.5
-     manipulate     1.0.1
-        mapproj     1.2-5
-           maps     3.2.0
-       maptools     0.9-2
-       markdown       0.8
-    matrixStats    0.52.2
-        memoise     1.1.0
- microbenchmark     1.4-3
-           mime       0.5
-         miniUI     0.1.1
-          minqa     1.2.4
-         mnormt     1.5-5
-         modelr     0.1.1
-        munsell     0.5.0
-       mvbutils   2.7.4.1
-         nloptr     1.0.4
-       numDeriv  2016.8-1
-        openssl     0.9.9
-      optextras  2016-8.8
-         optimx  2013.8.7
-       optparse     1.6.0
-   org.Hs.eg.db     3.0.0
-        packrat   0.4.9-3
-         pillar     1.1.0
-      pkgconfig     2.0.1
-          plogr     0.1-1
-           plyr     1.8.4
-         praise     1.0.0
-    prettyunits     1.0.2
-        profvis     0.3.4
-       progress     1.2.0
-       promises     1.0.1
-          proto     1.0.0
-           pryr     0.1.3
-          psych     1.7.8
-          purrr     0.2.4
-          qqman     0.1.4
-       quadprog     1.5-5
-             R6     2.2.2
-         Rcgmin 2013-2.21
-   RColorBrewer     1.1-2
-           Rcpp     1.0.0
-      RcppEigen 0.3.3.3.1
-          readr     1.1.1
-         readxl     1.0.0
-        rematch     1.0.1
-         reprex     0.1.1
-       reshape2     1.4.3
-     reticulate       1.6
-          rgdal    1.2-16
-          rgeos    0.3-26
-          rlang     0.3.1
-      rmarkdown       1.9
-       roxygen2     6.0.1
-      rprojroot     1.3-2
-        RSQLite       2.0
-        rstudio  0.98.994
-     rstudioapi       0.7
-         rtweet     0.6.0
-          rvest     0.3.2
-         Rvmmin 2017-7.18
-      S4Vectors     0.4.0
-         scales     1.0.0
-        selectr     0.3-1
-          servr       0.8
-         setRNG  2013.9-1
-            sgt       2.0
-     shapefiles       0.7
-          shiny     1.3.2
-    sourcetools     0.1.6
-             sp     1.2-6
-        stringi     1.1.6
-        stringr     1.2.0
-         svUnit    0.7-12
-         testit       0.8
-       testthat     2.0.0
-         tibble     1.4.1
-          tidyr     0.7.2
-     tidyselect     0.2.3
-      tidyverse     1.2.1
-        tkrplot    0.0-23
-         tweenr     1.0.1
-         ucminf     1.1-4
-           utf8     1.1.3
-    viridisLite     0.2.0
-        whisker     0.3-2
-          withr     2.1.1
-       xaringan     0.4.4
-       XKCDdata     0.1.0
-           xml2     1.1.1
-         xtable     1.8-2
-           yaml    2.1.16
+[1] "Today's date is 2022-09-20"
+       Package    Version
+     base64enc      0.1-3
+      blogdown       1.12
+      bookdown       0.29
+         bslib      0.4.0
+        cachem      1.0.6
+        digest     0.6.29
+        eulerr      6.1.0
+      evaluate       0.16
+       fastmap      1.1.0
+            fs      1.5.2
+         GenSA      1.1.7
+          glue      1.6.2
+     gridExtra        2.3
+        gtable      0.3.0
+         highr        0.9
+     htmltools      0.5.3
+        httpuv      1.6.6
+     jquerylib      0.1.4
+      jsonlite      1.8.0
+         knitr       1.40
+         later      1.3.0
+      magrittr      2.0.3
+      markdown        1.1
+       memoise      2.0.1
+          mime       0.12
+      polyclip     1.10-0
+    polylabelr      0.2.0
+      promises    1.2.0.1
+            R6      2.5.1
+      rappdirs      0.3.3
+          Rcpp      1.0.7
+ RcppArmadillo 0.10.6.0.0
+         rlang      1.0.5
+     rmarkdown       2.16
+     rprojroot      2.0.3
+          sass      0.4.2
+         servr       0.24
+       stringi      1.7.8
+       stringr      1.4.1
+       tinytex       0.41
+          xfun       0.33
+          yaml      2.3.5
diff --git a/public b/public
@@ -1 +1 @@
-Subproject commit f0ff095e4d49b8182a9de9cb51dc6dea49bd19a0
+Subproject commit 161eee918158a87de8832ad7176a0f00d55cc755