LeCun has a new course on deep learning this spring. I found two things he mentioned that worth jotting down. First, natural data lives in low-dimensional manifold. Probably I should have came across that before but it didn’t register earlier. Come to think of it. This is a very important fact. Second, as it is…
Framing and prospect theory
Asian disease problem illustrated that framing can alter one’s decision based on if we are emphasizing gain or loss. Prospect theory is just a fancy name to conjecture what happens when the utility function is indeed what economists believe.
One pager for proposal
A good slide from a workshop.
Free energy
When we model probability of a variable by , is often referred to as the free energy. The name is coming from historical reason. The Gibbs-Boltzmann distribution for a configuration is proportional to . And the closest reason I found is from here and is the Helmholtz free energy. Unfortunately, I doubt the above explanation…
Convex conjugate
I wasn’t aware that convex conjugate is just Legendre transformation. That is, $latex f^{*}(p)=\sup _{\tilde {x}}\{\langle p,{\tilde {x}}\rangle -f({\tilde {x}})\}\geq \langle p,x\rangle -f(x)$ Note that it is also known as the Legendre-Fenchel transformation. Btw, we have the Fenchel inequality $latex \langle p,x\rangle \le f(x)+f^*(p)$ directly from the definition since $latex f^{*}(p)=\sup _{\tilde {x}}\{\langle p,{\tilde…
Self-supervised learning
A very good lecture by Ishan Misra summarized many self-supervised learning methods. Idea of self-supervised learning is very simple. We are trying to design some pretext tasks where the labels can be obtained for free. And instead of training a model with real task (down stream task), we will pretrain the model with these pretext…
Visualizing CNN
Let’s summarize a couple techniques for visualization. Here all networks are trained to do classification. The simplest one is to randomly black out part of the image and evalute the classification result. For example, the dog score should drop when we black out the dog face. This approach is readily applicable to any classifier regardless…
Why you ought to Get Involved In Crypto Trading
建议习近平主席辞去党政军⼀切职务 致政协主席汪洋的公开信
Central limit theorem and moment generating function
For a random variable $latex X$, we can simply define a moment generating function as $latex MG(t) \triangleq E[e^{t X}]$. Then, $latex MG(t)^{(n)}|_{t=0} = E[X^{n}e^{tX}]|_{t=0}=E[X^n]$ is simply the $latex n$-th moment of $latex X$. Easy to verify that $latex \mathcal{N}(0,\sigma^2)$ has moment generating function of $latex e^{\frac{t^2\sigma^2}{2}}$ since $latex E[e^{tX}]=\frac{1}{\sqrt{2\pi\sigma^2}}\int e^{-\frac{x^2}{2\sigma^2}}e^{tx}dx=\frac{1}{\sqrt{2\pi\sigma^2}}\int e^{-\frac{1}{2\sigma^2}[(x-t\sigma^2)^2-t^2\sigma^4]}dx=e^{\frac{t^2\sigma^2}{2}}$ Central limit theorem…