Your solutions to the following problems should be submitted as one single pdf which does not contain
any personal information (student ID or name). The only rule for the layout of your submission is that for each
problem there has to be exactly one separate page containing the answer to the problem. You are welcome to use the \LaTeX-file underlying this pdf,
available under \url{https://version.aalto.fi/gitlab/junga1/MLBP2017Public}, and fill in your solutions there.
\newpage
\section{The Principal Component}
Consider $\samplesize=20$ snapshots, available at \url{https://version.aalto.fi/gitlab/junga1/MLBP2017Public/tree/master/Clustering/images},
which are named according to the season when they have been taken, i.e., either ``winter??.jpeg'' or ``summer??.jpeg''.
We represent the $i$th snapshot, with $i=1,\ldots,\samplesize$, by the feature vector $\vx^{(\sampleidx)}\in\mathbb{R}^{d}$ with entries
representing the greyscale values of the image pixels belonging to the lower left square of size $40\times40$ pixels (this results in a feature length $d=40^2=1600$).
In order to speed up subsequent computations, we transform the original feature vector $\vx^{(\sampleidx)}$ into one single number $z^{(\sampleidx)}=\mathbf{w}^{T}\vx^{(\sampleidx)}$
using a normalized vector $\mathbf{w}\in\mathcal{S}^{d}$, with the unit sphere $\mathcal{S}^{d}=\{\mathbf{u}\in\mathbb{R}^{d}: \|\mathbf{u}\|^{2}_{2}=1\}$ (which is the
set of all unit-norm vectors). % $\| \mathbf{w} \|_{2}^{2} = \mathbf{w}^{T} \mathbf{w} = 1$.
The vector $\mathbf{w}$ should be chosen such that we can accurately reconstruct the original feature vector using $\mathbf{v} z^{(\sampleidx)}$ with some normalized vector $\mathbf{v}\in\mathcal{S}^{d}$
(which might be different from $\mathbf{w}$). Let us measure the reconstruction error, when reconstructing $\vx^{(\sampleidx)}$ from $z^{(\sampleidx)}$, as
What is the relation of the vectors $\hat{\mathbf{v}},\hat{\mathbf{w}}\in\mathcal{S}^{d}$, which satisfy \eqref{equ_optimality}, to the eigenvectors of
the matrix ${\bm\Sigma}=(1/\samplesize)\mathbf{X}^{T}\mathbf{X}$ with $\mathbf{X}=(\vx^{(1)},\ldots,\vx^{(\samplesize)})^{T}\in\mathbb{R}^{\samplesize\times d}$?
Using this relation, compute the optimal vectors $\hat{\mathbf{v}},\hat{\mathbf{w}}\in\mathcal{S}^{d}$ and the associated minimum reconstruction error $\emperror(\hat{\mathbf{v}},\hat{\mathbf{w}}| \dataset)$ for
the given dataset $\dataset=\{\vx^{(\sampleidx)}\}_{\sampleidx=1}^{\samplesize}$.
Illustrate the vectors $\hat{\mathbf{v}},\hat{\mathbf{w}}\in\mathcal{S}^{d}$ using grayscale plots \\(cf.\ \url{https://se.mathworks.com/help/images/ref/mat2gray.html}).