Curse of dimensionality occurs when learning structureless data in high dimension \(d\):
VS
\(\varepsilon\sim P^{-\beta}\)
\(\Rightarrow\) Data must be structured and
Machine Learning should capture such structure.
Key questions motivating this thesis:
Reducing complexity with depth
Deep networks build increasingly abstract representations with depth (also in brain)
Intuition: reduces complexity of the task, ultimately beating curse of dimensionality.
Two ways for losing information
by learning invariances
Discrete
Continuous
[Zeiler and Fergus 14, Yosinski 15, Olah 17, Doimo 20,
Van Essen 83, Grill-Spector 04]
[Shwartz-Ziv and Tishby 17, Ansuini 19, Recanatesi 19, ]
[Bruna and Mallat 13, Mallat 16, Petrini 21]
Hierarchical structure
How many training points?
Quantitative predictions in a model of data
sofa
[Chomsky 1965]
[Grenander 1996]
Deep networks learn with a number of data polynomial in the \(d\)
\(P^*\sim n_c m^L\)
Generative technique:
The problem (NL-CSP):
Our approach:
[U.S patent]
Focus on LLM part: