University of Birmingham > Talks@bham > Artificial Intelligence and Natural Computation seminars > Random Projections for Dimensionality Reduction

Random Projections for Dimensionality Reduction

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Hector Basevi.

Host: Dr Ata Kaban

Speaker’s website:

Abstract: Linear dimensionality reduction is a key tool in the data scientist’s toolbox, used variously to make models simpler and more interpretable, to deal with cases when n < p (e.g. to enable model identifiability), or to reduce compute time or memory requirements for large-scale (high-dimensional, large p) problems. In recent years, /random/ projection (‘RP’), that is projecting a dataset on to a k-dimensional subspace (‘k-flat’) chosen uniformly at random from all such k-flats, has become a workhorse approach in the machine learning and data-mining fields, but it is still relatively unknown in other circles. In this talk I will review an elementary proof of the Johnson-Lindenstrauss lemma which, perhaps rather surprisingly, shows that (with high probability) RP approximately preserves the Euclidean geometry of projected data. This result has provided some theoretical grounds for using RP in a range of applications. I will also give a simple – but novel – extension which shows that for data satisfying a mild regularity condition simply sampling the features does nearly as well as RP at geometry preservation, while at the same time bringing a substantial speed-up in execution. Finally, I will briefly discuss some refinements of this final approach and present some preliminary experimental findings combining this with a pre-trained “deep” neural network on ImageNet data.

Biography: Bob has a BSc(Hons) Mathematical Sciences from the Open University UK, and an MSc Natural Computation, PhD Computer Science both from University of Birmingham UK.

His doctoral research mainly focused on theory quantifying the cost of random projection (RP) on classification performance, and a generic and interpretable algorithm for classification (with data-dependent performance guarantees) for n < p problems which employs RP. He is especially interested in the n < p problem, and when one can give performance guarantees, with high confidence, in these settings. In particular, what are the properties of data that allow typical case guarantees for the n < p regime, more broadly what are the fundamental limits – in terms of required sample size, given structured data – for statistical learning or inference.

He reviews widely for machine learning conferences and for machine learning and statistical journals, and his work on theory and applications of RP has garnered three conference ‘best paper’ awards.

This talk is part of the Artificial Intelligence and Natural Computation seminars series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.


Talks@bham, University of Birmingham. Contact Us | Help and Documentation | Privacy and Publicity.
talks@bham is based on from the University of Cambridge.