Monday, December 22, 2014

Video: Exact Recovery via Convex Relaxations

There is only a video for that presentation from Spectral Algorithms: From Theory to Practice organized at the Simons Institute at Berkeley.


Exact Recovery via Convex Relaxations, Moses Charikar
When do convex relaxations return integer solutions? In this talk, I will discuss several instances of discrete optimization problems where, for a suitable distribution on inputs, LP and SDP relaxations produce integer solutions with high probability. This phenomenon has been studied in the context of LP decoding, sparse recovery, stochastic block models and so on. I will also mention some recent results for clustering problems: when points are drawn from a distribution over k sufficiently separated clusters, the well known k-median relaxation and a (not so well known) SDP relaxation for k-means exactly recover the clusters.
Download Video [2.2GB .mp4]
 
Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Saturday, December 20, 2014

Sunday Morning Insight: The Regularization Architecture


Or Not ?


Today's insights go in different directions. The first one came in large part from a remark by John Platt on  Technet's Machine Learning blog who puts this into words much better than I ever could.
...Given the successes of deep learning, researchers are trying to understand how they work. Ba and Caruana had a NIPS paper which showed that, once a deep network is trained, a shallow network can learn the same function from the outputs of the deep network. The shallow network can’t learn the same function directly from the data. This indicates that deep learning could be an optimization/learning trick...
 
My emphasis on the last sentence. So quite clearly, the model reduction of Jimmy Ba and Rich Caruana that was featured a year ago (Do Deep Nets Really Need to be Deep?, NIPS version Do Deep Nets Really Need to be Deep?) or the more recently featured shallow model  (Kernel Methods Match Deep Neural Networks on TIMIT) do in fact point to a potentially better regularization scheme that we have not really found.
If in fact, one wants to continue to think in terms of several layers, then one could think of these stacked networks as iterations of a reconstruction solver as Christian Schuler, Michael Hirsch, Stefan Harmeling, and Bernhard Scholkopf do in Learning to Deblur.
 
Finally, in yesterday's videos (Saturday Morning Videos : Semidefinite Optimization, Approximation and Applications (Simons Institute @ Berkeley)), one could watch Sanjeev Arora [3] talk about Adventures in Linear Algebra++ and Unsupervised Learning (slides). It so happens that what he describes as Linear Algebra ++ is none other than the Advanced Matrix Factorization Jungle. But more importantly, he mentions that randomly wired deep nets (Provable Bounds for Learning Some Deep Representations)


were acknowledged in the building of a recent deeper network paper [1]. From [1]:

In general, one can view the Inception model as a logical culmination of [12] while taking inspiration and guidance from the theoretical work by Arora et al [2]. The benefits of the architecture are experimentally verified on the ILSVRC 2014 classification and detection challenges, on which it significantly outperforms the current state of the art.
and in the conclusion of that same paper:

Although it is expected that similar quality of result can be achieved by much more expensive networks of similar depth and width, our approach yields solid evidence that moving to sparser architectures is feasible and useful idea in general. This suggest promising future work towards creating sparser and more refined structures in automated ways on the basis of [2].
Deeper constructs yes, but sparser ones.
References:


[1] Going Deeper with Convolutions by Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich

We propose a deep convolutional neural network architecture codenamed "Inception", which was responsible for setting the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC 2014). The main hallmark of this architecture is the improved utilization of the computing resources inside the network. This was achieved by a carefully crafted design that allows for increasing the depth and width of the network while keeping the computational budget constant. To optimize quality, the architectural decisions were based on the Hebbian principle and the intuition of multi-scale processing. One particular incarnation used in our submission for ILSVRC 2014 is called GoogLeNet, a 22 layers deep network, the quality of which is assessed in the context of classification and detection.

[2] Learning to Deblur by Christian J. Schuler, Michael Hirsch, Stefan Harmeling, and Bernhard Scholkopf

We show that a neural network can be trained to blindly deblur images. To accomplish that, we apply a deep layered architecture, parts of which are borrowed from recent work on neural network learning, and parts of which incorporate computations that are specific to image deconvolution. The system is trained end-to-end on a set of artificially generated training examples, enabling competitive performance in blind deconvolution, both with respect to quality and runtime.


 
 
 
 
Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Saturday Morning Videos : Spectral Algorithms: From Theory to Practice (Simons Institute @ Berkeley)

 
 
I will also come back to some of the videos and slides of this workshop on Spectral Algorithms: From Theory to Practice organized at the Simons Institute at Berkeley. But in the meantime here are the links to pages that generally feature both the slides and the videos (and sometimes none of the two).
 
 
 
Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Saturday Morning Videos : Semidefinite Optimization, Approximation and Applications (Simons Institute @ Berkeley)

I will come back to some of the videos and slides of this workshop on Semidefinite Optimization, Approximation and Applications organized at the Simons Institute at Berkeley (is Simons in the audience ? ). But in the meantime here are the links to pages that generally features both the slides and the videos (and sometimes none of the two).





Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Friday, December 19, 2014

Video and Slides: Analysis and Design of Optimization Algorithms via Integral Quadratic Constraints

From the Simons institute workshop on Semidefinite Optimization, Approximation and Applications, here is:



I will present a new method to analyze and design iterative optimization algorithms built on the framework of Integral Quadratic Constraints (IQC) from robust control theory. IQCs provide sufficient conditions for the stability of complicated interconnected systems, and these conditions can be checked by semidefinite programming. I will discuss how to adapt IQC theory to study optimization algorithms, proving new inequalities about convex functions. Using these inequalities, I will derive upper bounds on convergence rates for the gradient method, the heavy-ball method, Nesterov's accelerated method, and related variants by solving small, simple semidefinite programming problems. I will close with a discussion of how these techniques can be used to search for algorithms with desired performance characteristics, establishing a new methodology for algorithm design.

Joint work with Laurent Lessard and Andrew Packard.
 
 
Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Job: Research positions (PostDocs and PhD studentship) at MSSL UCL

Jason just sent me the following:


Hi Igor,


I just wanted to let you know about some vacancies for postdoctoral researchers and PhD students at UCL.


Would you consider posting on Nuit Blanche if you think it suitable?


Details follow...


Two vacancies are available for postdoctoral researchers to join MSSL UCL.


These positions are focused on the development of novel sparse and statistical informatics techniques, and corresponding fast algorithms for high-performance computing, for application to data from SKA, Planck and Euclid (see descriptions of posts below).


I would be grateful if you could bring these positions to the attention of potential candidates. The deadline for applications is 1 Feb 2015. Further information is available at: http://www.jasonmcewen.org/opportunities.html


In addition, applications for PhD positions at MSSL to begin in September 2015 are also open. Further information regarding potential PhD projects is available at: http://www.ucl.ac.uk/mssl/astro/phd


Thanks in advance for your help.


Research Associate in Inverse Problems
We are seeking an excellent postdoctoral researcher in physics, mathematics, electrical engineering or computer science to develop novel approaches to solve inverse problems in the context of big-data. In particular, these approaches will be used to image the data acquired by radio interferometric telescopes, such as the forthcoming Square Kilometre Array (SKA). The SKA promises exquisite radio observations of unprecedented resolution and sensitivity, supporting a diverse range of science, from the search for extra-terrestrial life to testing Einstein's theory of general relativity, to uncovering the mysteries of the dawn of the first galaxies in the Universe. However, the SKA poses tremendous big-data challenges that must first be overcome. Inverse problems will be solved in the context of the recent revolutionary theory of compressed sensing, using sparse regularisation techniques and leveraging advanced convex optimisation algorithms, while exploiting high-performance computing architectures.

For further information and to apply see: http://www.ucl.ac.uk/mssl/jobs/mssl-jobs/bdsc-pdra


Research Associate in Information Engineering

We are seeking an excellent postdoctoral researcher in physics, mathematics, electrical engineering or computer science to develop novel information theoretic techniques (e.g., sparse and statistical signal processing, applied computational mathematics), motivated by their use for extracting scientific information from observational data. Signals defined on the sphere are prevalent in a diverse range of fields, including cosmology, geophysics, acoustics, and computer graphics, for example. In cosmology, observations made by the ESA Planck and Euclid satellites live on the celestial sphere, leading to very large and precise spherical data-sets, the robust analysis of which can reveal a great deal about the nature of our Universe. The project will initially focus on the analysis of signals defined on the sphere, such as those obtained by Planck and Euclid, but can be extended to incorporate the interests and expertise of the successful applicant.
For further information and to apply see: http://www.ucl.ac.uk/mssl/jobs/mssl-jobs/saos-pdra


Best wishes for the holiday period!


Best,
Jason
--
www.jasonmcewen.org 
 
 
No problem Jason !
 
 
Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Video: A Statistical Model for Tensor Principal Component Analysis


 



A Statistical Model for Tensor Principal Component Analysis
I will discuss the Principal Component Analysis problem for large tensors of arbitrary order k under a single-spike (or rank-one plus noise) model. On the one hand, using information theory, and recent results in probability theory I will provide necessary and sufficient conditions under  which the principal component can be estimated using unbounded computational resources. It turns out that this is possible as soon as the signal-to-noise ratio \beta becomes larger than C\sqrt{k\log k} (and in particular \beta can remain bounded has the problem dimensions increase). On the other hand, I will analyze several polynomial-time estimation algorithms, based on various approaches:
  1. Tensor matricization and spectral analysis, 
  2. Semidefinite relaxations,
  3. Power iteration.
I will show that, unless the signal-to-noise ratio diverges in the system dimensions, none of these approaches succeeds. This is possibly related to a fundamental limitation of polynomial estimators for this problem. While complexity theory suggests that intractability holds from a worst case point of view, no analogous result has been proved  under statistical models of the data.


 
 
Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Thursday, December 18, 2014

Video and Slides: Random Embeddings, Matrix-valued Kernels and Deep Learning


 

Random Embeddings, Matrix-valued Kernels and Deep Learning by Vikas Sindhwani

The recent dramatic success of Deep Neural Networks (DNNs) in many applications highlights the statistical benefits of marrying near-nonparametric models with large datasets, using efficient optimization algorithms running in distributed computing environments. In the 1990's, Kernel methods became the toolset of choice for a wide variety of machine learning problems due to their theoretical appeal and algorithmic roots in convex optimization, replacing neural nets in many settings. So what changed between then and the modern deep learning revolution? Perhaps the advent of "big data" or perhaps the notion of "depth" or perhaps better DNN training algorithms, or all of the above. Or perhaps also that the development of kernel methods has somewhat lagged behind in terms of scalable training techniques, effective mechanisms for kernel learning and parallel implementations.
I will describe new efforts to resolve scalability challenges of kernel methods, for both scalar and multivariate prediction settings, allowing them to be trained on big data using a combination of randomized data embeddings, Quasi-Monte Carlo (QMC) acceleration, distributed convex optimization and input-output kernel learning. I will report that on classic speech recognition and computer vision datasets, randomized kernel methods and deep neural networks turn out to have essentially identical performance. Curiously, though, randomized kernel methods begin to look a bit like neural networks, but with a clear mathematical basis for their architecture. Conversely, invariant kernel learning and matrix-valued kernels may offer a new way to construct deeper architectures. This talk will describe recent research results and personal perspectives on points of synergy between these fields.






 
 
Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Wednesday, December 17, 2014

Learning with Fredholm Kernels - implementation -

Here is a way to do semi-surpevised learning with kernels (and even with Random Features at the very end) 

Learning with Fredholm Kernels by Qichao Que Mikhail Belkin and Yusu Wang.
In this paper we propose a framework for supervised and semi-supervised learning based on reformulating the learning problem as a regularized Fredholm integral equation. Our approach fits naturally into the kernel framework and can be interpreted as constructing new data-dependent kernels, which we call Fredholm kernels. We proceed to discuss the “noise assumption” for semi-supervised learning and provide both theoretical and experimental evidences that Fredholm kernels can effectively utilize unlabeled data under the noise assumption. We demonstrate that methods based on Fredholm learning show very competitive performance inthe standard semi-supervised learning setting
The implementation is located at: https://github.com/queqichao/FredholmLearning
 
Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

More Oil and Compressive Sensing Connections



Felix Herrmann followed through to yesterday's "Another Donoho-Tao Moment ?" which itself was a followup to Sunday Morning Insight: The Stuff of Discovery and Hamming's time: Scientific Discovery Enabled by Compressive Sensing and related fields. Here what he has to say:

Hi Igor,
I agree with your assessment that the application of CS to seismic data acquisition may not qualify as a major scientific discovery but it will almost certainly lead to major scientific breakthroughs in crustal and mantle seismology as soon as my colleagues in those areas get wind of this innovation. Having said this it may be worth mentioning that randomized subsampling techniques are also having a major impact on carrying out large scale inversions in exploration seismology. For instance, a major seismic contractor company has been able to render full-waveform inversion (the seismic term for inverse problems involving PDE solves) into a commercially viable service by virtue of the fact that we were able to reduce the computational costs of these methods 5—7 fold using batching techniques. These references
were instrumental in motivating the oil & gas industry into adapting this batching technology into their business.
An area that is closer in spirit to Compressive Sensing is seismic imaging. Contrary to full-waveform inversion, seismic imaging entails the inversion of extremely large-scale tall systems of equations that can be made computationally viable using randomized sampling techniques in combination with sparsity promotion. While heuristic in nature, this technique
is able to approximately invert tall systems at the cost of roughly one pass through the data (read only one application of the full adjoint). It is very interesting to see that this heuristic technique has, at least at the conceptual level, connections with approximate message passing (see section 6.5 of “Graphical Models Concepts in Compressed Sensing”) and recent work on Kaczmarz: “A sparse Kaczmarz solver and a linearized Bregman method for online compressed sensing”.
While these developments may by themselves not qualify as major “scientific breakthroughs”, it is clear that these developments are having a major impact on the field of exploration seismology where inversions as opposed to applications of “single adjoints” are now possible that were until very recently computationally unfeasible. For further information on the application of Compressive Sensing to seismic data acquisition and wave-equation based inversion, I would like to point your readers to our website, mind map, and a recent overview article
Thanks again for your efforts promoting Compressive Sensing to the wider research community.
Kind regards,
Felix J. Herrmann
UBC—Seismic Laboratory for Imaging and Modelling
Thanks Felix !
 
Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Printfriendly