RESEARCH:LEARNING

A General Framework for Dimensionality Reduction of Multidimensional Data

Hongcheng Wang and Narendra Ahuja

INTRODUCTION

Dimensionality reduction has recently been received broad attention in areas such as computer vision, information retrieval, and machine learning. Traditional methods such as Principle Component Analysis (PCA) to reduce the dimensionality of multidimensional data usually reshape each datum (e.g. 2D image, 3D volume) into a vector in order to apply the classical second order array processing methods. Considering the curse of dimensionality, i.e., the classification accuracy degrades as dimensionality increases, dimensionality reduction becomes an important step. We developed a general framework for multidimensional data representation -

Datum-as-Isrepresentation, i.e., to preserve the order of the original data. For example, each datum in a 4D fMRI sequence (the fourth dimension is temporal) is represented as a third-order subtensor (first order data as vector, second order data as matrix, and third and higher order data as tensor). Dimensionality reduction is based on efficient tensor approximations of higher-order tensors [1, 2]. We push forward the dimensionality reduction methods to a new framework which is more efficient in both computation and capturing the redundancies within the multidimensional data, more general in the sense that it subsumes almost all existing linear (image-as-vector), bilinear (image-as-matrix) methods, and can solve the small-size-sample problem in object/face recognition.

PUBLICATIONS

1. Hongcheng Wang and Narendra Ahuja,

Rank-R Approximation of Tensors Using Image-as-Matrix Representation, IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2005

Abstract: We present a novel multilinear algebra based approach for reduced dimensionality representation of image ensembles. We treat an image as a matrix, instead of a vector as in traditional dimensionality reduction techniques like PCA, and higher-dimensional data as a tensor. This helps exploit spatio-temporal redundancies with less information loss than image-as-vector methods. The challenges lie in the computational and memory requirements for large ensembles. Currently, there exists a rank-R approximation algorithm which, although applicable to any number of dimensions, is efficient for only low-rank approximations. For larger dimensionality reductions, the memory and time costs of this algorithm become prohibitive. We propose a novel algorithm for rank-R approximations of third-order tensors, which is efficient for arbitrary R but for the important special case of 2D image ensembles, e.g. video. Both of these algorithms reduce redundancies present in all dimensions. Rank-R tensor approximation yields the most compact data representation among all known image-as-matrix methods. We evaluated the performance of our algorithm vs. other approaches on a number of datasets with the following two main results. First, for a fixed compression ratio, the proposed algorithm yields the best representation of image ensembles visually as well as in the least squares sense. Second, proposed representation gives the best performance for object classification.

Full Text: PDF (525KB)

BibTex:

@inproceedings{1069106,

author = {Hongcheng Wang and Narendra Ahuja},

title = {Rank-R Approximation of Tensors: Using Image-as-Matrix Representation},

booktitle = {CVPR '05: Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 2},

year = {2005},

isbn = {0-7695-2372-2},

pages = {346--353},

doi = {http://dx.doi.org/10.1109/CVPR.2005.290},

publisher = {IEEE Computer Society},

address = {Washington, DC, USA},

}

2. Hongcheng Wang and Narendra Ahuja,

Compact Representation of Multidimensional Data Using Tensor Rank-One Decomposition, IEEE, International Conference on Pattern Recognition, ICPR, 2004

Abstract: This paper presents a new approach for representing multidimensional data by a compact number of bases. We consider the multidimensional data as tensors instead of matrices or vectors, and propose a Tensor Rank-One Decomposition (TROD) algorithm by decomposing Nth-order data into a collection of rank-1 tensors based on multilinear algebra. By applying this algorithm to image sequence compression, we obtain much higher quality images with the same compression ratio as Principle Component Analysis (PCA). Experiments with gray-level and color video sequencesare used to illustrate the validity of this approach.

Full Text: PDF (525KB)

BibTex:

@inproceedings{1020435,

author = {Hongcheng Wang and Narendra Ahuja},

title = {Compact Representation of Multidimensional Data Using Tensor Rank-One Decomposition},

booktitle = {ICPR '04: Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 1},

year = {2004},

isbn = {0-7695-2128-2},

pages = {44--47},

doi = {http://dx.doi.org/10.1109/ICPR.2004.251},

publisher = {IEEE Computer Society},

address = {Washington, DC, USA},

}

3. Hongcheng Wang and Narendra Ahuja,

Facial Expression Decomposition, IEEE International Conference on Computer Vision (ICCV), 2003

Abstract: In this paper, we propose a novel approach for facial expression decomposition - Higher-Order Singular Value Decomposition (HOSVD), a natural generalization of matrix SVD. We learn the expression subspace and person subspace from a corpus of images showing seven basic facial expressions, rather than resort to expert-coded facial expression parameters. We propose a simultaneous face and facial expression recognition algorithm, which can classify the given image into one of the seven basic facial expression categories, and then other facial expressions of the new person can be synthesized using the learned expression subspace model. The contributions of this work lie mainly in two aspects. First, we propose a new multilinear model (HOSVD) based approach to model the mapping between persons and expressions, used for face transfer and facial expression synthesis for a new person. Second, we realize simultaneous face and facial expression recognition as a result of facial expression decomposition. Experimental results are presented that illustrate the capability of the person subspace and expression subspace in both synthesis and recognition tasks. As a quantitative measure of the quality of synthesis, we propose using Gradient Minimum Square Error (GMSE) which measures the gradient difference between the original and synthesized images.

Full Text: PDF (178KB)

BibTex:

@inproceedings{946680,

author = {Hongcheng Wang and Narendra Ahuja},

title = {Facial Expression Decomposition},

booktitle = {ICCV '03: Proceedings of the Ninth IEEE International Conference on Computer Vision},

year = {2003},

isbn = {0-7695-1950-4},

pages = {958},

publisher = {IEEE Computer Society},

address = {Washington, DC, USA},

}

Updated: Jan.1, 2006