1. CLUSTERED MATRIX APPROXIMATION
SIAM [Society for Industrial and Applied Mathematics] 2016
- develop a clustered matrix approximation framework
- Divide the matrix into different groups
- particularly well suited for problems with large-scale sparse matrices
Abstract
benefits
- preserve the important structure presented in the original matrix;
- contain both global-scale and local-scale information
- efficient both in computational speed and memory usage
- resulting approximations more accurate with less memory usage than truncated SVD approximations, which are optimal with respect to rank.
- flexible: modified in various ways
[what ways?]
method
- derive a probabilistic approach that uses randomness to compute a clustered matrix approximation within
the developed framework?
. - We further prove deterministic and probabilistic bounds of the resulting approximation error.
Introduction
- approximate with low-rank , has been studied extensively in the literature
- bottleneck: with large amounts of data is often memory consumption and
not the rank of the approximation
- In this setting, our method produces a considerably more accurate matrix approximation
with memory usage
that isthe same as or less than that of the truncated SVD approximation
, which is optimal with respect to rank.
- In this setting, our method produces a considerably more accurate matrix approximation
steps
- a clustering or coclustering step so that the rows and columns of a given matrix are partitioned into a number of groups;
- reordering the rows and columns according to cluster a�affiliation and extracting sufficiently dense blocks;
- computing low-rank approximations of these dense blocks;
- combining the blockwise approximations into an approximation of the entire
matrix.
explanation
Computing truncated SVD approximations is relatively expensive, one may wish to trade off� computation time against some slight loss inaccuracy. Probabilistic algorithms for matrix approximation [21]
give the means for this kind of trade-off�.
- develop and detail the method
Preliminaries
rely on a number of concepts:
- efficient graph clustering;
- bipartite graph coclustering;
- low-rank matrix approximations;
- stochastic methods for low rank matrix approximations.
Bipartite graph: is a graph whose vertices can be divided into two disjoint and independent sets and such that every edge connects a vertex in to one in . and are usually called the parts of the graph
-
assume that we can partition the graph and obtain disjoint sets
, with . Without loss of generality, we can assume that vertices in are sorted in strictly increasing order. Then matrix has form
- given and rank
- standard Gausion matrix , is a small oversampling parameter (typically set to 5 ~ 10)
- : orthonormal basis
- the corresponding low rank approximation is given by
Clustered low rank matrix approximations
However , it used existing clusters method... as it is more about computing
End.
2. Improved Nystrom Low-Rank Approximation and Error Analysis
This paper presents detailed studies on the Nystrom sampling scheme
and in particular, an error analysis
that directly relates the Nystrom approximation quality with the encoding powers of the landmark points
.
Introduction
Kernel methods play a central role in machine learning and have demonstrated huge success in modeling real-world data with highly complex, nonlinear structures.
网友评论