次へ: 序論 上へ: 柔らかな情報処理のための統計的手法の応用に関する研究 A STUDY ON 戻る: 柔らかな情報処理のための統計的手法の応用に関する研究 A STUDY ON

Synopsis

Human beings can flexibly cope with various problems encountered in various situations by summarizing various information obtained through experiences and utilizing it as knowledge. Essentially information processing technology tries to realize such flexible and intelligent human information processing by machine.

It is regarded that there are two aspects in human information processing. One is the logical information processing and the other is the intuitive information processing. Today's remarkable development of information processing technology has been achieved mainly by realization of the logical aspect of human information processing. For example, logical computation has been utilized as a basic tool in the research field of artificial intelligence including theorem proving, problem solving, and natural language understanding.

To extend the application areas of computers to the fields of more advanced and human-like problem solving, the intuitive information processing will be more important. For example, pattern recognition, multivariate data analysis, and neural computing are regarded as attempts to realize the intuitive information processing. In such problems, statistical methods play an important role. This paper discusses theory and applications of statistical methods, especially multivariate data analysis methods, for such flexible information processing.

In the section 2, theory of multivariate data analysis methods is presented. In multivariate data analysis, most of the methods are formulated in linear forms. Such linear methods are extended to nonlinear forms by assuming the whole knowledge of underlying probability distributions. These general nonlinear extensions provide us with deeper understanding of each method and clear relationships of the methods. Then each of the linear methods are reinterpreted as a linear approximation of the nonlinear methods.

In the section 3, clustering methods are discussed as basic tools for flexible information processing. At first, threshold selection methods from histogram are discussed as an example of the most simple clustering. Maximum likelihood thresholding methods are proposed on the basis of population mixture models. It is shown that the standard thresholding proposed by Otsu, which is based on a discriminant criterion and also minimizes the mean square errors between the original image and the resultant binary image, is equivalent to the maximization of the likelihood of the conditional distribution in the population mixture model under the assumption of normal distributions with a common variance. It is also shown that Kittler's thresholding, which minimizes a criterion related to the average classification error rare assuming normal distribution with different variances, is equivalent to the maximization of the likelihood of the joint distribution in the population mixture model. Next, an efficient algorithm for agglomerative clustering is proposed as an example of clustering methods of general multidimensional data. The algorithm uses a heap in which distances of all pairs of clusters are stored. Then the nearest pair of clusters is obtained as the element of the root node of the binary tree corresponding to the heap. The computation time of the algorithm is at most $O(N^2\log(N))$ when objects are going to be classified. Finally, an incremental clustering algorithm is presented for the case where the data is given one by one.

In the section 4, problems related with learning of layered neural networks is discussed from the statistical viewpoint. At first, learning algorithms of layered neural networks are discussed. From the standpoint of maximum likelihood estimation Fisher information is explicitly calculated for the layered network with one hidden layer. Fisher information is given as a weighted covariance matrix of inputs and outputs of hidden units. Then an iterative weighted least squares algorithm is proposed by utilizing block diagonal elements of Fisher information matrix. In the algorithm each neuron repeatedly estimates its own weights by a weighted least squares method. Next, methods to design neural network with good generalization ability are discussed. Generalization is one of the most important problems on neural network learning. Even if the learned network is adapted to a given set of training samples well, it often does not fit unknown samples. Thus it is necessary to evaluate the generalization ability of the learned network. This paper proposes methods which utilize information criteria such as AIC (Akaike' information criterion) or MLD (minimum description length).

From the section 5 through 9 applications of statistical methods are discussed.

In the section 5, multivariate data analysis methods are applied to obtain near optimum solution of a combinatorial problem. A problem to make a schedule of meetings is considered as an example of combinatorial problems. By using principal component analysis or Hayashi's quantification method, rough information is extracted from the original data and the problem is approximately reduced to a simple problem. For the same problem, agglomerative clustering algorithm discussed in the section 4 is also applied to get near optimal solution.

In the section 6, application to image compression of multivariate data analysis method is discussed and a method of block truncation coding (BTC) for color image compression is proposed. BTC for monochrome image is a noninformation preserving method which attempts to retain important visual features while discarding details which are not likely to be noticeable to a human observer. A one-bit adaptive scalar quantizer over small blocks is utilized for this purpose. To extend the idea of BTC to color image a one-bit adaptive vector quantizer (VQ) is used instead of a scalar quantizer. The VQ is designed by using the principal scores obtained by principal component analysis of R, G, B tristimulus values. Then the scores are classified into two classes by using a thresholding method discussed in the section 4.

In the section 7, problems on shape recognition are discussed from the standpoint of multivariate data analysis. Shape recognition is one of the basic problems in the field of pattern recognition. Many shape descriptors have been already developed. Fourier descriptors, smoothed curvature functions, moments, and chord length distribution are especially popular as descriptors intrinsically describing a boundary of a pattern. A method to describe planer shapes are presented on the basis of a complex autoregressive (CAR) model. The method can be invariant to the transformation of a boundary such as rotation, scale, translation. A fast algorithm to calculate complex autoregressive coefficients and complex partial correlation (CPARCOR) coefficients is also shown. Several distance measures (Euclidean distances between CAR coefficients or CPARCOR coefficients, log-likelihood distance, complex power cepstrum distance, complex power mel-cepstrum distance) are also proposed on the basis of the CAR model. The measures are also invariant to the transformation of a boundary.

In section 8, applications to computer vision are discussed. At first, general purpose image recognition and measurement method is proposed as an example of 2D computer vision. It is characterized by structural simplicity, trainability and high speed. The method consists of two stages of feature extractions. Higher order local autocorrelation features are employed as the primitive features at the first stage of feature extraction. Then those features are linearly combined by using multivariate data analysis methods. Next, a method to compute Mean and Gaussian curvatures from range images is proposed. Range image (depth map) processing has become one of the most important topics in 3D computer vision. Mean and Gaussian curvatures (surface curvatures) are invariant under rigid transformation. Smooth surfaces are locally characterized by them and classified into one of eight surface types using a combination of their signs. Thus, if we can compute local surface curvatures accurately, one can segment range images into several surface regions with similar surface types. To estimate the local surface curvatures, the first and second partial derivatives are estimated by the weighted least squares fit of a biquadratic polynomial within a local moving window. The weights are determined on the basis of surface distances and normal angular distances from the center points of the window.

In the section 9, applications to retrieval of image database are discussed. Since visual impression may differ with each person, use-friendly interfaces for image database systems require special retrieval methods which can adapt to the visual impression of each user. Algorithms for learning personal visual impression are proposed for image database systems. The algorithms are also based on multivariate data analysis methods. These algorithms provide a model on visual perception process of each user from a small set of training examples and the model is referred to as a personal index to retrieve desired images for the user. These algorithms are implemented in a graphical symbol database system and a full color image database system.

Takio Kurita 平成14年7月3日