Scientific Journals and Yearbooks Published at SAS

Article List

Computing and Informatics


Volume 30, 2011, No. 2

Content:


  libNMF -- A Library for Nonnegative Matrix Factorization
A. Janecek, S. Schulze Grotthoff, W.N. Gansterer

Nonnegative matrix factorization, low-rank approximation, evaluation, NMF library, NMF software

We present libNMF -- a computationally efficient high performance library for computing nonnegative matrix factorizations (NMF) written in C. Various algorithms and algorithmic variants for computing NMF are supported. libNMF is based on external routines from BLAS (Basic Linear Algebra Subprograms), LAPack (Linear Algebra package) and ARPack, which provide efficient building blocks for performing central vector and matrix operations. Since modern BLAS implementations support multi-threading, libNMF can exploit the potential of multi-core architectures. In this paper, the basic NMF algorithms contained in libNMF and existing implementations found in the literature are briefly reviewed. Then, libNMF is evaluated in terms of computational efficiency and numerical accuracy and compared with the best existing codes available. libNMF is publicly available at http://rlcta.univie.ac.at/software.

Computing and Informatics. Volume 30, 2011, No. 2: 205-224.

 
  Multilevel Aggregation Methods for Small-World Graphs with Application to Random-Walk Ranking
H. de Sterck, V. Henson, G. Sanders

Markov chain, aggregation, multilevel, random graphs

We describe multilevel aggregation in the specific context of using Markov chains to rank the nodes of graphs. More generally, aggregation is a graph coarsening technique that has a wide range of possible uses regarding information retrieval applications. Aggregation successfully generates efficient multilevel methods for solving nonsingular linear systems and various eigenproblems from discretized partial differential equations, which tend to involve mesh-like graphs. Our primary goal is to extend the applicability of aggregation to similar problems on small-world graphs, with a secondary goal of developing these methods for eventual applicability towards many other tasks such as using the information in the hierarchies for node clustering or pattern recognition. The nature of small-world graphs makes it difficult for many coarsening approaches to obtain useful hierarchies that have complexity on the order of the number of edges in the original graph while retaining the relevant properties of the original graph. Here, for a set of synthetic graphs with the small-world property, we show how multilevel hierarchies formed with non-overlapping strength-based aggregation have optimal or near optimal complexity. We also provide an example of how these hierarchies are employed to accelerate convergence of methods that calculate the stationary probability vector of large, sparse, irreducible, slowly-mixing Markov chains on such small-world graphs. The stationary probability vector of a Markov chain allows one to rank the nodes in a graph based on the likelihood that a long random walk visits each node. These ranking approaches have a wide range of applications including information retrieval and web ranking, performance modeling of computer and communication systems, analysis of social networks, dependability and security analysis, and analysis of biological systems [19].

Computing and Informatics. Volume 30, 2011, No. 2: 225-246.

 
  Parallel Retrieval of Dense Vectors in the Vector Space Model
T. Berka, M. Vajtersic

Vector space model, symmetric multiprocessing, dense vector computations, message passing interface

Modern information retrieval systems use distributed and parallel algorithms to meet their operational requirements, and commonly operate on sparse vectors; but dimensionality-reducing techniques produce dense and relatively short feature vectors. Motivated by this relevance of dense vectors, we have parallelized the vector space model for dense matrices and vectors. Our algorithm uses a hybrid partitioning splitting documents and features and operates on a mesh of hosts holding a block partitioned corpus matrix. We show that the theoretic speed-up is optimal. The empirical evaluation of an MPI-based implementation reveals that we obtain a super-linear speed-up on a cluster using Nehalem Xeon CPUs.

Computing and Informatics. Volume 30, 2011, No. 2: 247-265.

 
  New Inheritance Complexity Metrics for Object-Oriented Software Systems: An Evaluation with Weyuker's Properties
D. Mishra

Weyuker's properties, software metrics, object-oriented systems, inheritance, complexity

Two inheritance complexity metrics, one at class level CCI (Class Complexity due to Inheritance) and another at program level ACI (Average Complexity of a program due to Inheritance), have been proposed for object-oriented software systems. These proposed metrics are evaluated with Weyuker's properties and compared with other well known object-oriented inheritance metrics. It has been found that the proposed metrics better represent the complexity, due to inheritance, of a class and a program. Weyuker's property 7 (Significance of Permutation) has received a negative response regarding its applicability to object-oriented software metrics. It has been observed that this property is not satisfied by any of the object-oriented inheritance metrics proposed so far. Contrary to past beliefs, the relevance of this property to object-oriented systems has been brought out in this paper. Examples with C++ code are also presented to support the applicability of this property.

Computing and Informatics. Volume 30, 2011, No. 2: 267-293.

 
  Finger Vein Recognition by Combining Global and Local Features based on SVM
K.R. Park

Biometrics, finger vein recognition, LBP, Wavelet transform, SVM

Recently, biometrics such as fingerprints, faces and irises recognition have been widely used in many applications including door access control, personal authentication for computers, internet banking, automatic teller machines and border-crossing controls. Finger vein recognition uses the unique patterns of finger veins to identify individuals at a high level of accuracy. This paper proposes new algorithms for finger vein recognition. This research presents the following three advantages and contributions compared to previous works. First, we extracted local information of the finger veins based on a LBP (Local Binary Pattern) without segmenting accurate finger vein regions. Second, the global information of the finger veins based on Wavelet transform was extracted. Third, two score values by the LBP and Wavelet transform were combined by the SVM (Support Vector Machine). As experimental results, the EER (Equal Error Rate) was 0.011 % and the total processing time was 98.2ms.

Computing and Informatics. Volume 30, 2011, No. 2: 295-309.

 
  The Variant of Latent Dirichlet Allocation for Natural Scene Classification
T. Yingjun

LDA, CCLDA, topic, visual visterm, scene classification

The paper proposes a novel model based on classic LDA (latent Dirichlet allocation), which is used to learn and recognize natural scene category. Unlike previous work, the model performs variational Bayesian inference (VB) two times in order to get more precise prior Dirichlet parameters for each scene category. Although the scenes is represented in common topic simplex, the model has retained the diversities of each scene category based on the same topic simplex. Furthermore, two discriminations have been done to get good performance. We investigated the classification performance with classic 13 scenes image database and the experiments had demonstrated that our method can get better performance with less training time.

Computing and Informatics. Volume 30, 2011, No. 2: 311-319.

 
  Modular Echo State Neural Networks in Time Series Prediction
S. Babinec, J. Pospichal

Echo State neural networks, recurrent neural networks, time series prediction, gating of artificial neural networks

Echo State neural networks (ESN), which are a special case of recurrent neural networks, are studied from the viewpoint of their learning ability, with a goal to achieve their greater predictive ability. In this paper we study the influence of the memory length on predictive abilities of Echo State neural networks. The conclusion is that Echo State neural networks with fixed memory length can have troubles with adaptation of its intrinsic dynamics to dynamics of the prediction task. Therefore, we have tried to create complex prediction system as a combination of the local expert Echo State neural networks with different memory length and one special gating Echo State neural network. This approach was tested in laser fluctuations and turbojet gas temperature prediction. The prediction error achieved by this approach was substantially smaller in comparison with prediction error achieved by standard Echo State neural networks.

Computing and Informatics. Volume 30, 2011, No. 2: 321-334.

 
  A Clustering Scheme in Application Layer Multicast
X. Li, X. Zhang, W. Luo, B. Yan

Multicast, application layer multicast, WHOIS searching, network topology, IP address

In application layer multicast (ALM), member hosts lack direct knowledge of underlying network topology, which brings some performance penalty. This paper investigates an effective way to rapidly obtain some related topology knowledge, i.e. getting topology hints from existing IP registered resources -- WHOIS database. We further propose a clustering scheme, which can be integrated into the existing ALM solutions. Our proposed scheme can cluster some nearby member hosts no matter when these hosts join the group. Therefore the scheme also solves the join sequences problem in some degree. We also present an application framework of the clustering scheme, and give an application example named HMTP-E that integrates the scheme into HMTP protocol. The experiment results show that the clustering scheme plays a positive role on improving the performance of existing ALM solutions.

Computing and Informatics. Volume 30, 2011, No. 2: 335-355.

 
  Clustering of Steel Strip Sectional Profiles Based on Robust Adaptive Fuzzy Clustering Algorithm
Ch. Tang, Sh. Wang, Y. Chen

Fuzzy clustering, robust, adaptive operators, outlier mining, steel strip profile

In this paper, the intelligent techniques are applied to enhance the quality control precision in the steel strip cold rolling production. Firstly a new control scheme is proposed, establishing the classifier of the steel strip cross-sectional profiles is the core of the system. The fuzzy clustering algorithm is used to establish the classifier. Secondly, a novel fuzzy clustering algorithm is proposed and used in the real application. The results, under the comparisons with the results obtained by the conventional fuzzy clustering algorithm, show the new algorithm is robust and efficient and it can not only get better clustering prototypes, which are used as the classifier, but also easily and effectively detect the outliers; it does great help in improving the performances of the new system. Finally, it is pointed out that the new algorithm's efficiency is mainly due to the introduction of a set of adaptive operators which allow for treating the different influences of data objects on the clustering operations; and in nature, the new fuzzy algorithm is the generalized version of the existing fuzzy clustering algorithm.

Computing and Informatics. Volume 30, 2011, No. 2: 357-380.

 
  Exploring Complex Networks with Graph Investigator Research Application
W. Czech, W. Dzwinel, S. Goryczka, T. Arodz, A.Z. Dudek

Complex networks, graph descriptor, graph matching, biological network, graph pattern vector

This paper describes Graph Investigator, the application intended for analysis of complex networks. A rich set of application functions is briefly described including graph feature generation, comparison, visualization and edition. The program enables to analyze global and local structural properties of networks with the use of various descriptors derived from graph theory. Furthermore, it allows to quantify inter-graph similarity by embedding graph patterns into low-dimensional space or distance measurement based on feature vectors. The set of available graph descriptors includes over eighty statistical and algebraic measures. We present two examples of real-world networks analysis performed with Graph Investigator: comparison of brain vasculature with structurally similar artificial networks and analysis of vertices importance in a macaque cortical connectivity network. The third example describes tracking parameters of artificial vascular network evolving in the process of angiogenesis, modelled with the use of cellular automata.

Computing and Informatics. Volume 30, 2011, No. 2: 381-410.