Maximal quasi-bicliques with balanced noise tolerance: Concepts and co-clustering applications
- Publication Type:
- Conference Proceeding
- Society for Industrial and Applied Mathematics - 8th SIAM International Conference on Data Mining 2008, Proceedings in Applied Mathematics 130, 2008, 1 pp. 72 - 83
- Issue Date:
The rigid all-versus-all adjacency required by a maximal biclique for its two vertex sets is extremely vulnerable to missing data. In the past, several types of quasi-bicliques have been proposed to tackle this problem, however their noise tolerance is usually unbalanced and can be very skewed. In this paper, we improve the noise tolerance of maximal quasi-bicliques by allowing every vertex to tolerate up to the same number, or the same percentage, of missing edges. This idea leads to a more natural interaction between the two vertex sets-a balanced most-versus-most adjacency. This generalization is also non-trivial, as many large-size maximal quasi-biclique subgraphs do not contain any maximal bicliques. This observation implies that direct expansion from maximal bicliques may not guarantee a complete enumeration of all maximal quasi-bicliques. We present important properties of maximal quasi-bicliques such as a bounded closure property and a fixed point property to design efficient algorithms. Maximal quasi-bicliques are closely related to co-clustering problems such as documents and words co-clustering, images and features coclustering, stocks and financial ratios co-clustering, etc. Here, we demonstrate the usefulness of our concepts using a new application - a bioinformatics example - where prediction of true protein interactions is investigated. Copyright © by SIAM.
Please use this identifier to cite or link to this item: