Given a weighted graph, graph transduction aims to assign unlabeled examples explicit class labels rather than build a general decision function based on the available labeled examples. Practically, a dataset usually contains many noisy data, such as the “bridge points” located across different classes, and the “outliers” that incur abnormal distances from the normal examples of their classes. The labels of these examples are usually ambiguous and also difficult to decide. Labeling them incorrectly may further bring about erroneous classifications on the remaining unlabeled examples. Therefore, their accurate classifications are critical to obtaining satisfactory final performance.
Unfortunately, current graph transduction algorithms usually fall short of tackling the noisy but critical examples, so they may become fragile and produce imperfect results sometimes. Therefore, in this thesis we aim to develop a series of robust graph transduction methodologies via iterative or non-iterative way, so that they can perfectly handle the difficult noisy data points. Our works are summarized as follows:
In Chapter 2, we propose a robust non-iterative algorithm named “Label Prediction via Deformed Graph Laplacian” (LPDGL). Different from the existing methods that usually employ a traditional graph Laplacian to achieve label smoothness among pairs of examples, in LPDGL we introduce a deformed graph Laplacian, which not only induces the existing pairwise smoothness term, but also leads to a novel local smoothness term. This local smoothness term detects the ambiguity of each example by exploring the associated degree, and assigns confident labels to the examples with large degree, as well as allocates “weak labels” to the uncertain examples with small degree. As a result, the negative effects of outliers and bridge points are suppressed, leading to more robust transduction performance than some existing representative algorithms. Although LPDGL is designed for transduction purpose, we show that it can be easily extended to inductive settings.
In Chapter 3, we develop an iterative label propagation approach, called “Fick’s Law Assisted Propagation” (FLAP), for robust graph transduction. To be specific, we regard label propagation on the graph as the practical fluid diffusion on a plane, and develop a novel label propagation algorithm by utilizing a well-known physical theory called Fick’s Law of Diffusion. Different from existing machine learning models that are based on some heuristic principles, FLAP conducts label propagation in a “natural” way, namely when and how much label information is received or transferred by an example, or where these labels should be propagated to, are naturally governed. As a consequence, FLAP not only yields more robust propagation results, but also requires less computational time than the existing iterative methods.
In Chapter 4, we propose a propagation framework called “Teaching-to-Learn and Learning-to-Teach” (TLLT), in which a “teacher” (i.e. a teaching algorithm) is introduced to guide the label propagation. Different from existing methods that equally treat all the unlabeled examples, in TLLT we assume that different examples have different classification difficulties, and their propagations should follow a simple-to-difficult sequence. As such, the previously “learned” simple examples can ease the learning for the subsequent more difficult examples, and thus these difficult examples can be correctly classified. In each iteration of propagation, the teacher will designate the simplest examples to the “learner” (i.e. a propagation algorithm). After “learning” these simplest examples, the learner will deliver a learning feedback to the teacher to assist it in choosing the next simplest examples. Due to the collaborative teaching and learning process, all the unlabeled examples are propagated in a well-organized sequence, which contributes to the improved performance over existing methods.
In Chapter 5, we apply the TLLT framework proposed in Chapter 4 to accomplish saliency detection, so that the saliency values of all the superpixels are decided from simple superpixels to more difficult ones. The difficulty of a superpixel is judged by its informativity, individuality, inhomogeneity, and connectivity. As a result, our saliency detector generates manifest saliency maps, and outperforms baseline methods on the typical public datasets.