模式识别与人工智能
Home      About Journal      Editorial Board      Instructions      Ethics Statement      Contact Us                   中文
Pattern Recognition and Artificial Intelligence
22 Judgement and Disposal of Academic Misconduct Article
22 Copyright Transfer Agreement
22 Proof of Confidentiality
22 Requirements for Electronic Version
More....
22 Chinese Association of Automation
22 National ResearchCenter for Intelligent Computing System
22 Institute of Intelligent Machines,Chinese Academy of Sciences
More....
 
 
2017 Vol.30 Issue.1, Published 2017-01-31

Papers and Reports    Researches and Applications   
   
Papers and Reports
1 Clustering Assumption Based Classification Algorithm for Stream Data
LI Nan
Labeling all the instances is unpractical due to the high cost of acquiring labeled data in a real streaming environment. However, labeling part of the instances leads to model instability. Aiming at these problem, a clustering assumption based classification algorithm for stream data(CASD) is proposed. It is assumed that the instances divided into the same cluster may come from the same class. Based on the clustering assumption, the clustering result is utilized to fit the distribution of each class. The instances difficult to be classified or from concept drift class are selected to update the current model. Maintaining several base learners for each class and dynamical updating them is another innovation of the proposed algorithm. When instances from a specific class disappear or reappear, the corresponding base learners are frozen or activated instead of relearning the prior knowledge. Experimental results show that with a few labeled instances, the accuracy of CASD is comparable to that of state-of-the-art algorithms and the model can adapt to concept drift rapidly.
2017 Vol. 30 (1): 1-10 [Abstract] ( 867 ) [HTML 1KB] [ PDF 530KB] ( 807 )
11 Model of Multi-granulation Neighborhood Rough Intuitionistic Fuzzy Sets
XUE Zhan′ao, SI Xiaomeng, YUAN Yilin, XIN Xianwei
The combination of the multi-granulation neighborhood rough set and the intuitionistic fuzzy set is further researched in this paper. Firstly, the concepts of the intuitionistic fuzzy covering-based rough membership and non-membership are defined for dealing with the heterogeneous data including categorical attributes and numerical attributes. Secondly, a multi-granulation neighborhood rough intuitionistic fuzzy set model is established based on different attribute set sequences and different neighborhood radii. Then, the properties of multi-granulation neighborhood rough intuitionistic fuzzy set are discussed. Next, the approximate sets of the optimistic and pessimistic multi-granulation neighborhood rough intuitionistic fuzzy sets are constructed and their properties are discussed. Finally, these models are illustrated with examples. Example analysis shows the models can handle the heterogeneous data including categorical attributes and numerical attributes more accurately.
2017 Vol. 30 (1): 11-20 [Abstract] ( 595 ) [HTML 1KB] [ PDF 621KB] ( 461 )
21 Online Associative Memory Model Based on Self-organizing Decision Tree
XIE Zhenping, SUN Tao
To model associative relationships among multiple-source data in online way, an online associative memory model based on self-organizing decision tree is proposed with the consideration of the efficient computation performance and good noise robustness. In the proposed model, real multi-source data are firstly reduced into finite representatives for information enhancement. Then, data representatives are divided into different sub-domains based on decision tree algorithm. Finally, the associative relations among multi-source data are trained on different sub-domains. The learning stability of the proposed model is analyzed theoretically. The experimental results demonstrate the proposed model can gain good performance on online classification learning and hetero-associative modeling for noisy data.
2017 Vol. 30 (1): 21-31 [Abstract] ( 647 ) [HTML 1KB] [ PDF 1891KB] ( 512 )
32 Regional Image Segmentation Based on Energy Functions for Double Random Fields
ZHAO Quanhua, ZHAO Xuemei, LI Yu
To improve anti-noise capability of image segmentation and take full advantage of energy functions in feature and label fields, a regional image segmentation based on energy functions for double random fields is proposed. Firstly, an image domain is partitioned into a set of sub-regions by a geometry tessellation technique. Based on the tessellation, negative logarithm of multi-Gaussian probability distribution is employed to define regionalized feature field energy function to describe the homogeneity of statistical distribution for pixel colors in a homogeneous region. The improved Potts model is the extension of the traditional model for the labels of a pixel and its neighbor pixels, and it is used to define regionalized label field energy function to characterize the relativity of the labels for sub-regions. Combine feature field and label field, and Kullback-Leibler divergence is utilized to define the heterogeneous energy function for describing the heterogeneity of color distributions among different homogeneous regions. The unconditional Gibbs function is adopted to transform the defined energy functions into probability functions for image segmentation. Finally, based on the maximization of probability distribution scheme, Metropolis-Hastings sampler is designed to obtain the optimal segmentation. Synthetic, remote sensing and natural texture images are segmented by several algorithms. Segmentation results show the proposed algorithm realizes image segmentation accurately and efficiently.
2017 Vol. 30 (1): 32-42 [Abstract] ( 588 ) [HTML 1KB] [ PDF 1936KB] ( 636 )
Researches and Applications
43 Incremental Deep Web Crawling with Top-k Query Constraint
JIANG Junyan, PENG Zhiyong, WU Xiaoying
Crawling all deep web data is difficult for third party applications due to dynamicity, autonomy and quantity of deep web data sources. To tackle the deep web crawling problem under the query type restriction(only top-k queries are allowed) and limited query resources, an approach for incremental web crawling with top-k query constraint is proposed. Historical data and domain knowledge are combined to maximize total repository data quality. Firstly, valid queries are generated using a query tree, and changes and corresponding cost of the query are estimated by historical data and domain knowledge. Next, grounded on the query cost and data quality of the estimation, the optimal subset is selected approximately to globally maximize total data quality under limited query resources. The experimental results on real datasets show the proposed approach improves the efficiency of crawling dynamic web database.
2017 Vol. 30 (1): 43-53 [Abstract] ( 413 ) [HTML 1KB] [ PDF 892KB] ( 486 )
54 Relation Extraction Method Combining Clause Level Distant Supervision and Semi-supervised Ensemble Learning
YU Xiaokang, CHEN Ling, GUO Jing, CAI Yaya, WU Yong, WANG Jingchang
Aiming at noisy data in training data and the insufficient use of negative instances in traditional distant supervision relation extraction methods, a relation extraction method combining clause level distant supervision and semi-supervised ensemble learning is proposed. Firstly, the relation instance set is generated by distant supervision. Secondly, based on clause identification, a denoising algorithm is used to reduce the wrongly labeled data in the relation instance set. Thirdly, the lexical features are extracted from relation instances and are transformed into distributed vectors to establish feature dataset. Finally, all positive data and part of negative data in feature dataset are chosen to form labeled dataset, and the other part of negative data are chosen to form unlabeled dataset. A relation classifier is trained through improved semi-supervised ensemble learning algorithm. Experiments show that compared with baseline methods the proposed method achieves higher accuracies and recall.
2017 Vol. 30 (1): 54-63 [Abstract] ( 682 ) [HTML 1KB] [ PDF 956KB] ( 1281 )
64 Diversity-Aware KNN Query Processing Approaches for Temporal Spatial Textual Content
LI Chen, SHEN Derong, KOU Yue, NIE Tiezheng, YU Ge
It is very important to find textual contents satisfying user's demand among a mount of textual contents with location and time tags generated on web. Firstly, location variables and time variables of data objects are normalized, and a three-dimensional Rtree index combining location variables and time variables is designed. Then, a DST-KNN query algorithm and an improved diversity-aware KNN query algorithm called IDST-KNN query algorithm are proposed.Finally, experiments on massive datasets illustrate that the query processing approaches are efficient and accurate.
2017 Vol. 30 (1): 64-72 [Abstract] ( 463 ) [HTML 1KB] [ PDF 930KB] ( 430 )
73 Convolutional Neural Network and User Information Based Model for Microblog Topic Tracking
FU Peng, LIN Zheng, YUAN Fengcheng, LIN Hailun, WANG Weiping, MENG Dan
Aiming at feature sparseness and feature extraction of microblog text, a topic tracking model for Chinese microblog based on convolutional neural network(CNN-TTM) is proposed. Furthermore, user profiles and attributes are incorporated into CNN-TTM and a model called CNN-UserTTM is constructed. The user information of microblog is used to improve the accuracy of topic tracking. The experimental results demonstrate that CNN-TTM and CNN-UserTTM reach a high accuracy respectively on Sina microblog dataset.
2017 Vol. 30 (1): 73-80 [Abstract] ( 761 ) [HTML 1KB] [ PDF 836KB] ( 917 )
81 Incomplete Data Imputation Clustering Based on Difference of Convex Functions Programming
HE Dan, CHEN Songcan
To improve the clustering performance, an incomplete data imputation clustering algorithm based on difference of convex functions programming (DCP) is proposed. DCP is applied to optimize the kernel-based fuzzy C-means objective function, and the alternative optimization process for DCP clustering and missing completion is given. The convergence of the alternating optimization is proved theoretically. Experiments show the superiority of the proposed algorithm in missing completion and clustering performance.
2017 Vol. 30 (1): 81-88 [Abstract] ( 467 ) [HTML 1KB] [ PDF 550KB] ( 438 )
89 A Semi-supervised Feature Selection Method Based on Local Discriminant Constraint
YAN Fei, WANG Xiaodong
In feature selection the most representative features are selected and processed to reduce the dimensionality of feature space. A local discriminant constraint based semi-supervised feature selection method is presented in this paper. The labeled and unlabeled training samples are completely utilized to construct feature selection model, and the local discriminant information between the adjacent data is adopted to improve model accuracy. Then the l2,1 constraint is added to improve the distinguishability between these features and avoid noise interference. Finally, several state-of-the-art feature selection methods are performed to compare with the proposed algorithm. The experimental results demonstrate the effectiveness of the proposed algorithm.
2017 Vol. 30 (1): 89-95 [Abstract] ( 627 ) [HTML 1KB] [ PDF 604KB] ( 638 )
模式识别与人工智能
 

Supervised by
China Association for Science and Technology
Sponsored by
Chinese Association of Automation
NationalResearchCenter for Intelligent Computing System
Institute of Intelligent Machines, Chinese Academy of Sciences
Published by
Science Press
 
Copyright © 2010 Editorial Office of Pattern Recognition and Artificial Intelligence
Address: No.350 Shushanhu Road, Hefei, Anhui Province, P.R. China Tel: 0551-65591176 Fax:0551-65591176 Email: bjb@iim.ac.cn
Supported by Beijing Magtech  Email:support@magtech.com.cn