Hsinchun Chen

Hsinchun Chen

Professor, Management Information Systems
Regents Professor
Member of the Graduate Faculty
Professor, BIO5 Institute
Primary Department
Contact
(520) 621-4153

Research Interest

Dr Chen's areas of expertise include:Security informatics, security big data; smart and connected health, health analytics; data, text, web mining.Digital library, intelligent information retrieval, automatic categorization and classification, machine learning for IR, large-scale information analysis and visualization.Internet resource discovery, digital libraries, IR for large-scale scientific and business databases, customized IR, multilingual IR.Knowledge-based systems design, knowledge discovery in databases, hypertext systems, machine learning, neural networks computing, genetic algorithms, simulated annealing.Cognitive modeling, human-computer interactions, IR behaviors, human problem-solving process.

Publications

Chen, H. (2000). Introduction to the special topic issue: Part 2. Journal of the American Society for Information Science and Technology, 51(4), 311-312.
Abbasi, A., France, S., Zhang, Z., & Chen, H. (2011). Selecting attributes for sentiment classification using feature relation networks. IEEE Transactions on Knowledge and Data Engineering, 23(3), 447-462.

Abstract:

A major concern when incorporating large sets of diverse n-gram features for sentiment classification is the presence of noisy, irrelevant, and redundant attributes. These concerns can often make it difficult to harness the augmented discriminatory potential of extended feature sets. We propose a rule-based multivariate text feature selection method called Feature Relation Network (FRN) that considers semantic information and also leverages the syntactic relationships between n-gram features. FRN is intended to efficiently enable the inclusion of extended sets of heterogeneous n-gram features for enhanced sentiment classification. Experiments were conducted on three online review testbeds in comparison with methods used in prior sentiment classification research. FRN outperformed the comparison univariate, multivariate, and hybrid feature selection methods; it was able to select attributes resulting in significantly better classification accuracy irrespective of the feature subset sizes. Furthermore, by incorporating syntactic information about n-gram relations, FRN is able to select features in a more computationally efficient manner than many multivariate and hybrid techniques. © 2006 IEEE.

Chen, H., Lally, A. M., Zhu, B., & Chau, M. (2003). HelpfulMed: Intelligent searching for medical information over the Internet. Journal of the American Society for Information Science and Technology, 54(7), 683-694.

Abstract:

Medical professionals and researchers need information from reputable sources to accomplish their work. Unfortunately, the Web has a large number of documents that are irrelevant to their work, even those documents that purport to be "medically-related." This paper describes an architecture designed to integrate advanced searching and indexing algorithms, an automatic thesaurus, or "concept space," and Kohonen-based Self-Organizing Map (SOM) technologies to provide searchers with fine-grained results. Initial results indicate that these systems provide complementary retrieval functionalities. HelpfulMed not only allows users to search Web pages and other online databases, but also allows them to build searches through the use of an automatic thesaurus and browse a graphical display of medical-related topics. Evaluation results for each of the different components are included. Our spidering algorithm outperformed both breadth-first search and PageRank spiders on a test collection of 100,000 Web pages. The automatically generated thesaurus performed as well as both MeSH and UMLS-systems which require human mediation for currency. Lastly, a variant of the Kohonen SOM was comparable to MeSH terms in perceived cluster precision and significantly better at perceived cluster recall.

Woo, J., & Chen, H. (2012). An event-driven SIR model for topic diffusion in web forums. ISI 2012 - 2012 IEEE International Conference on Intelligence and Security Informatics: Cyberspace, Border, and Immigration Securities, 108-113.

Abstract:

Social media is being increasingly used as a communication channel. Among social media, web forums, where people in online communities disseminate and receive information by interaction, provide a good environment to examine information diffusion. In this research, we aim to understand the mechanisms and properties of the information diffusion in the web forum. For that, we model topic-level information diffusion in web forums using the baseline epidemic model, the SIR(Susceptible, Infective, and Recovered) model, frequently used in previous research to analyze disease outbreaks and knowledge diffusion. In addition, we propose an event-driven SIR model that reflects the event effect on information diffusion in the web forum. The proposed model incorporates the effect of news postings on the web forum. We evaluate two models using a large longitudinal dataset from the web forum of a major company. The event-SIR model outperforms the SIR model in fitting on major spikey topics that have peaks of author participation. © 2012 IEEE.

Chen, H., Xin, L. i., Chau, M., Ho, Y., & Tseng, C. (2009). Using open web APIs in teaching web mining. IEEE Transactions on Education, 52(4), 482-490.

Abstract:

With the advent of the World Wide Web, many business applications that utilize data mining and text mining techniques to extract useful business information on the Web have evolved from Web searching to Web mining. It is important for students to acquire knowledge and hands-on experience in Web mining during their education in information systems curricula. This paper reports on an experience using open Web Application Programming Interfaces (APIs) that have been made available by major Internet companies (e.g., Google, Amazon, and eBay) in a class project to teach Web mining applications. The instructor's observations of the students' performance and a survey of the students' opinions show that the class project achieved its objectives and students acquired valuable experience in leveraging the APIs to build interesting Web mining applications. © 2006 IEEE.