Hsinchun Chen

Hsinchun Chen

Professor, Management Information Systems
Regents Professor
Member of the Graduate Faculty
Professor, BIO5 Institute
Primary Department
Contact
(520) 621-4153

Research Interest

Dr Chen's areas of expertise include:Security informatics, security big data; smart and connected health, health analytics; data, text, web mining.Digital library, intelligent information retrieval, automatic categorization and classification, machine learning for IR, large-scale information analysis and visualization.Internet resource discovery, digital libraries, IR for large-scale scientific and business databases, customized IR, multilingual IR.Knowledge-based systems design, knowledge discovery in databases, hypertext systems, machine learning, neural networks computing, genetic algorithms, simulated annealing.Cognitive modeling, human-computer interactions, IR behaviors, human problem-solving process.

Publications

Huang, C., Tianjun, F. u., & Chen, H. (2010). Text-based video content classification for online video-sharing sites. Journal of the American Society for Information Science and Technology, 61(5), 891-906.

Abstract:

With the emergence of Web 2.0, sharing personal content, communicating ideas, and interacting with other online users in Web 2.0 communities have become daily routines for online users. User-generated data from Web 2.0 sites provide rich personal information (e.g., personal preferences and interests) and can be utilized to obtain insight about cyber communities and their social networks. Many studies have focused on leveraging usergenerated information to analyze blogs and forums, but few studies have applied this approach to video-sharing Web sites. In this study, we propose a text-based framework for video content classification of online-video sharing Web sites. Different types of user-generated data (e.g., titles, descriptions, and comments) were used as proxies for online videos, and three types of text features (lexical, syntactic, and content-specific features) were extracted. Three feature-based classification techniques (C4.5, Naïve Bayes, and Support Vector Machine) were used to classify videos. To evaluate the proposed framework, user-generated data from candidate videos, which were identified by searching user-given keywords on You Tube, were first collected.Then, a subset of the collected data was randomly selected and manually tagged by users as our experiment data.The experimental results showed that the proposed approach was able to classify online videos based on users' interests with accuracy rates up to 87.2%, and all three types of text features contributed to discriminating videos. Support Vector Machine outperformed C4.5 and Naïve Bayes techniques in our experiments. In addition, our case study further demonstrated that accurate video-classification results are very useful for identifying implicit cyber communities on video-sharing Web sites. © 2010 ASIS&T.

Xin, L. i., Chen, H., Huang, Z., & Roco, M. C. (2007). Patent citation network in nanotechnology (1976-2004). Journal of Nanoparticle Research, 9(3), 337-352.

Abstract:

The patent citation networks are described using critical node, core network, and network topological analysis. The main objective is understanding of the knowledge transfer processes between technical fields, institutions and countries. This includes identifying key influential players and subfields, the knowledge transfer patterns among them, and the overall knowledge transfer efficiency. The proposed framework is applied to the field of nanoscale science and engineering (NSE), including the citation networks of patent documents, submitting institutions, technology fields, and countries. The NSE patents were identified by keywords "full-text" searching of patents at the United States Patent and Trademark Office (USPTO). The analysis shows that the United States is the most important citation center in NSE research. The institution citation network illustrates a more efficient knowledge transfer between institutions than a random network. The country citation network displays a knowledge transfer capability as efficient as a random network. The technology field citation network and the patent document citation network exhibit a less efficient knowledge diffusion capability than a random network. All four citation networks show a tendency to form local citation clusters. © 2007 Springer Science+Business Media, Inc.

Xin, L. i., Chen, H., Jiexun, L. i., & Zhang, Z. (2010). Gene function prediction with gene interaction networks: A context graph kernel approach. IEEE Transactions on Information Technology in Biomedicine, 14(1), 119-128.

PMID: 19789115;Abstract:

Predicting gene functions is a challenge for biologists in the postgenomic era. Interactions among genes and their products compose networks that can be used to infer gene functions. Most previous studies adopt a linkage assumption, i.e., they assume that gene interactions indicate functional similarities between connected genes. In this study, we propose to use a gene's context graph, i.e., the gene interaction network associated with the focal gene, to infer its functions. In a kernel-based machine-learning framework, we design a context graph kernel to capture the information in context graphs. Our experimental study on a testbed of p53-related genes demonstrates the advantage of using indirect gene interactions and shows the empirical superiority of the proposed approach over linkage-assumption-based methods, such as the algorithm to minimize inconsistent connected genes and diffusion kernels. © 2009 IEEE.

Chang, W., Chung, W., Chen, H., & Chou, S. (2003). An international perspective on fighting cybercrime. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2665, 379-384.

Abstract:

Cybercrime is becoming ever more serious. Findings from the 2002 Computer Crime and Security Survey show an upward trend that demonstrates a need for a timely review of existing approaches to fighting this new phenomenon in the information age. In this paper, we provide an overview of cybercrime and present an international perspective on fighting cybercrime. We review current status of fighting cybercrime in different countries, which rely on legal, organizational, and technological approaches, and recommend four directions for governments, lawmakers, intelligence and law enforcement agencies, and researchers to combat cybercrime. © Springer-Verlag Berlin Heidelberg 2003.

Zhu, B., & Chen, H. (2000). Validating a geographical image retrieval system. Journal of the American Society for Information Science and Technology, 51(7), 625-634.

Abstract:

This paper summarizes a prototype geographical image retrieval system that demonstrates how to integrate image processing and information analysis techniques to support large-scale content-based image retrieval. By using an image as its interface, the prototype system addresses a troublesome aspect of traditional retrieval models, which require users to have complete knowledge of the low-level features of an image. In addition we describe an experiment to validate the performance of this image retrieval system against that of human subjects in an effort to address the scarcity of research evaluating performance of an algorithm against that of human beings. The results of the experiment indicate that the system could do as well as human subjects in accomplishing the tasks of similarity analysis and image categorization. We also found that under some circumstances texture features of an image are insufficient to represent a geographic image. We believe, however, that our image retrieval system provides a promising approach to integrating image processing techniques and information retrieval algorithms. © 2000 John Wiley & Sons, Inc.