Hsinchun Chen

Hsinchun Chen

Professor, Management Information Systems
Regents Professor
Member of the Graduate Faculty
Professor, BIO5 Institute
Primary Department
Contact
(520) 621-4153

Research Interest

Dr Chen's areas of expertise include:Security informatics, security big data; smart and connected health, health analytics; data, text, web mining.Digital library, intelligent information retrieval, automatic categorization and classification, machine learning for IR, large-scale information analysis and visualization.Internet resource discovery, digital libraries, IR for large-scale scientific and business databases, customized IR, multilingual IR.Knowledge-based systems design, knowledge discovery in databases, hypertext systems, machine learning, neural networks computing, genetic algorithms, simulated annealing.Cognitive modeling, human-computer interactions, IR behaviors, human problem-solving process.

Publications

Yang, C. C., Ng, T. D., Wang, J., Wei, C., & Chen, H. (2007). Analyzing and visualizing gray Web forum structure. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 4430 LNCS, 21-33.

Abstract:

Web is a platform for users to search for information to fulfill their information needs but it is also an ideal platform to express personal opinions and comments. A virtual community is formed when a number of members participate in this kind of communication. Nowadays, teenagers are spending extensive amount of time to communicate with strangers in these virtual communities. At the same time, criminals and terrorists are also taking advantages of these virtual communities to recruit members and identify victims. Many Web forum users may not be aware that their participation in these virtual communities have violated the laws in their countries, for example, downloading pirated software or multimedia contents. Police officers cannot combat against this kind of criminal activities using the traditional approaches. We must rely on computing technologies to analyze and visualize the activities within these virtual communities to identify the suspects and extract the active groups. In this work, we introduce the social network analysis technique and information visualization technique for the Gray Web Forum - forum that may threaten public safety. © Springer-Verlag Berlin Heidelberg 2007.

Ku, Y., Chiu, C., Zhang, Y., Chen, H., & Su, H. (2014). Text Mining Self-Disclosing Health Information for Public Health Service. JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY, 65(5), 928-947.

Understanding specific patterns or knowledge of self-disclosing health information could support public health surveillance and healthcare. This study aimed to develop an analytical framework to identify self-disclosing health information with unusual messages on web forums by leveraging advanced text-mining techniques. To demonstrate the performance of the proposed analytical framework, we conducted an experimental study on 2 major human immunodeficiency virus (HIV)/acquired immune deficiency syndrome (AIDS) forums in Taiwan. The experimental results show that the classification accuracy increased significantly (up to 83.83%) when using features selected by the information gain technique. The results also show the importance of adopting domain-specific features in analyzing unusual messages on web forums. This study has practical implications for the prevention and support of HIV/AIDS healthcare. For example, public health agencies can re-allocate resources and deliver services to people who need help via social media sites. In addition, individuals can also join a social media site to get better suggestions and support from each other.

Wang, G., Chen, H., & Atabakhsh, H. (2004). Automatically detecting deceptive criminal identities. Communications of the ACM, 47(3), 70-76.

Abstract:

The uncovering patterns of criminal identity deception based on actual criminal records and algorithmic approach to reveal deceptive identities are discussed. The testing results shows that no false positive errors occurs which shows the effectiveness of the algorithm. The errors occurs in the false negative category in which unrelated suspects are recognized as being related. The threshold value is set to capture maximum possible true similar records. Adaptive threshold is required for making an automated process in the future research.

Romano Jr., N. C., Roussinov, D., Nunamaker Jr., J. F., & Chen, H. (1999). Collaborative information retrieval environment: Integration of information retrieval with group support systems. Proceedings of the Hawaii International Conference on System Sciences, 33-.

Abstract:

How user experiences with information retrieval (IR) system and group support system (GSS) has shed light onto a promising new era of collaborative research and led the development of a prototype that merges the two paradigms into a collaborative information retrieval environment (CIRE) are described. The theory developed from initial user experiences with a prototype and the plans to empirically test the efficacy of this new paradigm through controlled experimentation is discussed.

Daning, H. u., Chen, H., Huang, Z., & Roco, M. C. (2007). Longitudinal study on patent citations to academic research articles in nanotechnology (1976-2004). Journal of Nanoparticle Research, 9(4), 529-542.

Abstract:

Academic nanoscale science and engineering (NSE) research provides a foundation for nanotechnology innovation reflected in patents. About 60% or about 50,000 of the NSE-related patents identified by "full-text" keyword searching between 1976 and 2004 at the United States Patent and Trademark Office (USPTO) have an average of approximately 18 academic citations. The most cited academic journals, individual researchers, and research articles have been evaluated as sources of technology innovation in the NSE area over the 28-year period. Each of the most influential articles was cited about 90 times on the average, while the most influential author was cited more than 700 times by the NSE-related patents. Thirteen mainstream journals accounted for about 20% of all citations. Science, Nature and Proceedings of the National Academy of Sciences (PNAS) have consistently been the top three most cited journals, with each article being cited three times on average. There is another kind of influential journals, represented by Biosystems and Origin of Life, which have very few articles cited but with exceptionally high frequencies. The number of academic citations per year from ten most cited journals has increased by over 17 times in the interval (1990-1999) as compared to (1976-1989), and again over 3 times in the interval (2000-2004) as compared to (1990-1999). This is an indication of increased used of academic knowledge creation in the NSE-related patents. © 2007 Springer Science+Business Media, Inc.