Hsinchun Chen

Hsinchun Chen

Professor, Management Information Systems
Regents Professor
Member of the Graduate Faculty
Professor, BIO5 Institute
Primary Department
Contact
(520) 621-4153

Research Interest

Dr Chen's areas of expertise include:Security informatics, security big data; smart and connected health, health analytics; data, text, web mining.Digital library, intelligent information retrieval, automatic categorization and classification, machine learning for IR, large-scale information analysis and visualization.Internet resource discovery, digital libraries, IR for large-scale scientific and business databases, customized IR, multilingual IR.Knowledge-based systems design, knowledge discovery in databases, hypertext systems, machine learning, neural networks computing, genetic algorithms, simulated annealing.Cognitive modeling, human-computer interactions, IR behaviors, human problem-solving process.

Publications

Chen, H., Schuffels, C., & Orwig, R. (1996). Internet Categorization and Search: A Self-Organizing Approach. Journal of Visual Communication and Image Representation, 7(1), 88-102.

Abstract:

The problems of information overload and vocabulary differences have become more pressing with the emergence of increasingly popular Internet services. The main information retrieval mechanisms provided by the prevailing Internet WWW software are based on either keyword search (e.g., the Lycos server at CMU, the Yahoo server at Stanford) or hypertext browsing (e.g., Mosaic and Netscape). This research aims to provide an alternative concept-based categorization and search capability for WWW servers based on selected machine learning algorithms. Our proposed approach, which is grounded on automatic textual analysis of Internet documents (homepages), attempts to address the Internet search problem by first categorizing the content of Internet documents. We report results of our recent testing of a multilayered neural network clustering algorithm employing the Kohonen self-organizing feature map to categorize (classify) Internet homepages according to their content. The category hierarchies created could serve to partition the vast Internet services into subject-specific categories and databases and improve Internet keyword searching and/or browsing. © 1996 Academic Press, Inc.

Chen, H. (2008). Nuclear threat detection via the nuclear web and dark web: Framework and preliminary study. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 5376 LNCS, 85-96.

Abstract:

We believe the science of Intelligence and Security Informatics (ISI) can help with nuclear forensics and attribution. ISI research can help advance the intelligence collection, analytical techniques and instrumentation used in determining the origin, capability, intent, and transit route of nuclear materials by selected hostile countries and (terrorist) groups. We propose a research framework that aims to investigate the Capability, Accessibility, and Intent of critical high-risk countries, institutions, researchers, and extremist or terrorist groups. We propose to develop a knowledge base of the Nuclear Web that will collect, analyze, and pinpoint significant actors in the high-risk international nuclear physics and weapon community. We also identify potential extremist or terrorist groups from our Dark Web testbed who might pose WMD threats to the US and the international community. Selected knowledge mapping and focused web crawling techniques and findings from a preliminary study are presented in this paper. © 2008 Springer Berlin Heidelberg.

Reid, E., & Chen, H. (2005). Mapping the contemporary terrorism research domain: Researchers, publications, and institutions analysis. Lecture Notes in Computer Science, 3495, 322-339.

Abstract:

The ability to map the contemporary terrorism research domain involves mining, analyzing, charting, and visualizing a research area according to experts, institutions, topics, publications, and social networks. As the increasing flood of new, diverse, and disorganized digital terrorism studies continues, the application of domain visualization techniques are increasingly critical for understanding the growth of scientific research, tracking the dynamics of the field, discovering potential new areas of research, and creating a big picture of the field's intellectual structure as well as challenges. In this paper, we present an overview of contemporary terrorism research by applying domain visualization techniques to the literature and author citation data from the years 1965 to 2003. The data were gathered from ten databases such as the ISI Web of Science then analyzed using an integrated knowledge mapping framework that includes selected techniques such as self-organizing map (SOM), content map analysis, and co-citation analysis. The analysis revealed (1) 42 key terrorism researchers and their institutional affiliations; (2) their influential publications; (3) a shift from focusing on terrorism as a low-intensity conflict to an emphasis on it as a strategic threat to world powers with increased focus on Osama Bin Laden; and (4) clusters of terrorism researchers who work in similar research areas as identified by co-citation and block-modeling maps. © Springer-Verlag Berlin Heidelberg 2005.

Chen, H., Atabakhsh, H., Tseng, C., Marshall, B., Kaza, S., Eggers, S., Gowda, H., Shah, A., Petersen, T., & Violette, C. (2005). Visualization in law enforcement. Conference on Human Factors in Computing Systems - Proceedings, 1268-1271.

Abstract:

Visualization techniques have proven to be critical in helping crime analysis. By interviewing and observing Criminal Intelligence Officers (CIO) and civilian crime analysts at the Tucson Police Department (TPD), we found that two types of tasks are important for crime analysis: crime pattern recognition and criminal association discovery. We developed two separate systems that provide automatic visual assistance on these tasks. To help identify crime patterns, a Spatial Temporal Visualization (STV) system was designed to integrate a synchronized view of three types of visualization techniques: a GIS view, a timeline view and a periodic pattern view. The Criminal Activities Network (CAN) system extracts, visualizes and analyzes criminal relationships using spring-embedded and blockmodeling algorithms. This paper discusses the design and functionality of these two systems and the lessons learned from the development process and interaction with law enforcement officers.

Chau, M., Shiu, B., Chan, I., & Chen, H. (2007). Redips: Backlink search and analysis on the web for business intelligence analysis. Journal of the American Society for Information Science and Technology, 58(3), 351-365.

Abstract:

The World Wide Web presents significant opportunities for business intelligence analysis as it can provide information about a company's external environment and its stakeholders. Traditional business intelligence analysis on the Web has focused on simple keyword searching. Recently, it has been suggested that the incoming links, or backlinks, of a company's Web site (i.e., other Web pages that have a hyperlink pointing to the company of interest) can provide important insights about the company's "online communities." Although analysis of these communities can provide useful signals for a company and information about its stakeholder groups, the manual analysis process can be very time-consuming for business analysts and consultants. In this article, we present a tool called Redips that automatically integrates backlink meta-searching and text-mining techniques to facilitate users in performing such business intelligence analysis on the Web. The architectural design and implementation of the tool are presented in the article. To evaluate the effectiveness, efficiency, and user satisfaction of Redips, an experiment was conducted to compare the tool with two popular business intelligence analysis methods - using backlink search engines and manual browsing. The experiment results showed that Redips was statistically more effective than both benchmark methods (in terms of Recall and F-measure) but required more time in search tasks. In terms of user satisfaction, Redips scored statistically higher than backlink search engines in all five measures used, and also statistically higher than manual browsing in three measures. © 2006 Wiley Periodicals, Inc.