Hsinchun Chen

Hsinchun Chen

Professor, Management Information Systems
Regents Professor
Member of the Graduate Faculty
Professor, BIO5 Institute
Primary Department
Contact
(520) 621-4153

Research Interest

Dr Chen's areas of expertise include:Security informatics, security big data; smart and connected health, health analytics; data, text, web mining.Digital library, intelligent information retrieval, automatic categorization and classification, machine learning for IR, large-scale information analysis and visualization.Internet resource discovery, digital libraries, IR for large-scale scientific and business databases, customized IR, multilingual IR.Knowledge-based systems design, knowledge discovery in databases, hypertext systems, machine learning, neural networks computing, genetic algorithms, simulated annealing.Cognitive modeling, human-computer interactions, IR behaviors, human problem-solving process.

Publications

Mcdonald, D. M., & Chen, H. (2006). Summary in context: Searching versus browsing. ACM Transactions on Information Systems, 24(1), 111-141.

Abstract:

The use of text summaries in information-seeking research has focused on query-based summaries. Extracting content that resembles the query alone, however, ignores the greater context of the document. Such context may be central to the purpose and meaning of the document. We developed a generic, a query-based, and a hybrid summarizer, each with differing amounts of document context. The generic summarizer used a blend of discourse information and information obtained through traditional surface-level analysis. The query-based summarizer used only query-term information, and the hybrid summarizer used some discourse information along with query-term information. The validity of the generic summarizer was shown through an intrinsic evaluation using a well-established corpus of human-generated summaries. All three summarizers were then compared in an information-seeking experiment involving 297 subjects. Results from the information-seeking experiment showed that the generic summaries outperformed all others in the browse tasks, while the query-based and hybrid summaries outperformed the generic summary in the search tasks. Thus, the document context of generic summaries helped users browse, while such context was not helpful in search tasks. Such results are interesting given that generic summaries have not been studied in search tasks and the that majority of Internet search engines rely solely on query-based summaries. © 2006 ACM.

Zhang, Y., Dang, Y., & Chen, H. (2013). Research note: Examining gender emotional differences in Web forum communication. Decision Support Systems, 55(3), 851-860.

Abstract:

Web 2.0 has enabled and fostered Internet users to share and discuss their opinions and ideas online. Thus, a large amount of opinion-rich content has been generated. With more and more women starting to participate in online communications, questions regarding gender emotional differences in Web 2.0 communication platform have been raised. However, few studies have systematically examined such differences. Motivated to address this gap, we have developed an advanced and generic framework to automatically analyze gender emotional differences in social media. Algorithms are developed and embedded in the framework to conduct analyses in different granularity levels, including sentence level, phrase level, and word level. To demonstrate the proposed research framework, an empirical experiment is conducted on a large Web forum. The analysis results indicate that women are more likely to express their opinions subjectively than men (based on sentence-level analysis), and they are more likely to express both positive and negative emotions (based on phrase-level and word-level analyses). © 2013 Elsevier B.V.

Marshall, B., Kaza, S., Jennifer, X. u., Atabakhsh, H., Petersen, T., Violette, C., & Chen, H. (2004). Cross-jurisdictional Criminal Activity Networks to support border and transportation security. IEEE Conference on Intelligent Transportation Systems, Proceedings, ITSC, 100-105.

Abstract:

Border and transportation security is a critical part of the Department of Homeland Security's (DHS) national strategy. DHS strategy calls for the creation of "smart borders" where information from local, state, federal, and international sources can be combined to support risk-based management tools for border-management agencies. This paper proposes a framework for effectively integrating such data to create cross-jurisdictional Criminal Activity Networks (CAN)s. Using the approach outlined in the framework, we created a CAN system as part of the DHS-funded BorderSafe project This paper describes the system, reports on feedback received from investigating officers, and highlights key issues and challenges.

Chen, H. (2009). AI and global science and technology assessment. IEEE Intelligent Systems, 24(4), 68-71.

Abstract:

The five essays on global science and technology S&T assessment from distinguished experts in knowledge mapping, scientometrics, information visualization, digital libraries, and multilingual knowledge management has been discussed. The first essay, 'China S&T Assessment' proposes three fundamental S&T assessment metrics and shows the Chinese emphasis on the physical and engineering sciences and its significant research productivity gains. The another essay, 'Open Data and Open Code for S&T Assessment', introduces science maps to help humans mentally organize, access, and manage complex digital library collections. The essay, 'Global S&T Assessment by Analysis of Large ETD Collections introduce the highly successful Networked Digital Library of Theses and Dissertations (NDLTD) project. The final essay, 'Managing Multilingual S&T Knowledge' describes a research framework for cross-lingual and polylingual text categorization and category integration.

Chen, H., & Zimbra, D. (2010). AI and opinion mining. IEEE Intelligent Systems, 25(3), 74-76.

Abstract:

Opinion mining which is a sub discipline within data mining and computational linguistics refers to the computational techniques for extracting, classifying, understanding, and assessing the opinions expressed in various online news sources, social media comments, and other user-generated content is discussed. Frameworks and methods for integrating sentiments and opinions expressed with other computational representations such as interesting topics or product features extracted from user-generated text, participant reply networks, spikes and outbreaks of ideas or events are also critically needed. Disagreement and subjectivity also held significant relationships with volatility, where less disagreement and high levels of subjectivity predicted periods of high stock volatility. Positive sentiment reduces trading volume, perhaps because satisfied shareholders hold their stock, while negative sentiment induces trading activity as shareholders defect.