Dr Di Cai
firstname.lastname@example.org | 01484 472340
After completing a first (Hons) degree in applied mathematics, Dr Di Cai was appointed lecturer/senior lecturer in the Department of Mathematics at Tianjin University of Science and Technology, P.R. China. She taught a wide range of subjects in applied mathematics, engineering mathematics and advanced engineering mathematics. She was a course leader for engineering mathematics and a member of the University/Department Teaching and Learning Committee. She was involved in the organization of national conferences and workshops on the teaching and learning of engineering mathematics.
Di was awarded her PhD in Information Retrieval (IR) in 2004, under the supervision of Professor Keith van Rijsbergen, leader of a world-leading IR research group in the Department of Computing Science (DCS) at the University of Glasgow (UG). Since then Di has worked as a research fellow on several projects in areas of IR, Text Mining, Document Classification and Sentiment Analysis. She commenced work on an EPSRC funded project: “XML technologies for the acceleration of cancer drug target discovery” in DCS in 2004. Soon after, she applied successfully for funding from Microsoft Research, Cambridge, for her own research project: “A discrimination information model for automatic query reformulation” in the same Department. In late 2006, she worked on a BBSRC funded project: “A taxonomically intelligent phylogenetic database” in the Institute of Biological & Life Sciences at UG. She moved to the School of Computing and IT at the University of Wolverhampton in 2009, working on a large EU funded project: “Collective emotions in cyberspace”. Di joined the University of Huddersfield in 2011.
Di is currently a member of IEEE (The Institute of Electrical and Electronics Engineers), ACM (The Association for Computing Machinery) and BCS-IRSG (The BCS Information Retrieval Specialist Group).
Research and Scholarship
Dr Di Cai’s range of research interests centre around any type of information processing that can be represented mathematically, including formal modelling, quantitative method development, problem solving, algorithm design and large-scale data analysis.
Di’s current research focuses on fundamental issues relevant to many areas of science, including statistical semantic analysis of features (concepts, terms, phrases, words, etc.), representation of objects (documents, abstracts, sentences, queries, etc.), detection of unreliable samples (obtained from web users), based on a variety of theories (probability theory, information theory, theory of evidence, rough set theory, as well as machine learning methods). Some specific topics are:
- measurement of discrimination information of features
- measurement of semantic relatedness/association between features
- identification of informative terms and sentiment-bearing terms
- extraction of key terms and taxonomic names
- thesaurus simplification and normalization
- key term modelling and term classification
- representation and modelling of objects
- measurement of similarity between objects, between features, between objects and features
- query formulation and reformulation (automatic/semi-automatic/interactive)
- algorithm development for system design and implementation
- data analysis and corpus processing
- detection of outlying ratings and identification of unreliable samples (when data is gathered from social websites)
Publications and Other Research Outputs
Cai, D. and Wade, S. (2012) ‘A Rule-Based Method for Outlying Rating Detection’ International Journal of Computer and Communication Engineering , 1 (4), pp. 466-471. ISSN 2010-3743
Cai, D. and McCluskey, T. (2012) ‘A Simple Method for Estimating Term Mutual Information’ Journal of Computing , 4 (6), pp. 1-6. ISSN 2151-9617
Kowalska, K., Cai, D. and Wade, S. (2012) ‘Sentiment Analysis of Polish Texts’ International Journal of Computer and Communication Engineering , 1 (1), pp. 39-42. ISSN 2010-3743
Thelwall, M., Buckley, K., Paltoglou, G., Cai, D. and Kappas, A. (2010) ‘Sentiment Strength Detection in Short Informal Text’ Journal of the American Society for Information Science and Technology , 61 (12), pp. 2544-2558. ISSN 0002-8231
Cai, D (2010) ‘An Information-Theoretic Foundation for the Measurement of Discrimination Information’ IEEE Transactions on Knowledge and Data Engineering , 22 (9), pp. 1262-1273. ISSN 1041-4347
Cai, D (2009) ‘Determining Semantic Relatedness through the Measurement of Discrimination Information Using Jensen Difference’ International Journal of Intelligent Systems , 24 (5), pp. 477-503. ISSN 0884-8173
Cai, D. and van Rijsbergen, C. (2009) ‘Learning semantic relatedness from term discrimination information’ Expert Systems With Applications , 36 (2), pp. 1860-1875. ISSN 0957-4174
Cai, D. and van Rijsbergen, C. (2008) ‘An Algorithm for Modelling Key Terms’ International Journal of Intelligent Systems , 23 (1), pp. 50-81. ISSN 0884-8173
Cai, D. and van Rijsbergen, C. (2005) ‘Semantic Relations and Information Discovery’. In: Intelligent Data Mining: Techniques and Applications. London, UK: Springer. pp. 79-102. ISBN 9783540262565
Cai, D (2001) ‘Data mining based on evidence theory’. In: Soft Computing for Risk Evaluation Management: Applications in Technology, Environment & Finance. : Springer. pp. 97-120. ISBN 9783790814064
Cai, D (2001) ‘Extension and applications of evidence theory’. In: Soft Computing for Risk Evaluation Management: Applications in Technology, Environment & Finance. : Springer. pp. 73-93. ISBN 9783790814064
Research Degree Supervision
Dr Di Cai’s research interests are in Data Mining and Artificial Intelligence in general and specific to the following areas:
- Information Retrieval and Extraction
- Text Mining and Analytics
- Document Classification and Summarization
- Sentiment Analysis and Opinion Mining
- Please contact this member of staff to discuss possible opportunities.