Explainable Text Classification: Producing performant solutions using hybrid lexicon-based approaches


Automated text classification is an important area in natural language processing with application to sentiment analysis, emotion or stance detection.  This potentially allows users to analyse massive amounts of data, which would be too time-consuming to do manually.  Current Machine-Learning based approaches for automated text classification tend to be based on complex black-box algorithms that deliver high accuracy predictions but at the cost of not being able to explain the rationale behind their decisions.  One solution is to use lexicon-based classifiers, which are logic-based and therefore are explainable but normally comes at the cost of classification accuracy. 

In this project, we want to design an experiment to compare select classes of text classifiers from different families.  This will involve identifying different types of classifiers to compare and implement.  The core of the project will involve implementing one or more hybrid classifiers that conciliate lexicon and standard supervised approaches such as in [Clos, Wiratunga and Massie, 2017].  The hybrid classifier could also be extended to handle multiple modifier terms.  It could also be applied to categorise web search results in the manner of [Chen and Dumais, 2000].


[Clos, Wiratunga and Massie, 2017] Jeremie Clos, Nirmalie Wiratunga and Stewart Massie.  Towards Explainable Text Classification by Jointly Learning Lexicon and Modifier terms, Proceeding of IJCAI-17 Workshop on Explainable AI

[Chen and Dumais, 2000] Hao Chen and Susan Dumais.  Bringing order to the web: Automatically categorising search results. Pages 145-152, 2000



  • preferably has done the Document Analysis course
  • self-motivated and ability to work independently with provided guidance


natural language processing, machine learning, logic, artificial intelligence

Updated:  10 August 2021/Responsible Officer:  Dean, CECS/Page Contact:  CECS Marketing