Improving search engine efficiency with emerging memory and storage

People

Researcher

Description

Search engines are ubiquitous. Although information retrieval via web search engines is mature, a new more critical venue for text search is emerging. In particular, social media platforms, such as Facebbok and Twitter, are driving new avenues for real-time search. In this project, I am actively looking for an honors or a master's student to investigate the use of emerging memories and storage devices, such as Intel Optane memory, and NVMe-based solid state drives, to improve the performance and efficiency of full-text search.  

 

Goals

The student needs to have excellent system building skils.  The student should be extremely comfortable and proficient in the use of Linux OS.  We will tease apart and significantly modify an industrial strength search library namely Lucene.  Lucene is written in the Java programming language.  Proficiency in Java is therefore essential.  The main task for this project would be replacing the backend of Lucene (filesystem-backed database) with an in-memory key-value store.  If all this seems exciting to you, and you are up for a challenging but highly rewarding project, please contact me at Shoaib.Akram@anu.edu.au.  

Requirements

Interest and some background in Linux OS and Java and algortihms

Background Literature

https://lucene.apache.org

Earlybird: Real-Time Search at Twitter

Gain

Good system building and programming experience

High impact 

Keywords

Search engines

Memory management

Storage devices

Intel Optane Memory

NVMe Solid State Drives

Inverted Index

Hash Tables

Key Value Storage

Updated:  10 August 2021/Responsible Officer:  Dean, CECS/Page Contact:  CECS Marketing