Summary
Objectives: The Web provides a huge source of information, also on medical and health-related
issues. In particular the content of medical social media data can be diverse due
to the background of an author, the source or the topic. Diversity in this context
means that a document covers different aspects of a topic or a topic is described
in different ways. In this paper, we introduce an approach that allows to consider
the diverse aspects of a search query when providing retrieval results to a user.
Methods: We introduce a system architecture for a diversity-aware search engine that allows
retrieving medical information from the web. The diversity of retrieval results is
assessed by calculating diversity measures that rely upon semantic information derived
from a mapping to concepts of a medical terminology. Considering these measures, the
result set is diversified by ranking more diverse texts higher.
Results: The methods and system architecture are implemented in a retrieval engine for medical
web content. The diversity measures reflect the diversity of aspects considered in
a text and its type of information content. They are used for result presentation,
filtering and ranking. In a user evaluation we assess the user satisfaction with an
ordering of retrieval results that considers the diversity measures.
Conclusions: It is shown through the evaluation that diversity-aware retrieval considering diversity
measures in ranking could increase the user satisfaction with retrieval results.
Keywords
Information retrieval - diversity - medical social media - Web search