Talks & master-classes

Applying Topic Segmentation to Document-Level Information Retrieval

October 13, 12:30
Room III

Discuss the presentation

In the present paper we discuss how text segmentation could be applied in the information retrieval domain. We assume that topic text segmentation allows one to better model text structure and therefore language itself, which influences the quality of text representation. We test the initial hypothesis by conducting experiments with several baseline models on the arXiv dataset comparing their quality on whole texts and on segmented texts. The experiments demonstrated that, indeed, the quality of retrieval is generally slightly improved.

Polina Kazakova

Data Scientist, Integrated Systems

Nikita Nikitinsky

CTO, Integrated Systems

Gennady Shtekh

Lead Data Scientist, Integrated Systems

Sponsors & Partners

Sponsors

Gold

JetBrainsFirst Line Software

Sponsors

BellSoftPVS-Studio

Embedded

Auriga

Partners

Gold

Digital October

Main partners

RUSSOFTAP KIT

In cooperation

ACM Special Interest Group on Software EngineeringAssociation for Computing Machinery

Technical partners

CUSTIS0x1.tvMajordomo

Organizers

Software Russiai-Help