Podcast Segment Retrieval Spotify

Supervisor: Judith Bütepage

Collaborators: Fredrik Segerhammar & Mariya Lazarova (Statistical Method), Tianzong Wang (Deep Learning Method)

Locating the best matching paragraph in a document given a search query is a very well studied problem. However, for podcast data the problem is newer and there is not much research done on it. We attempt to retrieve the best jump-in point for relevant segments of podcast episodes given arbitrary user search queries, using the dataset provided in the TREC 2020 Podcasts Track. We propose two methods, one based traditional statistical methods utilizaing TF-IDF and Okapi BM25, and another Sentence-Transformer based deep learning embedding method, to target the first Ad-hoc Segment Retrieval task. A detailed project report and presentation will be released later, or upon request.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Podcast Segment Retrieval Spotify

Supervisor: Judith Bütepage

Collaborators: Fredrik Segerhammar & Mariya Lazarova (Statistical Method), Tianzong Wang (Deep Learning Method)

Files

README.md

Latest commit

History

README.md

File metadata and controls

Podcast Segment Retrieval Spotify

Supervisor: Judith Bütepage

Collaborators: Fredrik Segerhammar & Mariya Lazarova (Statistical Method), Tianzong Wang (Deep Learning Method)