Ben Fields, Lead Data Scientist, BBC

In this talk I’ll discuss the BBC’s usage of content-based text similarity for both external applications such as ‘more like this’ recommenders and internal systems such as our commissioning support tool. As a key part of this I’ll talk through a practical approach to evaluate similarity spaces of news articles, guided by human perception. This approach will be contextualised with a brief background in human similarity measurement and perception alongside a discussion of computation methods for measuring similarity between news articles. I’ll bring everything together by showing how these techniques are used in practice, in both internal and external applications. By the end of this talk you should have a good understanding of tuning computed text similarity to better match subjective opinions and some applications where these techniques are effectively applied.