Metadata based Contextual summarizer for Technical Conversations in Public forums

In recent years, the task of sequence to sequence based neural abstractive||summarization has gained a lot of attention. Many novel strategies have been||used to improve the saliency, human readability, and consistency of these models,||resulting in high-quality summaries. However, because the majority of these pretrained||models were trained on news datasets, they contain an inherent bias. One||such bias is that most of these generated summaries originate from the start or end||of the text, much like a news story might be summarised. Another issue we encountered||while using these summarizers in our Technical discussion forums usecase||was token recurrence, which resulted in lower ROUGE-precision scores. To overcome||these issues, we present a unique approach that includes: a) An additional||parameter to the loss function based on ROUGE-precision score that is optimised||alongside categorical cross entropy loss. b) An adaptive loss function based on token||repetition rate which is optimized along with the final loss so that the model||may provide contextual summaries with less token repetition and successfully learn||with the least training samples. c) To effectively contextualize this summarizer for||technical forum discussion platforms, we added extra metadata indicator tokens to||aid the model in learning latent features and dependencies in text segments with||relevant metadata information. To avoid overfitting due to data scarcity, we test and||verify all models on a hold-out dataset that was not part of the training or validation||dataset. This paper discusses the various strategies we used and compares the performance||of fine tuned models against baseline summarizers n the test dataset. By||end-to-end training our models with these losses, we acquire substantially better||ROUGE scores while being the most legible and relevant summary on the Technical||forum dataset.

Speakers: