Achieved 80% accuracy in NLP classification

Who are you?

GlaxoSmithKline (GSK) is a global healthcare company based in London, UK. It is one of the largest pharmaceutical companies in the world. When developing new products, part of the process includes the rigorous testing of these products through drug trials. GSK tasked Peak with improving its methods for classifying the open text field data received as part of drug trials.

What was the challenge?

When GSK run drug trials, they have MSLs (Medical Science Liaisons) whom they connect with to obtain various information from healthcare professionals. The MSLs conduct interviews with these professionals and input raw, unstructured text data, and then send it to GSK. Issues arise when GSK has to label and classify this information manually; it’s incredibly time-consuming, and different GSK employees may subjectively classify responses in different ways.

What did Peak do?

Peak has developed a unique Natural Language Processing (NLP) classification model for GSK, powered by the Peak AI System. This model is used to label a controlled set of sentences. Ambiguity is high in open field text, so to handle this, Peak’s model can provide up to two or three labels (if appropriate) to any given comment.

The AI System increases the value and accuracy of a particular comment compared to only using one label. The model can also run through all historical responses as well as ones that will be input in the future to standardize labels and remove subjectiveness.

NATURAL LANGUAGE PROCESSING: Natural language processing, or NLP, is the application of computational techniques to the analysis and synthesis of natural language and speech.

What’s the upshot?

The model provides GSK with a more efficient way of labeling sentences than their previous manual methods. GSK ran strict testing of Peak’s model on a controlled data set, which delivered results with 80% accuracy. One of the other main benefits of the Peak model was the ability to assign more than one label to particular sentences. 70% of all sentences were assigned a second label, with 30% assigned a third label – allowing GSK to perform a more in-depth analysis of this essential language data.

Stay in touch!

Subscribe to our newsletter to find out what's going on at Peak.