We have furthered this activity by implementing the FastText classifier. This is a C++ program that we compiled for our Linux platform and ran in a Java program utilizing our expert’s Action Types data set. The results are displayed in the AI section for comparison with our Naive Bayes approach. We have not implemented multi-labels, although FastText has this feature. (This was the primary source of error.) We used n-grams of size 2, learning rate of 1, and epoch of 30. The results are very encouraging. If you wish to try it out you can use this data set, where we have added a “__label__” tag to work with Fasttext.

Acronyms are always a stumbling block for NLP and our capturing of these in a reference database table should help in reducing these errors.

Our next activity will be to examine ways to utilize n-grams to identify recommendations that are applicable in a clinical setting. Based on the frequency results, we will convert these to probabilities and conditional probabilities.  We can then assign a likelihood that a given recommendation will be relevant to a clinical context source, such as an order set or EHR clinical note.   We are also going to see how to build on our FastText results.