Process Flow

Guideline documents are edited using GEM Cutter to form GEM Documents. These are then uploaded to a repository using a program to extract the essential elements. The next step is to run Apache cTAKES, which is an UIMA-based NLP processor for clinical documents, that forms annotations for the guideline text. These annotations, which include UMLS codes, are then stored in the repository for later retrieval. The final step is to create SVM classifiers based on training sets created by clinical experts. Examples are being designed and developed and will be made available as soon as they are ready.

An important initiative that we are pursuing is the automated identification of UMLS cui codes with recommendation text. The challenge is to optimize the set of codes that are related to the guideline text. We are employing cTAKES (Document Concepts) as well as trying Cosine Similarity measures (Filtered Actions), n-grams, and Sorenson-Dice coefficients (UMLS Filtered Codes) to reduce the matches generated from UMLS search results.