Natural language processing is a multidisciplinary field concerned with the interactions between computers and human (natural) languages. Clinical care produces a vast amount of data which could be used by administrators, researchers, and clinicians to improve the quality of care and advance research. However the vast majority of this data is contained within the raw text of physician narratives, requiring either manual abstraction or automated extraction to transform it into usable, aggregatable, and relatable data. The complexity of clinical data and medical language provide unique challenges in text processing and information extraction.
Work alongside natural language processing engineers, researchers, clinicians, data managers and system developers to help architect, develop and implement computational methods and tools to assist time- and resource-intensive manual processes and enable researchers, clinicians, and physicians to retrieve and use clinical information more efficiently, improving healthcare operations and advancing cancer research.
Information extraction from pathology reports
Extraction of important data elements like diagnosis, tumor size, location, genetic markers, treatment history, etc. from breast, prostate, and sarcoma pathology reports.
Extraction of Pancreatic Cancer Diagnosis and Staging
Over the course of 5 months, our graduate level, computational linguistics intern piloted a project to automatically extract diagnosis and staging from clinical notes and pathology reports to build a resource of discrete, queryable data for the pancreatic working group at SCCA.
A Spectrum of Certainty in Epistemic Terms and Hedging Language
So called "hedging language" is ubiquitous in clinical narratives (e.g. "imaging is worrisome for ___ " or "pathology is consistent with ____"). In a subjective grading, how do these various hedging statements relate to the authors' confidence in the given evidence and how does the use of these hedging statements and their level of certainty vary among specialties. Also how do the use of these epistemic/evidential/hedging phrases actually relate to changes in treatment
Simple Temporal Designations for Clinical Timelines
Recreating the clinical timeline of events is a crucial step in outcomes research. For example, piecing together something like "For a given patient cohort, what were the outcomes of treatment X versus treatment Y?" means ascertaining medical and social history, diagnosis, complications, progressions, treatments, and when they all happened. Determining fine grain temporal relations (for example with Allen Calculus) is an extremely difficult task, even for people with extensive linguistic and/or medical knowledge. However, creating broad temporal bins for clinical events (e.g. remote past, recent past, present, and future) could potentially, not only be an easier task, but also capture the significant relations necessary to relate clinical events temporally and create a complete timeline of a patient’s clinical story.
This is a paid, non-benefits eligible internship for a ten week period starting Monday, July 20th.
Attendance for the duration of the program is required.
Please be sure to include the following:
Late or incomplete applications will not be processed. Notification will occur by May 1.
Please read this page completely and carefully before contacting us.
For questions about the program or application process, please contact:
Thank you for your interest!