A lot of data to swallow
Dysphagia is a broad term for any difficulty a person may have swallowing. This can be anything from taking a bit longer to swallow larger or more viscous material, to a complete inability to swallow at all. The causes of dysphagia range from the fairly mechanical, muscle failure or blockage, to the neurological, multiple sclerosis, muscular dystrophy and Parkinson's disease. Some cancer's and their treatments can also damage the swallowing reflex. One particularly unpleasant form of the problem occurs when the throat fails to prevent food from entering the windpipe. In healthy patients, this would result in coughing, but another side effect of some of the neurological conditions is a failure of the cough reflex, and so food is allowed to enter the lungs. With large pieces of food this can cause choking, but more usually the pieces of food are small and get into the lung without any immediately obvious effect. However, the food is likely to eventually cause pneumonia, which is potentially lethal for patients with the types of conditions listed above. Therefore, it is very important to know whether food is entering the airways, which is called "aspiration", since then it can be mitigated against using special swallowing techniques control of the diet.
The current gold standard for detection of aspiration is fluoroscopy, which involves taking a live x-ray video of someone as they swallow food that is laced with barium, which shows up well on the x-ray, to try to directly observe whether food is entering the airways. Disadvantages of this include the fact that it involves a significant radiation dose for the patient, and only represents their swallowing on a particular day, and in the rather artificial conditions of a fluoroscopy clinic. It also requires the patient to be able to respond to the directions from the clinicians, which makes can make aspiration effectively invisible for patients with some neurological conditions. We would like a method of detecting aspiration that does not require x-rays, and preferably could be used out of a hospital with patients with more limited ability to respond to questioning.
Our aim is to use microphones placed on the throat to detect whether food is entering the lungs. We have been working with a team from the Sheffield Royal Hallamshire Hospital who run a clinic detecting aspiration. When a patient comes in for a fluoroscopy, we place microphones at the cricothyroid ligament (the front/middle of the throat) and the suprasternal notch (the notch in the centre of the upper chest between the two collar bones). We record all of the patients swallowing, coughing, and breathing sounds while the x-ray video is being taken. The x-ray video can then be used to label all of the audio files with whether or not aspiration is happening, as well as other relevant information such as the position of the head and the timing of any swallows. This data allows us to use a basic form of machine learning.
We are currently about halfway through a study involving around 20 patients. From this, we aim to develop a device that can then be tested on thousands of patients, which would be required for it to be used more widely on the NHS. With the data collected so far we have been able to predict aspiration from the sound files with a sensitivity and specificity of approximately 75 percent. This is not as good as we hope to get, but is already better than methods that does not use fluoroscopy. We are also refining the machine learning method quite significantly as the remaining data is collected. We have found that, as is often the case, the most important part of the process is the segmentation of the sound file, which is divided up into labelled sections, and the removal of any irrelevant sections.
As already mentioned, our future aims include developing a device to test on a much larger population of patients. This will allow a much more sophisticated machine learning methods. With the small data set so far collected we are limited to simple methods such as k-nearest neighbour and support vector machines, but more data will allow more modern neural network-based methods to be used. We are also interested in using the detailed video and sound data that we have collected to get a better understanding of how swallowing sounds are produced. Bubbles are oscillating, cavities are resonating, and flexible structures are vibrating, and the relative importance of all of these is not yet understood. We believe that a better understanding of this process will also help to detect it and refine the machine learning processes we use.
By Dr Alastair Gregory (2010)