The use of AI in both research and clinical prediction models is becoming ubiquitous, but there are still several challenges that remain. How do we build context around data? How do we focus on the true question that the AI exists to answer? How do we build trust in these processes?
At the Festival of Genomics and Biodata 2022, we hosted an AI workshop with experts from the space. Here we share some key insights from the session.
One of the first challenges raised in the workshop was the fact that there is a lack of people working in AI in the UK. Boston and Silicon Valley were noted as areas where large communities of individuals working in AI can easily reach out and collaborate on issues. It was postulated that a similar community should be built (metaphorically if not physically) within the UK to further propel thinking forward.
In a similar vein to community, the workshop ended with discussion on talent. How do you draw in the right talent for your project? It was unanimously agreed that the answer is almost never “an expert in AI”. Rather, it was suggested that competent individuals who have expertise in the domain that the AI is being applied in are likely best for the job. Interestingly, a poll in the workshop showed that 53% of attendees said that the AI space needed more people with domain expertise, rather than more data, more AI experts, or more computational ability.
Collaboration with individuals who have domain expertise helps to build context around a particular dataset. AI is typically built around structured data, which are quantitative data like heart rate or blood pressure, T cell count, etc. However, a case study raised during the workshop achieved poor performance of only 65% accuracy in predicting patient mortality when based solely on structured data and was also found to not be scalable. When unsupervised natural language processes were applied to the unstructured data that accompanied the structured data, mortality prediction accuracy jumped all the way to 90%. This was only possible in conjunction with clinicians.
The overall view in the workshop was that dealing with unstructured data, like patient history, is non-negotiable. However, these data are hard to handle. AutoML was raised as an interesting option. Automated machine learning, also referred to as AutoML, is the process of automating the time-consuming, iterative tasks of machine learning model development. Coupled with natural language processing, it could help deal with the sea of hard-to-handle unstructured data.
Another issue with processing unstructured data is ensuring a large, diverse, organised dataset is produced despite the restrictions of data governance. Balancing privacy with knowledge of the diversity of a dataset is sometimes a difficult balance to strike.
Another key insight mentioned in the workplace was that of trust. Patients have a right to know why AI has decided something for them. However, with such complex AI systems, how realistic is this? Improving AI interpretability is key here. The group provided some amusing examples that described how important it is to be able to interpret how an AI has arrived at a particular conclusion. These included:
- An AI made to identify huskies from wolves was instead using the presence of snow in the image to guide its prediction
- An AI made to identify melanoma was instead checking for a histological ruler in the image, a common sight in reference images of melanoma.
- A highly accurate AI was able to predict how sick COVID patients receiving MRIs were. It turns out that the sickest patients had their MRI lying down, while healthier individuals were stood up. The amazingly accurate AI was actually measuring if someone was standing or lying down.
These examples, while funny, actually indicate a really important problem. It’s possible to produce an AI that appears to be absolutely fantastic at doing what you want. But if the route to that decision is not understood, it may be mistakenly and incorrectly applied to a situation that it is completely unable to process. When AI is used in the context of healthcare, the magnitude of such mistakes is amplified.
Obviously, this makes real world validation extremely important. If an AI is informing important decisions, it is crucial that the route to that decision is well understood, and not taken at face value.
Image Credit: Canva