Exploring Potential AI Bias in Healthcare
We have all heard of the quote attributed to Benjamin Disraeli: “Lies, damned lies, and statistics.” Lately, there has been increasing concern that AI outputs may be either intrinsically or intentionally biased. When coupled with the reality that AI’s power and intelligence is progressing at an exponential rate, this is especially concerning. There has been a great deal of media hype about internal bias in AI, and we must separate the wheat from the chaff as we examine potential examples in the healthcare realm.
Most people do believe that there is some bias to AI, if for no other reason than most feel that the internet itself, the source of much of the training data for GPT, may be biased in terms of content. Robert Thompson of News Corp stated: “The danger is rubbish in, rubbish out, rubbish all about. Bots like ChatGPT will regurgitate the claptrap as fact.” He also stated that one could see the effects of the bias of the input-er.
Earlier this year, researcher and associate professor David Rozado put Chat-GPT through three political internet quizzes: The Political Compass, Pew Political Typology Quiz, and the World’s Smallest Political Quiz. In all three, he claims, Chat-GPT scored far toward the left and libertarian, seeming to provide evidence that an intrinsic AI bias exists, at least on politically or ethically charged tasks. In further analyses, he found substantially less political bias—but his research serves to illustrate the potential bias we face.
How Does Bias Emerge? Examples of Bias in AI
To pursue this further, we can conceptualize the AI process as follows:
AI input 🡪 Algorithms + Datasets 🡪 Intelligent AI output
There could be opportunity in each of those steps to introduce bias. The resulting bias could be either unintentional or deliberate. Some potential examples of bias in each of those AI components could include:
AI Input:
- It is possible that the initial question posed to GPT could be ambiguous or misleading. The computer following the algorithms and training could reach a biased result simply on the way that the request was phrased. Users have noticed GPT giving different responses to the same questions if asked on different occasions. Though not truly a bias, the input may lead to an unintended output simply on the basis of the way it was presented.
Algorithms:
- There could be an innate bias in the AI programmers (input-ers). Although there is little objective data to support this, one could argue that the lack of diversity among the programmers can contribute to a systemic bias in the very algorithms that are used to apply to data sets. The programmers may be biasing the algorithms based upon their own experiences or preferences that may not lead to an objective result.
- The actual algorithms could be insufficient to fend off bias. As the programming becomes more sophisticated and mature, this would not be a significant issue; however, we simply do not know at this point whether there are gaps in the logic that could sway the output in a nonobjective fashion.
- It is possible that the algorithms could be intentionally designed to be hostile or to lead in a particular direction. Could a nefarious programmer “poison” the algorithms to further a particular cause?
Datasets:
- The intrinsic bias of the internet. No one debates that the information on the internet is far from objective. There is no regulation of what could wind up on the internet. This is especially true when patients access the internet. There is excellent data and there is nonsensical data. How does the user know the difference? Likewise, how would GPT be able to distinguish which sources are credible and which ones are not?
- The training data for GPT could be intrinsically biased or insufficient. By and large training data is proprietary. We do know that much of the training data comes from the internet. If the training data is not appropriately vetted and verified, the GPT could “learn” how to reach a solution in a non-objective or even false manner.
- It is also possible that the training data is intentionally false. Once again, the designer of the training data could introduce a “poison” that would affect how the program would operate.
AI Output:
- Emergent properties are unexpected and unpredictable events or outcomes that arise out of the collaborative functioning of a system. Put another way, over time, AI can develop a “mind of its own” that is uncontrollable by the input-ers. There is some controversy as to whether emergent properties actually exist as they are less observed in larger, more sophisticated models. Some believe that they do not exist and could be due to the metrics used to evaluate the AI process.
- Hallucinations or confabulations. There is no doubt in users of GPT that on occasion, GPT will give a nonsensical response that is not justified by its training data or source algorithms. Oren Etzioni, founding CEO of the Allen Institute for AI, said earlier this year that “[AI can] give you a very impressive-sounding answer that’s just dead wrong.” Like emergent properties the etiology is unclear. One could argue that a hallucination is not a true bias, but it would cause the AI to give a random, non-objective response to the user.
Understanding the Potential for Bias in AI is Key
So, there are numerous opportunities to introduce bias throughout the process with artificial intelligence. Some of the above are theoretical and have not been observed, but they are certainly possible.
As end users of this technology, we must consider these possible bias examples in our AI solutions, particularly in the healthcare field. The more we can do to understand the process and potential etiologies of bias, the more dependable the end result will be.As always, I welcome your thoughts and feedback. If you’re interested in bringing my presentation on AI in healthcare to your organization, please reach out to me.