Saturday, 27 April, 2024
HomeResearchRace versus skin tone debate in resolving pulse oximeters' false readings

Race versus skin tone debate in resolving pulse oximeters' false readings

Physicians and government regulators are increasingly aware that pulse oximeters measure oxygen levels less accurately in patients with darker skin. But the issue with the devices is not one of race – rather, it’s one of skin tone, says an expert who has spent more than a decade studying colourism.

An associate professor of sociology at Harvard, Ellis Monk, says much of the work and research to understand the devices’ shortcomings and devise solutions is focused on race. However, he says, it all revolves around skin tone: the light used in the devices to detect oxygenated blood can actually be blocked by melanin in the skin.

Indeed, it was largely race and not skin pigment that was discussed when a US Food and Drug Administration panel met last month to advise the agency on how best to improve the devices, reports STAT News.

And race has long been a proxy for skin tone in research studies because it’s something that’s recorded in both medical and census records, while skin tone is not. Medical studies showing that that the devices missed dangerously low oxygen levels in patients with darker skin used race as well.

But the two are very different. Black people can have a huge variety of skin tones, ranging from very dark to very light. And some people who are Asian, Hispanic, or indigenous have darker skin than people who are black.

Yet there hasn’t been a good way to characterise these differences in skin tone in medical research, especially for those whose skin is of darker shades. It’s something Monk wants to fix. “To do the foundational research …we need to… get pulse oximetry that works for everyone, think very deeply about skin tone,” he said.

Monk has spent more than 10 years researching colourism, a form of discrimination based on skin colour that tends to favour lighter-skinned people over darker-skinned people. He’s among social scientists who have published a stream of studies showing that skin colour, not just race, is a major factor in health and other disparities. People with darker skin are more likely to receive the death penalty, earn less, and have poorer health.

But as with pulse oximeters, Monk says a limitation in this research has been the lack of a reliable way to measure skin tone. So, working with Google, he developed the Monk Skin Tone Scale, which includes a fuller range of darker skin tones than the cruder tools currently used. Such a scale, he said, is essential to ensure pulse oximeters – and many other technologies and medical devices – work equally well for all people.

Monk’s scale, with its broader range of standardised colours has advantages over other scales, said Michael Lipnick, an associate professor of anaesthesia at the University of California San Francisco, and an investigator at UCSF’s Hypoxia Lab.

The lab has started the Open Oximetry Project, which is working to help the FDA and other stakeholders to assess the limitations of pulse oximeters. Other scales, he said, “leave too much room for subjectivity and may not adequately account for colour at the site of pulse oximeter measurement”.

The scale most commonly used in labs is the Fitzpatrick scale, which by its very reason for existence is skewed to assess lighter skin. The widely used scale was developed by Harvard dermatologist Thomas B Fitzpatrick in 1975 to assess both sunburn risk and the risk of skin damage during medical treatments with UV light for conditions like psoriasis or eczema.

Because lighter skin has less melanin to filter out harmful UV rays, it is considered more susceptible to damage. The original scale had just four shades, all light. It wasn’t until 10 years later that two more shades were added, one for brown skin tones and one for black skin tones – woefully inadequate to represent the almost infinite shades of skin tone in the real world.

The scale is increasingly seen as inadequate for dermatology as well because it does not contain enough dark tones, implies that darker skin doesn’t burn, and is often used by physicians to conflate skin colour and race or ethnicity.

But because it was there, the Fitzpatrick scale became the de facto standard for engineers and researchers who needed to measure skin tone, Monk said. It’s also been the basis for the six skin colours used in emojis and the standard used in developing machine learning algorithms for a range of technologies.

The paucity of skin tones used in machine learning has become abundantly clear in work from such scholars as MIT’s Joy Buolamwini and Princeton’s Ruha Benjamin, who have pointed out racist algorithms that lead automatic light switches to stay off when people with darker skin walk into a room, taps to stay dry when darker-skinned hands are placed beneath them, and self-driving cars to not detect and stop for people with darker skin.

“That’s computer vision using light to sense whether there’s a hand there. With dark skin, not enough light came back to the sensor. That means they didn’t test whether their sensor worked with enough skin colours,” Monk said.

His scale has 10 shades compared with Fitzpatrick’s six. Both scales contain four swatches of light skin shades, but Monk’s has six to represent medium and darker shades. The scale, he believes, is a sweet spot between too few and too many shades, and was developed based on his work on coloirism in two countries that have populations that are highly racially mixed, the US and Brazil.

(Another scale, the Massey-Martin scale developed for use in immigrant surveys in 2003, has 10 shades, but did not take hold widely in research labs and has been criticised by some because the darker shades are too similar.)

Ten shades may not seem enough: some racially aware cosmetics companies offer hundreds of shades to customers choosing foundation, and Crayola now offers 24 skin-tone crayons. But in medical research, an exact match is not as important as practicality.

“You can’t have more than 10 or 12, or at a certain point trying to pick out differences gets really hard to do,” Monk said. The scale, he said, involved “making some hard choices because no scale, even one with 150 points, can represent every skin tone out there.”

For those developing improved pulse oximeters, a more diverse scale can help determine how well the devices work on people with a range of skin colours by allowing more precise ratings of the skin colour of test subjects. It could also facilitate the creation of guidelines that require manufacturers to test their devices on a range of skin tones, including those that are very dark.

Current FDA guidelines for pulse oximeter approval state merely that two “darkly pigmented” subjects must be included in testing.

“Two darkly pigmented people? You can interpret that however you want,” said Grace Wickerson, a policy entrepreneurship fellow at the Federation of American Scientists who has been pushing for stronger regulation of medical devices, more diversity in populations that are tested, and more objective measures of skin tone such as Monk’s scale.

“This is a scale that’s about skin pigmentation. It’s not a scale that’s about UV exposure,” Wickerson said.

Monk is teaming with Robert Wilson, a physicist and optics expert at the,University of California Irvine, to develop a better device, and was recently awarded a $2.5m NIH Director’s New Innovator Award to assess and try to fix biases in the algorithms used in pulse oximeters. That grant will also fund a longitudinal survey to examine how skin tone, colourism, and social stress affect mental and physical health among black Americans.

Many device manufacturers have said that their pulse oximeters work better on darker-skinned test subjects than the recent medical studies conducted on hospitalised patients suggest. This could be because the devices worked better in idealised lab conditions than in the real world, but could also be, Monk said, because researchers using colour scales with few choices found it easy to rate subjects as having darker skin than they actually do.

Another way to measure skin colour would be to use highly precise devices such as spectrophotometers. But these machines never took hold in dermatology offices because they are expensive and inconvenient, and may, ironically, be less accurate than simple paper or digital colour scales because they’re influenced by features such as vascularity and erythema that can darken a patient’s skin tone.

“In being so precise in measuring the skin, some of these objective measures actually end up bringing in confounders,” Monk said.

Monk didn’t set out to fix pulse oximeters. His project got off the ground when Google contacted him nearly three years ago in an attempt to solve problems with its smartphone cameras, which did not work as well on darker skin; with its Google Photos app, which now includes filters to enhance images of darker skin; and with search algorithms that often spit out image collections that only include lighter-skinned people.

In a series of product updates, the company said it was using Monk’s skin tone scale to “better understand representation in imagery, as well as evaluate whether a product or feature works well across a range of skin tones”, something critically important for computer vision work.

Monk has been working for years to understand the impact of colorism on health.

Many people first learned that pulse oximeters were less accurate for people with darker skin during the Covid-19 pandemic, when the devices became indispensable for determining who might need hospitalisation or supplemental oxygen. But the fact that the devices he’s now trying to help fix didn’t always work well on people with darker skin was no surprise to Monk.

“My mother had a lung condition,” he said. “So I knew they were problematic.”

 

Stat News article – Before making unbiased pulse oximeters, researchers need a better way to measure skin tone (Open access)

 

See more from MedicalBrief archives:

 

Pulse oximeters deliver unreliable readings across ethnic groups

 

UK investigation into racial and gender bias in medical devices

 

Pulse oximetry accuracy varies between race groups – US cohort study

 

A ‘glaring’ lack of darker skin in textbooks and journals

 

 

 

 

 

MedicalBrief — our free weekly e-newsletter

We'd appreciate as much information as possible, however only an email address is required.