Authors: Matthew Lungren1, Johanna Kim1, Stephanie Bogdan1, William Lane2, Josh Risley2, Katy Haynes2, Ziad Obermeyer2,3
1 Stanford Center for Artificial Intelligence in Medicine and Imaging
2 Nightingale Open Science
3 University of California, Berkeley
Lead Nightingale analyst: William Lane
Lungren M., Kim J., Bogdan S., Lane W., Risley J., Haynes K., and Obermeyer Z. (2021). Predicting fractures and pain using chest x-rays. Nightingale Open Science Dataset. doi:10.48815/N5RP44
For many older patients—and some younger ones—a fracture marks the beginning of the end. The fracture itself is seldom fatal; but it sets off a downward spiral of pain, decreased mobility, physical deconditioning, debility, and ultimately death. This is why screening for osteoporosis, recommended today for women starting at age 65, is so critical: the appearance of bones on a special type of x-ray (called a DEXA scan) shows us who is at high risk of fractures, and lets us start treatments to prevent them before they happen.
Given the massive costs of fractures—to both patients, and the health care system, which a recent report put at nearly $60 billion for fractures in US Medicare patients alone—it’s clear that our current screening strategies are not adequate. For one thing, despite established guidelines calling for universal screening over age 65, the vast majority of women don’t get it—not to mention the fact that many fractures occur in men and younger people, for whom guidelines don’t recommend screening. So it would be very useful to find another way to predict fractures at scale, using routinely available data.
The chest x-ray is, by far, the most commonly-performed radiological study in the world, done when patients see their doctor for a cough, chest or back pain, before surgery, in the ER, on admission to the hospital, and in a variety of other settings. An interesting fact about the ‘chest’ x-ray is that it also gets a very clear view of the spine, from neck to the upper lumbar area. And the spine is an excellent place to assess the quality and quantity of bone, which may hold signal for predicting future fractures.
This dataset starts with the Stanford Artificial Intelligence in Medical Imaging CheXpert dataset, which contains x-rays from across the Stanford Medicine system: in outpatient clinics, in the ER, or in the hospital. As part of that initial dataset, the chest x-rays were linked to the radiologist’s interpretation of the image, which we also provide here.
But this dataset goes further, adding labels on both health outcomes and patient experiences. First, we link each x-ray to the occurrence of past and future fractures, not just in the spine, but all over the body; and to data on diagnoses of osteopenia and osteoporosis, so that researchers can compare algorithmic predictions to what doctors already know about patient risk. (We also have the CheXpert labels, so researchers can also observe whether the doctor saw a fracture in the actual chest x-ray.) We also link the x-rays to diagnoses of musculoskeletal problems (joints, tendons, pain, etc), again past and future and all over the body, to test the hypothesis that subtle features of the chest x-ray might also be able to yield insights into a range of musculoskeletal issues (as other recent articles have suggested). Finally, we also add other relevant data elements describing the patients, including height, weight, and selected vital signs.
A few notes to keep in mind. All labels—on fractures, pain, etc.—will only be present if the patient received care involving that fracture in some part of the Stanford Medicine system. This creates bias in who is labeled, since some patients who have fractures will not show up, or go elsewhere. Note also that many, but not all, x-ray studies contain two orthogonal images: the PA [postero-anterior] view taken from back to front, and the lateral view from the side. (Some patients, particularly those who are too frail or sick to stand up, receive only the AP [antero-posterio] view from front to back, while lying down in bed.) Finally, note that there can be multiple chest x-ray studies per patient, on different days.
The Stanford Artificial Intelligence in Medical Imaging (AIMI) Center supports the development, evaluation and dissemination of new artificial intelligence methods applied across the medical imaging life cycle, in order to solve clinically important problems in medicine using AI. Their mission is to develop and support transformative medical AI applications and the latest in applied computational and biomedical imaging research to advance patient health. Building on their trailblazing work to release imaging datasets like CheXpert, this Nightingale dataset holds the promise to predict future fractures and frailty in patients, which could lead to the creation of tools for triage and diagnosis, and optimize over-burdened hospitals. This dataset was conceived of and created by Dr. Matthew Lungren and Johanna Kim, Co-Directors of the Stanford AIMI Center, as well as Stephanie Bogdan, Project Manager for the Stanford AIMI Center. We are deeply grateful for their help, as well as their inspirational work to make data available as a public good.
v1: Each observation in the dataset corresponds to one of 224,316 chest x-ray studies, from 65,240 unique patients between October 2002 and July 2017. The x-rays were then linked to electronic health record data from the Stanford Medicine system using patient MRN. We queried ICD diagnosis tables to obtain codes on fractures and pain over one years before and after the date of the x-ray, and patient flowsheet data to obtain data on height, weight, body temperature.
v2 (target release date: March 2022): We will add diagnosis and procedure codes that capture pulmonary deterioration in the short-term after the x-ray was done, as well as the setting of the x-ray (e.g., the ER, inpatient, clinic). This will allow researchers to predict this important outcome, and align this dataset with other Nightingale Open Science datasets that also involve prediction of pulmonary deterioration with chest x-rays.
Dataset construction and key outcome variables are shown in the schematic below. A note on color choices: the burnt siena (orange) indicates the node that corresponds to the observations (rows) in the dataset, and the grape (purple) indicates key patient outcomes.
We obtained data on ICD-9 codes 800–829 (fractures) over the year before and after the x-ray. In the summary table below, diagnoses were grouped by body region, but individual ICD-9 codes are available in the dataset.
|skull and face||800,801,802,803,804||1303||77.44%||65.54%||83.73%|
|spine and ribs||805,806,807,809||4779||75.27%||68.07%||83.16%|
|pelvis and hip||808,820||2075||66.36%||57.64%||76.14%|
|scapula and clavicle||810,811||748||78.34%||72.33%||85.29%|
fracture is one of the CheXpert labels that the radiologist can comment on in the chest x-ray interpretation, so you will also be able to know if the particular fracture that shows up in the ICD-9 code (in the electronic health/billing record) was visible and commented on in the x-ray itself.
We obtained data on ICD-9 codes 733.00–733.03 for osteoporosis; and 733.09 or 733.90 for osteopenia, based on prior research. Additionally, for osteopenia, we required the text flag accompanying codes 733.09 or 733.90 to mention osteopenia (e.g., in this dataset, some patients had code 733.90 accompanied by a text flag for osteodynia, which would not be included under our definition).
We obtained data on ICD-9 codes 710–739 (musculoskeletal diagnoses), many of which involve pain, over the year before and after the x-ray. Again in the summary table below, we group these by clinical category, but individual ICD-9 codes are available in the dataset.
|Connective tissue disease||710||988||67.91%||52.02%||78.04%|
|Other joint problem||716,713||2973||44.37%||39.86%||59.87%|
|Intervertebral disc problem||722||4700||56.15%||49.98%||73.04%|
|Tendon and bursa problem||726,727||6378||35.15%||35.97%||52.37%|
|Other disorders of soft tissues||729||15664||60.60%||60.25%||76.60%|
|Bone and cartilage problem||733,732,731||10170||58.48%||52.87%||73.80%|
|Nonallopathic lesions not elsewhere classified||739||207||54.59%||52.66%||77.29%|
Finally, we obtained temperature, height, and weight from the flowsheet data collected in the course of medical encounters. We will be adding more of these in