News
Our experts share the vision of IDSO and the advanced, data-driven approaches to cancer care and research.
Featured Article
What does a data scientist do?
Theoretically, anyone who analyzes data to do science could call themselves a data scientist. But to me, that term also implies the use of computers. So, it’s data plus computers that makes someone a data scientist in my mind.
I also tend to apply a slightly narrower definition: I think of a data scientist as someone who’s concerned with deriving hidden knowledge from data trends and then making predictions based upon them.
To accomplish those goals, you need two things. The first is the ability to handle, organize, standardize, label, test, move and make data amenable to analysis. The second is the ability to make predictions based on that data, and to develop the artificial intelligence (AI) tools needed to analyze it, learn from it and evolve as an organization.
What makes a good data scientist?
You don’t have to be an oncologist to be a good data scientist. In fact, very few data scientists at MD Anderson come from an oncology background. My team consists of everything from astrophysicists to shopping website analysts. But many of their skillsets are completely transferrable, so I am thrilled to have their talents here.
I didn’t even study oncology myself — just pure computer science and molecular biology. I applied data science to drug discovery when I started out in the biotech industry. It wasn’t until I became junior faculty in academia that I started developing the oncology knowledge I use today.
It’s easy to fall into the trap of thinking you’re poised for success in the field of data science, just because you’ve gotten some training on the most recent technologies. But the tools being used today are very different from the tools that will be used two years from now. So, if all you know how to do is push buttons on the latest fad, you’re going to get lost straightaway.
To be a good data scientist, you’ve got to have a good grasp of the fundamentals of math and computer science — and a really solid understanding of the underlying methodologies to identify trends and make predictions. You’ve also got to understand the limitations of any tools you’re using and how to design questions to make sure your experiment is both unbiased and testing the actual hypothesis.
“Bilingual” people who can “speak” both oncology and data science — and take complex biological problems and translate them into computational questions — are what I call “translational” data scientists. That’s what I consider myself. And that’s what I try to help each of my new team members to become, if they’re not one already.
How MD Anderson is harnessing the power of data
As a drug discovery scientist trained in molecular biology, I’ve always been fascinated by the idea of doing things at scale. Bringing all the data together and identifying hidden patterns in it that no one else can see — then using those insights to inform drug discovery efforts — is much more satisfying to me than trying to find the answers to very specific questions. But we need both types of science to advance medicine, of course.
I just love the process of drug discovery. It brings together experts from so many different disciplines, including genomics, physics and chemistry, to name a few. It’s a really complex field.
It’s also exciting to be exploring drug discovery here at MD Anderson, where I get to work with people like Andy Futreal, Ph.D., who is leading initiatives to collect data and profile patients in really meaningful ways; and with Tim Heffernan, Ph.D. ,who is leading our Therapeutics Discovery teams to explore new ideas through experiments that lead to the development of new drugs.
MD Anderson already has so many phenomenal initiatives— like the Patient Mosaic™ — that don’t exist anywhere else. All that was lacking was a cohesive way to harness its collective data to effectively drive our decision-making. That’s why I was recruited: to develop a kind of “information superhighway” that sits right in the middle, allowing a continuous feedback loop to keep us all on track.
Although it’s still early, we have already been able to harness knowledge from our rare tumor patient samples and identify potential “Achilles heels” genes using our cutting-edge AI methods. We then demonstrated these genes’ importance using our Therapeutics Discovery's translational biology expertise, and we are moving them to the drug discovery stage. This shows how MD Anderson’s unique capabilities enable us to quickly change the way we make advances that benefit patients.
As a co-lead for Computational Modeling for Precision Medicine in our Institute for Data Science in Oncology (IDSO), I’m spearheading that initiative with the Adaptive AI-Augmented Drug Discovery and Development program, “A3D3a.” I contrived the name deliberately so that we could call it “Ada.” It’s a tribute to one of my heroes, Ada Lovelace, the world’s first computer programmer.
The daughter of British aristocrat and Romantic poet Lord Byron, Lovelace worked with inventor Charles Babbage on machines that could make calculations at great scale. Then one day, she said to him, “Why don’t we make a machine that can be programmed to do whatever calculation we want it to?” She wrote the first computer program and with that, the era of computer programming was born.
Why I joined MD Anderson
The collective brain power at MD Anderson is truly unequaled. It simply doesn’t exist anywhere else. Neither does the ability to make advances benefit patients more quickly. That’s why I believe MD Anderson is the only place on the planet where we can do this. But to be successful, our work needs to start and end with the patient. That means:
- coming up with each hypothesis based on the existing patient data
- validating it in an experimental setting relevant to patients
- taking it to the preclinical and clinical trial stages in our own hospital
- bringing it to our patients at the bedside and in the clinic, and
- using the feedback generated by that process to refine any new drug therapy or patient care practices.
For me, personally, that also means developing algorithms to help us learn more about cancer, and uncovering new information that can further refine our decision-making processes. Using AI to inform each and every one of the thousands of decisions our faculty and staff make each day is what I’ve dedicated my career to — and precisely why I joined MD Anderson.
A lot of drugs that get approved to treat cancer today are considered “me, too” drugs. This means they ride the coattails of the ones that came before them. But when you see a brand-new drug that you helped develop enter a Phase I clinical trial to be tested for the first time — and you know that it will soon start directly benefiting patients — it really is the most exciting thing in the world. That’s where I get my buzz every morning, and it’s my favorite part of the job.
It’s too soon for anything I’ve been working on here to be entering Phase I clinical trials yet. But by applying data science, we’ve already identified several targets for potential therapy development in very rare cancers, such as metastatic uveal melanoma, which is really difficult. And that’s exactly the kind of thing we can only do at MD Anderson, because you need all the unique components of the entire pathway in one spot.
Now, with the help of AI knowledge bases like the CanSAR platform I created, we’ll soon be able to make drug discoveries that even places like MD Anderson couldn’t possibly have made on their own.
Advancing cancer surgery through data science
At MD Anderson, operating room lamps cast a brilliant glow on surgeons, nurses, anesthesiologists and other clinical team members as they work together to treat thousands of patients each year. Now that glow has been cast wider to include data scientists and engineers. Led by Jeff Siewerdsen, Ph.D., these quantitative scientists are regularly suiting up in scrubs to experience first-hand the operating room workflows they’re trying to improve.
Desire to make a positive impact leads to role at MD Anderson
MD Anderson’s Surgical Data Science Program was born from Siewerdsen’s observations over his 25 years as an academic researcher. In that time, he focused on developing new imaging technologies for diagnostic and interventional procedures. While his work produced numerous technologies and algorithms now used in operating rooms, he has strived more recently to conduct his research more closely with clinical teams impacted by the problems he has sought to address.
“Rather than continuing to add new technologies to address unmet clinical needs, I wanted to simplify, integrate and critically evaluate the value of new technologies using data science and systems engineering,” says Siewerdsen.
That opportunity arrived last year when MD Anderson recruited him.
“I was drawn to MD Anderson’s vision, strategic resources, expertise and capacity to bring major positive impact for patients and clinical teams,” recalls Siewerdsen, who was recently named to the National Academy of Inventors 2023 Class of Fellows. “To bring data science and systems engineering approaches to surgery – at MD Anderson’s scale – is a tremendous opportunity to show how these disciplines can make a tangible impact for patients and their clinical teams.”
Enabling surgery advances that benefit patients and clinicians
Drawing inspiration from his research as well as the “Surgineering” education program that he created at his previous institution, Siewerdsen established and leads a focus area within the newly launched Institute for Data Science in Oncology (IDSO). The IDSO Safety, Quality and Access focus area fosters collaboration with surgeons and clinical departments to integrate new technology and drive data science solutions to clinical practice.
One example is the creation of computational tools for improved operating room scheduling to enhance the efficiency of operating room use, leading to increased patient access and improved clinician wellness by streamlining clinical workflows. Another example is to use machine learning for real-time analysis and prediction to avoid surgical adverse events. A third involves surgical process modeling to refine workflows and quantitatively evaluate the benefit of emerging technologies before introducing them to the operating room.
“In the years ahead, my goal is not only to help move the needle on safety and quality but also to prove the hypothesis that quantitative scientists integrated with clinical operations are key to realizing major advances in surgery,” says Siewerdsen. “For MD Anderson’s patients, this means that surgery will be more accessible, safer and will use the most cutting-edge technologies to their fullest benefit.”
Getting to know Chief Data Officer Caroline Chung, M.D.
In October 2021, radiation oncologist Caroline Chung, M.D., became MD Anderson’s first-ever chief data officer.
Her charge? To shape MD Anderson’s data strategy and lead its implementation from an operational and cultural perspective.
Chung joined MD Anderson in 2016 from Princess Margaret Cancer Centre in Toronto, Canada, to spearhead the safe and novel uses of MRI to further advance personalized treatment in Radiation Oncology. Under her leadership, the effort has grown into a strategic initiative around Advanced Imaging and her passion for maximizing the use of high-quality imaging data to improve patient outcomes brought her to co-chair the Tumor Measurement Initiative, MD Anderson’s platform to support standardized, automated, quantitative imaging-based tumor measurement. Most recently, Chung has led the effort to establish an institutional Data Governance and Provenance Office to ensure strong stewardship of our data while maximizing the effective use and accelerating discoveries from our data to serve our mission of ending cancer.
Here, she answers nine questions about her new role, data and artificial intelligence (AI), and transforming personalized medicine for our cancer patients.
What brought you to MD Anderson?
The institution’s people, collaborative environment and tangible passion and dedication toward providing cancer patients with world-class care through innovation, research, education and prevention attracted me to MD Anderson and continue to inspire me every day. It is wonderful to work at an institution that has such strong alignment with my personal goals of delivering patient-centered, data-driven care and achieving the best outcomes possible – all of which we accomplish through data.
What are you most excited about in your new role as MD Anderson’s first chief data officer?
I am most excited about partnering with our patients and everyone across MD Anderson to shape our data-centric approach to ending cancer. We want to responsibly and effectively use the massive amounts of data generated each day to determine the best possible treatments and care paths for each patient, based on their diagnosis and personal goals. I am excited to work with everyone across MD Anderson to build a culture that recognizes the value and importance of ensuring high quality data to serve our mission to end cancer.
What role does AI play in medicine?
Asingle cancer patient generates over 2 GB of data each year in imaging and electronic medical records data, so the sheer volume of information can be overwhelming. With more advances and new applications of technology — such as telemedicine, deeper characterization of tumors and -omics research — we’re expecting to see an even larger increase in data.
AI has the potential to help us gain the greatest value and use out of the exponential growth of data and paint a clearer picture of each person’s cancer journey, deliver more personalized care, increase safety of care, and improve patient-centered outcomes.
What is MD Anderson doing to pioneer AI-enabled solutions in medicine?
We need to look at the data in context in order for AI to reach its full potential. At MD Anderson, we are shaping our approach to data around a set of principles that start with the core view of keeping our data in context. I recently highlighted these messages in a perspective piece in Cancer Research. Essentially, metadata is the contextual data about the data. It provides information about the who, what, where, how and even why the data was collected that give context to the data and, thereby, informs the definition, quality and lineage of the data. This is important because it ensures that we can trust the data and use it appropriately to ensure confidence in our data-driven decisions while serving as good stewards of the data.
What does this mean for cancer patients?
By looking at the data in context, we consider the patient as a whole person and record the situation within the moment the data was captured. We can interpret the information that considers your holistic journey over time up until the present moment and provide the contextual assessment that enables personalized medicine when we have the ability to look at your:
- clinical story,
- physical exam findings,
- imaging data,
- patient reported outcomes,
- genomic data,
- tumor profiling data,
- microbiome data and
- biosensor data.
By building a robust flow of data (the content) and metadata (the descriptors that provide the context around the content) and developing predictive models, AI has great potential to help us make profound leaps and bounds toward truly personalized, patient-centric care.
How do a data-driven approach and AI help us discover new treatments and improve diagnoses?
A data-driven approach aims to both ask and answer questions with an open mind based on objective investigation of the data.
In order to do so, we need to put an emphasis on gathering detailed, high quality and consistent data. Having multidisciplinary teams including your clinical care team but also other researchers and data scientists from across the organization come together around the data allows us to consider new questions and insights that may not have been obvious before. It is a true testament to the power of team data science and data-driven approach.
How will AI improve cancer treatment advances?
The human body is an intricate connection of systems (e.g., nervous system, gastrointestinal system, endocrine system). AI is enabling us to model much more complex systems and the complicated interaction between systems with greater efficiency. By generating a digital twin, AI could enable us to model different systems and even a patient’s entire body with all the different integrated systems to help inform and predict treatment responses. Using digital twin technology to accelerate our understanding of cancer could lead to the discovery of more effective treatments and guide personalized cancer treatments.
For example, MD Anderson’s Tumor Measurement Initiative focuses on improving the quality of our imaging measurements. Enhanced by technology such as AI, it has the potential to improve our ability to confidently determine tumor response to treatment. This will improve our ability to assess how effective a potential new drug or treatment will be against certain cancers. It may also improve our ability to make more timely adjustments to treatments in clinical care. Utilizing the data and AI, we may see new patterns of treatment response that may lead to future discoveries in cancer treatment. For example, the contextual data in combination with AI may reveal that tumors with specific genetic mutations behave a certain way or combining cancer treatments with other associated medications may lead to better responses.
AI can also accelerate tasks that would take humans years to complete. One example of a task that has been accelerated with the use of machine learning is the ability to model the shapes of large molecules and how they fold and interact with each other. Speeding up this process with AI will help scientists speed up the development of new drugs or biologicals personally tailored for each patient.
What are you passionate about outside of work?
My fascination with imaging and data is stimulated by my love of art, food and learning new things. I love to paint, cook and read on a vast range of topics. I love sharing the products of my experiences and experiments (never follow recipes) with those around me. It fascinates me how the brain takes, interprets and gets affected by all our senses – sight, sound, touch, smell and taste.
If you weren’t a doctor, what would you be?
Being in Texas and collaborating with NASA, I may say space cowboy, which I was for Halloween in 2020 when kids came by for socially distanced treats.
At the end of the day, I love what I do and feel blessed to be constantly inspired by my patients — by their strength, their struggles, their hopes and resilience. They are a grounding motivation for all my efforts in clinical care, research, and in my new leadership role.
Request an appointment at MD Anderson online or by calling 1-877-632-6789
Related Articles and News
MD Anderson News Releases
-
MD Anderson’s Institute for Data Science in Oncology establishes internal advisory council to maximize impact
June 2024
-
MD Anderson’s Institute for Data Science in Oncology announces appointment of inaugural IDSO Affiliates
March 2024
-
MD Anderson, TACC and the Oden Institute announce funding for the next round of collaborative cancer research projects
December 2023
-
Collaboration on Data and Computational Sciences Announces Next Round of Projects to Advance Cancer Breakthroughs
November 2021
-
MD Anderson advances data collaboration through technology agreement with Syntropy
April 2021
-
MD Anderson and UT Austin collaboration to end cancer welcomed enthusiastically by state and federal stakeholders
November 2020
Articles
-
"Cancer Needs a robust 'metadata supply chain' to realize the promise of artificial intelligence"
Elsevier Pure
-
"35 chief digital officers of health systems to know | 2022"
Becker's Hospital Review
-
"MD Anderson researchers harness AI to transform cancer care"
NVIDIA
-
"MD Anderson launches $100M data science institute"
Becker's Hospital Review
-
"MD Anderson names first chief data officer"
Modern Healthcare
-
"Realizing the power of data science to advance cancer research and cancer care"
MD Anderson (Cancerwise)
MD Anderson News
The MD Anderson Newsroom provides information about studies and findings from our cancer researchers.