Clinical Reporting: Mass General
Signals Over Noise: Cleaning Up Cancer Trial Data
Clinical trial reporting is the bedrock of evidence-based oncology, yet it's riddled with inefficiencies, mismatched data, and underreported adverse events. In this episode, Dr. Danielle Bitterman of Mass General Brigham joins unNatural Selection to discuss how her team is using AI to clean up the clinical data pipeline—automating the extraction of outcomes, improving the accuracy of adverse event tracking, and scaling trial infrastructure without compromising safety. We explore how artificial intelligence is not just a tool for speed, but a mechanism for survival in a world where better data leads to better treatments.
-
(Auto-generated by Spotify. Errors may exist.)
Doctor Danielle Bitterman is an Assistant Professor at Harvard Medical School who is dedicated to developing and implementing advances in natural language processing in large language models for safe, equitable healthcare.
She is faculty in the Artificial Intelligence and Medicine Program at Mass General Brigham and a radiation oncologist at Brigham and Women's Hospital.
1:22
Her expertise includes language model evaluation and risk monitoring, information extraction from the electronic health records, and translational studies of AI in the clinic.
Doctor Biderman's research has been published in high impact venues spanning medicine and computer science, including Nature Medicine, Lancet, Digital Health, and the Journal of Clinical Oncology and Nerves.
1:43
Danielle, welcome to a natural selection.
1:45
Speaker 1
Thanks so much for having me, really glad to be here.
1:48
Speaker 2
I always start with the same level setting question just to hear from your perspective what your field is like.
So could you please let us know what need or impact drives your work and how do you view your role in addressing it?
2:00
Speaker 1
Yeah.
So my research and kind of ever since the beginning of when I kind of became involved in natural language processing, artificial intelligence has come from this kind of outstanding need that I see really everyday in clinic is that our cancer patients urgently need new and better treatments to for their cancer.
2:26
We've made huge amounts of progress with with advancing multi modality treatment.
Cancer, cancer survival has improved their past decades, but there's still so much more we need to do.
But one challenge that both kind of slows our ability to figure out what the next best treatment is and both quickly and resource efficiently and then get these treatments to patients.
2:51
So that in a cost effective way is how long and how resource intensive it is to collect the data that we need to collect for patients who are enrolled on clinical trials.
It's a very manual process.
3:07
It takes a long time.
It requires a lot of humans, a lot, and there's a lot of redundancies in the process.
And I saw a lot of potential for artificial intelligence to really help facilitate that and advance, advance, advance cancer care that way.
3:22
Unpacking Ambiguity in Clinical Trial Data Reporting
Interesting.
So you're focusing primarily on the clinical reporting problem in the clinical trial setting.
Is the poor quality of clinical reporting?
Is it a technical problem, a systems issue, or something deeper like maybe a cultural mismatch between research and care?
3:39
Speaker 1
That's such a good question and I'll kind of focus on what we're right now laser focused on, which is specifically the adverse event reporting, which is particularly challenging because on a clinical trial you in the cancer clinical child generally you have to collect every adverse event that occurs in that patient.
3:57
And the potential list of adverse events is very, very long over at serial time points over months of their care, sometimes over years of their care.
And so it's both a technical and as you mentioned a cultural problem.
Providers are primarily documenting the medical records for the purposes of providing care to that patient.
4:19
Secondarily, you know, it's, it's, you know, that's a, a source for that for outcomes.
So there is often ambiguity in the language used to describe the adverse events.
The timing of it is often ambiguous and the severity of it might be ambiguous.
4:37
So that's a kind of cultural and kind of cultural is wrong word.
That's kind of just a mismatch of of of how we collect adverse events for many trials, not for all, but for many.
4:49
Speaker 2
And skills too, right?
I mean, clinicians aren't necessarily trained in research and vice versa.
4:56
Speaker 1
Exactly right, right.
And it's not, you know, the Pi of the study is not the person who's providing the direct care for every single patient on that trial.
And you know, it's frankly probably not realistic to have every clinician constantly having a dual, dual priorities of research reporting and clinical care.
5:19
We want our clinicians to have clinical care as their number one priority.
So trying to kind of flip their roles is I don't believe is the solution, but supporting them and being able to ensure that we're clarifying and support and and as much as possible documenting in a way that done, done that will downstream help the reporting.
5:45
That's something that I think we could really where we can really make inroads and and that kind of communication of what's happening to the patient.
From the more technical side, studies have shown that there's like and and we've shown in our we have our, our study that we are working on is developing methods for immunotherapy related advert extraction.
6:07
This is a a study that's an NIH funded study.
We've shown that humans, expert humans, disagree up to 50% of the time and kind of what adverse event is even present?
That's partially due to the ambiguity of the language and partially due to this is just the reality of language, which is something that I found really fascinating as I have dove into natural language processing.
6:32
Language is inherently ambiguous and you know, it's very difficult to get to humans to agree on things.
So while we might not ever, you know, we can improve that agreement.
We likely won't ever achieve 100% agreement.
6:47
But if, if we could at least collect that in a consistent way, that I believe would be an improvement in terms of understanding what's really happening to patients at scale.
6:59
How AI Cleans Up Inconsistent Clinical Trial Data
That's interesting.
So when you're talking about the disagreement in language, you're talking about two trained professionals that are looking at the same thing but interpreting it in different ways.
And are we talking about slight variations where there could be a mapping that says this is a sentiment of that?
7:15
Or are we talking about two totally different things?
7:18
Speaker 1
So it's it's a range, it's a range.
More commonly it's more kind of I would say smaller scale, but things that can end up having it be different.
So is this patient have AC, having a CTCAE, grade 2 adverse event, a matter adverse event, or grade 3 adverse event which is severe?
7:38
There's oftentimes room for differences in interpretation, interpreting how those guidelines of whether that's an adverse event is moderate or severe.
That might seem minor, but often times clinical trials and report, you know, the, you know, will report only the rate of grade 3 or greater adverse events.
7:56
So then you're missing if so, if it's something that one person would have called a grade 3 ends up being recorded as a grade 2, you're not that's going to that might end up not showing up in the final report for that study.
So even though it seems minor in some cases, it has important consequences for for the interpretation of the study does.
8:18
Speaker 2
That imply also that there might be an issue in the input of records, because if they disagree on necessarily how to interpret some of the symptoms, AI can comb through the data and clean it up and report.
8:35
But how do we address the issue of maybe some things not being reported at all?
8:41
Speaker 1
Great question.
Yeah, so we ought.
So just relying on what's currently in there won't, can't catch things that aren't, aren't reported.
So that's where I view in the future next step of this project, which we haven't gotten to yet, but where we are taking it next is OK.
8:57
Once you are able to improve the current standard reporting, what you're referring to is patient reported outcome reporting, I believe, which is very important being used in many places, but there are real technical challenges with that too.
Not every patient has access easy access or you know patients with who are on with cancer and clinical trials don't always have the time to do huge amounts of patient reported outcome reporting.
9:23
But now with with large language models, there potentially opens an Ave.
Where you could look at the records, see what is what things seem to really be missing and then in a more targeted fashion and more easy to use conversational fashion, potentially reach out to those patients or reach out to the clinicians.
9:41
Now this is where I think we need to sit with patients and clinicians and we're planning to do this to understand how would they really want that implemented.
So it doesn't provide, you know, this is an issue with technology if there's a huge amount of potential, but it also can create new burdens and put, we don't want to shift a lot of burden onto the onto patients who are enrolled on trials.
9:59
So we need to figure out that balance and that's going to require real engagement with the people who are the both the patients and clinicians involved in trials.
10:05
Speaker 2
Yeah, no.
And I was actually thinking about the clinicians, right.
So if there's disagreement between two experts that disagree on symptoms or the the nomenclature, is there also potential for one clinician to misinterpret a symptom or and not record it properly?
10:20
So the input from a professional or comparison between professionals, you might be missing some data because they're not all reporting it the same or reporting it at all.
10:30
Speaker 1
Yes, I see what you mean.
Yes, that absolutely that happens where where you know, I haven't, I have colleagues who've done studies looking at disagreements on whether the attribution side, whether the side effect is due to immunotherapy or due to somewhere some, some something else.
10:49
And often times there's a lot of disagreement there and it doesn't end up being accounted for until after that, after the, you know, weeks after that adverse event occurred.
Sometimes after all the data has been collected and then, you know, the central site might go back and look for records.
11:09
But there is a lot of that, that, that there is, you know, that's a bottleneck in terms of you can't have a central site reviewing every single page of every single patient who of records for every single patient who has been, you know, we, we, we don't want to totally duplicate the manual reporting process both at the site and, and at the and at the and at the, the, the main site or whoever is doing, doing the auditing.
11:39
So yeah, information can can drop all at that point as well.
11:43
Speaker 2
Yeah.
And, and potentially you can use these advanced models to help fill in the gaps, right.
So if you start seeing some patterns downstream, you could potentially prompt the clinician at the next visit to say, hey, by the way, ask this patient if they have a headache or any number of symptoms that might explain how they went from point A to Point C without AB in there.
12:02
Speaker 1
Right, precisely, yes, that's what I much more eloquently what I was what I was referring to when you can kind of then prompt support the clinicians and and and documenting and and collecting enough information or you know, oh, this person's, you know, maybe you might want to think about ordering a test that could confirm or at least provide more suggested evidence for what's really happening to that patient.
12:23
Speaker 2
And are there types of information that AI consistently misses or misinterprets an oncology records that makes your job?
Obviously it it'll clean up some human elements, but then what are the biggest challenges in cleaning up this data from a technology standpoint?
12:38
Augmenting Research with Transparent AI Systems
So we, we're really right now focusing on the adverse reactions, the models, our model, our system that we've developed performs well, performs with, you know right now about 98%.
So it actually is very good at finding the adverse events.
12:55
What we're moving on to now is doing the grading, how severe it is, which I had mentioned is hard for humans that we are.
That's definitely, that's certainly more challenging for models, but it's also more challenging for humans.
So it's hard to know if that's a uniquely technologic aspect.
13:10
Sometimes it's, as you said that it's not, there isn't enough information documented at present in the medical records to really clarify.
But that does require systems that embed kind of clinical knowledge into the LLMS beyond kind of what the generalist LLMS can do.
13:26
This is why, you know, we can't just use ChatGPT out-of-the-box to do it doesn't perform well at that because that's a generalist model.
It doesn't.
It hasn't been developed and engineered in a way that kind of aligns with the specifics of how we how we run cancer clinical trials.
13:45
Speaker 2
How do clinicians actually interact with the AI in the system?
Is it transparent to them or is it something that they're actively engaging in it?
And I guess a follow on question to that is, do they trust them yet or do they still see it as like a research tool?
14:02
Speaker 1
So we're currently having research coordinators interact with it at this stage, not the clinicians because it's the research clinic.
That's the research coordinators who are doing a lot of hands on and it is fully transparent.
I'm a huge believer that in, in, in transparency and the use of AI, especially in these kind of early days, we want to kind of ensure trust.
14:25
And I think if you're not transparent about it and that's the biggest way that you could lose trust early on.
So we have an interface where, you know, it's, you know, it's clear that they're, you know, we tell them that this is an AI kind of the AI generator report.
14:41
It shows, we show kind of the, you know, where in the records the, the model is basing its, it's basing its final determinations on.
And they have it's the goal is not to replace the research coordinators, it's to augment them and allow them to do more with better quality and which you know, and which will allow us to run more trials, find new, find better treatments for, for, for patients at a larger scale.
15:05
So we give the suggestion to the research coordinator and they confirm or adjust it if needed.
15:12
Speaker 2
What are some concrete ways that your team is improving this signal to noise ratio in reporting?
Is it generating views in EHR?
Generating PDF reports?
How?
How does how do your researchers currently interface with this technology?
15:32
Speaker 1
We have, it's implemented kind of within our system totally behind the MGB firewall and it, the notes are pulled on the patients on a trial and are surfaced in a kind of browser behind, again totally behind MGB firewalls where the, the clinical evidence, usually the note is shown side by side with the AI systems determination as well as as I mentioned kind of the, the highlighting which parts of the, of the data is driving that determination.
16:09
And so the the research career can see side by side the the source data and the the determination and then make and then make their final, their final decision based on that.
16:23
Speaker 2
Interesting.
I would imagine that in the process of doing this, you're obviously creating some tremendous value out of the data that you're collating and pulling together.
Is there a process also for transforming that into a more structured system that can be used for other research and other purposes in the future?
16:41
Early Successes in AI-Powered Clinical Data Extraction
Oh yeah, absolutely.
So this has a, a kind of a, a dual use for retrospective research.
So everything on the back end that we extract get from the clinical notes or the radiology, anything unstructured, the, the system automatically structures the way we show it to the research coordinators is kind of informed by, we've done a lot of field studies with research coordinators to kind of figure out what, what, what's most useful to them.
17:09
We show them kind of this kind of the more visual explanation of where of, of where these determinations come from.
But on the back end, it's fully a structured database.
And so this could theoretically be used for kind of run over historical historical data to to collect the same information.
17:31
Speaker 2
And have you how long has this project been running so far?
17:37
Speaker 1
So the project, it officially started last September, we're in the first year of it.
We've made it really a lot of progress in the first year, more than I even anticipated.
I've had a great team working on it.
17:53
So you know, we're currently we have, we're, we, we are running a clinical trial of this system to us like a meta clinical trial specifically.
We're using it to see if it can help.
18:09
As I mentioned, the focus of this grant is immunotherapy related adverse events and there is a biorepository being developed of immunotherapy related adverse events.
So once a patient, so the goal is to collect blood samples, tissue samples on patients who experience these adverse events to better understand, you know, what puts people at risk for it and then you know what might be potential better treatments for these adverse events.
18:36
But the these biopar positors have difficulty identifying the patients quickly enough after when they between the time when they develop the adverse event and being able to collect those samples because we want to do it.
You want to collect the samples at least the initial samples after the adverse event while, you know in an active phase.
18:56
So we are looking to see whether running our system it improves a time to which we can find these patients and quickly get them registered onto the biobank to get those high quality samples.
19:11
Speaker 2
And the reason why I asked how long the project's been running is because I was going to ask if you've had any positive signal so far, any great learnings or results.
But obviously you're only you're less than a year into it, so you may not yet, but do you have anything that you've seen any positive trends?
19:28
Speaker 1
We've seen.
So we've also we've done more retrospective before we before we before we moved into the clinical trial, we've done kind of comparisons of research and we're still kind of still finishing it.
So those results maybe stay tuned for the for the World Medical Innovation Forum by then we'll have full results.
19:48
But we're doing comparisons of research coordinators alone versus research coordinators who are augmented by our system from using kind of historical patients to kind of simulate what what might happen.
And we're seeing that it, you know, improves, improves time by at least half and improves and improves the variability a lot between the research coordinators.
20:10
And the accuracy of the system, again as I mentioned is especially for detection is, is really excellent.
It's about 98% right now.
20:17
Speaker 2
That's amazing.
Congratulations on those results.
20:20
AI's Vision for Better Trials and Patient Care
So if you know, assuming that this technology works as you envision, how do you see the future of clinical trials and oncology changing once your technology's out there at scale?
What's the the future of state of the the field and outcomes look like if we actually see this in a production setting?
20:44
Speaker 1
Yeah.
So I see that, I see this really accelerating the number of trials that that we can run at each.
You know, different hospitals can run because you can for the same number of research coordinators they can, they can collect information on many more patients in the same amount of time.
21:04
So I see allowing us to get to do more open more clinical trials and get more patients on clinical trials that way.
I also see it improving the, as I mentioned that quality of the of the of the adverse reporting both by improving consistently consistency.
21:21
So at least you know that when you have two trials that maybe you're comparing between two, you know, results of two different trials, you can at least say, OK, at least I know these adverse events for both trials were reported in the same way as opposed to really had being making it really difficult to know what they reported.
21:39
You know, if, if this if the team from from trial A did reporting for did the adverse event reporting for trial B would would that have changed the results?
So I think they're really seeing us being able to better put our finger down on this is the benefits of the of the of the strategy under investigating, under investigation.
22:00
These are the risks of the and most, you know, here would help.
These are the risks of the trial under investigation.
Given those two, we can have a better shared decision making between clinicians and providers about, about a new, about a new treatment strategy.
22:15
So I see a lot of benefit there because again, we, you know, tend to not we tend to have a less quantitative understanding of the risks of our different management of our treatments.
Speaking from a radiation as a radiation oncologist, you know, you know, for where I see a lot of benefit kind of extending beyond the Phase 2-3 clinical trials for Phase 4 pharmacovigilance.
22:39
We can kind of now be able to automatically collect, collect this, collect these signals and say, OK, here's a, you know, over the long term, here are the potential risks and potential and benefits of a treatment.
And then provide more clarity about that for our patients as well, which is going to only become more and more important as more patients survive longer with cancer.
23:01
We're going to have one more long term survivors.
23:03
Speaker 2
Yeah.
And you mentioned patients there.
I was going to ask you.
Here's a final question.
How does this transform the patient experience?
I mean, from my vantage point, it looks like clinicians will sound more informed, maybe better coordinated, maybe more insightful on what the patient is experiencing, what their journey looks like.
23:21
Is this all part of the equation of why you see this transforming, not only from a reporting standpoint, from the actual experience, from these patients going through the system?
23:30
Speaker 1
Absolutely.
And I, we don't have to totally go into this, but we have, you know, along the same line of interest, I, we have a project on using AI to help communicate information about clinical trials to patients and provide kind of a copilot for them on the side to help them, you know, understand.
23:49
So I could see in the future that those two coming together being like, you know, kind of providing information about the potential adverse events.
They can better understand the risks of either the, the potential risks of the trial they're consenting to.
So they know what they're getting into, which only in the long term, that really empowers patients and allows them to make the decisions that work for them.
24:08
And that ultimately leads to more trust in the healthcare system and it improves healthcare overall for our patients.
And then once they're on a trial, you could see it as a copilot for, you know, helping them understand what you know, you know, at different time points, what's going to be collected, what's, what information is going to be collected on them.
24:25
What, what it means if, when, if outcome A is found versus outcome B is found, really allowing them kind of 24/7 support and guidance on that as well.
So that that project is kind of just kicking off.
I'm really excited about that as well by, you know, there's a lot of ways that those two can synergize together.
24:42
Thank You and Next Steps for Unnatural Selection
Well, Danielle, thank you so much for being a part of this.
I think your work is tremendous and I congratulate you on the development so far.
Looking forward to hearing more developments as we approach September and I look forward to meeting you at the World Medical Innovation Forum.
24:56
Speaker 1
Thanks so much.
Thank you so much for having me and looking forward to meeting you as well.
