Key words
MR-diffusion/perfusion - neural networks - vascular - staging - diagnostic radiology
Introduction
The basic goal of all contrast agent (CA)-based perfusion measurement methods is to
obtain detailed information about the structure and function of the vascular network
by observing its dynamic response to a defined CA bolus. For this purpose, MR imaging
is an ideal measurement method as both high temporal and spatial resolution may be
achieved using current-generation scanners. If the relationship between contrast agent
concentration and signal intensity is known, the contrast agent concentration can
be calculated for each voxel and timestep. After this has been achieved, tissue models
of different complexity may be fitted to the concentration time curves and the results
used for diagnosis, treatment monitoring, or basic research.
The goal of this review is to explore the potential of novel deep learning-based processing
methods which may be able to capture hitherto unknown perfusion parameters such as
temporal curve shape or spatial enhancement patterns. It may be possible to identify,
similarly to MR fingerprinting in structural MR imaging, perfusion signatures which
carry information about the underlying microvascular architecture. As ample literature
on the technical details of perfusion MRI exists, conventional acquisition and processing
methods are only presented in brief.
In the first section, the influence of the vascular architecture and function on the
CA dynamics are reviewed. This is followed by a brief recapitulation of conventional
T1-weighted and T2*-weighted image acquisition methods and their relative strengths
and weaknesses. In the main section, the potential of deep learning-based perfusion
processing is reviewed and discussed, followed by an overview of current and potential
future clinical applications. This is followed by a look at future research directions.
Background
Vascular networks and blood flow
The architecture and integrity of the tissue microvasculature determine the temporal
and spatial signal dynamics in response to an external CA bolus. It must be emphasized
that, while the signal dynamics are highly dynamic and of large magnitude, the measured
blood flow and diffusion effects are largely static over the measurement duration:
the changes in contrast agent concentration do not represent the return of a perturbed
system to its equilibrium – the CA bolus is merely the measurement vehicle with which
static effects such as blood flow, contrast agent extravasation, and diffusion are
measured. The relevant effects can be largely classified into two categories: flow
effects that take place inside the vascular system and exchange effects taking place
between intra- and extravascular spaces. As physiological microvascular flow is nearly
always laminar and therefore deterministic, the flow of each measured intravascular
proton is theoretically predetermined by the vascular architecture and its flow patterns
[1]. Due to resolution limits, however, the precise morphology and flow velocity of
each capillary segment cannot be known. Recent models suggest [2]
[3] that the capillary network is organized according to common principles, and that
the microarchitecture can be described by a few organizational parameters [4]
[5]. This creates the possibility to simulate realistic microvascular networks relatively
easily, which can in turn be used to explore the influence of these organizational
parameters on the CA dynamics [6]. Concepts of statistical mechanics and graph theory may be of use to explore this
space further [7]. Exchange effects arise due to the permeability of the vessel walls, allowing the
transport of fluid and solutes, including the CA itself, between blood and the extravascular
space. This exchange is driven by concentration gradients, hydrostatic and oncotic
pressure differentials and, partly, active transport. Tumor growth is associated with
neoangiogenesis [8], with the newly developed vasculature being significantly more fragile and permeable
than the physiological vasculature [9]. Imaging permeability effects requires much longer measurement times [10], as the exchange effects are at least two magnitudes slower than directed blood
flow.
Basic measurement principles
For the acquisition of T1-weighted perfusion imaging, called dynamic contrast-enhanced
(DCE) MRI, dynamic MRI measurements using a heavily T1-weighted MR sequence with sufficient
spatial and temporal resolution are necessary. This is possible using either spin
echo or gradient echo-based stimulation schemes, although 3D gradient echo sequences
have become the de-facto standard. An example of a DCE acquisition can be seen in
[Fig. 1B]. Recently, compressed sensing-based sequences have been introduced, allowing shorter
acquisition times and adaptive readout schemes. The absolute contrast agent concentration
can be calculated from the relative change in signal intensity before and after application
of the CA. For this, the absolute T1 time of the tissue must be known beforehand,
usually by applying a quantitative T1 mapping sequence such as a variable flip angle
(VFA) or modified Look-Locker (MOLLI) sequence. The main advantage of T1w imaging-based
perfusion measurements is that the signal intensity-contrast agent concentration relationship
is only weakly influenced by tissue effects, allowing the modeling of permeability
effects in which contrast agent leaves the vasculature.
Fig. 1 Comparison of DCE and DSC MRI. A Raw DSC MRI of the human brain using single-shot gradient echo planar imaging at
time point t1 (left) and t10 (right). B Calculated CBF (left) and CBV (right) maps. C Raw DCE MRI of the human brain using a 3D gradient echo TWIST sequence at time point
t1 (left) and t10 (right). D Fitted Tofts model with parameters ktrans (left), vep (right).
Abb. 1 Vergleich zwischen DCE- und DSC-MRT. A Rohdaten einer DSC-Aufnahme des menschlichen Gehirns mittels einer single-shot echo-planaren
Gradientenechosequenz an Zeitpunkt t1 (links) und t10 (rechts). B. Berechnete CBF- (links) und CBV- (rechts) Karten. C Rohdaten einer DCE-Aufnahme des menschlichen Gehirns mittels einer 3D Gradientenecho-TWIST-Sequenz
an Zeitpunkt t1 (links) und t10 (rechts). D Gefittetes Tofts-Modell mit Parametermaps für ktrans (links) und vep (rechts).
For T2*-weighted perfusion imaging, called dynamic susceptibility-weighted (DSC) MRI,
usually a gradient echo sequence is applied, most commonly echo planar imaging due
to its uniquely high acquisition speed [11], as shown in [Fig. 1A]. As the vessel geometry and susceptibility difference between vessel and parenchyma
have a large influence on signal dephasing speed, it is common practice to saturate
the interstitial space in regions of high permeability by applying a small pre-bolus
of contrast agent before starting the dynamic acquisition. This attenuates the interstitial
signal changes during passage of the main bolus, allowing correct quantification of
cerebral blood flow (CBF) and cerebral blood volume (CBV). Still, additional postprocessing
correction of contrast agent extravasation is highly advantageous [12]. The relative signal intensity of T2*-weighted spin echo and gradient echo sequences
is dependent on the vessel geometry inside the voxel. This effect is used in vessel
size imaging (VSI) [13]
[14] or vessel architecture imaging (VAI) [15] to determine vessel diameters.
Quantitative analysis
The challenge of quantitative perfusion and permeability modelling is fitting complex
nonlinear models to noisy measurement data while having to correctly describe the
aberrant contrast agent dynamics created by pathological tissue. Therefore, care has
to be taken when using regularization algorithms in order to not correct away important
abnormalities. From a mathematical viewpoint, dynamic contrast agent curves are stochastic
time series with overlying noise. Depending on the specific measurement sequence,
the noise distribution of the signal intensity [16] and contrast agent concentration is not necessarily Gaussian. This has important
implications for noise estimation and significance testing.
Perfusion modeling
The basis of perfusion modeling is the indicator-dilution theorem, which connects
the local dynamics of centrally injected contrast agent with the local tissue blood
flow and blood volume [17]
[18]. These parameters can be determined directly from the concentration time curve without
having to assume an underlying tissue model. In order to account for variations in
the central circulation, the bolus curve either of a large perfusing artery or an
interpolated “local” vessel is used as an arterial input function (AIF) [19]. The AIF is usually either manually or automatically selected [20]
[21]. The AIF is deconvolved from the voxel CA concentration time curve, resulting in
the tissue residuum function, from which CBF and CBV can be directly calculated. Examples
of the resulting maps are shown in [Fig. 1B]. While the calculation of conventional parameters like CBF, CBV, time to peak (TTP),
and mean transit time (MTT) is well-established and a mainstay of neurooncological
and neurovascular diagnosis, novel processing methods like wavelet-based analysis
[22]
[23], Bayesian vascular models [24], and control-point interpolation methods [25] may be able to capture pathological changes better. In addition, direct inference
from the tissue residuum function may be able to capture important features which
are missed by calculated single parameters. Changes in the microvascular architecture
impact the transit delay of the CA, leading to changes in the residuum function shape.
Permeability modeling
For modeling of permeability effects, a dynamic tissue model with fixed exchange coefficients
is assumed. Parameter fitting is usually accomplished by either nonlinear least squares
(LS) fitting or Bayesian optimization [26]
[27]. Since the introduction of the Tofts [28], Brix [29] and Patlak models [30], increasingly complex models have been proposed. Examples include the 2-compartment
chemical exchange model (2CXM) [31], the compartmental tissue uptake (CTU) model [32], and the adiabatic approximation to the tissue homogeneity (TH) model [33]. The maps resulting from a Tofts model fit can be seen in [Fig. 1 D]. While research of tissue models for DCE perfusion has enjoyed a constant popularity
among MRI researchers and robust fitting algorithms have been developed, clinical
utilization of the derived parameters remains low. Most diagnostic guidelines still
rely on qualitative descriptors of contrast agent bolus curves.
Deep learning
Deep learning algorithms are essentially complex nonlinear functions which can represent
nearly any underlying distribution provided enough variable input is provided [34]
[35]. This essentially implies that most, if not all, conventional processing steps that
are applied to the raw perfusion data can be learned by a neural network. This raises
the question as to whether it is useful to train a neural network to provide known
perfusion parameters as an output. The ultimate goal of perfusion imaging is to provide
functional information not available from purely morphological imaging and to use
it to improve detection rates, diagnostic accuracy, and outcome prediction. Neural
networks can be trained to directly output this information by training them on data
obtained from medical records or histopathological reports, e. g., tumor grading or
staining density. The prediction of clinical outcome parameters directly based on
raw perfusion data is not an easy task, however, and requires both a large amount
of training data and enough training time to provide sensible results. As an intermediate
step, the output parameters of conventional quantitative processing methods can be
learned instead, essentially teaching a neural network the mathematical transformations
behind the processing pipeline. This is far easier than predicting clinical or outcome
parameters, as the quantitative parameter is usually known for each voxel separately,
and there is a directly (quasi-)deterministic relationship between input and output
variables. Therefore, less training data is required, and training converges faster.
Furthermore, even if no new information is gained compared to conventional algorithms,
implementing the processing in a common neural network may have other advantages,
such as higher processing speed or better interoperability between data acquired by
MR devices from different vendors. The raw data from T1- or T2*-weighted perfusion
imaging take the form of a four-dimensional array – three space dimensions are acquired
for each timestep. Two different classes of neural networks have been employed to
model this spatiotemporal data: convolutional neural networks and recurrent neural
networks.
Convolutional neural networks are derived from fully convolutional networks and incorporate
hidden layers which perform spatial convolution steps, helping them capture complex
relationships at different resolution scales. For perfusion imaging, the 3D images
captured at each timestep are usually assigned as different input channels. This has
the advantage of easily capturing spatial relationships but requiring that the input
data always has the same number of timesteps. In addition, the results are not invariant
under temporal shifts, such as when the acquisition was started earlier or later.
A possible solution is four-dimensional convolutional neural networks with modified
loss functions [36]
[37]. Finally, as convolutional neural networks are commonly used for tissue segmentations,
their incorporation into perfusion processing workflow can help in extracting tissue
parameters [38].
Recurrent neural networks, on the other hand, are natively designed for sequential
input data: there is a one-to-one-relationship between each temporal position in the
input data and a network layer [39]. Each layer, in addition to input and output nodes, consists of several hidden nodes,
with weight matrices shared between different timesteps. This makes them invariant
under time shifts, an important advantage when considering perfusion data. Training
recurrent networks has specific challenges such as vanishing or exploding gradient
problems [39]. Several network architectures were designed to deal with these problems, the most
prominent being long short-term memory (LSTM) nets [40]
[41]. The architecture of LSTM designed to learn CBV values from DSC data is shown in
[Fig. 2]. LSTM networks have shown high promise in modeling a wide range of different sequential
problems in radiology, such as in predicting IDH genotype in gliomas [42], breast lesion classification [43], for segmenting tissue or organs [44]
[45], differentiating the origins of spinal metastases [46], and recently for predicting DCE model parameters [47]. It is also possible to learn the necessary transformations for perfusion modelling
of DSC data, as demonstrated in [Fig. 3] (own work). As can be seen in the right part of [Fig. 3B], the root mean squared error between the predicted and the conventionally calculated
CBV is very small.
Fig. 2 Recurrent neural network architecture for the prediction of CBV from DSC contrast
agent curves with N time points. The network consists of L layers with N LSTM cells,
each with M hidden features, in each layer. For each voxel separately, the concentration
at each time point c(tn) is processed by a separate LSTM cell with hidden state hl,n. The weighting function wl,n is the same inside each layer. The last output of the last layer is given as the
input for a fully connected network (FCN) layer with (M, 1) nodes. The final output
is the trained parameter, in this case the CBV.
Abb. 2 Beispielhafte Architektur eines rekurrenten neuronalen Netzwerks für die Vorhersage
des CBV ausgehend von den Kontrastmittelkurven einer DSC-MRT-Aufnahme mit N Zeitpunkten.
Das Netzwerk besteht aus L Schichten mit je N LSTM-Zellen, jede mit M hidden features.
Die Kontrastmittelkonzentrationen c(tn) zu jedem Zeitpunkt werden für jeden Voxel einzeln von einer eigenen LSTM-Zelle mit
dem hidden state hl,n verarbeitet. Die Gewichtungsfunktion wl,n ist für alle Zellen in einem Layer gleich. Der letzte Output des letzten Layers wird
als Input für ein fully connected network (FCN)-Layer mit (M,1)-Neuronen benützt.
Der Outputparameter dieses FCN ist schlussendlich das CBV pro Voxel.
Fig. 3 Demonstration of the capabilities of an exemplary LSTM network. A Exemplary mean squared error (MSE) loss, evaluated on a validation subset, over the
training epochs, showing rapidly decreasing loss when using the ADAM optimizer for
a learning rate of 1e-6. B Comparison of the conventionally obtained CBV, as obtained by a Tikhonov-regularized
singular value decomposition (TiSVD, left) and the learned CBV LSTMpred (middle) on a test case which was not in the training or validation cohort. The right
image shows the root MSE between the CBV values obtained by the two different methods.
Pseudocolor scale is identical across the methods with a range of [0, 30] ml/min.
Abb. 3 Beispiel der Fähigkeiten eines exemplarischen LSTM-Netzwerks. A Beispielhafte Darstellung des mittleren quadratischen Fehlers (MSE loss) eines Validationsdatensatzes
abhängig von der Trainingsepoche. Der Fehler nimmt bei Benutzung des ADAM-Optimizers
und einer learning rate von 1e-6 rasch ab. B Vergleich der mittels eines konventionellen Tikhonov-stabilisierten SVD-Algorithmus
berechneten CBV, (TiSVD, links) und der gelernten CBV (CBV LSTMpred, mittig) anhand eines Testdatensatzes, der nicht Bestandteil der Trainings- oder
Validierungskohorte war. Das rechte Bild zeigt die Wurzel des mittleren quadratischen
Fehlers (RMSE) zwischen den beiden Methoden. Die Pseudofarbskala ist bei allen Darstellungen
einheitlich [0, 30] ml/min.
Model interpretability and error estimation
In order to compare perfusion parameters from different voxels or measurements, it
is necessary to have an estimation of the parameter errors. This error arises from
two main sources: from intrinsic MRI measurement noise, and from deviations between
real voxel tissue and the chosen tissue model. Only when the error is known is it
possible to correctly assess the magnitude of differences and do significance testing.
There are several methods for error estimation, with the most common model-free being
bootstrapping [48]. This method treats the model algorithm as essentially a black box and considers
the output error after artificially adding noise to the input parameter. For deep
learning, different technique-specific methods have been proposed, such as dropout
methods [49] or neural networks based on Bayesian reasoning [50]
[51].
When using black-box machine learning algorithms, the outcome predictions must be
taken at face value as the algorithm natively does not provide a reason for its prediction.
Recently, the field of neural network interpretability has made significant advances
in providing measures which help with the interpretation of results [52]. Specifically for LSTM networks, gradient-based attribution methods [53] or structure modifications allowing direct variable importance output [54] have been proposed. This can be used to extract associations between contrast agent
curve shape and outcome parameters [42].
Current and future clinical applicability
The quantitative evaluation of DSC MRI, that is, perfusion modeling, currently has
far more clinical applications than permeability modeling using DCE MRI. CBF and CBV
maps are used in the diagnosis of stroke, glioma, head-and-neck tumors, and sometimes
in cardiac imaging. On the other hand, most clinical applications of DCE MRI imaging
rely on purely qualitative or semiquantitative assessment. In the PI-RADS guidelines
for the diagnosis of prostate cancer, only the presence or absence of early enhancement
is scored since the available evidence for pharmacodynamic modeling is deemed insufficient
[55]. Similarly, breast cancer diagnosis using the BI-RADS criteria also uses a classification
of the signal dynamics into one of three curve types according to rise speed and washout
[56]. This does not imply, however, that there is no evidence for the usefulness of these
models, and a large number of small-scale studies exist [57]. Widespread adoption of quantitative DCE is hindered by several factors such as
insufficient standardization of acquisition and processing. The relatively poor current
performance of quantitative DCE models in distinguishing healthy from malignant tissue
may stem from deficits in the handling of the noise levels inherent in fast T1w imaging
and in cleanly separating perfusion effects such as bolus delay and dispersion from
permeability effects. This may change in the future, however, as deep learning-based
modelling becomes commonplace. The optimal mathematical framework for correctly handling
both intrinsic and extrinsic noise is given by stochastic analysis, in particular
by stochastic differential equations.
OUTLOOK
Currently, CA-based perfusion MRI is a well-established functional imaging method
with a multitude of appropriate processing methods. While new processing methods continue
to be developed, progress has declined somewhat in recent years. Machine learning
in general, and deep learning in particular, comprise promising new avenues for better
and more reliable processing methods. The inherently stochastic nature of neural networks
represents an ideal fit for modeling the inherently noisy CA dynamics. Due to their
flexibility, these methods may be capable of modelling the complex dynamics inherent
in aberrant blood flow patterns without having to specify a particular model beforehand.
The main challenge in applying deep learning algorithms to perfusion MRI data remains
the necessity of high-quality and plentiful data for training. Not only does the raw
perfusion data need to be acquired under standardized conditions using comparable
sequence parameters, but the trained goal parameters, whether segmentations, clinical
outcome parameters, or conventional perfusion parameters, need to be high-quality
too. Particularly the prediction of clinical outcome parameters is demanding due to
the often highly nonlinear and indirect relationship between perfusion data and final
outcome.
A possible avenue to circumvent the problem of always having to learn the complete
problem set is using physics-informed neural networks, which can learn complex tasks
while preserving physical or heuristic relationships specified as differential equations
[58]. These may be able to directly learn, for example, DCE parameters for a specified
model. An even newer generalization of physics-informed neural networks, universal
neural differential equations, provide an explicit way of doing this [59]
[60]. The disadvantage, however, is that the explicit model-free nature of deep learning
is partially lost.
In conclusion, deep learning-based processing of perfusion MRI data holds high promise
for diagnosis and treatment monitoring in oncology. The novel methods may be uniquely
suited for the inherently noisy time series obtained for each voxel and can learn
almost any sensible parameter. A special focus should be on connecting architectural
modeling and perfusion parameters, as this may allow monitoring of microstructural
changes in the microvascular architecture induced by neoangiogenesis or as treatment
response. Due to the rapid progress in the field, further research is urgently needed.