Planta Med 2021; 87(12/13): 964-988
DOI: 10.1055/a-1529-8339
Natural Product Chemistry and Analytical Studies
Reviews

Quality Control of Herbal Medicines: From Traditional Techniques to State-of-the-art Approaches[ # ]

1   School of Health Sciences, Universidade do Vale do Itajaí – UNIVALI, Itajaí/SC, Brazil
,
Maira R. de Souza
2   Laboratory of Pharmacognosy and Quality Control of Phytomedicines, Faculty of Pharmacy, Universidade Federal do Rio Grande do Sul-UFRGS, Porto Alegre/RS, Brazil
,
Johan Viaene
3   Department of Analytical Chemistry, Applied Chemometrics and Molecular Modelling, Center for Pharmaceutical Research (CePhaR), Vrije Universiteit Brussel – VUB, Brussels, Belgium
,
Tania M. B. Bresolin
1   School of Health Sciences, Universidade do Vale do Itajaí – UNIVALI, Itajaí/SC, Brazil
,
André L. de Gasper
4   Herbarium Dr. Roberto Miguel Klein, Department of Natural Sciences, Universidade Regional de Blumenau – FURB, Blumenau/SC, Brazil
,
Amélia T. Henriques
2   Laboratory of Pharmacognosy and Quality Control of Phytomedicines, Faculty of Pharmacy, Universidade Federal do Rio Grande do Sul-UFRGS, Porto Alegre/RS, Brazil
,
Yvan Vander Heyden
3   Department of Analytical Chemistry, Applied Chemometrics and Molecular Modelling, Center for Pharmaceutical Research (CePhaR), Vrije Universiteit Brussel – VUB, Brussels, Belgium
› Author Affiliations

Supported by: Conselho Nacional de Desenvolvimento Científico e Tecnológico
 

Abstract

Herbal medicines are important options for the treatment of several illnesses. Although their therapeutic applicability has been demonstrated throughout history, several concerns about their safety and efficacy are raised regularly. Quality control of articles of botanical origin, including plant materials, plant extracts, and herbal medicines, remains a challenge. Traditionally, qualitative (e.g., identification and chromatographic profile) and quantitative (e.g., content analyses) markers are applied for this purpose. The compound-oriented approach may stand alone in some cases (e.g., atropine in Atropa belladonna). However, for most plant materials, plant extracts, and herbal medicines, it is not possible to assure quality based only on the content or presence/absence of one (sometimes randomly selected) compound. In this sense, pattern-oriented approaches have been extensively studied, introducing the use of multivariate data analysis on chromatographic/spectroscopic fingerprints. The use of genetic methods for plant material/plant extract authentication has also been proposed. In this study, traditional approaches are reviewed, although the focus is on the applicability of fingerprints for quality control, highlighting the most used approaches, as well as demonstrating their usefulness. The literature review shows that a pattern-oriented approach may be successfully applied to the quality assessment of articles of botanical origin, while also providing directions for a compound-oriented approach and a rational marker selection. These observations indicate that it may be worth considering to include fingerprints and their data analysis in the regulatory framework for herbal medicines concerning quality control since this is the foundation of the holistic view that these complex products demand.


Abbreviations

AFLP: amplified fragment length polymorphism
ANN: artificial neural networks
APCI: atmospheric pressure chemical ionization
API: active pharmaceutical ingredient
ASEAN: Association of Southeast Asian Nations
BOLD: barcode of life data system
CAD: charged aerosol detector
CE: capillary electrophoresis
COI: cytochrome oxidase
COW: correlation optimized warping
DAD: diode array detector
DoE: design of experiments
DRIFTS: diffuse reflectance infrared Fourier transform spectroscopy
ECD: electron capture detector
EEC: Eurasian Economic Commission
EI: electronic impact
ELSD: evaporative light scattering detector
EMA: European Medicines Agency
ER: error rate
ESI: electrospray ionization
EU: European Union
FD: fluorescence detector
FDA: Food and Drug Administration
FID: flame ionization detector
FN: false negative
FP: false positive
FRAP: ferric reducing antioxidant power
GMP: good manufacturing practices
HCA: hierarchical cluster analysis
HM: herbal medicine
ICH: International Conference on Harmonization
iPLS: interval partial least squares
IRCH: International Regulatory Conference on Herbals
ITS: internal transcribed spacer
kNN: k-Nearest Neighbors
LDA: linear discriminant analysis
LLE: liquid-liquid extraction
LOO-CV: leave-one-out cross validation
MAE: microwave-assisted extraction
MCCV: Monte Carlo cross validation
MIR: mid-infrared
MLR: multiple linear regression
MSC: multiplicative scatter correction
NER: non-error rate
NIR: near-infrared
NP: natural product
OPLS: orthogonal projection to latent structure
OPLS-DA: orthogonal projection to latent structures discriminant analysis
OTU: operational taxonomic units
OVAT: one-variable-at-time
PC: principal component
PCA: principal component analysis
PCR: principal component regression
PE: plant extract
PLS: partial least squares
PLS-DA: partial least squares discriminant analysis
PM: plant material
PSE: pressurized solvent extraction
QbD: quality by design
QDA: quadratic discriminant analysis
RAPD: random amplified polymorphic DNA
RH: relative humidity
RMSEC: root mean squared error of calibration
RMSECV: root mean squared error of cross-validation
RMSEP: root mean squared error of prediction
RSM: response surface methodology
SCAR: sequence characterized amplified region
SF: supercritical fluid extraction
SFC: supercritical fluid chromatography
SIMCA: soft independent modeling by class analogy
siPLS: synergy interval partial least squares
SNV: standard normal variate
SPE: solid-phase extraction
SVM: support vector machines
TCM: traditional Chinese medicine
TN: true negative
TP: true positive
UAE: ultrasound-assisted extraction
UVE-PLS: uninformative variable elimination partial least squares
WHO: World Health Organization
 

Introduction

Research into complex NPs, such as botanical extracts, has been facing a decrease in funding from the pharmaceutical industry and some governmental institutions. Even so, secondary metabolites remain a “muse” for the development of therapeutics [1]. HMs and other articles of botanical origin are still relevant as important options for the treatment of several ailments, mainly reinforced by traditional medicine systems, such as the Chinese and Ayurveda [2], [3]. On the other hand, the market of NPs with unproven therapeutic effects also calls attention, especially because of the medical fake news in social media [4]. Independent of the context in which they are used, these products have a chemically complex composition that may act at multiple targets to exert pharmacological and/or toxicological effects [5].

The chemical mixture is a very important feature, but it is also the main challenge of HMs. Most of their therapeutic properties are credited to the synergistic effects between several compounds of this complex mixture. On the one hand, the lack of 1 single chemical entity responsible for the HMʼs pharmacological properties may hamper the appropriate assessment of the productʼs safety and efficacy profile from nonclinical and clinical studies [6], [7]. The assessment problem is mainly because of the challenge of properly evaluating the relevant quality attributes of the product, especially in early study phases, leading to standardization issues. Moreover, the lack of appropriate control of cultivation conditions, the use of wild-collected PM, and the lack of standardization of the processes for obtaining PEs may also lead to variations in the chemical composition, thus impairing the reproducibility of the HMʼs pharmacological and therapeutic effects [5], [8], [9].

On the other hand, for marketed HMs, the complex matrix remains a continuous problem for industries and regulatory agencies, adding challenges to the quality assurance of these products. This leads to one of the major difficulties for the pharmaceutical industry regarding articles of botanical origin: quality control. Initially, quality control was performed based on macroscopic and microscopic characteristics of the parts of the medicinal plants. With advances in the knowledge of chemical aspects of NPs and the development of new analytical techniques, additional tests were included in the pharmacopeial monographs, such as wet-chemical tests. Nowadays, most quality assessment approaches are based on botanical and chemical descriptors for identification tests, including chromatographic profiles [10], along with the quantitative determination of 1 or more marker compounds [11].

The main problem with compound-oriented approaches is that, for most plant species, it is not possible to explain the therapeutic activities based on one or a few compounds. Thus, it is inappropriate to infer efficacy and safety based solely on their contents [12], [13]. This becomes an even bigger problem when considering mixtures of different plants, such as in TCM.

In this sense, methodologies that give a more complete image of the chemical complexity of HMs may play a significant role in quality control. The use of chromatographic or spectroscopic techniques to generate chemical fingerprints of PEs, processed with data analysis techniques, has shown to be a promising strategy to deal with challenges in the quality evaluation of articles of botanical origin [13], [14]. DNA-based methods can generate genomic information, adding some extra tools to the list of multivariate methods for the quality control of these materials [15].

In this review, an overall assessment of the quality control of PM/PE/HM is provided. Traditional approaches are reviewed, although the focus is on the application of multivariate techniques. Two main aspects of pharmacopeial tests are covered: identification (qualitative analysis) and assay (quantitative analysis). Some additional hot topics, such as HMʼs stability and pharmacological assay for quality control, are also covered. For the sake of clarity, in this review, HM will be the standard term for finished products (e.g., tinctures, tablets, pills); “plant material” (PM) will be used for raw materials, such as plant parts; and “plant extracts” (PE), for extracts and other processed APIs of botanical origin (e.g., fractions and enriched extracts).

Current Regulatory Context

The regulatory status of HMs differs from country to country [16], [17], [18], [19], [20], ranging from dietary supplements and other poorly or self-regulated categories to medicinal products under stricter regulations [15], [16], [17]. This implies heterogeneity in quality control requirements around the globe [15], [18], [20]. This situation makes it difficult for consumers to distinguish between products with different quality attributes and efficacy/safety profiles [16]. The lack of appropriate legal control and standardization of quality specifications for HMs in several countries also weakens the confidence of healthcare professionals and favors the emergence of adulterated or poor-quality products, thus increasing the risks for consumers [15], [21]. Moreover, current regulatory requirements regarding the evaluation of HM stability still present several outstanding issues concerning the assessment of relevant quality attributes of those products (See Stability section).

Given the above, the need for stringent and scientific-based regulatory control of HMs, to protect public health, has motivated the WHO to developed global guidelines, in addition to monographs on selected medicinal plants, to assist its member states in developing their own regulatory framework and to promote the appropriate use of plant species [22], [23]. Furthermore, the current globalization has motivated the search for international harmonization of regulatory policies and quality standards for comparable products, thus providing conditions for mutual acceptance [19], [22]. Successful examples of regional harmonization, such as those of the EU [19], [22], may serve as references for other regional groups of countries to develop their own regulatory networks. On a global scale, however, the development of harmonized regulatory policies with a solid scientific basis remains a major challenge, depending on political engagement [19].

Regardless of the regulatory categories to which HMs belong, the establishment of appropriate quality criteria, covering all stages of the production process, is of utmost importance for their standardization.


Markers

The choice of chemical markers is critical for quality assurance and standardization of PEs/HMs [24], [25]. According to the EMA, “markers” are chemically defined constituents or groups of constituents of a medicinal species that are of interest for quality control purposes [26], including both quantitative and qualitative analysis, and which should be suitable for their intended purpose [27]. Ideally, when chemical constituents with demonstrated clinical activities are present, they need to be monitored both qualitatively and quantitatively [11], [24], [27]. However, for most commercial pharmaceutical products derived from medicinal plant species, clinical trials of good methodological quality involving appropriate characterization of the PE are scarce or nonexistent [11], [28]. This significantly hinders attempts to assign the observed therapeutic activities to specific constituents [11].

For some species, it is possible to select active markers (i.e., constituents or groups of constituents) that are generally accepted to contribute to the overall therapeutic activity of the HM [27]. Most compounds in this category show pharmacological activities that have been demonstrated by in vitro and/or in vivo studies. However, the ensemble of therapeutic properties of the HM cannot be attributed exclusively to them [24]. Unfortunately, for the majority of HMs, active principles or active markers are either unknown or unavailable for routine analysis [24]. In such cases, it is acceptable to select 1 or more analytical markers (i.e., constituents whose contributions to the therapeutic activities of the HM are usually unknown and, therefore, are monitored for analytical purposes only) [11], [24], [27]. In this context, the PMs/PEs as a whole have to be considered as the active ingredients in the HM [11], and the quantitative assessment of analytical markers is carried out with the main objective of allowing the estimation of the active ingredientsʼ content in the finished product (HM) [27]. However, controlling the content of analytical markers within specification limits does not ensure batch-to-batch uniformity [11], [24], [27], which makes it necessary to use complementary or orthogonal methods, such as the monitoring of chromatographic fingerprints to assess other relevant aspects of the HMʼs quality [10].

From a qualitative perspective, the detection of characteristic secondary metabolites, regardless of their biological activities, can be a useful tool for the identification of an article of botanical origin [28]. Ideally, constituents or groups of constituents specific to the species of interest should be selected as identity markers [28], [29]. However, only a few examples of species-specific markers are reported in the literature so far [29]. Therefore, the monitoring of multiple identity markers is often necessary to provide the appropriate degree of discrimination between related species [24], [29]. Furthermore, although the detection of identity markers may provide positive evidence for the presence of the medicinal species of interest, it does not prove the absence of contaminants or adulterants [29]. This indicates the need for complementary methods for the latter purpose [24], [29]. In addition to the above-mentioned marker categories, some authors used the term “negative markers”, referring to substances whose presence is undesirable in the PMs/PEs/HMs because of toxic or allergenic properties [11], [24]. In some cases, these substances are controlled as impurities within a maximum concentration limit [11].

Regardless of the categories of quality markers, their choice must be conducted on a rational basis, and the established acceptance criteria must be technically justified [27], taking into account the circumstances and objectives of each analysis.



Identification: Qualitative Analysis (Traditional Approaches)

Botanical methods: identification challenge

It is estimated that there are more than 390 000 plant species in the world and about 1 100 000 scientific names [30]. More than 28 000 plant species with some medicinal use are registered [31]. The rules for the scientific nomenclature of plant species have changed over time [32], [33], [34], [35]. More recently, nomenclature changes have been proposed as knowledge on phylogenetic relationships among plant species evolves [36].

The difficulty in identifying plants may arise from the wide phenotypic variation. This variation occurs between different populations of the same species [36] or within the individual itself [37]. One of the most notable changes occurs in the leaves since they can be in full sun or shade. Sun and shade leaves can vary in thickness and size [38], [39], which makes the identification of plants, based only on the leaf (i.e., vegetative character) more complex. Ideally, for good identification, plants in the reproductive stage should be collected. With vegetative and reproductive structures, taxonomists can study samples using identification keys and species descriptions that, unfortunately, do not exist for all species. This analysis is usually made with macroscopic characters by the naked eye or with stereoscopic microscopy.

A taxonomic analysis is fundamental, as well as cataloging samples in herbaria [40]. Sometimes, several anatomical characteristics, such as trichome types [41], cork layer [42], or stomata type, can be used to ensure plant identification [43]. Notwithstanding that, in most of these analyses, integral and preferably living samples are recommended. When necessary, anatomical analyses can be done using rehydrated material. In all cases, samples are processed in microtomes or with free hand-cuts, fixed on slides, and then analyzed following a standard protocol [44]. Specific stains, such as Astra blue, toluidine blue, safranin, or Lugol, are often used to stain specific cells. However, a morphological analysis may not always ensure accurate identification and other techniques, such as a chromosome count, [45], [46], multivariate analysis of morpho-anatomic data [47], and scanning electron microscopy, can be used to try to identify a species [48]. All these techniques require time to prepare slides as well as reagents but may help to ensure the correct identification of the studied species.

Nevertheless, even these techniques may be insufficient for correct identification, since (commercially available) dry samples may be contaminated or substitutes may have been used. Mistakes or fraud, which can pose a health risk, can be revealed by a DNA-based approach [49].


Chemical methods

The assessment of chemical composition complements the botanical identification in traditional approaches [29]. It not only reflects the identity of the plant species but also assists in the characterization of subspecific variants, such as varieties, cultivars, and chemotypes [29]. The chemical composition provides also a reflection of the production processes of PEs and HMs, especially the extraction stages, which play a critical role in the final composition of the material of interest [17], [29]. In addition to being relevant for the authentication of starting material, adequate monitoring of the constituents profile of PMs/PEs/HMs is essential to ensure the consistency of their therapeutic profiles [17], [29], [50].

Sample preparation

Sample preparation is one of the most important steps in the quality control of articles of botanical origin. It is centered on the extraction process and aims to recover the analytes from the plant tissue to make them available for chemical analysis [51]. This is reflected in the statement by Choi and Verpoorte [52], “What you see is what you extract”.

Traditionally, maceration, decoction, and infusion methods are used. However, over years, the extraction methods were upgraded, adding heat to accelerate and improve the extraction process (e.g., reflux, Soxhlet extraction) [53]. Today, several other sources of energy are used to improve the extraction procedure, such as soundwaves (UAE), microwaves (MAE), and pressure (PSE) [54]. These modern techniques can improve analyte recovery and extraction repeatability, usually reducing time and solvent consumption [53], [55]. Clean-up procedures, such as LLE and SPE, which aim to concentrate markers and to remove interferences, must also be considered [55]. Although this step is traditionally avoided, some special conditions (e.g., low concentration of the analytes) require the use of clean-up methods.

From the studies reviewed (188 papers, Table 1S, Supporting Information), it is observed that about 50% used UAE to recover the analytes from plant tissue ([Fig. 1]). For the sake of clarity, only studies focusing on quality control were included in this section. In fact, in general, UAE has been broadly applied for the extraction of metabolites from plant sources [56]. However, it is known that UAE may cause temperature increases and radical chemical species formation [55]. Therefore, when known labile analytes are targeted, this technique should be avoided, or experimental conditions should be properly controlled.

Zoom
Fig. 1 Pie chart representing the most frequently used extraction methods, based on the 188 studies considered in this review (Table 1S, Supporting Information). PSE = pressurized solvent extraction; SFE = supercritical fluid extraction; UAE = ultrasound-assisted extraction.

Techniques that use pressure (PSE and SFE) represented 5% of the studies ([Fig. 1]). By controlling factors like temperature and pressure, it is possible to increase extraction efficiency by decreasing solvent viscosity and increasing its diffusivity, leading to an improved solvent passage through the plant material. Extraction-time and solvent-volume reduction are some additional advantages of these approaches [55], [57]. Despite their advantages, PSE and SFE are less popular, probably because of their higher costs.

Regardless of the extraction approaches, water, hydroalcoholic mixtures (water/ethanol or water/methanol), or alcohol (ethanol or methanol) were mostly used as solvents (80%, Table 1S, Supporting Information). These solvents are largely applied for the extraction of NPs, probably because of their polarity, which is in accordance with the polarity of several classes of secondary metabolites [53]. Less polar solvents are usually used when a targeted extraction is aimed, for instance, chlorinated solvents to obtain steroids and terpenoids [53], [58]. Solvent properties may also be modified by acidifiers or alkalizers, specifically for alkaloids extraction [59], [60]. This can also help in clean-up methods. In fact, in this review (Table 1S, Supporting Information), less than 10% of the studies used a clean-up procedure (e.g., LLE or SPE) and in one-third of the latter, they were applied to obtain alkaloid-enriched fractions.

Clean-up is a post-extraction procedure that aims to concentrate the analytes. Two main approaches are used: LLE or SPE. Clean-up is an additional sample preparation step, and for that reason, it is generally avoided. Usually, researchers aim to keep pre-analytical methods as simple as possible and to focus on the analytical steps of the study [52].

It became clear through this review that researchers do not put much effort into sample preparation optimization. Only 7% of the studies (Table 1S, Supporting Information) applied some kind of extraction optimization. This raises an alert that extra attention should be paid to sample preparation. If NPs are not properly extracted, all subsequent analytical efforts can be compromised, resulting in incorrect content estimations or no representative metabolic profiles. This is a recurrent problem already noticed in other papers [52], [53], [55]. Moreover, less than 50% of the studies that perform optimization used a DoE approach.

DoE is a multivariate approach to simultaneously evaluate the influence of several factors on a given response. In comparison to the OVAT approach, it has the advantage that interaction effects occasionally may also be evaluated. Thanks to a proper distribution of the experiments in DoE, the experimental domain can be properly mapped in a given number of experiments [55]. The most used DoE approach is RSM. It uses a second-order polynomial model to predict the best levels of the evaluated factors that give an acceptable or optimal response [55].

A few examples of DoE applications can be found in the reviewed literature. Most used an L9 (34) orthogonal array [61], [62] or a Box-Behnken design [59], [63] to find appropriate conditions anyway.

Many analytical studies could benefit from an optimization of the sample preparation. Extraction methods are usually selected based on the researcherʼs experience. They may not give enough suitable information, while no validation is performed. A wide range of methodologies is available, and their applicability to articles of botanical origin should be explored both by the academic world and the industry. In this context, DoE provides a rational approach to properly explore the experimental domain, usually in an experimentally economical strategy.


Phytochemical screening

Phytochemical tests can be useful to indicate the presence of some adulterants or foreign matters in PM [64]. Some classical methods are based on colorimetric reactions that indicate the presence of certain groups of secondary metabolites, such as alkaloids, steroids, carbohydrates, tannins, flavonoids, saponins, triterpenoids, coumarins, and phenols [65], [66], [67]. Although these tests have been used historically for identification purposes, especially before the popularization of modern chromatographic techniques, most of them show poor specificity and, therefore, the WHO has discouraged their inclusion in pharmacopeial monographs, unless they can provide useful information [10]. Nevertheless, several authors, mainly in developing countries, have proposed the detection of secondary metabolites by classical phytochemical tests when defining quality standards for medicinal plant species [65], [66], [67], [68], [69], [70], [71], [72], [73], [74] (Table 2S, Supporting Information). Among the reviewed phytochemical screening studies, the most frequently used tests were those aimed at detecting the presence of flavonoids and other polyphenols, tannins, alkaloids, saponins, steroids, and triterpenoids. Products of the primary metabolism, such as carbohydrates, amino acids, and proteins, were also evaluated in some studies.


Chromatographic methods

Chromatographic profiles largely reflect the chemical composition of the sample as visualized by the chosen detector. This composition is dependent to a significant degree on the species, plant part, environmental conditions, and extraction method [17], [29]. Accordingly, chromatographic techniques have been widely applied for the identification of articles of botanical origin and the targeted detection of known adulterants [2], [10], [17]. To this end, the methods should ideally be able to distinguish the herbal materials from potential adulterants and substituents [10], which implies the need to establish a set of markers characterizing the plant species of interest [75]. Generally, appropriate species identification depends on the monitoring of a broad set of constituents, and most chromatographic methods are limited in terms of their ability to accommodate the ensemble of such diverse analytes [75]. Consequently, it is desirable to use a combination of independent methods covering different groups of constituents to achieve the required selectivity level [20], [75].

Among the chromatographic techniques applied to phytochemical profile monitoring, TLC stands out as one of the most popular methods, being included in most pharmacopeial monographs on articles of botanical origin [3], [10], [17], [76]. This technique is known to possess some advantages, such as simplicity of analysis and sample preparation; a relatively low cost; the possibility of processing several samples in parallel; the versatility of experimental conditions, and the possibility of using specific derivatization reagents [3], [17], [77]. However, as a predominantly manual technique, TLC is commonly associated with poor reproducibility, since its performance is strongly influenced by experimental factors (e.g., temperature, humidity, and chamber saturation) [3], [17].

Many of these limitations were overcome with the development of HPTLC, which involves the use of stationary phases with reduced particle size and is usually carried out with equipment that allows for automation of sample application, development, and documentation, resulting in a better standardization of the chromatographic conditions and, therefore, improved reproducibility [3], [17], [77]. Moreover, HPTLC applications are not limited to identification tests since the technique is also suitable for the assessment of batch-to-batch consistency, chromatographic profile monitoring during stability studies, and in-process control throughout product manufacturing [20], [78].

To demonstrate that TLC or HPTLC methods are suitable for their intended purpose, they need to be validated [20], [77], [78]. According to the main guidelines on validation of analytical procedures [79], [80], [81], identification methods such as qualitative TLC and HPTLC can be validated by only confirming their selectivity [22], which may be insufficient, considering the peculiarities of this technique. Consequently, some authors suggested that those methods should be validated on their robustness and reproducibility as well [20], [77]. Forced-degradation studies should also be considered for methods intended to be applied during stability studies [20], [77], [78]. Some interesting approaches for standardization and validation of HPTLC-based analytical methods have been suggested by Koll et al. [20] and Reich et al. [77].

Some examples of TLC and HPTLC applications for the qualitative analysis of PMs/PEs are listed in Table 3S (Supporting Information). The diversity of classes of secondary metabolites identified in these analyses illustrates the versatility of the technique. It is worthwhile mentioning that most of the reviewed studies (70%) did not report any validation or suitability evaluation of the proposed methods.




Fingerprint Approaches

Herbal fingerprints can be developed by a multitude of chemical analysis methods [14]. Some methods can be applied on samples without or after minimal sample preparation. Other methods require either simple or advanced extraction and/or clean-up. The complexity of the sample, the nature of active chemical constituents expected to be present, matrix components that potentially might interfere with the analysis, and the cost-to-benefit ratio all may be taken into consideration in the development of a sample pretreatment that is fit for purpose. Another crucial step is the technique used to develop the chemical fingerprint. Once a qualitative fingerprint is developed, its characteristic information needs to be extracted and translated into numbers that can be compared. In identification tests, the goal is to prove that a fingerprint under study has a sufficient degree of similarity to a specific class (species, origin, or based on other quality criteria), and to show that it differs from related classes. Visual assessment is a crucial tool that is of limited benefit when differences between groups are small and when fingerprints contain information that complicates their interpretation. A wide range of multivariate data analysis techniques is available, of which some were developed for specific challenges encountered in herbal fingerprints. Some challenges are artefacts of the methods used in the development of the fingerprints; others are caused by the inherent variability in the qualitative and quantitative composition of PMs/PEs. Sample pretreatment, the chemical analysis technique, and the data processing methods all are major factors in the quality of the fingerprint and its analysis [13]. The fingerprint approach has matured over the last decade from a research topic to a methodology that is slowly finding its way to pharmacopeia and other guidelines.

Botanical methods: DNA-based approach

Commercialized samples may be contaminated, or substitutes may have been used. As highlighted by Pretorius et al. [82], or wrong species can be used when there is nomenclatural confusion, especially when popular names are used (e.g., from TCM). All these mistakes or frauds can lead to health risks [83], [84]. In this context, DNA-based authentication techniques to identify plants with more security started to be used [85]. The most common techniques for species identification, such as RAPD [86], AFLP [87], SCAR [88], among others [89], [90], [91], [92], [93], [94], search for polymorphisms in DNA, with amplification of regions of the genome. These techniques are considered simple and fast, but in many cases, their reproducibility is challenging [85].

An alternative is the DNA barcode technique that uses a small part of DNA to identify species. For plants, several genomic regions (plastidial or nuclear) have been proposed since a single region has not been effective. The Consortium for the Barcode of Life Plant Working Group proposed in 2009 the use of plastid regions rbcL + matK as a barcode for plants [95]. However, other proposals were made, such as the combination of rbcL + trnL-F [96] or ycf1 [97] and the use of ITS or at least the ITS2 region [98], [99], [100].

A 2008 review [85] identified 82 publications that used DNA to identify medicinal plants, the majority of studied species came from the Chinese pharmacopeia [101], [102], [103], [104]. More recently, a significant increase in the use of DNA barcodes was seen, as demonstrated by Mishra et al. [105], and several adulterations could be detected in the industry [105], [106]. The technique is thus considered fundamental to be implemented in the industry [106]. The adulterations point to species that were not listed on the packaging or to the use of substitutes [82].

However, when the sequences are very degraded, and only small fragments of DNA can be obtained, some authors have suggested mini-barcodes [107], [108], which were successfully used to study medicinal plants, such as Ginkgo biloba [109], Serenoa repens [110], and species of the Apiaceae family [111]. Nevertheless, when many samples are mixed, the traditional sequencing technique is not suitable, and a metabarcoding analysis could be employed [112]. In this approach, environmental or highly processed sequences are analyzed and converted into OTUs, which are then compared with databases, such as BOLD. Species threatened with extinction and protected by international agreements were detected in this way in TCM [113]. The technique was also indicated as a complement to detect the presence of given species in HMs [18], [114].

We highlight that DNA techniques generally do not identify chemotypes or samples produced in different seasons [105]. For this purpose, especially in initial bioprospecting studies, the combination of DNA barcode (to ensure the identification of species) and other methodologies, such as chemical analysis, should be used.


Chemical methods: Analytical Techniques

A wide range of analytical techniques generate fingerprints. This includes spectral techniques (e.g., infrared [IR] spectroscopy, NIR spectroscopy, NMR spectroscopy, and MS), as well as chromatographic techniques (e.g., LC, GC, and HPTLC) [115], [116]. The latter separate the sample compounds before the spectral measurement by given detectors. Occasionally this results in a sample being characterized by a large number of spectra that hold majorly the same information as UV or MS data but structured in a more comprehensive way (hyphenated data). In this sense, it is not surprising that chromatographic techniques are prioritized for fingerprint generation, representing 68% of the studies reviewed ([Fig. 2]; Table 4S, Supporting Information).

Zoom
Fig. 2 The most used analytical techniques for fingerprint development, based on the 102 studies reviewed (Table 4S, Supporting Information). NIR = near-infrared spectroscopy; MIR = mid-infrared Spectroscopy; UV/DAD = ultraviolet/diode array detector; ELSD = evaporative light scattering detector.

In this review, more than 80% of the studies used an LC approach, about 11% GC, and less than 5% HPTLC (Table 4S, Supporting Information). LC is the most popular technique, mainly because of its high resolution, selectivity, sensitivity, versatility, and multiple detector possibilities. GC has a more limited applicability because of the requirement that analytes must be (semi-)volatile, which is not the case for several NPs. Finally, HPTLC is the least popular, probably because of its lower resolution [13].

In LC, once compounds are separated on the column, a detector is put in series with the flow path to generate a signal. A wide range of detectors is available, for instance, detectors based on the absorbance of electromagnetic radiation in the UV/Vis region (UV or DAD), on fluorescence (FD), on evaporative light scattering (ELSD), as well as MS. The complexity, comprehensiveness, and usability of the registered information highly depend on the detector and on the way it is used.

When using a DAD, for instance, a 3-dimensional chromatogram is generated, when plotting the absorbance versus wavelength and time. This may contribute to the identification of the analytes of interest by comparing UV/Vis spectra with reference standards, in addition to their retention behavior [117]. Furthermore, DAD allows for peak-purity evaluation, which is useful to point out the occurrence of coelutions [117], [118]. However, both the identification of the analytes and the peak-purity assessment have limitations, especially when analyzing samples containing structurally related constituents with similar UV spectra or in the case of a complete coelution [118]. Even so, about 47% of the studies in this review used a UV/DAD detector (Table 4S, Supporting Information). This may be justified because of its good combination of sensitivity, linearity, versatility, and reliability. In addition, it is one of the most affordable detectors [117].

Another very popular detector for LC fingerprinting is the MS detector. It accounted for more than 50% of the studies (Table 4S, Supporting Information). The advent of ionization techniques, such as ESI and APCI, has significantly broadened the application of HPLC coupled to mass spectrometers (HPLC-MSn or HPLC-MS/MS), enabling the analysis of a wide range of plant secondary metabolites with high selectivity and sensitivity, thus overcoming some of the limitations of DAD [117], [118], [119].

Spectroscopic or spectrometric techniques can also be applied for fingerprint generation. They account for 32% of the qualitative studies reviewed (Table 4S, Supporting Information). The most popular technique was 1HNMR (18%, Table 4S, Supporting Information). 1HNMR is a powerful technique that provides exhaustive information on the chemical structure of compounds [120]. 1HNMR spectra of complex matrices may be very noisy, causing low sensitivity and signal overlap, hampering metabolite identification. However, its popularity may be explained by its sample preparation simplicity, as well as short analysis time [13], [115].

Other popular spectroscopic techniques are NIR and MIR, accounting for about 12% of the qualitative studies (Table 4S, Supporting Information). These methods can give structural information; however proper identification requires additional methods. In addition, frequent signal overlap may hamper data interpretation. Spectral data may be particularly useful when subjected to chemometrics, for instance, to build classification models. The main advantage of the IR techniques is that sample preparation may not be required (or kept at a minimum); in addition, NIR is a nondestructive technique [116], [121], [122], [123].


Chemical methods: Chemometric data analysis

In chemometrics, calculations are applied based on chemical signals. Building a calibration curve model based on a series of standards of a compound to determine its concentration in a sample is a basic and frequently used chemometric approach. The term chemometrics also encompasses techniques to process more complex chemical signals, such as chemical profiles. In contrast to single-signal-based assessments (univariate analysis), when using multivariate analysis methods, the entire profiles are considered. They systematically translate, for example, similarity and dissimilarity between the profiles or fingerprints into numbers. The approaches can be divided into unsupervised and supervised methods. Unsupervised methods are used for data exploration or identifying patterns such as only applying the information from the fingerprints (i.e., the multivariate signal). On the other hand, supervised methods link these patterns to additional information, for instance, class information or biological activity, to build prediction models, either for qualification or for quantification purposes [124].

Data structure

Chemometric methods all start with a data matrix. For simple 2-dimensional fingerprints, the data matrix X (n × p), is a table that lists the consecutively measured signals (1 to p) for each sample (1 to n), where p is the number of data points measured for each sample and n the number of samples. When hyphenated techniques are used, a third dimension is required to represent the spectral information acquired, resulting in a matrix X (n × p × q), where q is the number of spectral points measured for each sample. This 3-dimensional data matrix is a cube of data, where the numbers in a layer belong to 1 sample.

In practice, however, the 3-dimensional character of the data is often simplified by extracting a table of features. For instance, from 3-dimensional LC-MS data, often a table of signals above a user-defined threshold value is extracted and represented with the samples as rows and m/z-retention time pairs as columns. A benefit of this data extraction is that techniques for 2-dimensional tables can be applied, which are much more abundant than methods to process 3-dimensional data sets. Additional benefits are that methods for 2-dimensional data require less computation power and memory and are usually less time-consuming than for 3-dimensional. Different methods exist to process the information in these data matrices, including similarity analysis, exploratory data analysis, and classification methods. Nevertheless, methods to process 3- and multi-dimensional data sets exist but will not be discussed in the scope of this review. Interested readers are referred to [124], [125].


Calibration sets and test sets

When building and validating a model using a modeling (regression or classification) approach, the modeling is applied to the data of a calibration set. This allows establishing a calibration model, in its simplest form, a linear calibration curve, based on the signals and concentrations of a set of standards. Application of the model to a second set of samples (validation or test set) with known concentrations allows comparing the predicted and known concentrations, thus determining the accuracy (predictive properties) of the model. The evaluation of the agreement between known and predicted concentrations is a crucial step in the validation of the model.

The strategy to establish and validate multivariate models is rather similar. Samples with known properties (geographical origin, botanical species/genus, given activities, or other qualitative/quantitative properties of interest) are used to establish a model that can distinguish the classes or quantify the property. Models of different complexities are built. They differ in their fit to the calibration samples. The predictive properties of the models can be evaluated based on a cross-validation method. In this latter methodology, a (selection of) sample(s) is excluded from the data set, and the model is built with the remaining samples, while it is applied to the left-out sample(s) to evaluate the predictive properties of the model. Repeating the procedure until all samples have been left out and predicted once provides information on the prediction error of the model. The best model is the one with the best compromise between model fit (for calibration data) and predictive properties from cross-validation samples. Once the optimal model is determined, it is applied to the test set (if available), resulting in predictions for the test set samples. These are then compared to the known values to assess the predictive quality of the developed model [126].

Ideally, each class should be well-represented in roughly equivalent numbers to build reliable chemometric models that can be applied to new samples with a sufficient degree of certainty. Nevertheless, studies with more restricted sample sizes may also have a value in a study context, occasionally resulting in local models. In the latter case, caution is required when applying the models to new samples. When a given source of variability (for instance seasonal) is not modeled, the model needs to be updated first to include the new source of variability before application on such new samples. The minimum sample size for each set is difficult to define, but models can reasonably expect to improve with an increasing number of represented samples. Minimal sample size will also depend on the complexity of the classes to be described; higher variability demands a higher sample size.


Data pretreatment

Very few techniques to develop profiles or fingerprints register only the information in which an analyst is interested (i.e., only on the compounds present in the sample). Examples of artefacts or experimental error affecting the profile are the change in absorbance due to the change in percentage organic modifier or variability in retention times between samples or injections. Moreover, many multivariate profiles (for instance, NIR spectra) contain physicochemical information from different sources. For instance, a spectrum is determined by the chemical properties of the compounds in the sample but also by the physical aspects of the sample (e.g., particle size or shape) or by compounds added during processing (e.g., excipients). The measured signal is thus a combined response originating from a (very) complex mixture. In this profile, only a limited amount of information is related to the property of interest, while much information is from other sources.

Data pretreatment is performed because of this data complexity and to deal with it. Data pretreatment may help to minimize the influence of unwanted variability and to maximize that of information linked to the property of interest [126], [127], [128], [129]. The goal of data pretreatment is thus to minimize the variability caused by the undesired sources and to maximize the variability of the property studied. Blank or baseline correction [130] may be required to maintain the information specific to a sample, for instance by filtering out the changes due to the mobile phase gradient. Examples can be found in [130]. Since compound peaks are prone to variability in retention time, this affects proper data analysis or model building. Since most multivariate methods, discussed in the following sections, are based on matrix calculations, a prior alignment of the peaks across the sample set can significantly improve the quality and simplicity of the model created [131], [132], [133], [134], [135]. These techniques are called warping methods, with COW being rather popular. COW works by cutting the profile in fragments of a certain length (segment length parameter is to be optimized) and allowing these to stretch or compress a number of time points (slack parameter is also to be optimized). A representative chromatogram, occasionally an average or median chromatogram, is used as a template to which to align the other. Color maps of the matrices of correlation coefficients C (n × n) between the profiles before and after warping are often drawn to evaluate the effectiveness of the warping procedure [136]. Examples can be found in Table 4S (Supporting Information), and a representation of the warping effect on a set of chromatograms is shown in [Fig. 3]. Some other pretreatment methods are of general use, for instance, mean centering or normalization. Centering can be applied per row or column. Normalization can also be performed per row or column. Other data pretreatments include baseline subtraction, calculation of derivatives (first or second), or direct orthogonal signal correction.

Zoom
Fig. 3 Fingerprints and color maps of their correlation coefficients (at the bottom) before (a) and after (b) warping. Source: Reproduced with permission from [136]. [rerif]

Meaningless parts of the profiles may be excluded, for instance, the first part of a chromatogram that can be affected largely by the solvent peak [137]. Some methods select fragments of a chromatogram, related to a certain property, using variable-selection algorithms.

A trial-and-error approach is often applied to determine the suitable data pretreatment for a given set of fingerprints and a given type of analysis. This makes it impossible to develop a standard approach that suits all situations. Usually, different pretreatment approaches are tested when solving a given problem. Their outcome on the final solution is compared. It may also be recommended to compare to the solution obtained without data pretreatment (i.e., on the raw data) to verify whether pretreatment is necessary.


Similarity analysis

Similarity analysis calculates a parameter that expresses the similarity between pairs of fingerprints. In the studies considered in this review, 19% applied a similarity analysis step in their chemometric analysis. Two types of parameters are applied (i.e., correlation- or distance-based). The more similar a pair of fingerprints is, the higher the correlation parameter and the lower the distance. When considering more than 2 samples, the similarity parameters can be calculated between each pair of fingerprints and visually represented in color maps (similar to those shown in [Fig. 3]) of dimensions C (n × n) or D (n × n), where C is the correlation-parameter and D the distance-parameter matrix. The numbers in C or D can be used to calculate warning and control limits, against which the correlation and distance values of new samples are assessed [138]. Statistically, for similar samples, 99.3% of the correlation values are contained within the interval bordered by the lower and upper warning limits; and 95% within the lower and upper control limits. Since a low correlation indicates dissimilarity, for correlation parameters, the lower limits are used to find samples with atypical behavior compared to the reference set. A similar approach applies to the distance values, where samples more distant than the upper limits derived from the reference samples reveal atypical behavior. Both correlation-based and distance-based parameters have been used in this context to study the similarity of LC-UV fingerprints for green tea samples [138]. For 2 different datasets, it was possible to detect outlying or dissimilar samples based on the dissimilarity of their fingerprints when compared to those obtained from genuine samples.


Exploratory data analysis

Unsupervised exploratory data analysis methods [139], [140], [141], [142] have in common that only the X matrix is used in the calculations. This is the essential difference between unsupervised and supervised data analysis methods. The purpose of exploratory techniques is similar to that of a histogram or a boxplot in univariate data analysis (i.e., data visualization and outlier identification). In supervised methods, a (measured) property is modeled as a function of the profile, which is similar to building a univariate calibration model to estimate the concentration of a compound of interest. Both unsupervised (exploratory) and supervised models are built with samples with known properties (concentration, geographic origin, pharmacological activity or else), before being applied to unknown samples, as is also done when a calibration curve is built.

Exploratory methods, such as PCA, are not suitable for classical classification purposes (i.e., the prediction of a class for a given sample). They can indicate (dis)similarity; however, it is not a predictive response. They are sometimes called unsupervised classification methods. Since it is not possible to estimate the error, it is not possible to validate exploratory methods. Therefore, these analyses are less suitable for classification or quantification purposes. The representativity of the results obtained from the exploratory analysis depends also on the sample size. It means that small-sized sample sets probably will not reflect the entire population in sample sets with much diversity.


Principal component analysis (PCA)

PCA is an unsupervised exploratory technique, which does not use class information in its algorithm. It is also often applied as an unsupervised classification approach. In PCA, the data dimensionality of X is reduced while maintaining most information on the data structure [139], [140], [141], [142]. From the original variables, a set of new variables (latent variables called PCs) is derived that in decreasing order capture the largest (remaining) variability found in the original variables. The samples are then visually represented as dots based on their projection scores, usually, on the first 2 or 3 PCs (e.g., PC1 – PC2 or PC1 – PC3 score plot). Samples with high similarity are expected to cluster in the score plot, while dissimilar samples are expected to be more distant.

Another output from PCA is the indication of which variables provide similar and dissimilar information. This is visualized in loadings plots. Variables with similar behavior are arranged in the same direction from the center, while variables with opposite behavior tend to be situated on opposite sites across the center of the plot. The distance from the center can be used as a measure for the importance (weight) of a given variable [143]. The further the distance, the more important the variable (to distinguish among samples).

Differences in scales of the variables can be reduced by the application of an appropriate data pretreatment before the application of the PCA algorithm. Both score and loadings plots can be combined in a so-called biplot. PCA is widely applied as a visualization tool in herbal fingerprinting (Table 4S, Supporting Information), mainly in the early stages to screen for outliers. About 85% of the studies reviewed used PCA.


Hierarchical cluster analysis

Another type of exploratory analysis is HCA [141], [144]. It is a less popular technique than PCA; 29% of the studies summarized in Table 4S (Supporting Information) used HCA. Distance or correlation parameters can be used to sequentially group samples in clusters. The algorithm can start by considering all samples as individual groups and sequentially grouping them until all samples are grouped. Alternatively, all samples initially are considered as one group, which then is split into several subgroups. The results are represented in tree-like structures, called dendrograms. HCA can be used as an alternative technique for PCA to visualize the structure of a data set. Applications of HCA are shown in Table 4S (Supporting Information).


Discrimination and classification modeling

After an initial exploratory data analysis, discrimination and classification methods are often used as identification tools [141], [142], [145], [146], [147]. They allow determining whether or not an unknown sample belongs to a given class. In discrimination and classification modeling, a mathematical model is built between fingerprint information in the X (n × p) matrix and property of the samples, collected in a y vector (n × 1), where n is the number of samples and p the number of variables measured for each sample. When the property y is categorical–such as botanical species or geographic origin–discrimination or classification models allow distinguishing between the classes. The model is developed using a set of samples (calibration set) with known class information. The models predict the class of a sample based on its information in the X matrix. A key difference between classification and discrimination techniques is that with a discrimination model, a sample is obligatorily assigned to one of the modeled classes, while with classification models, the possibility exists to assign the sample to none, one, or more than one of the modeled classes. Model quality is assessed from the percentage of correct and incorrect class assignments. This can be done per class by calculating the number of assigned members (TPs), assigned nonmembers (FPs), not assigned members (FNs), and not assigned nonmembers (TNs). Several parameters [145] exist in which these are combined and calculated as per [Table 1]. These parameters can also be determined in the cross-validation and external validation steps.

Table 1 Model parameters per class i and across all classes j.

Parameter

Per class I

Overall

* Calculated for classification methods only; ** Also known as correct classification rate

Sensitivity

TPi/(TPi + FNi)

Specificity

TNi/(TNi + FPi)

Precision

TPi/(TPi + FPi)

Nonerror rate (NER)

(Sensitivityi + Specificityi)/2

Error Rate (ER)

1-NER

1-NERoverall

Unassigned samples*

Unassignedi/
Totali

Accuracy

A large number of techniques [141], [146], [147] is available to build discrimination and classification models, including LDA and QDA (2%, Table 4S, Supporting Information), SVM (3%, Table 4S, Supporting Information), PLS-DA, and OPLS-DA (39%, Table 4S, Supporting Information), SIMCA (4%, Table 4S, Supporting Information), ANN (5%, Table 4S, Supporting Information), kNN (3%, Table 4S, Supporting Information), and others. The majority of these techniques are based on the definition of latent variables (e.g., discriminant functions, support vectors, PLS components, or PCs). The models are defined by combining the X and y data to distinguish the known classes (e.g., different species) in a data set. In each technique, the way of separating is done differently, hence are also the success rates. Once the model is developed based on a calibration set, samples of an independent test set can be projected in the latent variable space to determine their predicted class and to validate model performance. If the model is reliable, it can be applied to real unknowns to estimate their class.

kNN is a somewhat different technique since the Euclidean distances to a set of K samples of known classes are used to assign the unknown sample to the most represented class within the closest Euclidean distances. ANN works differently again, similar to how a human brain works. The (input) variables from the X matrix are linked to a series of nodes by giving weights to the variables of the multivariate profile. An output function determines whether the information is transferred to the next layer of nodes (occasionally hidden layers) or not. The weights in the network are initially (often randomly) set to determine the output of the network (forward propagation), in this case, the class of a sample. The class prediction can be right or wrong, and this information is internalized by the network through a process called backpropagation (i.e., learning from the error made in the first prediction to adjust the weights in the network). Forward and back propagation are iterated until the weights are optimized in such a way that the model predictions are sufficiently correct. The best performing technique for a given data set is often to be determined via a trial-and-error approach. However, proper calibration and test sets are very important, particularly in complex models (such as SVM and ANN) to avoid overfitting.

Applications of exploratory and classification techniques on data obtained with different analytical methods are summarized in Table 4S (Supporting Information). For instance, to study the chemical variation between 6 batches of total flavones of sea buckthorn, HPLC-UV fingerprints were developed. The areas of 40 peaks were measured to build matrix X. PCA allowed observing the separation between the 6 batches ([Fig. 4 a]); PLS-DA clustered the batches in 4 classes ([Fig. 4 b]). In this study, instead of a simple visual inspection of the PLS-DA loading plot ([Fig. 4 c]), HCA was applied to the loading matrix to detect the peaks responsible for the classification obtained by PLS-DA ([Fig. 4 d]). Using this approach, quercetin, kaempferol, isorhamnetin, oleanolic acid, and ursolic acid, in addition to some unknown compounds, were found mainly responsible for PLS-DA classification [148]. However, the data set and the number of representative samples per class are very small. Therefore, one may wonder whether the observed classification will be maintained when a large data set with more samples per class is used and whether the same responsible variables will be indicated.

Zoom
Fig. 4 PCA score plot (a), PLS-DA score plot (b), PLS-DA loading plot (c), and HCA dendrogram of the loading matrix of PLS-DA (d), built by the analysis of 40 peaks areas obtained by HPLC-UV analysis of total flavones of sea buckthorn. Source: Reproduced with permission from [148]. [rerif]

Discrimination/classification methods can also be used to discriminate between, for instance, different species, treatments, and ages. This is also an evaluation of the chemical variation between the different populations/classes. In this sense, Lee et al. [149] used a combined ¹H NMR fingerprint and PLS-DA approach to discriminate different ginseng species. The authors evaluated American (Panax quinquefolius) and Asian ginseng (Panax ginseng), consisting of samples from 2 different regions. From PLS-DA score plots, it was observed that the technique properly classified the ginseng samples ([Fig. 5 a]). The loading plot ([Fig. 5 b]) analysis allowed the identification of discriminant metabolites, which included some amino acids (e.g., glutamine, leucine, alanine) and carbohydrates (e.g., sucrose, fructose, glucose). Some metabolites were proposed as biomarkers to differentiate between the ginseng species. Again, the data set and the number of samples per class are very small to draw general conclusions.

Zoom
Fig. 5 PLS-DA score plot (a) and loading plot (b) built from ¹H NMR fingerprint obtained from different samples of American ginseng (Panax quinquefolius) and Asian ginseng (Panax ginseng), consisting of samples from 2 Chinese regions. GB = American ginseng; MS = Asian ginseng from Fushun, China; TH = Asian ginseng from Tonghua, China. Source: Reproduced with permission from [149]. [rerif]



Assay: Quantitative Analysis (Traditional approaches)

The quantitation of one or more markers or active constituents has been an approach traditionally adopted in the quality control of articles of botanical origin. It is also included in most monographs on medicinal plants in official compendia [10], [11]. The analytical methods used for this purpose have, however, evolved, with the gradual replacement of less specific methods, such as titrimetric or spectrophotometric, with chromatographic methods, which are more appropriate for the characterization of the chemical composition of complex matrices [11]. In recent years, appropriate analytical method development approaches, preferably using QbD tools, such as DoE for method optimization, have been encouraged by regulatory authorities [150], [151]. In addition to the analytical step, the extraction procedure plays also a critical role in quantitative analyses, interfering with analytesʼ recovery (see Sample Preparation Section).

Among the papers reviewed, 69 focused primarily on the development and validation of analytical methods for the quantitative determination of secondary metabolites. Of these, 5 involved the application of traditional analytical methods based on nonchromatographic techniques, such as spectrophotometry; 15 were related to HPTLC methods, 45 to LC methods (HPLC or UHPLC), and 4 to GC methods ([Fig. 6]; Tables 5S7S, Supporting Information).

Zoom
Fig. 6 Plot representing the most used analytical methods in the quantitative analysis, based on 69 studies (Table 5S7S, Supporting Information). FID = flame ionization detector; DAD = diode array detector; ELSD = evaporative light scattering detector; FD = fluorescence detector; CAD = charged aerosol detector.

Spectrophotometry

UV spectrophotometry-based methods are often used for the quantitative determination of groups of constituents, usually belonging to the same secondary metabolites class [3], [11], [24]. The most common approach is to select a single marker compound as standard, while the contents of the other compounds are expressed in terms of this reference substance [152]. An important drawback of this approach is that differences in specific absorbance coefficients of the target compounds (e.g., between heterosides and their corresponding aglycones) are disregarded, possibly leading to inaccurate results [24], [152]. Moreover, spectrophotometric quantification of a certain group of constituents ignores its relative composition, which compromises the achievement of clinically meaningful results since different compounds are unlikely to present identical pharmacological properties [24]. Furthermore, those methods often lack appropriate selectivity and are not stability-indicating [3], [24]. Based on the above, the EDQM has recommended their replacement by chromatographic methods for the quantitative analysis of targeted compounds [11]. Similarly, the WHO established that assay methods included in pharmacopeial monographs on HMs should be validated and based on stability-indicating chromatographic techniques [10].

For HMs whose pharmacological activities are associated with antioxidant properties, spectrophotometric methods can be used to assess the antioxidant activity or the total phenol content [153]. Examples of UV-spectrophotometry-based methods applied to the quantitative analysis of PMs/PEs are listed in Table 8S (Supporting Information). However, only 1 paper [154] reported a validation study according to ICH guidelines. Although most reported methods follow traditionally used approaches, it is important to note that this does not make validation redundant. PMs/PEs/HMs present complex matrices with different individual characteristics, and, therefore, the adequacy (fit for purpose) of the analytical methods must be demonstrated on a case-by-case basis.


Chromatographic methods

HPLC and UHPLC

HPLC is one of the most widely used techniques in the quality control of PMs/PEs and HMs [3], [17], [118]. Among its advantages, the ability to provide high resolution of a wide variety of analytes, as well as its versatility, due to the range of stationary phases available and the possibility to couple to different detectors types (DAD, FD, ELSD, MS, and CAD) can be highlighted [3], [17]. These detectors provide different levels of selectivity and sensitivity [17], [117].

DAD and MS detectors were previously addressed (See Analytical Techniques Section). FD provides higher levels of sensitivity and selectivity than UV or DAD, which is an advantage in analyses of complex matrices [117], [155]. However, the application of this detector is limited since most analytes of plant origin are devoid of natural fluorescence [117]. ELSD is a more affordable alternative to MS for the determination of analytes with little or no absorption in the UV-Vis region, such as terpenes, saponins, and some alkaloids [117]. Unlike DAD, ELSD provides a mass-dependent response that is not influenced by the physicochemical or spectral properties of the analytes, thus allowing their quantitative determination even if specific standards are not available [117]. Nevertheless, the ELSD response is influenced by the mobile-phase composition and does not fit directly into a linear correlation model, thus requiring a suitable mathematical transformation [117].

The use of CAD in LC is relatively new [117], [156]. This detector is derived from ELSD and shares many of its characteristics [117], such as its universal character and the possibility of applying a universal response factor in quantitative analysis, as well as the limitations regarding the mobile phase and the nonlinear responses [156]. The list of detectors mentioned in this paper is not exhaustive. A review of the LC detectors applied to the analysis of NPs can be found in [117].

In the LC-based methods reviewed in this paper, the DAD was most frequently used. It is reported in 80% of the studies, followed by mass spectrometers, which were used in 38% of the studies. DAD-MS hyphenated detection was also observed (13 studies, 29%), especially in cases involving the quantification of several markers and the simultaneous collection of qualitative information. The use of ELSD was reported in 3 studies (7%), mainly for the quantification of substances with poor UV-Vis absorption, such as diterpenoid alkaloids, baccharane glycosides, and phytosterols. ELSD and DAD results were compared [157] for phytosterol quantification. ELSD was found suitable for stigmastanol detection, unlike DAD, and provided more sensitive results regarding the other analytes. FDs were reported in 2 papers (4%) and were more sensitive than DAD for aloe-emodin quantification [158]. CAD was reported in 1 study, which aimed at determining the content of phenolic compounds [156] (Table 5S, Supporting Information and [Fig. 6]).

In addition to advances in detection systems, significant improvements in the efficiency of LC have also occurred in recent decades, thanks to the development of equipment capable of operating under higher pressures (UHPLC) and stationary phases with reduced particle sizes (below 2 µm). This provided a considerable reduction in analysis time [17], [117]. Of the studies reviewed, 36 (80%) used classic HPLC, while 9 (20%) involved UHPLC-based methods ([Fig. 6]). Despite its advantages over traditional HPLC, UHPLC still has a relatively high cost, which, along with the fact it is a more recent technology, may explain the lower number of studies reviewed.

Most analytical methods in the reviewed studies were validated according to the general recommendations of the ICH Q2 guideline [79]. AOAC, FDA, and ANVISA guidelines [80], [81], [159] were also mentioned in some studies. It is worthwhile mentioning that selectivity is quite important for analytical methods applied to plant matrices and is recommended by the above-mentioned guidelines. However, a significant number of the reviewed studies did not report an adequate assessment of this property. Few studies evaluated the possible interference of matrix constituents on the analytical response. The assessment of potential interferences, likely to be formed during stability studies, was an even less frequent concern. Similarly, most studies did not report the assessment of the robustness of the developed methods.

The abovementioned information, although referring to a small number of studies, illustrates the need to fill a gap between the current regulatory requirements and the analytical development and validation practices commonly adopted in academic research, so that the methods developed can be used reliably, including in GMP and GLP environments. A summary of the reviewed papers can be found in Table 5S (Supporting Information).


HPTLC

The application of HPTLC for the qualitative analysis of PMs/PEs/HMs is discussed earlier in this paper. This technique can also be applied in quantitative analyses, thanks to the development of modern detection systems that allow densitometric evaluation [23], [160]. Moreover, automation of sample application and detection, closed developing chambers, and software-assisted documentation provided significant improvements to the technique, especially regarding reproducibility and results traceability. They allow compliance with regulatory guidelines for quantitative method validation, as well as with pharmaceutical GMP requirements [23], [77], [78], [160]. Some examples of HPLTC-based methods applied to the quantitative determination of secondary metabolites are presented in Table 6S (Supporting Information).

Notice also that, in certain cases involving problematic analytes, such as some unstable substances, HPTLC may be preferred over other chromatographic techniques [118]. Advantages of HPTLC include its versatility and flexibility regarding the selection of experimental parameters, such as mobile and stationary phases, development conditions and derivatizing agents; minimum solvent consumption; simpler sample preparation; and the possibility of processing several samples on the same plate, resulting in high-throughput and low-cost analyses [23], [77], [78], [118], [160]. However, HPTLC may not be able to separate certain analytes with the same efficiency as HPLC, in part due to limitations in developing distance, elution mode, and plate performance [161], [162]. However, this technique may also be susceptible to changes in experimental parameters and environmental factors, especially when an open developing system is used [23], which emphasizes the need for standardization of experimental procedures and appropriate control of critical experimental parameters [23], [77].

In the studies reviewed, the ICH Q2 guideline was used as a reference for validation of HPTLC methods, although some peculiarities of this technique, especially when applied to complex matrices, are not included in the guideline. Most studies (93%) involved the assessment of method robustness. Nevertheless, the choice of analytical variables to be studied for this purpose may not have been sufficiently comprehensive in some cases. The selectivity evaluation, despite being especially important, was not reported in one-third of the reviewed studies. In some studies, the chromatograms showed poor resolution between peaks of interest, indicating a lack of selectivity. Forced degradation studies, investigating the stability-indicating potential of the methods, were reported in 40% of the reviewed papers. As pointed out elsewhere in this paper, this indicates the need for harmonization between the validation approaches used in the academic environment and those recommended in validation guidelines and by regulatory authorities.


Gas Chromatography (GC)

GC is widely used for the quantification of volatile compounds in plant matrices, which mostly are difficult to analyze with LC because of their physicochemical properties [17], [118], [163], [164]. GC is characterized by a high sensibility, good separation efficiency, and simplicity of sample treatment, unless derivatization is needed [17], [164]. However, the application of GC for NPs has some limitations, such as the need for derivatization to enable the determination of less volatile analytes, thus increasing analysis time and costs, and the possibility of degradation of thermolabile compounds [17], [118]. This may explain, at least partly, the limited number of GC-based methods the studies reviewed (Table 7S, Supporting Information).

The most common detectors associated with GC, in the analysis of PMs/PEs/HMs, are the FID, ECD, and MS detectors [17], [118], [164]. Among these, FID stands out as one of the most popular, due to its ability to provide almost universal responses to organic compounds within a broad linear range and with good sensitivity, as well as to its relatively low cost and simplicity of operation [165]. The hyphenation of GC with EI ionizers MS through a direct interface is also widely used. Besides providing high sensitivity and selectivity, this type of detection allows online identification of constituents using retention indices and mass spectra [17], [118], [163], [164]. Since EI provides consistent fragmentation patterns, mass spectrum libraries can be used for identification purposes, which is a major advantage [17].

In the reviewed studies, both FID and MS detection were equally used for quantitative purposes (Table 7S, Supporting Information). In 1 study, MS was used in scan mode for the characterization of Copaifera spp. oil before the quantitative determination of β-caryophyllene, α-copaene, and α-humulene using FID [166]. Most validation studies reported for GC methods also presented some gaps, especially concerning the assessment of selectivity and robustness. Only 1 paper reported a full validation, according to the ICH guidelines. Nevertheless, important aspects of the selectivity, such as the absence of interference by matrix components or possible degradation products, were not evaluated.



Stability

Stability studies are an essential part of the HMsʼ regulatory quality requirements, ensuring the consistent maintenance of product quality, safety, and efficacy throughout its shelf life [167]. Today, 5 global authorities and 15 countries define guidelines for stability testing (parameters and procedures) on HMs [168], [169]. However, several still neglected issues may hamper the industrial production of safe and effective products, as well as their evaluation in clinical trials. Negative results of the latter could be related to the loss of stability of some active constituents [170].

Stability testing aims to provide experimental evidence on the impact of a variety of environmental factors, such as temperature, humidity, and light, on the quality of PE and HM, over time, during the proposed shelf life, supporting the recommended storage and in-use conditions [171], [172]. On-going stability programs or re-tests are also required for marketed products [171], [172], [173].

However, evaluating the stability of HM presents a key challenge compared to chemically defined substances. Articles of botanical origin are complex from the chemical and analytical points of view, due to the high number of constituents from different chemical classes (polar to nonpolar, acidic to basic, very low to high molecular mass compounds) having different analytical behavior, besides a low concentration, especially in the final dosage form [27], [174]. In addition, the different compounds present in the plant matrix may undergo intra- or intermolecular reactions under the stress conditions during stability studies, such as heat, light, and humidity, generating potentially less effective and/or toxic products [174]. Another important issue is the lack of knowledge about the constituents responsible for the pharmacological effect, and therefore, the importance of their monitoring during the shelf life of the product [27]. This issue becomes even more complex when dealing with associations of medicinal species. Consequently, the definition of objective acceptance criteria and quality specifications to be applied throughout the stability studies for PE and HM is often a difficult task.

Therefore, considering that the PM/PE is regarded as the API in the HM, the official guides do not merely accept the monitoring of the stability of the constituents with known therapeutic activity. This requires additional studies, such as a comparison of appropriate fingerprint chromatograms from the beginning and the end of the study [168] or biological assays reflecting the drugʼs known or intended mechanism of action [9]. However, the definition of evaluation criteria for a fingerprint analysis is quite complex, as experimental results depend on factors, such as i) the analytical technique (TLC, HPTLC, HPLC, UHPLC, etc.) used and its resolving capacity for constituents with similar chemical structure, such as some degradation products of the original constituents; ii) the selectivity and response to the detection system (TLC staining methods, UV radiation absorption, mass spectrometry); iii) the kinetics of the degradation products. Consequently, only an initial and final fingerprint analysis may not be sufficient to understand the overall degradation process.

Given these limitations, we highlight the advantages of multivariate analysis. It allows a global assessment of the changes that occurred during a stability study and provides more in-depth information regarding the changes in the chromatographic profile. It may result in statistical criteria for scenarios where a significant change is observed (see Fingerprint Approaches section). These changes may compromise the productʼs safety and efficacy profile [174]. Thus, biomonitoring of samples submitted to stability tests is also relevant to establish the thresholds for changes allowed in the biological activity spectrum. In practice, univariate chromatographic methods are found the most popular for stability monitoring (Table 5S, Supporting Information).

Stability studies monitored by multivariate analysis have not yet been properly explored (Table 5S, Supporting Information) except for the use of PCA to check fingerprint dissimilarities [175]. However, the observed changes in the chemical profile are not necessarily associated with changes in biological activity (therapeutic or toxicological). Thus, stability monitoring through multivariate analysis should be accompanied by biomonitoring as stated above.

However, the choice of biological methods for biomonitoring is not a trivial task. The wide variety of active constituents and thus, the potential biological effects of a medicinal plant, highlight the importance of investigating which classes of constituents are associated with a given biological effect. After all, one medicinal plant can have different therapeutic applications. In this context, the choice of biomonitoring models plays a central role.

Lang et al. [176], for instance, evaluated the in vivo antihyperalgesic and anti-inflammatory activities of lyophilized and spray-dried extracts of Sphagneticola triblotata. A decrease in the antihyperalgesic activity of the spray-dried extract was related to the thermodegradation of chlorogenic acids. The kaurenoic acid content, however, remained stable, explaining the unchanged anti-inflammatory activity [177]. In another study [178], the withaferin A and withanolide A contents in Withania somnifera roots aqueous extract were monitored during real-time and accelerated stability studies, along with an evaluation of the in vivo immunomodulatory activity. The authors reported a content decrease and fingerprint alteration, along with a decrease in biological activity [178]. This evidence highlights the importance of robust and validated analytical methods to monitor the stability of constituents, associated with proper biomonitoring methods.

In addition, it is important to consider the dosage forms commonly used for HM. They may include powdered PM, soft extracts, tinctures, spray-dried extracts, freeze-dried powdered material, and other herbal preparations [179]. These herbal preparations are used as such or mixed with excipients to formulate solid oral dosage forms, as tablets, capsules, and sachets, or liquid dosage forms, as syrups, liquid extracts, and tinctures. Other dosage forms include, for instance, creams, ointments, and semisolid preparations [27]. Thus, the stability concerns may focus on the different steps of the HM production: i) exposition of the PM during drying in the oven or under sunlight may be affected by environmental factors, such as air, moisture, heat, light, and microbes [179]; ii) the PE preparation steps, in which the herbal constituents are exposed to potentially degrading agents (e.g., solvents, heat, shear, light); iii) the HM production, where herbal preparations are incorporated in a dosage form. It may include tableting, shearing, heating, and air/radiation exposition steps. In those steps, potential risks of degradation occur, not only due to the extrinsic factors involved but also to interaction with the excipients used [27]. All steps involve specific risks of degradation, loss of potency, and/or generation of artefacts that can also be active, contributing to the pharmacological or biological activity or implying toxic effects [180]. Throughout all steps, selective stability-indicating methods, for the extract components and the final formulation, are of extreme importance.

It should be noted that nonvalidated analytical methods may hide degradation problems due to a lack of selectivity. Thus, proper selectivity evaluation is necessary for stability-indicating methods [181], [182]. As shown in Table 5S (Supporting Information), only 1 of 45 studies carried out a validation to demonstrate that the methodology is stability-indicating. Although it is not a regulatory requirement, such studies allow accessing important information about the stability of phytochemicals and assist in directing the dosage form, its composition, packaging, and storage conditions.

Recently, the concept of QbD, which incorporates product development information in the regulatory application phase of pharmaceutical products, has been adopted by regulatory agencies [167]. Although introduced at the beginning of the 21st century, aiming to systematize the development of pharmaceutical products under a risk-based approach [183], [184], this concept is rarely extended to PEs/HMs [185]. Finally, it is important to highlight that stability is a direct consequence of the development, and the more assertive and in-depth this is, the fewer problems will be faced regarding stability.

In this context, tools such as QbD, multivariate analysis, and biomonitoring, should be applied systematically in studies on HM development and stability, to integrate academic efforts and regulatory needs, to expand the availability of safe HMs for the population.


Fingerprint approaches

Pharmacological data as a response

Pharmacopoeial monographs for the quality control of PM have evolved from a botanical description to chemical tests, including both qualitative and quantitative analysis [11]. These can attest to plant identity and can indicate safety and efficacy, especially when active markers are determined. Unfortunately, such monographs do not exist for most PEs/HMs. Therefore, pharmacological assays, both to attest PM/PE/HM quality and to determine bioactive markers seem a promising approach for quality control [5], and for monitoring the stability (see Stability section).

Most studies selected in this review (78%, Table 9S, Supporting Information) use multivariate calibration techniques to link chemical fingerprints to a given pharmacological activity to indicate potential biomarkers. A similar strategy has also been used for drug discovery [136], [186], [187], which is outside the scope of this review. The main principle is to model measured bioactivity as a function of the fingerprint data by regression tools [14], [188]. In all case studies (24 studies, Table 9S, Supporting Information), fingerprints were generated by separation techniques (mainly LC, 87%) and correlated to pharmacological results obtained by in vitro assays (87%). Although spectroscopic fingerprints can also be used, their interpretability may be hampered, mainly because of sample complexity [189]. In addition, the application of in vivo assays is also restricted, both because of a limited reproducibility and high animal demand since, multivariate calibration requires rather extended data sets [190], [191].

As a first step when linking chemical and pharmacological data, matrix X pretreatment is often a must. For chromatographic profiles, peak alignment is frequently used to minimize inter-analysis peak shifts. In addition, other pretreatments, such as autoscaling, column centering, and normalization can also be used (see Pretreatment section). The chemical data (fingerprints) in matrix X are commonly visualized by unsupervised exploratory data analysis (e.g., PCA and HCA), both before and after pretreatment. As previously mentioned (see Exploratory Data Analysis section), these techniques can identify cluster tendency, as well as outliers. This information may be useful to avoid data analysis problems in further steps.

When matrix X (chemical fingerprints) and vector y (pharmacological response) are ready to be linked, supervised methods are used, often multivariate calibration approaches. A model is built, from which the response (y) can be predicted based on the chemical data (X), described as:

y = Xb + e

where y is the response, X the predictor matrix, b the vector of regression coefficients, and e the residual vector [131]. For biomarker identification, the predictive power of the model is less important, but the interpretation of the regression coefficients b in comparison to the original chromatographic fingerprints may provide an indication of compounds, which might be selected as active markers [131]. Several techniques allow to build such models, with PLS (56%), OPLS (26%), and MLR (17%) the most frequently used.

PLS models apply latent variables (LVs) based on the maximized covariance between X and y. Although this might be an advantage for bioactivity prediction, it may hamper regression coefficient interpretation, since small orthogonal variations might also be modeled [14], [131], [136], [192]. An alternative is the use of OPLS, which removes the orthogonal (uncorrelated) variation in matrix X before building a PLS model. As a consequence, regression-coefficient interpretability is improved [14], [131]. Finally, MLR models y as a function of 2 or more dependent variables from matrix X. Residuals are minimized by least-squares methods. The main mathematical drawback is that MLR requires several variables that are smaller than the number of samples, which practically seen, demands a variable selection to meet the requirements [14], [131], [188].

Tistaert et al. [192] compared several modeling techniques to indicate antioxidant compounds from Mallotus spp., which are used in traditional medicine in Vietnam and China. Chemical data were obtained by HPLC-DAD. Models from PLS, OPLS, MLR, PCR, and UVE-PLS are compared in [Fig. 7]. MLR modeling selected only 3 variables related to the maximum of a peak, while the remaining ones were related to peak tails or minor peaks. MLR thus was found not so suitable for the intended purpose in this case study. PLS and OPLS regression coefficient plots, on the other hand, gave a better indication of the compounds that probably cause the antioxidant activity. The PLS plot was found to be noisier than that of OPLS because PLS does not remove the orthogonal variation of matrix X. Therefore, OPLS was considered the best technique to indicate the bioactive compounds. Similar results were observed in other studies [136], [137].

Zoom
Fig. 7 Chromatographic fingerprints (top figure) and the regression coefficients from MLR, PCR, UVE-PLS, PLS, and OPLS models for the antioxidant activity of Mallotus spp. extracts. Source: reproduced with permission from [192]. [rerif]

Other approaches can also be used for the identification of active markers in PMs/PEs/HMs. Wu et al. [193] used a combination of exploratory methods (not calibration methods) to identify markers for the quality control of Suhuang antitussive capsules. This TCM is a mixture of 9 medicinal plants, which significantly increases sample complexity. Different batches of the medicine were analyzed by HPLC-DAD. First, 22 common peaks were detected in the chromatograms by similarity analysis. Their areas were used to build matrix X and PCA was performed. The analysis of loading plots ([Fig. 8]) allowed the selection of 13 compounds as highly influencing the sample distribution in the score plot. These compounds were evaluated for their in vitro anti-inflammatory activity, and 4 were identified as bioactive and potential markers for the quality control of Suhuang antitussive capsules.

Zoom
Fig. 8 The PCA biplots for the areas of 22 common peaks in different Suhuang antitussive capsules. a PC1 vs. PC2, b PC2 vs. PC3. Source: Reproduced with permission from [193]. [rerif]

The combination of chemical and pharmacological data using multivariate analysis can also be applied to predict pharmacological activity. Although less applied (22%, Table 9S, Supporting Information), the same chemometric methods are commonly used (i.e., PLS, OPLS, MLR). However, what matters here is the predictive capacity of the model [131].

Dumarey et al. [194] compared several calibration models, from MLR, PCR, PLS, UVE-PLS, and OPLS for antioxidant capacity prediction from green tea HPLC-DAD fingerprints. The dataset was split into a calibration set and a test set, allowing the evaluation of both the fit and the predictive ability of the models. The RMSECV was determined based on LOO-CV and MCCV, where 1 or a random number of objects, respectively, is iteratively removed as a test set and the remaining ones used to build the model [195]. The RMSEC and RMSEP were estimated based on the calibration and test sets, respectively. The RMSEC is a measure for model fit, while the others estimate its predictive properties. The authors observed that all techniques provided a model to predict the total antioxidant activity with a precision comparable to the reference method. Because of its simplicity, reproducibility, and interpretability, the OPLS model was considered the best.


Marker content as response

Fingerprint analysis for the quantification of markers is limited because, in general, this is achieved by traditional approaches, mainly using separation techniques (see Traditional Approach section). When fingerprints are used for marker quantification, spectroscopic data, mainly MIR/NIR, are applied. The justification to use such data instead of chromatographic fingerprints is because of their fast data acquisition, low cost, and minimal sample preparation requirements [121], [122], [123].

The principles of using fingerprints of IR spectra for marker quantification are similar to those for biological activities. A matrix X, containing the chemical data (IR spectra) is linked by multivariate calibration modeling, to a vector y, containing a marker content (determined by a reference method (e.g., HPLC-DAD). Usually, a matrix X must again be preprocessed, which may include the earlier mentioned methods (see Pretreatment Section), but also, for instance, MSC or SNV. The last 2 correct the scatter level in the spectra original from physical aspects of the sample [196]. Matrix X can also be analyzed by exploratory data analysis (e.g., PCA and HCA) (see Exploratory Data Analysis Section). As for the other calibration approaches, PLS is most frequently applied (commonly referred to as NIR-PLS).

Mavimbela et al. [196] used such an IR-PLS approach to quantify the triterpene glycoside sutherlandioside B in the leaves of Sutherlandia frutescens, a medicinal plant from southern Africa. Sixty samples were collected from different regions in South Africa, and the marker content was determined by UHPLC-MS. Powdered plant material was also analyzed by MIR, and the obtained data were split into calibration (70%) and validation sets. Before PLS model building, different pretreatments were performed, and the best results were found for the second derivative of the spectra. A plot of the sutherlandioside B concentrations determined by UHPLC-MS versus those predicted by MIR is given in [Fig. 9].

Zoom
Fig. 9 Plot of MIR-PLS predicted concentrations against the UHPLC-MS reference concentrations for sutherlandioside B Calibration set samples (70% of samples) are shown in red and validation set samples (30% of samples) in blue. Source: Reproduced with permission from [196]. [rerif]

Another application of the IR-PLS technique was described by Petrakis et al. [197] to assess the content of adulterants in commercial saffron samples. The authors spiked pure saffron samples with 5, 10, and 20% (w/w) of different adulterants and analyzed the samples by DRIFTS. The data set was split into calibration (70%) and test sets, allowing the estimation of RMSEC, RMSECV, and RMSEP. The X matrix was pretreated by calculating the first derivatives after Savitzky-Golay smoothing and mean centering. Variable selection was performed, using iPLS and siPLS. In both cases, the aim is to find intervals in the spectrum that contain important information to give a better model and prediction of the response than the application of the full spectrum. iPLS splits the data in a given number of intervals and calculates a PLS model for each interval; siPLS creates a PLS model for a combination of 2 to 4 intervals. The best results were observed for siPLS.




Conclusions and Future Perspectives

Plants remained important therapeutic agents over millennia. Even with the advent of rational drug discovery in the early 19th century, plants persisted as a major source of bioactive compounds (e.g., paclitaxel, vincristine, vinblastine, camptothecin, and artemisinin) [198]. The evolution of the use of medicinal plants throughout recent history was accompanied by the evolution of analytical techniques, allowing a more in-depth study of their chemical composition. This reflects also the search to overcome analytical challenges resulting from facts such as the complexity of plant matrices, the need to assess a large diversity of chemical constituents, the low concentration of some analytes of interest, and the significant differences in analyte concentrations [118]. Meanwhile, advances in the pharmacological evaluation enabled also a better understanding of the action mechanism of isolated compounds, as well as compound interaction and synergistic effects in extracted complex mixtures [199].

Advances in knowledge of the chemical and biological aspects of medicinal plants also raised a red flag for traditional quality control approaches [200]. Although the regulation of marketed HMs has evolved so that, today, several plants have their minimum quality requirements established in official compendia, most specifications are not sufficiently based on scientific evidence and therefore do not adequately reflect the productʼs safety and efficacy profile. Moreover, the intrinsic variability of articles of botanical origin imposes difficulties in standardization, thus representing an additional challenge regarding the definition of clinically relevant quality specifications [9]. These limitations have motivated the development of pattern-oriented analytical approaches to complement the traditionally used compound-oriented approaches, thus allowing a more complete assessment of the relevant quality attributes of the product, reflecting its inherent complexity [17].

The appropriate authentication of PM can also be a difficult task since adulteration, substitution with related species, and contamination are quite common [75]. For most traditionally used plants, correct identification can easily be achieved (e.g., Ginkgo biloba, Psidium guajava), but in many cases, the correct species identification is difficult since little is known about the closest species. An accurate morphological study associated with DNA techniques (barcode) may be used to guarantee identification. Several independent tests, usually involving a combination of botanical and chemical tests, are often needed to ensure the appropriate level of identification [2], [10].

Appropriate quality control approaches are essential to deal with batch-to-batch reproducibility of the HMʼs safety/efficacy profile, as well as to ensure the consistency of the therapeutic effects [9]. Thus, the definition of quality specifications and the choice of analytical techniques should be made in such a way that they are consistent with the purposes of the analysis [3]. Since each analytical technique has advantages and limitations, it is often necessary to apply orthogonal methods for the integrated assessment of the relevant quality aspects of PMs/PEs/HMs, including authenticity, strength, and purity [3], [9].

This review shows that multivariate techniques have been regularly applied in the study of NPs, especially for qualitative purposes. However, the quantification of one or a few markers (sometimes randomly selected) remains the main instrument for quality assurance. In this scenario, the possibility of applying state-of-the-art approaches, including multivariate analyses, to establish correlations between the phytochemical profiles of PMs/PEs/HMs and their pharmacological properties, may encourage a science-driven improvement of the regulatory framework for this category of products by favoring the establishment of more objective and appropriate requirements for quality assessment, ideally involving a combination of compound- and pattern-oriented approaches. In this sense, biomonitoring, aligned with a relevant therapeutic target, is necessary. However, this implies the need for the qualification of personnel both in regulatory and regulated sectors and may be time-consuming. Despite potential difficulties in implementing these new quality approaches, the efforts are worthwhile given the potential to promote access of the population to HMs with favorable risk-benefit profiles.


Method

Literature research

The entry terms were defined based on medical subject headings. Two scientific databases were used, and 641 references were retrieved (time cover: up to 2020). After duplicates removal, the references were selected based on their title and abstracts. Incomplete information or lack of results were excluding criteria. It is clear, by the methods used, that the coverage is limited to academic publications, mostly in English.



Contributorsʼ Statement

Conception and design of the work: L. C. Klein-Junior, M. R. de Souza, J. Viaene, T. M. B. Bresolin, A. L. Gasper, A. T. Henriques, Y. Vander Heyden; data collection: L. C. Klein-Junior, M. R. de Souza, J. Viaene, T. M. B. Bresolin, A. L. Gasper; analysis and interpretation of the data: L. C. Klein-Junior, M. R. de Souza, J. Viaene, T. M. B. Bresolin, A. L. Gasper, A. T. Henriques, Y. Vander Heyden; drafting the manuscript: L. C. Klein-Junior, M. R. de Souza, J. Viaene, T. M. B. Bresolin, A. L. Gasper, A. T. Henriques, Y. Vander Heyden; critical revision: L. C. Klein-Junior, M. R. de Souza, J. Viaene, T. M. B. Bresolin, A. L. Gasper, A. T. Henriques, Y. Vander Heyden.



Conflict of Interest

The authors declare that they have no conflict of interest.

Acknowledgements

The authors are grateful to The National Council for Scientific and Technological Development (CNPq), Brazil.

# Dedicated to Professor Arnold Vlietinck on the occasion of his 80th birthday.


Supporting Information


Correspondence

Prof. Dr. Yvan Vander Heyden
Department of Analytical Chemistry, Applied Chemometrics and Molecular Modelling
Center for Pharmaceutical Research (CePhaR)
Vrije Universiteit Brussel – VUB
Laarbeeklaan 103
1090 Brussels
Belgium   
Phone: + 32 24 77 47 34   
Fax: + 32 24 77 47 35   

Publication History

Received: 17 December 2020

Accepted after revision: 09 June 2021

Article published online:
19 August 2021

© 2021. Thieme. All rights reserved.

Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany


Zoom
Fig. 1 Pie chart representing the most frequently used extraction methods, based on the 188 studies considered in this review (Table 1S, Supporting Information). PSE = pressurized solvent extraction; SFE = supercritical fluid extraction; UAE = ultrasound-assisted extraction.
Zoom
Fig. 2 The most used analytical techniques for fingerprint development, based on the 102 studies reviewed (Table 4S, Supporting Information). NIR = near-infrared spectroscopy; MIR = mid-infrared Spectroscopy; UV/DAD = ultraviolet/diode array detector; ELSD = evaporative light scattering detector.
Zoom
Fig. 3 Fingerprints and color maps of their correlation coefficients (at the bottom) before (a) and after (b) warping. Source: Reproduced with permission from [136]. [rerif]
Zoom
Fig. 4 PCA score plot (a), PLS-DA score plot (b), PLS-DA loading plot (c), and HCA dendrogram of the loading matrix of PLS-DA (d), built by the analysis of 40 peaks areas obtained by HPLC-UV analysis of total flavones of sea buckthorn. Source: Reproduced with permission from [148]. [rerif]
Zoom
Fig. 5 PLS-DA score plot (a) and loading plot (b) built from ¹H NMR fingerprint obtained from different samples of American ginseng (Panax quinquefolius) and Asian ginseng (Panax ginseng), consisting of samples from 2 Chinese regions. GB = American ginseng; MS = Asian ginseng from Fushun, China; TH = Asian ginseng from Tonghua, China. Source: Reproduced with permission from [149]. [rerif]
Zoom
Fig. 6 Plot representing the most used analytical methods in the quantitative analysis, based on 69 studies (Table 5S7S, Supporting Information). FID = flame ionization detector; DAD = diode array detector; ELSD = evaporative light scattering detector; FD = fluorescence detector; CAD = charged aerosol detector.
Zoom
Fig. 7 Chromatographic fingerprints (top figure) and the regression coefficients from MLR, PCR, UVE-PLS, PLS, and OPLS models for the antioxidant activity of Mallotus spp. extracts. Source: reproduced with permission from [192]. [rerif]
Zoom
Fig. 8 The PCA biplots for the areas of 22 common peaks in different Suhuang antitussive capsules. a PC1 vs. PC2, b PC2 vs. PC3. Source: Reproduced with permission from [193]. [rerif]
Zoom
Fig. 9 Plot of MIR-PLS predicted concentrations against the UHPLC-MS reference concentrations for sutherlandioside B Calibration set samples (70% of samples) are shown in red and validation set samples (30% of samples) in blue. Source: Reproduced with permission from [196]. [rerif]