Key words
natural product - extraction - library - stability - screening
Introduction
Natural products often possess biological activity and are a valuable source of drug
leads [1], [2], [3], [4]. Many natural products occupy unique areas of chemical space [5] and they often differ from synthetic compounds by the presence of multiple chiral
centers, an abundance of O over N atoms, and more H-bond donors and acceptors [6], [7], [8], [9]. Despite these structural differences and rich history in drug development, natural
product-based hit discovery is currently almost nonexistent in large pharmaceutical
companies. This is mainly due to the perception that (i) the time between isolation
and structure elucidation is too long, (ii) there are diminishing returns due to high
dereplication rates, (iii) there is a resupply issue for hit-to-lead and preclinical
studies, and (iv) natural products are not “drug-like” and make poor leads. These
perceptions can be countered as follows. Advances in dereplication, isolation techniques,
and structure elucidation have reduced the time from extract to pure compound to under
one month on average in well equipped, organized, and experienced labs [1], [10], [11]. The resupply issue can be resolved by focusing on microorganisms, with other organism
types such as marine invertebrates and plants only screened if there is well documented
collection information and a resupply plan. Advances in synthesis have also solved
some supply issues as demonstrated by the totally synthetic anticancer drug eribulin
mesylate (trade name: Halaven®; launched 2010) that was derived from the complex,
sponge-derived lead halichondrin B ([Fig. 1]) [12], [13]. Natural product anticancer drugs such as trabectedin (trade name: Yondelis®; launched
in 2007) [14], [15] and homoharringtonine (trade name: Synribo®; launched 2012) [16] are produced semisynthetically for clinical use to overcome resupply issues ([Fig. 1]).
Fig. 1 Structures of the complex, sponge-derived anticancer lead halichondrin B and its
synthetically inspired drug eribulin produced by total synthesis; ester hydrolysis
of the naturally occurring ester mixture to give cephalotaxine, which is then esterified
to give the anticancer drug homoharringtonine (omacetaxine mepesuccinate); the ascidian-derived
anticancer drug trabectedin is semisynthetically produced from the bacterial-derived
cyanosafarin B.
The concern that natural products are “drug-like” is paradoxical given that natural
products have been lead structures for many drugs [2], [4]. Christopher Lipinski, who introduced the “rule-of-five” for guiding drug-like properties
required for oral adsorption, suggested that these rules were not suitable for natural
products due to the potential for active transport [17]. The disconnect with the rule-of-five and some natural products is demonstrated
in Ganesanʼs analysis of the 24 natural product drug leads discovered from 1970–2006
that led to a drug approval [18]. In this study, it was found that half of these leads were inside the “Lipinski
universe” (0 or 1 rule violations) chemical space, while the other half were in a
natural product “parallel universe”. Revealingly, leads from the “parallel universe”
had an equivalent chance to leads from the “Lipinski universe” of producing an oral
drug; these leads were Lipinski-compliant in terms of clogP and H-bond donors, as
well as potentially being able to access carrier mediated or active transport mechanisms
[18].
In the 1980s, advances in robotics and computing, coupled with enhanced access to
biological reagents and proteins, enabled large pharmaceutical companies to develop
high-throughput screening (HTS) capabilities [19], [20], [21]. Initially HTS was used to screen modest sized synthetic compound libraries (thousands
of compounds) and natural product extract libraries, but the increased screening capacity
created soon led to an explosion in the number of synthetic compounds added to screening
libraries using combinatorial chemistry (hundreds of thousands of compounds). During
this time, HTS has had an impact on drug discovery leading to the identification of
numerous new drugs leads and at least 12 drugs on the market [22].
For over 20 years, HTS facilities have also been used in academic settings and smaller
biotechnology companies to screen for new drugs leads. As large pharmaceutical companies
retreat further from natural product-based lead discovery (and basic research), there
are now opportunities for university-based research groups to be at the forefront
of lead discovery in natural products. It is also now feasible for these groups to
use in-house resources and contract research organizations (CROs) to validate these
drug leads and undertake preclinical drug development [23].
In this review, we discuss how to assemble a natural product-focused library that
would consist of structurally diverse extracts, prefractionated extracts, and pure
compounds. Practical methods to maintain, store and efficiently screen these natural
product-derived samples are also discussed, as the logistics of moving from an assay
hit to pure bioactive compounds.
Initiating a Natural Product Library
Initiating a Natural Product Library
Extraction methods
The early years of natural product chemistry research were dominated by large scale
plant-based studies that often used harsh extraction methods. These extraction techniques
included the use of boiling solvents, acid-base extraction, and steam distillation.
As isolation techniques and technology have improved, extraction conditions have become
milder, and the amount of compound required for structure elucidation has dropped
to well below 1 mg. Extraction methods [24], [25] developed to increase extraction efficiency include the use of ultrasound [26], [27], microwave [27], [28], pressure (accelerated solvent extraction) [29], supercritical fluid [30], [31], ionic liquids [32], and deep eutectic solvents [33], [34]. Although these newer methods can be useful, a simple method such as extraction
with slightly aqueous MeOH using gentle agitation captures most of the drug-like molecules
and is usually more than adequate for biological screening purposes. Another popular
extraction solvent for screening libraries is EtOAc, which extracts less polar material
compared to MeOH, resulting in cleaner extracts of mid-polarity compounds. Extraction
conditions for the compounds of interest can be subsequently optimized upon re-extraction
or scale-up studies.
Sources of natural products
Plants: Plants have been used as the basis of medicines for thousands of years and even
today these traditional medicines are relied upon for health care in many parts of
the world [35], [36]. Plant secondary metabolites are often stored inside cells, which need to be ruptured
to increase the extractable yield. The plant material is usually dried, either at
slightly elevated temperatures or using freeze drying, and ground to a fine powder
before extraction. The choice of solvent depends upon the desired polarity range with
MeOH or EtOAc being excellent options for a single solvent-derived extract library
as previously discussed. Extracts can also be generated by successively extracting
with nonpolar to polar solvents such as heptane, CH2Cl2, MeOH, and H2O, while hot H2O can be used to mimic the administration of many traditional medicines as infused
teas. Before screening, plant extracts can be passed through a polyamide column (eluent
MeOH) or polyvinylpyrrolidone (PVP) to remove polyphenols that can interfere with
various enzyme biological assays [37], [38], [39]. Alkaloids [40] can also be enriched from plant samples using an acid/base extraction protocol and
detected using Dragendorffʼs reagent and/or using (+)-electrospray (ESI)-MS [37], [41].
Microorganisms: Microbes have been the mainstay of industrially focused natural product research
due to their propensity to produce novel molecules and the use of fermentation as
a renewable source of compound resupply. Fungi and bacteria can be cultivated on both
solid and liquid media under a variety of conditions; however, bacteria are predominantly
grown in liquid media, while fungi are grown using both solid and liquid media. Important
parameters for secondary metabolite production are the choice of media, temperature,
aeration, and duration of the fermentation. On a small scale, liquid cultivations
can be performed in test tubes, flasks, or specially designed microtiter plates [42], [43], [44], which can be freeze-dried and extracted with an appropriate solvent or directly
extracted with an organic solvent that is immiscible with the broth [e.g., EtOAc,
n-butanol (n-BuOH), methyl ethyl ketone (MEK), tert-butyl methyl ether (TBME)]. Centrifugal evaporation is particularly useful for organisms
grown in high salt content media to minimize “bumping” that can occur during freeze
drying. For larger scale liquid cultivations, multiple flasks or a bioreactor are
used with the biomass usually separated from the broth using centrifugation. The wet
biomass can then be extracted directly, or freeze dried and then extracted. Broth
metabolites can be concentrated by elution through a resin column with increasing
amounts of organic solvents (usually MeCN or MeOH) in H2O. Commonly used resins are the brominated-polystyrene Sepabeads SP207 (Mitsubishi
Chemical Corp), the cross-linked polystyrene Diaion HP20 (Mitsubishi Chemical Corp),
and the XAD (The Dow Chemical Company) ion exchange resins. These resins can also
be added during cultivation to capture compounds, often in increased yield, and to
help with downstream purification [45]. There have also been reports of larger scale solid fermentations [46], [47].
Altering media and cultivation conditions can considerably influence secondary metabolite
profile production. This is exemplified by the OSMAC (one strain many compounds) approach
[48], which is particularly compatible with the number of cultivations available using
the microtiter plate format, and the use of additives to alter secondary metabolite
profiles [49], [50], [51] or activate cryptic biosynthesis pathways [50], [52], [53].
Marine invertebrates: As a result of potential logistical difficulties associated with marine invertebrate
recollection, gentle extraction methods are generally used to minimize compound degradation.
A commonly used method is to cut the invertebrate in smaller pieces, place in a plastic
container, immerse in EtOH, and refrigerate until required. EtOH is often used instead
of MeOH to try and distinguish between naturally occurring esters and artifactual
esterification. The Crews Group have reported a method that they use in the field
where specimens are immersed in an MeOH : H2O (1 : 1) solution and after approximately 24 h the liquid is decanted and discarded
[54]. The specimen is placed in a Nalgene container, shipped back to the laboratory at
ambient temperature and then stored at 4 °C until further processed [54]. Marine invertebrate specimens can also be freeze dried, powdered and extracted,
but care needs to be taken as occasionally some metabolites can sublime into the cold
trap.
Prefractionation and purified NP libraries
One way to reduce the complexity of an extract is to fractionate before biological
testing using column-based [55], [56], [57], [58], [59], [60], [61], [62], [63] or liquid-liquid separation methods [64], [65], [66]. The production and the guiding principles behind the generation of prefractionated
libraries have been recently reviewed [67], [68]. Prefractionated extracts are considerably less complex than crude extracts, which
enables screening at a high concentration and simplifies dereplication. Although enriched
extracts allow the detection of minor components during screening, it is not possible
to remove all interfering or frequent hitting compounds that will also be enriched.
Prefractionated extracts also facilitate the unmasking of active compounds from cytotoxic
compounds and agonists from antagonists.
The ultimate prefractionated library is a pure natural product library that has a
structurally diverse set of natural products with known structures and physicochemical
properties and > 95 % purity. Pure natural product libraries are assembled in an iterative
manner from in-house isolated compounds and natural products purchased from commercial
sources. The largest commercial player in this area is AnalytiCon Discovery GmbH (Potsdam,
Germany; http://www.ac-discovery.com/). AnalytiCon have assembled a 25 000 pure natural
product library over the last 20 years based on about 2000 different chemotypes that
are available in a ready-for-screening format [69], [70]. The only other description [71] of a pure natural product library was a plant-based library from MolecularNature
Ltd., which is now available via PhytoQuest Ltd. (Aberystwyth, United Kingdom; http://www.phytoquest.co.uk/).
Screening of this type of library is especially attractive for groups not interested
in bioassay-guided isolation that still want to access natural product diversity.
A few words of caution: although it may seem attractive to generate large numbers
of pure natural products and prefractionated extracts for screening, there is considerable
effort and monetary investment required to generate, store and undertake quality control
analyses [1]. There is also an increase in screening costs for each extra screening point and,
as a consequence, the optimal number of fractions needs to be carefully considered.
There also can be a loss in biodiversity coverage when screening prefractions, especially
if the screening capacity is limited. To achieve optimal metabolite coverage in the
minimum number of screening points, organism taxonomy studies along with robust assessments
of the extract quality and chemical diversity must be undertaken before generating
fractions [72], [73], [74], [75].
Before assembling the library, a decision is required about whether samples will be
tested at equivalent µg/mL concentrations, which requires the weighing of each extract
or prefraction, or at equivalent doses relative to a broth volume or amount of material
extracted. Both methods have their advantages and disadvantages.
Maintaining and Storing a Natural Product Library
Maintaining and Storing a Natural Product Library
Organism storage
Dried plant material is usually stored at room temperature as a finely ground powder
to minimize storage space and enhance extraction. Plant samples, especially intact
plant parts, need to be periodically inspected for microorganism growth to identify
contamination. Microorganism stock cultures are usually stored frozen (liquid nitrogen
and − 80 °C freezer) [76], [77] as glycerol stocks or freeze-dried [78]. It is prudent to duplicate the microorganism collection in two separate locations
in case of a catastrophic event that would destroy the freezer contents. It is also
prudent to keep multiple stock copies of each microorganism to give the best chance
of revival. Marine invertebrate samples can be stored in a − 4 °C freezer with or
without the extraction solvent.
Extract and pure natural product storage and stability
Extracts and pure natural products are usually best stored dried under a dry, inert
atmosphere at − 20 °C or below; however, these solid materials are not easily manipulated
for screening purposes. One way around this problem is to aliquot the solid material
at the desired screening concentrations, remove the solvent and store ready for screening.
This dry plating method also has its disadvantages: the screening concentration is
fixed, there could be issues with extract dissolution and some assays require the
testing material to be added after the reagents and media. An alternative method is
to dissolve the extracts and compounds in dry DMSO, which is a hygroscopic liquid
that freezes at 18.5 °C, and dispense aliquots of this solution at the desired concentration.
The dispensing is best performed using automated robots but can also be performed
manually on a smaller scale. It has been shown that storing the DMSO stock solution
as a solid can cause issues with compound stability and solubility that worsens with
each freeze-thaw cycle [79], [80], [81]. A stability study of synthetic compound DMSO stocks stored at room temperature
showed that a concentration of 8 % of the compounds has significantly decreased after
3 months, 17 % after 6 months, and 48 % after 1 year [82]. Interestingly, a recent paper has suggested that compound purity is the most important
factor for stability of synthetic compounds [81]. On face value this implies that natural product extracts could be especially unstable
but in reality the presence of other compounds increases compound solubility and stability
inside the extracts. The presence of antioxidants also helps with compound stability
in extracts.
It is almost impossible for the average laboratory to totally exclude H2O from DMSO solutions during the storage and dispensing phases. In this situation,
DMSO stocks should be dispensed and stored for only a limited amount of time (say
< 2 months) and the number of freeze-thaw cycles kept to a minimum. An example of
a sophisticated compound and extract storage and dispensing facility is the Queensland
Compound Library (http://www.griffith.edu.au/science-aviation/queensland-compound-library),
which is housed at Griffith University (Brisbane, Australia). In this facility, compounds
and extracts have a unique 2D barcode, which is used to track samples through the
whole process from the automated weighing station to the plate dispensing for screening.
The solid samples are stored in the dark under an inert and dry atmosphere at − 20 °C
[83]. The screening samples are dissolved in dry DMSO and stored in liquid form under
a dry N2 atmosphere for up to 5–6 years. Individual or sets of these screening samples are
then assembled using the 2D barcode and transferred using an acoustic dispenser [84] in any format required for screening (e.g., 96-, 384-, or 1356-well microtiter plates).
Periodic analyses of the screening samples using LC-MS are undertaken for quality
control purposes.
Although state-of-the-art storage, dispensing, and analysis facilities increase the
probability of identifying bioactive molecules, inexpensive and simple procedures
like minimizing water in the DMSO stocks, limiting freeze-thaw cycles, and not using
“expired DMSO stocks” go a long way to improving assay outcomes without significant
monetary outlays.
Practicalities of Screening Natural Product Libraries
Practicalities of Screening Natural Product Libraries
A natural product extract can contain hundreds to thousands of compounds that have
a wide molecular weight, polarity, and abundance ranges. Before bioassay-guided isolation
can be used to identify the compounds responsible for the activity, medicinally relevant
and robust assays need to be developed, which can be divided into in vitro biochemical and cell-based assays. The biochemical assays are grouped into two subtypes
based on their complexity and troubleshooting difficulty: binary assays or direct
binding assays involving two partners (ligand and an analyte or binding partner) or
ternary assays (ligand, analyte, and a ligand-analyte modulator). The use of more
complex, more biologically relevant cell-based assays has steadily increased over
the past several years [85], and they can be further divided into reporter, morphometric, and homogenous assays.
Reporter assays use fluorescent and/or luminescent labels to monitor changes in protein
expression or protein-protein clustering or interaction, while morphometric assays
involve the measurement of changes in growth, cytotoxicity, and subcellular morphologic
features using quantitative microscopy and label-free techniques. Finally, homogenous
assays include a lysis step before readout and could almost be considered as biochemical
assays. Electrophysiology and flow cytometry assays stand on their own as they involve
a cell sorting step and a cell-by-cell readout.
Developing representative screening sets to interrogate screen robustness
The concept of the natural product matrix effect: An extract is a complex genus and media specific mixture or matrix that can interfere
with the assay dynamic range, signal-to-noise ratio, and reproducibility. Although
a solvent-vehicle can be used as a baseline control, an inactive extract representative
of the libraryʼs “average” matrix can also be used. It is important that the inactive
extract matrix be representative of the average matrices of the other extracts. For
example, the “generic inactive extract” for a marine invertebrate n-BuOH extract library can be produced by pooling whole EtOH extracts from three to
five randomly selected library specimens, drying the extracts and partitioning between
H2O and n-BuOH. The n-BuOH layer is dried, weighed and adjusted to the desired w/v concentration required
for the stock solution. This solution is left on the laboratory bench for one week
under ambient laboratory fluorescent light to generate the final “generic inactive
extract”.
Positive and negative controls: This “generic inactive extract” is then used as the negative control or baseline
for the screening campaign. If an active small molecule has been previously identified,
positive controls and/or standard curves should be generated using the same “generic
inactive extract” spiked with the known active compound. Simple positive controls
in the solvent should also be run in parallel to establish the “matrix effect” on
the assay dynamic range, signal to noise ratio, and reproducibility.
For some biological targets, no active molecules have been identified. For homogenous
assays that use recombinantly expressed purified proteins or membrane preparations,
chaotropic agents such as guanidine and SDS can be used as positive controls due their
denaturing effect. Care needs to be taken as these types of positive controls can
be misleading when trying to establish assay tolerability to the vehicle-solvent concentration,
as they can trigger the expected assay response, even at a high vehicle-solvent concentration,
while displaying the hallmarks of a working assay (e.g., stable baseline, no precipitation,
and a large dynamic range). However, the assay may no longer be able to detect an
active compound using these conditions. Residual solvent in the samples such as n-BuOH can also act itself as a chaotropic agent, disrupting the structure of, or worse,
denaturing proteins [86]. A rule-of-thumb is to assume that there is no tolerance to the solvent in these
assays until an active compound has been identified suitable for use as a positive
control.
The solvent effect can be avoided if the vehicle-solvent used to transfer a library
in the assay microtiter plate can be evaporated and the extract resuspended in assay
buffer. In most cases, the resuspension of extracts/prefractions is straightforward
except for a small number of extracts that can emulsify during the brisk agitation
in the assay buffer. The effect of this transfer, drying, and resuspension cycle should
also be evaluated on the positive control (generic inactive extract spiked with active
compound).
Assay optimization and pilot study: The experimental conditions of each assay need to be optimized in regards to parameters
such as reagent and extract concentrations, reagent addition times, endpoint window
to account for different instrument reading times, replicate reproducibility, and
DMSO tolerance. Once these parameters have been established, a pilot study is undertaken
on a subset of representative extracts and prefractions, and the pure natural product
library. The data from the representative extracts is then analyzed for assay robustness
and from these results, promising hits are moved to the dereplication stage. These
data can also be extrapolated to give an estimated overall assay hit rate. The pure
natural product library is used to identify compounds of interest and frequent hitters
that are flagged for dereplication.
For an example of this process, consider the marine invertebrate n-BuOH extract library, which has already been formatted in a screening-ready format
in deep 96-well plates. The extracts were formatted at several w/v concentrations
(25, 2.5, and 0.25 mg/mL), relative to the initial concentration of the whole extract
before n-BuOH/H2O partition. These concentrations offer flexibility and accuracy when plating sub-mg
to mg amounts in assay wells, where incubation volumes range from 10 to 250 µL.
Cytotoxicity is not an issue for cell-based assay protocols that use short incubation
times (30–45 minutes) or homogenous assays, and extracts in the several hundreds of
µg/mL can be used in these assays. For cell-based assays that require longer incubation
times, usually only concentrations less than 10 µg/mL can be used to avoid observing
toxicity. Cell toxicity should be tested with a large concentration range of “generic
inactive extract”, e.g., 0 to 500 µg/mL, using alamarBlue, cytosolic lactose dehydrogenase
release, or similar readouts, in order to estimate the EC50 profile of the library genus matrix. The incubation duration depends on the assay
type and can vary from 30 minutes for ion channel cell-based assays [87], [88] to 15–18 hours for a lipid droplet formation assay [89], [90], or up to four days for a stem cell commitment assay that was developed to identify
molecules stimulating commitment of neural precursor cells to neuronal lineage [91], [92]. Care needs to be taken when using DMSO-containing libraries as most cell-based
assays can only tolerate < 0.5 % DMSO.
The next step is to decide whether the natural product library will be screened at
a single concentration or at multiple concentrations in singlicate or duplicate experiments.
The two extremes are libraries screened at a single concentration in singlicate and
libraries screened using quantitative HTS (qHTS), in which samples are screened at
multiple concentrations to generate concentration-response curves [93], [94]. The limited supply of some natural product extracts is also a consideration when
deciding on screening numbers. We would recommend screening at two concentrations
when practicable, usually 10- to 100-fold apart ([Fig. 2]). Besides providing concentration-dependence information, these data are useful
for ranking hits in order of potency, as well as identifying extracts that are only
active at the lower concentration. Extracts not active at the higher concentration
could have their activity masked by a matrix effect or cell cytotoxicity or be false
positives. Examples of screenings of a natural product library at low to very low
concentrations are starting to appear in the literature [95]. We recommend that this strategy be used for cell-based assays, but it can also
be used for biochemical assays.
Fig. 2 Screening at high concentrations (Zone 1) leads to higher hit rates (continuous line)
that identify a predominance of compounds with low target affinity (dashed line).
For example, screening a set of marine invertebrate extracts at 10 µg/mL w/v with
compounds at a relative abundance of 1–10 % (average MW of 500) is equivalent to testing
pure compounds in the 0.2 to 2 µM range, which often results in the identification
of hits in the mid- to low µM affinity range. Although mid- to low nM actives can
also be identified when screening in Zone 1, the extract matrix effect and compound
interference can also mask the activity. Screening in Zone 1 for cell-based assays
where cytotoxicity often leads to “bell-shaped” dose-response curves will miss relatively
abundant mid- to low nM actives. Screening the extracts at 10- and 100-fold lower
concentrations (Zone 2 and Zone 3) reduces the matrix and interference effects leading
to identification of extracts that would have be missed or not prioritized when screening
in Zone 1. As a consequence, we recommend screening natural product extracts and prefractions
when practicable using at least two concentrations, especially for cell-based screening.
Matrix components that lead to assay interference
Colored or fluorescent compounds: These types of compounds can interfere with colorimetric and fluorescent assays
and result in both false positives and negatives depending upon the assay readout.
For example, compounds with molecular weights in the range of 300 to 600 Da can be
yellow to red in color, which can interfere with the colorimetric readouts, or absorb
and fluoresce in a yellow-green spectrum range similar to fluorescein, which is commonly
used in fluorescence polarization-based binding competition assays [96]. As these compounds are often present in the assay in the µM range, which is around
1000-fold higher than the assay probe present in the nM range, most of the energy
of excitation will be absorbed by the compounds and probe-derived fluorescence will
decrease accordingly. In fluorescence polarization-based assays, fluorescence interference
will decrease the total fluorescence (parallel and perpendicular) and increase the
calculated fluorescent polarization ratios. Fluorescence interference in these types
of assays can be decreased by using far-red fluorescent probes [96].
Micelle formation and aggregation: Some extracts, prefractions, and pure compounds can form micelles in assay buffer
that can destabilize a purified system (recombinant protein, membrane preparation)
in a biochemical assay. The micelles can cause toxicity in cell-based assays, while
in homogeneous assays they can trap fluorescent probes, fluorescent chemosensors,
or fluorescence-tagged reagents leading to a high local concentration that can interfere
with the fluorescence readout (quenching) or prevent functional chemosensing. Micelles
are usually formed by detergent-like molecules such as fatty acids and sulfated compounds,
but this behavior can also be observed with other types of molecules. Simple assays
are available to measure critical micelle concentration (CMC) and can easily be implemented
in 96- or 384-well plate format [97]. Detergent-like molecules present can also form colloidal aggregations that cause
assay interference [98], [99], [100], [101], [102], which can usually be identified through the addition of detergent to disrupt aggregation
[98], [103]. Finally, there have also been reports of detergent-like molecules being released
from the plasticware that can cause assay interference [104], [105].
Media-derived interference: The media used to grow microorganisms is often a soup of diverse ingredients that
include salts, amino acids, proteins, and complex biological products. Some assays
are sensitive to the different salts [106] such as FLIPR assays that detect Ca2+ flux [107], or assays involving metalloproteases, which contain cations in their active sites
[108]. Other examples include the production of large amounts of aluminum dioxalate by
some fungi grown using vermiculite-based solid media [109] and the bacterial biotransformation of soybean-derived glycosylated isoflavones
to the biologically active aglycones genistein and daidzein.
Pan-active bioactive compounds: Broad-spectrum kinase and protease are pivotal modulators of cell protein-protein
interactions and cell signaling. The abundance of these broad-spectrum cell-signaling
modulators in some natural extracts may “paralyze” cells, without the usual hallmarks
of cytotoxicity, and mask compounds of biological interest, which need a “functional
cell” to modulate a phenotype, like cell differentiation, secretion, and trafficking.
Examples include bacteria and marine invertebrate extracts that contain staurosporine-like
kinase inhibitors [110], and marine algae and venom extracts that contain protease inhibitors [111], [112].
Moving from the assay to bioactive compounds
Assay quality control: Z-factor (or Z′ analyses) is a statistical method used to indicate whether an assay
is suitable for HTS campaigns [113]. Z-factor is an important quality control that should be calculated for every completed
assay microtiter plate. The value should remain relatively constant during screening,
but if a variation is observed, then there could be issues with the screen performance.
These variations can be caused by poor library quality, batch-to-batch cell and protein
quality differences, incorrect assay optimization, or instrument issues. The Z-factor
can vary from poor (Z < 0), which can occur if there is too much overlap between negative
and positive controls, marginal (0 < Z < 0.5), and excellent (0.5 < Z < 1) assay quality.
In many instances, assays cannot be optimized beyond the marginal level, but this
is often sufficient for screening with extra care taken to analyze the data. For example,
for fluorescence polarization assays, the signal-to-noise ratio is usually in the
1.5 to 4 range leading to Z-factors of around 0.5, but this assay format is precise,
robust, and reproducible and, in our opinion, the technology of choice for homogeneous-type
assays to screen natural products [114].
The hit rate determined during the pilot study of the various library subclasses can
also be extrapolated to the larger screening sets. If the hit rates diverge considerably
between the pilot study and the screening campaign, then it is worth further analysis
of the hits as this could be a flag for poor assay performance and extract quality.
Prioritizing extracts/prefractions for evaluation and bioassay-guided isolation: During the screening campaign, it is recommended that carefully selected assay hits
from the different screening libraries enter the dereplication [68], [70], [115], [116] stage to identify commonly occurring active compounds and to get a head start on
identifying new bioactive molecules ([Fig. 3]). Once the screening campaign is completed, then the overall hit rate of the screen
can be calculated. The hit rate is usually between 0 % to 5 % depending upon the assay
setup and hit selection cutoff, but for some screens such as cell line cytotoxicity
and gram-positive bacteria whole cell assays, the hit rate can be up to 15 % depending
upon the library.
Fig. 3 Schematic of a typical dereplication process. Up to 1 mg of crude extract or prefraction
are separated using HPLC (C18 column) and the eluting compounds analyzed using a photodiode
array detector (PDA). The eluent is then split with a majority flowing into a microtiter
plate and the remainder analyzed using ESI-MS and MS/MS. The microtiter plate contents
are dried using centrifugal evaporation and DMSO is added to each well ready for screening.
It is prudent to add the crude extract or originating fraction to the plate at two
concentrations for comparative purposes. The (A) retention time (RT), (B) UV spectra,
(C) MS, and (D) MS/MS of compounds present in the active fractions are analyzed and
compared to in-house databases and commercial software such as the Dictionary of Natural
Products (Taylor & Francis Group), MarinLit (Royal Society of Chemistry), AntiBase
(Wiley-VCH), and SciFinder (American Chemical Society). Work is discontinued on samples
where the activity is accounted by the identified compound or compounds, while like
extracts are grouped for further evaluation.
First, letʼs consider a manageable hit rate of around 0.2 to 0.5 %, which would involve
further evaluation of 200 to 500 samples from screening of a 100 000 member library.
The active samples are first cherry-picked from the library and then retested in the
screening assay to confirm activity. The next stage is to obtain a 4-log fold range
dose response on the retested positive samples and then test in appropriate orthogonal
and secondary assays. An orthogonal assay is the same target as the HTS assay but
in a different screening format that is used to help identify false positive results.
Orthogonal assays are especially important when screening crude extracts due to the
potential for a variety of interference compounds. Frequently run secondary assays
include whole cell cytotoxicity, antifungal, gram-positive and gram-negative bacteria
screening, and protease, kinase, and GPCR panel profiling to analyze for specificity.
The retesting and secondary screening phases usually take at least 2 weeks depending
upon the amount of secondary screening required. The number of samples for further
evaluation should now have been reduced to below 50 and after dereplication will result
in a reasonable number for bioassay-guided isolation.
However, what do you do with a hit rate of around 5 to 10 %, which would involve further
evaluation of 5000 to 10 000 samples from screening a 100 000 member library? An example
of a screen with a potentially high hit rate is the screening of bacterial-derived
extracts and prefractions against the gram-positive bacteria Staphylococcus aureus. The hit rate could be lowered by raising the cutoff threshold, but this is risky
as extracts or prefractions that contain lower abundance actives will not be selected
and potential leads not identified. The only practical way forward with this number
of extracts is to undertake considerable secondary screening or cross-referencing
with existing screening data, as well as undertaking considerable dereplication, either
through sheer force of numbers or by rapid analysis for common hitting compounds by
LC-MS/MS without sample collection. Extracts can be further clustered by their biological
profiles, dereplication profiles, and taxonomy.
The next step is to undertake bioassay-guided isolation [24], [117], [118] to identify the bioactive compounds present in the extract. Guided by the dereplication
profile, a chromatography step is undertaken, and the fractions are subjected to testing
alongside the crude extract or originating fraction. This process is repeated until
the most active compound or compounds are identified that account for the activity
of the extract or prefraction. For biological evaluation, the compound needs to have
> 95 % purity, especially for structure-activity studies. Always be on the lookout
for samples where there could be a minor component present that is significantly adding
to, or responsible for, the activity. For example, for µM active compounds, the presence
< 1 % of an nM active compound could account for the activity and not be able to be
identified by NMR or even LC-MS. If this is a possibility, then isolate related compounds
to see if they are active or collect sub-fractions of the peak and look for fractions
that are equally active.
Conclusion
The unique chemical structures and biological activities of natural products make
them attractive candidates for drug lead discovery and as chemical probes. Most large
pharmaceutical companies have abandoned natural product-based lead discovery, which
has created great opportunities for university-based research groups and small biotechnology
companies. Expertise is often available in these organizations to validate the drug
leads and undertake preclinical drug development using in-house resources and CROs.
However, if natural product-based libraries are going to be efficiently used to screen
for new drug leads, they need to be of the highest quality. To start assembling this
type of library, we first recommend that analysis methods (including dereplication)
to assess metabolite diversity are implemented, all organisms used to generate the
library are taxonomically identified, and a pure natural product library is assembled
to aid in pilot screening studies and dereplication. The library should be a balanced
mixture of crude extracts, prefractions, and pure compounds to optimize taxonomic
and chemical space coverage, while minimizing the number of screening points. The
library should be subdivided into smaller screening sets based on taxonomy, microorganism
cultivation conditions, and sample preparation methods. The library should also be
dynamic with extracts and prefractions added and removed from the library in response
to chemical analyses and screening results. The next step in the process is to develop
(or work out with collaborators) robust assays that are suitable for screening natural
product samples. Orthogonal assays also need to be developed to help identify false
positive results, as well as secondary assays to help select the best hits for further
evaluation. Before full scale campaign screening is undertaken, a pilot study needs
to be completed and the data analyzed. We also recommend that the library is screened
at two concentrations when practicable, especially for cell-based assays. Active samples
are dereplicated, and selected samples then undergo bioassay-guided isolation to identify
the active compound(s) responsible for the activity. With luck, these biologically
active compounds have the potential to be new drug leads and chemical probes.
Acknowledgements
This paper was prepared with the support of NHMRC grant AF511105. MSB is supported
by a Wellcome Trust Seeding Drug Discovery Award (094 977/Z/10/Z).