Subscribe to RSS

DOI: 10.1055/a-2665-1298
Quantitative Analysis of Pravastatin Sodium Polymorphs: a Comparative Study of Chemometric Techniques Combined with Powder X-ray Diffraction, Mid-Infrared, and Raman Spectroscopy
Funding None.
- Introduction
- Material and Methods
- Results and Discussion
- Discussion
- Conclusion
- References
Abstract
Pravastatin sodium (PS) is a hydrophilic statin lipid-lowering drug that reduces low-density lipoprotein levels in the blood by inhibiting the activity of 3-hydroxy-3-methylglutaryl coenzyme A reductase. It is known to exist in 17 crystalline forms, with some different crystalline forms overlapping in the powder diffraction patterns, making it difficult to control the purity of the crystalline forms. In this study, we aimed to determine the purity of PS crystals using powder X-ray diffraction (PXRD), mid-infrared (MIR) spectroscopy, and Raman techniques. The predictive ability of the partial least squares (PLS) model was constructed and assessed using SPXY, K_S, and Random methods at different partitioning ratios. PLS calibration curves were established based on the relationship between PXRD, MIR, and Raman data and the content of a solid forms of PS (PS-A) in different ranges (full and partial spectra) using different preprocessing algorithms such as multiplicative scattering correction, standard normal variable, Savitzky–Golay filtering, and derivative spectroscopy, or a combination of them. The results showed that the calibration model (y = 0.999x + 0.008 with R 2 = 0.999) established using the PXRD method was better, with a low detection limit (1.52%) and quantification limit (4.60%). In addition, by analyzing the testing results of the blind sample, it was found that the confidence intervals of the predicted values of MIR and Raman were wider, indicating a large uncertainty of their parameter estimation. Therefore, it will be better to select the calibration model established by the PXRD method to determine the purity of PS in actual production. This can provide more reliable methodological support for the quality control of pharmaceutical products.
Introduction
Polycrystallinity is a common phenomenon in solid drugs, wherein the same drug component presents different solid forms due to variations in molecular arrangement or conformation. The physicochemical properties of these different crystalline forms can influence the drug's processing characteristics and bioavailability and ultimately affect its overall efficacy.[1] [2] [3] [4] The thermodynamic stability of drug crystals can be changed depending on their chemical composition, which poses a significant challenge to maintaining the quality and efficacy of pharmaceuticals.[5] The U.S. Food and Drug Administration, in its draft guidance, recommends monitoring and controlling polymorphs in drug substances and drug products to ensure the reproducibility of drug production and quality.[6] [7]
Pravastatin sodium (PS) ([Fig. 1]) is a statin drug used for the treatment of lipid disorders, which acts mainly by inhibiting the activity of 3-hydroxy-3-methylglutaryl coenzyme A reductase.[8] PS tablets were first marketed in Japan in 1989 and then gradually marketed in the United States, Europe, and some other countries and regions. According to statistics, there are 17 crystal types of PS, with similar physicochemical properties,[9] and several companies are developing or have developed pharmaceutical ingredients (APIs) of PS optimized for different crystal types of preparation processes. PS-A and PS-D are two crystal types of PS found in the process of characterization of APIs. The former is stable and has a low moisture attraction, making it more conducive to the preparation, preservation, and prolongation of the effective period of drugs and is used as a medicinal crystal type in the pharmaceutical industry. However, the latter is more moisture-attractive, affecting the stability of PS. It is difficult to control the purity of the crystalline forms due to the partial overlap of the powder diffraction patterns of polycrystalline forms. Therefore, it is necessary to establish a quantitative model to control the purity of pharmaceutical crystalline forms. Based on the complexity of PS crystalline form, the focus of this paper is on the quantitative modeling of the pharmaceutical crystalline form A. Although the polymorphisms of PS have been investigated in several previous studies, quantitative analysis of these forms remains underexplored.


There are several methods for quantitative analysis of crystallinity, including powder X-ray diffraction (PXRD),[10] [11] [12] [13] [14] differential scanning calorimetry (DSC),[14] [15] [16] [17] and vibrational spectroscopy, which is subdivided into near-infrared spectroscopy (NIR),[18] [19] [20] Raman spectroscopy (Raman),[19] [21] [22] [23] [24] mid-infrared (MIR) spectroscopy,[10] [19] [25] [26] solid-state nuclear magnetic resonance spectroscopy,[27] [28] and terahertz spectroscopy.[29] PXRD and DSC are commonly used and affected by factors such as selective orientation and sample filling. The combination of spectroscopic techniques, including NIR, MIR, and Raman spectroscopy, with chemometrics has become a hot research topic.[30] [31] [32] [33] The quantitative analysis of spectra relies on crystal stacking of different crystal types and the changes in molecular vibrations, but overlapping of characteristic peaks, blurring of spectral differences, and nonlinear relationships increase the difficulty of analysis. However, multivariate analysis models such as partial least squares regression (PLS) and principal component regression can solve those problems and achieve quantitative analysis by filtering valid information.[34] [35] [36] In addition, the raw spectra contain noise information. The raw spectral data need to be divided into datasets and preprocessed to reduce the impact of noise on the recognition accuracy. Dataset division methods include SPXY, K_S, and Random. Preprocessing methods include the multiplicative scattering correction (MSC), the standard normal variate (SNV), Savitzky–Golay filtering, and derivative spectra.[37] [38] [39]
This work aimed to establish a method for quantitative analysis of PS-A crystalline form. A review of the literature patents shows that the quantification of the crystalline form has not been reported. In this work, it is proposed to quantify PS-A in PS binary mixtures using PXRD, MIR, and Raman methods. The predictive ability of the PLS model was investigated by the SPXY, K_S, and Random methods at different division ratios. The PLS calibration curves were established based on the relationship between PXRD, MIR, and Raman data and PS-A content in different ranges (full and partial spectra) using different preprocessing algorithms including MSC, SNV, Savitzky–Golay filtering, and derivative spectroscopy, or combinations of them. A method suitable for quantifying the PS-A content in PS binary mixture was explored by comparing the performance of the calibration models established by different methods.
Material and Methods
Material and Sample Preparation
PS API (101240701∼101240706) was purchased from Shanghai Tianwei Biopharmaceutical Co., Ltd. (Shanghai, China) with a purity > 99%, whereas PS-A and PS-D were laboratory-made. They were characterized by PXRD, MIR, and Raman. PS-A and PS-D were sieved through a 100-mesh sieve and then mixed. To minimize sample homogeneity leading to differences in sampling, binary mixtures were prepared separately for each technique.
For the PXRD analysis, the samples were mixed using ultrasonication. A total of 40 mg binary mixture containing 0, 10, 20, 30, 40, 50, 60, 70, 80, 90, and 100% of PS-A, respectively, were prepared; the remaining mass balance being provided by crystalline type D. Appropriate amount of ether was added, the mixture was sonicated for 1 minute and then dried in an oven at 40°C for 90 minutes. Pure PS crystalline forms A and D were tested before and after pretreatment to exclude the possibility of any phase change during sonication and drying.
For MIR analysis, 1.3 mg of binary mixture containing 0, 10, 20, 30, 40, 50, 60, 70, 80, 90, and 100% of PS-A was prepared; the remaining mass was provided by PS-D. The mixture was ground with 200 mg of KBr for 3 minutes and pressed and set aside.
For Raman analysis, a 100 mg binary mixture containing 0, 10, 20, 30, 40, 50, 60, 70, 80, 90, and 100% of crystalline type A was prepared and the remaining mass balance was provided by crystalline type D. The samples were kept in polyethylene tubes, sealed with a sealing film, vortex-mixed for 5 minutes, and then placed in a thermostatic shaker and shaken at a constant temperature for 6 hours for backup.
The chemicals used in the test were analytically pure. The weighing method used in the experiments was the gravimetric method. For the weighing section, the PXRD and Raman samples were weighed using a Mettler Toledo 100,000 ppm balance with a minimum weight of 10.00 mg, and the MIR samples were weighed using a Mettler Toledo 1 million parts per million balance with a minimum weight of 1.136 mg.
Data Acquisition and Analysis
Data Acquisition
PXRD data were collected using a BRUKER D8 ADVANCE A25 powder X-ray diffractometer (Bruker, United States) at room temperature. Cu Kα rays (λ = 1.5418 Å) were used for the experiments, with a scanning range of 3 to 35°, a scanning step size of 0.02°, and a scanning time of 1 second for each step. Data were processed using Jade 6.5 software.
The MIR data were detected by an IRTracer-100 spectrometer of Shimadzu Corporation (Shimadzu, Japan) using the potassium bromide press method, and the samples were scanned in the range of 400 to 4,000 cm−1 with a scan number of 32 and a resolution of 2 cm−1.
Raman data were acquired with a DRX3xi Raman imaging microscope (Thermo Scientific, United States). The Raman microscope component of the DRX3xi utilizes a 532 nm laser with an output power of 8 mW, an exposure time of 0.01 seconds, a total of 50 exposures, scanning range of 50 to 3,400 cm−1.
Data Set Division and Division Ratio
Three algorithms, namely SPXY, K_S, and Random, were used to divide the dataset, and then a PLS model for the quantitative analysis of PS-A crystalline form was established to determine the advantages and disadvantages of the dataset division methods. In this study, the ratio of 1:4 to 1:1, the prediction set to the correction set, which is hereinafter referred to as H, was used to investigate the ratio of the dataset division. The dataset division ratios were investigated in terms of the root mean square error of the calibration set (RMSEC), root mean square error of the prediction set (RMSEP), and root mean square error of cross-validation (RMSECV). Correlation coefficient of calibration set (RC) and correlation coefficient of prediction set (RP) were evaluated comprehensively, and the optimal division method and the optimal division ratio were selected.
Data Analysis
Currently, there are many preprocessing algorithms, each with its own advantages and disadvantages. In this work, various preprocessing algorithms, including MSC, SNV, Savitzky–Golay filter, and Derivative Spectroscopy, were used to eliminate unimportant baseline (offset) interferences in the sample or to correct scattering effects and enhance the spectral signals of interest.
MATLAB 2020 was used for preprocessing analysis, combined with PLS regression to establish correction models for PXRD, MIR, and Raman. The PLS algorithm extracts the principal components of the independent variable (X) and the dependent variable (Y) at the same time, to maximize the interpretation of the correlation between X and Y while retaining the key information of the original variables, to efficiently establish a quantitative relationship between the spectral data and the crystal content. The optimal number of PLS factors was selected using complete cross-validation method (leave-one-out validation). The quality of the model was evaluated based on the correlation coefficient (R 2), RMSEC, RMSEP, and RMSECV. The formula, Equation ([1]), is as follows:


Where у i represents the theoretical value; ŷ i the calculated value, and n the number of samples. The limit of detection (LOD) and limit of quantification (LOQ) were estimated as 3.3 and 10 times the standard deviation of the blank divided by the slope of the calibration curve, respectively. The standard deviation of the blank was replaced by the standard deviation of the lowest concentration for three measurements. To verify the accuracy of the calibration model, several samples with known PS-A concentrations were subjected to PXRD, MIR, and Raman for three measurements, respectively. LOD and LOQ were calculated using Equations [(2)] and [(3)].




Where σ is the standard deviation of the predicted content values, and S is the slope of the calibration curve.
Results and Discussion
Characterization of Pravastatin Sodium Crystal Types A and D
The solid forms of PS (PS-A and PS-D) were characterized using PXRD, MIR, and Raman spectroscopy, and the results are described below.
Powder X-ray Diffraction
As shown in [Fig. 2A], the characteristic peaks of PS-A (2θ = 3.9°, 4.5°, 6.2°, 7.3°, 8.6°, 9.2°, 10.0°, 11.7°, 12.0°, 17.0°, 19.9°) and PS-D (2θ = 3.5°, 6.3°, 9.7°, and 16.9°) was showed in the experimental PXRD diffractograms, which are consistent with the results of the patent.[9] For PS-A quantification, the average PXRD spectra of PS binary mixtures containing different PS-A content were collected. As shown in [Fig. 2B], the intensity of the characteristic peaks of PS-A in the binary mixtures becomes stronger as the increased in PA content.


Mid-Infrared
The MIR spectra of PS-A and PS-D solid powder samples, obtained in KBr, are shown in [Fig. 3A]. Although polycrystalline species show similar spectra, small differences can be detected. Martín-Islan et al resolved its IR pattern as follows.[40] The strong and broad band at 3,700–3,100 cm−1 is attributed to the ν (OH) vibration of the hydroxyl group in the pravastatin molecule and water molecule in the lattice. The band appearing at 3,040–2,800 cm−1 is attributed to the ν (C–H) vibration. The strong narrow band appearing at 1,727 cm−1 is attributed to the ν (C = O) vibration of the ester group. Both show this band at the same frequency, with the D-type showing a small shoulder peak at 1,711 cm−1. The maximum absorption peak is located at 1,569 cm−1 and is attributed to the ν (C = O) vibration of the carboxylate. Crystal type A shows a 7 cm−1 shift of the maximum (1,577 cm−1) at higher frequencies, and crystal type D shows an additional small shoulder at higher frequencies (1,606 cm−1). Several bands that can be attributed to δ (C–H) vibrations appear in the range 1,480–1,230 cm−1, where small differences are observed. The band appearing at 1,220–1,140 cm−1 can be attributed to the ν (C–O) vibration of the ester group, while the band appearing at 1,120–1,000 cm−1 corresponds to the ν (C–O) vibration of the hydroxyl group. In the range of 890–800 cm−1, the two differ in that type D peaks at 854 cm−1, while type A does not. Similarly, as can be seen in [Fig. 3B], the intensity of the characteristic peaks in the MIR varies with increasing PS-A content and is marked with an orange box.


Raman
Raman spectra of PS-A and PS-D solid powder samples, obtained at 532 nm laser wavelength, are shown in [Fig. 4A]. Although polycrystalline species show similar spectra, minor differences can be detected. The Raman spectra were analyzed as follows. The C–H stretching vibration in the range of 3,000–2,800 cm−1 belonged to methyl and methylene groups. The strong narrow band that appeared at 1,725 cm−1 was attributed to the C = O stretching vibration of ester groups. The maximum absorption peak at 1,647 cm−1 is attributed to the C = O stretching vibration of the carboxyl group, and both show this band at the same frequency. In the range of 1,300–1,000 cm−1 attributed to the C–O stretching vibration of the ester group. In the range of 200 to 50 cm−1, it can be seen that the difference between the two lies in the difference in the location of the peaks for both. For type A, there is an outgoing peak at 134 cm−1, whereas for type D, there is a double-shouldered peak. Similarly, in [Fig. 4B], it can be seen that the trend of the intensity of the Raman characteristic peaks with increasing/decreasing PS-A content is also marked with an orange box.


Analysis of the Results of Different Dataset Division Methods and Division Ratio of the Sample
We explored the effect of the dataset on the PLS model under different division ratios. In this work, K_S and SPXY methods were used to divide the samples under different division ratios, respectively, and then establish the PLS quantitative model. When the interval of the H value is [0.25,1], the number of PXRD correction sets varies in the range of [16,10], starting from 16 and decreasing in steps of 1 until 10 to change the division ratio. The range of the number of MIR correction sets is [38,24], starting from 38 and decreasing in steps of 1 until it stops at 24 to transform the division ratio. The number of Raman correction sets ranges from [36,24], and the division ratio is transformed by reducing the number of Raman correction sets in steps of 1 from 36 to 24. The division results were evaluated one by one using PLS modeling values. The number of principal factors of the PLS model was determined based on leave-one-out cross-validation. In addition, the Random method cannot arbitrarily divide the number of correction sets and prediction sets, so the optimal ratio is not explored; only the performance of the Random method is compared with the other two methods.
Specific analysis content is included in the [Supplementary Material] (available in online version), and the optimal division ratio of different dataset division methods was obtained as shown in [Table 1], with Cal denoting the number of correction sets and Val denoting the number of prediction sets, and the results were assessed by comparing the RMSEP values under the optimal ratios of different partitioning methods. It is found that the SPXY partitioning method has the smallest RMSEP, and therefore is the optimal dataset partitioning method for the PLS models of PXRD, MIR, and Raman.
Abbreviations: PLS, partial least squares; PXRD, powder X-ray diffraction; Raman, Raman spectroscopy; MIR, mid-infrared; RMSEC, root mean squared error of calibration; RMSEP, root mean squared error of prediction; Cal, the number of correction sets; Val, the number of prediction sets.
Quantification of PS-A in Binary Mixtures
Raw spectral data generated by all of the techniques discussed below are presented in the [Supplementary Material] (available in online version).
Quantitative Modeling of Powder X-ray Diffraction
X-ray diffraction spectra of 16 different PS binary mixtures were scanned. The calibration curves were plotted. The raw PXRD data were preprocessed using a combination of MSC, SNV, Savitzky–Golay filtering, and derivative, respectively. The preprocessed graphs are shown in [Fig. 5]. After preprocessing, 3–35°, 3–5°, 3–5° and 6–14°, 3–5° and 15–20° of the 16 sets of PXRD data were selected for the quantitative determination of PS-A content using the PLS method to establish the calibration model. The results are shown in [Table 2].
Abbreviations: PLS, partial least squares; RMSEC, root mean squared error of calibration; RMSECV, root mean squared error of cross-validation; RMSEP, root mean squared error of prediction.


Based on the R 2 values in [Table 2], it is better to establish the PLS regression models for the four different diffraction ranges processed by MSC, SNV, Savitzky–Golay filtering, and derivative. Combining the RMSEC, RMSEP, RMSECV values, the PLS factor numbers, and the LOD, and LOQ values in [Table 3], it showed that MSC (3–35°), S-G + first-order derivatives (3–5°), SNV (3–5°, 6–14°), and SNV (3–5°, 15–20°) are better models. The actual (41.548, 76.231, and 94.701%) and predicted PS-A weight percentage values of the quantitative models are shown in [Table 3]. The best-performing calibration model was built using PLS after MSC at 3 to 5° and 6 to 14° ([Fig. 6]). Subsequently, confidence intervals and prediction intervals were computed for the calibration model; the confidence intervals were used to assess the stability of the model estimates, with narrower confidence intervals indicating greater stability and wider confidence intervals implying greater uncertainty. Prediction intervals provide confidence ranges for point predictions of new observations and assess prediction uncertainty and can also be used to identify potential outliers or outliers. Four samples from the prediction set were used to validate the accuracy of the constructed model, which was analyzed by MATLAB 2020 software to derive the predicted values, and the predicted values were used to fit with the true values to obtain the prediction results as shown in [Fig. 7A], which shows a narrower 95% confidence interval and prediction intervals suggesting that the model has a good degree of confidence and that the four samples of the prediction were all within the prediction intervals.




Abbreviations: PS-A, pravastatin sodium crystal form A; PXRD, powder X-ray diffraction; Raman, Raman spectroscopy; MIR, mid-infrared.
Quantitative Modeling of Mid-Infrared
Thirty-five sets of MIR spectra of different PS binary mixtures were collected for calibration modeling. Before modeling, the raw MIR data were preprocessed using a combination of MSC, SNV, Savitzky–Golay filtering, and derivative spectroscopy, respectively, and the preprocessed results are shown in [Fig. 8]. After preprocessing of 35 sets of MIR data at 4,000–400 cm−1, 898–666 cm−1, 1,790–1,000 cm−1, and 898–666 and 1,790–1,000 cm−1, a corrective model was built using PLS to quantify the PS-A content, and the results are shown in [Table 4].
Abbreviations: PLS, partial least squares; PS-A, pravastatin sodium crystal form A; RMSEC, root mean squared error of calibration; RMSECV, root mean squared error of cross-validation; RMSEP, root mean squared error of prediction.


According to the data in [Table 4], it can be seen that the correction models established for the four different spectral ranges treated with a combination of MSC, SNV, Savitzky–Golay filtering, and derivative spectroscopy are similar, with R 2 value, PLS factor number, RMSEC, RMSECP, and RMSECV showing good correction ability. However, by calculating its LOD and LOQ values, it was found that the model treated by MSC (4,000–400 cm−1) had the smallest LOD and LOQ values, which was theoretically the most suitable regression model. However, in conjunction with the actual quantitative effect of the model, the one processed by S-G + second-order derivatives (898–666, 1,790–1,000 cm−1) established the best-corrected model ([Fig. 6B]). Although the corrected model exhibits a good linear relationship with an R2 value of 0.997, showing a high degree of goodness of fit, the quality of the model's predictions is not satisfactory as shown by the results of the validation of the 13 samples of the prediction set, which is consistent with the wider confidence intervals ([Fig. 7B]). The wider confidence intervals indicate that the uncertainty of the prediction results is large, and the prediction accuracy of the model in practical applications needs to be improved. The reasons for this may be that the model overfits the data, the data itself are more variable, or the model fails to adequately capture the key features in the data.
Quantitative Modeling of Raman
Thirty-six sets of Raman spectra of PS binary mixtures were collected for calibration modeling. Before modeling, the raw data were preprocessed using MSC, SNV, Savitzky–Golay filter, and derivative spectral combination, respectively. The results are shown in [Fig. 9]. Four spectral regions, 3,400–50 cm−1, 400–50 cm−1, 1,700–900 cm−1, and 400–50 and 1,700–900 cm−1, were used for the PLS regression analysis. As illustrated in [Table 5], the four designated spectral regions are all deemed suitable for quantification. However, when combined with the actual quantification of the model, the optimal PLS model for quantifying PS-A in binary mixtures was obtained through MSC (3,400–50 cm−1) performed on the Raman data. MSC has proved to be useful as it eliminates any light scattering from the powder with RMSEC and RMSEP values of 6.634% and 4.228%, respectively. As shown in [Fig. 6C], the PS-A plot of prediction versus measurement showed an R2 value of 0.953. However, the model prediction is not very good through the 11 samples predicted, which is in line with the wider confidence intervals given in [Fig. 7C], which also suggests that the accuracy of the model is not very good, and the factor that affects this could be the insufficient sample size, as larger samples provide more information and reduce the estimated Uncertainty.
Abbreviations: PLS, partial least squares; PS-A, pravastatin sodium crystal form A; RMSEC, root mean squared error of calibration; RMSECV, root mean squared error of cross-validation; RMSEP, root mean squared error of prediction.


Comparison of the Three Technologies
The aim of this study was to find the most suitable method for the quantitative determination of PS-A in binary mixtures. PXRD is a nondestructive test method. MIR compression is one of the most widely used methods for the determination of solid samples. Raman requires the least amount of sample preparation, and the measurements are noncontact and nondestructive. Each technique has its advantages and disadvantages. Sometimes a combination of two or three techniques is required for effective quantitative analysis. From the results, the models developed by PXRD, MIR, and Raman accurately predicted PS-A in binary mixtures, but PXRD was superior to spectroscopy. This is evident from the 95% confidence intervals and prediction intervals of the fitted curves, which are wider for both the MIR and Raman models and narrower for the PXRD model, with higher accuracy of the measurements. In addition, the PXRD model has lower LOD (1.52%) and LOQ (4.60%), and the RMSEC, RMSEP, and RMSECV values are significantly smaller than those of the MIR and Raman models. To further compare the accuracy of the three techniques, a set of validation samples containing different amounts of PS-A was analyzed ([Table 3]). For example, PXRD was the most accurate to determine validation sample 1, with the difference decreasing as the PS-A content increased. Overall, PXRD gave more accurate predictions.
PXRD, MIR, and Raman can all predict the polycrystalline transformation of PS with varying degrees of accuracy and specificity. However, the accuracy and specificity of PXRD are usually higher because of its ability to provide detailed crystal structure information, and the difference in prediction between MIR and Raman stems from the relatively low accuracy of their model development, resulting in less predictability than PXRD.
Application of the Methodology
To test the applicability of the established model, six batches of blind samples were selected for analysis. The measured data were imported into the established PXRD, MIR, and Raman models to obtain the predicted values, and 95% confidence intervals were calculated for the predicted results. As shown in [Table 6], the prediction results for the unknown samples demonstrate that the PXRD method yields results with a high degree of accuracy, with a range of 96 to 100%, which exceeds the 95% threshold. The calculated 95% confidence interval for this range was [97.877, 99.295]. The partial predictions of the MIR and Raman methods were relatively low, and their 95% confidence intervals were calculated as [93.377, 98.848] and [96.586, 99.777]. Compared with the PXRD method, the MIR and Raman confidence intervals were relatively wide, indicating a higher uncertainty in the parameter estimation. The PXRD method showed high accuracy and reliability in predicting the crystalline purity of PS owing to its high specificity and detailed crystal structure. In contrast, the MIR and Raman methods, although having some predictive ability, showed a higher uncertainty in the prediction results because of their sensitivity to the nature of the sample and spectral properties.
Abbreviations: MIR, mid-infrared; PS-A, pravastatin sodium crystal form A; PXRD, powder X-ray diffraction; Raman, Raman spectroscopy.
Given the advantages of the PXRD method in determining the crystal form purity of PS, it is recommended that the quantitative model established by PXRD be utilized to determine crystal purity in subsequent studies.
Discussion
This study successfully developed quantitative modeling of the A crystalline content in binary mixtures of PS using PXRD, MIR, and Raman techniques combined with PLS and various preprocessing algorithms. The PXRD model demonstrated significant advantages, with an LOD value of 1.52% and an LOQ value of 4.60%, which is substantially better than other spectroscopic techniques. This result shows that PXRD is less affected by quantitative analysis due to the distinct diffraction peaks of the crystal structure, making it more accurate and reliable for the quantitative analysis of PS-A crystal form. In contrast, while NIR and Raman techniques are uniquely valuable in chemical analysis, their performance in quantifying PS-A crystal morphology is limited by the overlap of spectral signals, resulting in lower predictive power than PXRD.
However, there are certain limitations in the application of crystal form quantitative modeling of APIs in formulations. Although PXRD shows high accuracy and reliability in the quantitative analysis of the crystal form of APIs, the presence of excipients in formulations may bring complex background signals, which may interfere with the detection results of PXRD, thus affecting the accuracy of the quantitative analysis. It can be coupled with other analytical techniques, such as MIR and Raman, to reduce the effect of this interference. These spectroscopic techniques provide information on the crystalline form of the drug and the molecular structure of the excipients, reduce excipient interferences, detect the signals of low levels of APIs, and analyze the spatial distribution of the drug and excipients with the help of microscopic imaging techniques to further improve the accuracy of quantitative analysis. When building quantitative models of complex polycrystals, it is crucial to thoroughly assess the generalization capabilities of the model to ensure its reliability and accuracy in different application scenarios, such as screening and optimizing polycrystals in APIs, monitoring crystallization during manufacturing processes, and quantifying crystalline forms in complex formulations. The complexity of the crystal structure, the diversity of the formulation process, the wide distribution of the data, and the ability of the model to predict new crystalline forms all affect the model's ability to generalize. The physical and chemical properties of APIs in different crystal forms vary widely and are susceptible to transformation during processing. Variables in the formulation process can also lead to instability and diversity of crystals, thus affecting the ability of the model to generalize. To improve the generalization ability of the model, it is necessary to combine more types of samples, covering samples from different sources, different preparation methods, and under different storage conditions. This not only helps to improve the prediction accuracy of the model, but also enhances its reliability in real-world applications, thereby improving the model's adaptability and robustness in complex tasks.
Conclusion
In this study, the quantitative modeling of the A crystalline content in the binary mixture of PS was successfully developed by PXRD, MIR, and Raman techniques in combination with PLS and various preprocessing algorithms. The PXRD model showed higher accuracy and specificity, and its limits of detection and quantification were 1.52 and 4.60%, which were significantly better than those of the MIR and Raman models, respectively. The difference in prediction between the MIR and Raman models may stem from the lower accuracy of their model development, resulting in less predictability than PXRD. In addition, the developed models were tested with blind samples, and from the test results, the confidence intervals of the predicted values of MIR and Raman were wider and the uncertainties of the parameter estimation were larger compared with those of PXRD, so the calibration model developed by the PXRD method was chosen to be used for the determination of the crystalline purity of PS in the actual production, and to provide reliable methodological support for the quality control of the pharmaceutical products.
Conflict of Interest
None declared.
Acknowledgments
In the process of completing the Pravastatin Sodium API Project Research, we have received support and help from many parties, especially thanks to Shanghai Tianwei Biopharmaceutical Co., Ltd., for the gift of Pravastatin Sodium API, which provides the basis for the smooth progress of the project. This work is supported by the National Key Laboratory of Lead Druggability Research, which is an important component of the construction of the “Shanghai Municipal Professional Service Platform for Drug Solid State and Quality Control Technology” (23DZ2292600). The successful completion of the Pravastatin Sodium API project has laid a solid foundation for subsequent drug development and research.
Supporting Information
This section includes (1) preparation process of pravastatin sodium A crystal, pravastatin sodium D crystal form, and binary mixture samples; (2) sample collection and preprocessing; (3) analysis of the results of different dataset division methods and division ratio of the sample; (4) quantitative model construction and evaluation; (5) the plot of PXRD raw data, mapping of MIR raw data and Raman raw data and trend plot of PLS modeling results at different scales of KS method and SPXY method ([Supplementary Figs. S1]–[S5] [available in online version]); and (6) samples used to build and validate quantitative PXRD models, MIR models, and Raman models ([Supplementary Tables S1]–-[S3] [available in online version]).
# These authors contributed equally to this work.
-
References
- 1 Ludík J, Kostková V, Kocian Š, Touš P, Štejfa V, Červinka C. First-principles models of polymorphism of pharmaceuticals: maximizing the accuracy-to-cost ratio. J Chem Theory Comput 2024; 20 (07) 2858-2870
- 2 Alam A, Sharma KP, Singh P. et al. Drug polymorphism: an important preformulation tool in the formulation development of a dosage form. Curr Phys Chem 2024; 14 (01) 2-19
- 3 Shi Q, Chen H, Wang Y, Xu J, Liu Z, Zhang C. Recent advances in drug polymorphs: aspects of pharmaceutical properties and selective crystallization. Int J Pharm 2022; 611: 121320
- 4 Higashi K, Ueda K, Moribe K. Recent progress of structural study of polymorphic pharmaceutical drugs. Adv Drug Deliv Rev 2017; 117: 71-85
- 5 Gabdulkhaev MN, Ziganshin MA, Larionov RA. et al. Fast heating inhibits endothermic solid-solid polymorphic transition giving a melting of low temperature polymorph with the next cold crystallization. Thermochim Acta 2023; 726: 179561
- 6 Brittain HG. Theory and Principles of Polymorphic Systems. Polymorphism in Pharmaceutical Solids. 2nd ed.. Boca Raton: CRC Press; 2016: 1-23
- 7 Strachan CJ, Pratiwi D, Gordon KC. et al. Quantitative analysis of polymorphic mixtures of carbamazepine by Raman spectroscopy and principal components analysis. Raman Spectrosc 2004; 35: 347-352
- 8 Ramkumar S, Raghunath A, Raghunath S. Statin therapy: review of safety and potential side effects. Zhonghua Minguo Xinzangxue Hui Zazhi 2016; 32 (06) 631-639
- 9 Keri V, Deak L, Forgacs I, Szabo C, Nagyne AE. Pravastatin sodium substantially free of pravastatin lactone and EPI-pravastatin, and compositions containing same. U.S. Patent 20050215636A1. September 29, 2005
- 10 Sevgi P, Huseyin EB. Effect of L-alanyl-glycine dipeptide on calcium oxalate crystallization in artificial urine. J Cryst Growth 2021; 126176: 566-567
- 11 Eliska S, Jan R, Argyro C. et al. Low-temperature polymorphs of lacosamide. J Cryst Growth 2021; 562: 126085
- 12 Sundaram M, Natarajan S, Dikundwar GA. et al. Quantification of solid-state impurity with powder X-ray diffraction using laboratory source. Powder Diffr 2020; 35 (04) 1-7
- 13 Zappi A, Maini L, Galimberti G, Caliandro R, Melucci D. Quantifying API polymorphs in formulations using X-ray powder diffraction and multivariate standard addition method combined with net analyte signal analysis. Eur J Pharm Sci 2019; 130: 36-43
- 14 Bellur Atici E, Karlığa B. Quantitative determination of two polymorphic forms of imatinib mesylate in a drug substance and tablet formulation by X-ray powder diffraction, differential scanning calorimetry and attenuated total reflectance Fourier transform infrared spectroscopy. J Pharm Biomed Anal 2015; 114: 330-340
- 15 Lin SY. An overview of famotidine polymorphs: solid-state characteristics, thermodynamics, polymorphic transformation and quality control. Pharm Res 2014; 31 (07) 1619-1631
- 16 Riekes MK, Pereira RN, Rauber GS. et al. Polymorphism in nimodipine raw materials: development and validation of a quantitative method through differential scanning calorimetry. J Pharm Biomed Anal 2012; 70 (11) 188-193
- 17 Li Y, Chow PS, Tan RB. Quantification of polymorphic impurity in an enantiotropic polymorph system using differential scanning calorimetry, X-ray powder diffraction and Raman spectroscopy. Int J Pharm 2011; 415 (1–2): 110-118
- 18 da Silva VH, Gonçalves JL, Vasconcelos FV, Pimentel MF, Pereira CF. Quantitative analysis of mebendazole polymorphs in pharmaceutical raw materials using near-infrared spectroscopy. J Pharm Biomed Anal 2015; 115: 587-593
- 19 Bhavana V, Chavan RB, Mannava MKC, Nangia A, Shastri NR. Quantification of niclosamide polymorphic forms - a comparative study by Raman, NIR and MIR using chemometric techniques. Talanta 2019; 199: 679-688
- 20 Hennigan MC, Ryder AG. Quantitative polymorph contaminant analysis in tablets using Raman and near infra-red spectroscopies. J Pharm Biomed Anal 2013; 72: 163-171
- 21 Farias M, Carneiro R. Simultaneous quantification of three polymorphic forms of carbamazepine in the presence of excipients using Raman spectroscopy. Molecules 2014; 19 (09) 14128-14138
- 22 Farias MADS, Soares FLF, Carneiro RL. Crystalline phase transition of ezetimibe in final product, after packing, promoted by the humidity of excipients: Monitoring and quantification by Raman spectroscopy. J Pharm Biomed Anal 2016; 121: 209-214
- 23 Nagy B, Farkas A, Balogh A. et al. Quantification and handling of nonlinearity in Raman micro-spectrometry of pharmaceuticals. J Pharm Biomed Anal 2016; 128: 236-246
- 24 Pazesh S, Lazorova L, Berggren J, Alderborn G, Gråsjö J. Considerations on the quantitative analysis of apparent amorphicity of milled lactose by Raman spectroscopy. Int J Pharm 2016; 511 (01) 488-504
- 25 Agatonovic-Kustrin S, Glass BD, Mangan M, Smithson J. Analysing the crystal purity of mebendazole raw material and its stability in a suspension formulation. Int J Pharm 2008; 361 (1–2): 245-250
- 26 Hu Y, Erxleben A, Ryder AG, McArdle P. Quantitative analysis of sulfathiazole polymorphs in ternary mixtures by attenuated total reflectance infrared, near-infrared and Raman spectroscopy. J Pharm Biomed Anal 2010; 53 (03) 412-420
- 27 Tinmanee R, Larsen SC, Morris KR, Kirsch LE. Quantification of gabapentin polymorphs in gabapentin/excipient mixtures using solid state 13C NMR spectroscopy and X-ray powder diffraction. J Pharm Biomed Anal 2017; 146: 29-36
- 28 Virtanen T, Maunu SL. Quantitation of a polymorphic mixture of an active pharmaceutical ingredient with solid state (13)C CPMAS NMR spectroscopy. Int J Pharm 2010; 394 (1–2): 18-25
- 29 Darkwah J, Smith G, Ermolina I, Mueller-Holtz M. A THz spectroscopy method for quantifying the degree of crystallinity in freeze-dried gelatin/amino acid mixtures: an application for the development of rapidly disintegrating tablets. Int J Pharm 2013; 455 (1–2): 357-364
- 30 Rodionova OY, Titova AV, Godin FY, Balyklova KS, Pomerantsev AL, Rutledge DN. Monitoring of the natural aging of Diclofenac tablets, NIR and MIR-ATR spectroscopy coupled with chemometrics data analysis. J Pharm Biomed Anal 2022; 219: 114917
- 31 Sousa Sampaio PN, Calado CRC. Antimicrobial evaluation of the Cynara cardunculus extract in Helicobacter pylori cells using mid-infrared spectroscopy and chemometric methods. J Appl Microbiol 2022; 133 (03) 1743-1756
- 32 Čapková T, Pekárek T, Hanulíková B, Matějka P. Application of reverse engineering in the field of pharmaceutical tablets using Raman mapping and chemometrics. J Pharm Biomed Anal 2022; 209: 114496
- 33 Moroni AB, Vega DR, Kaufman TS, Calvo NL. Form quantitation in desmotropic mixtures of albendazole bulk drug by chemometrics-assisted analysis of vibrational spectra. Spectrochim Acta A Mol Biomol Spectrosc 2022; 265: 120354
- 34 Kachalkin MN, Ryazanova TK, Sokolova IV. Quantitative determination of ademetionine in tablets utilizing ATR-FTIR and partial least squares methods approaches. J Pharm Biomed Anal 2024; 241: 115991
- 35 Alaoui Mansouri M, Kharbach M, Bouklouze A. Current applications of multivariate curve resolution-alternating least squares (MCR-ALS) in Pharmaceutical Analysis. J Pharm Sci 2024; 113 (04) 856-865
- 36 Zhao X, Wang N, Zhu M. et al. Application of transmission Raman spectroscopy in combination with partial least-squares (PLS) for the fast quantification of paracetamol. Molecules 2022; 27 (05) 1707
- 37 Padhi S, John R, Tripathi K. et al. A comparison of spectral preprocessing methods and their effects on nutritional traits in cowpea germplasm. Legume Sci 2024; 6: e2977
- 38 Cai Y, Ma X, Huang B, Zhang R, Wang X. LIBS combined with SG-SPXY spectral data pre-processing for cement raw meal composition analysis. Appl Opt 2024; 63 (06) A24-A31
- 39 Mokari A, Guo S, Bocklitz T. Exploring the steps of infrared (IR) spectral analysis: pre-processing, (classical) data modelling, and deep learning. Molecules 2023; 28 (19) 6886
- 40 Martín-Islan AP, Cruzado MC, Asensio R, Sainz-Díaz CI. Crystalline polymorphism and molecular structure of sodium pravastatin. J Phys Chem B 2006; 110 (51) 26148-26159
Address for correspondence
Publication History
Received: 06 March 2025
Accepted: 24 July 2025
Article published online:
22 August 2025
© 2025. The Author(s). This is an open access article published by Thieme under the terms of the Creative Commons Attribution License, permitting unrestricted use, distribution, and reproduction so long as the original work is properly cited. (https://creativecommons.org/licenses/by/4.0/)
Georg Thieme Verlag KG
Oswald-Hesse-Straße 50, 70469 Stuttgart, Germany
-
References
- 1 Ludík J, Kostková V, Kocian Š, Touš P, Štejfa V, Červinka C. First-principles models of polymorphism of pharmaceuticals: maximizing the accuracy-to-cost ratio. J Chem Theory Comput 2024; 20 (07) 2858-2870
- 2 Alam A, Sharma KP, Singh P. et al. Drug polymorphism: an important preformulation tool in the formulation development of a dosage form. Curr Phys Chem 2024; 14 (01) 2-19
- 3 Shi Q, Chen H, Wang Y, Xu J, Liu Z, Zhang C. Recent advances in drug polymorphs: aspects of pharmaceutical properties and selective crystallization. Int J Pharm 2022; 611: 121320
- 4 Higashi K, Ueda K, Moribe K. Recent progress of structural study of polymorphic pharmaceutical drugs. Adv Drug Deliv Rev 2017; 117: 71-85
- 5 Gabdulkhaev MN, Ziganshin MA, Larionov RA. et al. Fast heating inhibits endothermic solid-solid polymorphic transition giving a melting of low temperature polymorph with the next cold crystallization. Thermochim Acta 2023; 726: 179561
- 6 Brittain HG. Theory and Principles of Polymorphic Systems. Polymorphism in Pharmaceutical Solids. 2nd ed.. Boca Raton: CRC Press; 2016: 1-23
- 7 Strachan CJ, Pratiwi D, Gordon KC. et al. Quantitative analysis of polymorphic mixtures of carbamazepine by Raman spectroscopy and principal components analysis. Raman Spectrosc 2004; 35: 347-352
- 8 Ramkumar S, Raghunath A, Raghunath S. Statin therapy: review of safety and potential side effects. Zhonghua Minguo Xinzangxue Hui Zazhi 2016; 32 (06) 631-639
- 9 Keri V, Deak L, Forgacs I, Szabo C, Nagyne AE. Pravastatin sodium substantially free of pravastatin lactone and EPI-pravastatin, and compositions containing same. U.S. Patent 20050215636A1. September 29, 2005
- 10 Sevgi P, Huseyin EB. Effect of L-alanyl-glycine dipeptide on calcium oxalate crystallization in artificial urine. J Cryst Growth 2021; 126176: 566-567
- 11 Eliska S, Jan R, Argyro C. et al. Low-temperature polymorphs of lacosamide. J Cryst Growth 2021; 562: 126085
- 12 Sundaram M, Natarajan S, Dikundwar GA. et al. Quantification of solid-state impurity with powder X-ray diffraction using laboratory source. Powder Diffr 2020; 35 (04) 1-7
- 13 Zappi A, Maini L, Galimberti G, Caliandro R, Melucci D. Quantifying API polymorphs in formulations using X-ray powder diffraction and multivariate standard addition method combined with net analyte signal analysis. Eur J Pharm Sci 2019; 130: 36-43
- 14 Bellur Atici E, Karlığa B. Quantitative determination of two polymorphic forms of imatinib mesylate in a drug substance and tablet formulation by X-ray powder diffraction, differential scanning calorimetry and attenuated total reflectance Fourier transform infrared spectroscopy. J Pharm Biomed Anal 2015; 114: 330-340
- 15 Lin SY. An overview of famotidine polymorphs: solid-state characteristics, thermodynamics, polymorphic transformation and quality control. Pharm Res 2014; 31 (07) 1619-1631
- 16 Riekes MK, Pereira RN, Rauber GS. et al. Polymorphism in nimodipine raw materials: development and validation of a quantitative method through differential scanning calorimetry. J Pharm Biomed Anal 2012; 70 (11) 188-193
- 17 Li Y, Chow PS, Tan RB. Quantification of polymorphic impurity in an enantiotropic polymorph system using differential scanning calorimetry, X-ray powder diffraction and Raman spectroscopy. Int J Pharm 2011; 415 (1–2): 110-118
- 18 da Silva VH, Gonçalves JL, Vasconcelos FV, Pimentel MF, Pereira CF. Quantitative analysis of mebendazole polymorphs in pharmaceutical raw materials using near-infrared spectroscopy. J Pharm Biomed Anal 2015; 115: 587-593
- 19 Bhavana V, Chavan RB, Mannava MKC, Nangia A, Shastri NR. Quantification of niclosamide polymorphic forms - a comparative study by Raman, NIR and MIR using chemometric techniques. Talanta 2019; 199: 679-688
- 20 Hennigan MC, Ryder AG. Quantitative polymorph contaminant analysis in tablets using Raman and near infra-red spectroscopies. J Pharm Biomed Anal 2013; 72: 163-171
- 21 Farias M, Carneiro R. Simultaneous quantification of three polymorphic forms of carbamazepine in the presence of excipients using Raman spectroscopy. Molecules 2014; 19 (09) 14128-14138
- 22 Farias MADS, Soares FLF, Carneiro RL. Crystalline phase transition of ezetimibe in final product, after packing, promoted by the humidity of excipients: Monitoring and quantification by Raman spectroscopy. J Pharm Biomed Anal 2016; 121: 209-214
- 23 Nagy B, Farkas A, Balogh A. et al. Quantification and handling of nonlinearity in Raman micro-spectrometry of pharmaceuticals. J Pharm Biomed Anal 2016; 128: 236-246
- 24 Pazesh S, Lazorova L, Berggren J, Alderborn G, Gråsjö J. Considerations on the quantitative analysis of apparent amorphicity of milled lactose by Raman spectroscopy. Int J Pharm 2016; 511 (01) 488-504
- 25 Agatonovic-Kustrin S, Glass BD, Mangan M, Smithson J. Analysing the crystal purity of mebendazole raw material and its stability in a suspension formulation. Int J Pharm 2008; 361 (1–2): 245-250
- 26 Hu Y, Erxleben A, Ryder AG, McArdle P. Quantitative analysis of sulfathiazole polymorphs in ternary mixtures by attenuated total reflectance infrared, near-infrared and Raman spectroscopy. J Pharm Biomed Anal 2010; 53 (03) 412-420
- 27 Tinmanee R, Larsen SC, Morris KR, Kirsch LE. Quantification of gabapentin polymorphs in gabapentin/excipient mixtures using solid state 13C NMR spectroscopy and X-ray powder diffraction. J Pharm Biomed Anal 2017; 146: 29-36
- 28 Virtanen T, Maunu SL. Quantitation of a polymorphic mixture of an active pharmaceutical ingredient with solid state (13)C CPMAS NMR spectroscopy. Int J Pharm 2010; 394 (1–2): 18-25
- 29 Darkwah J, Smith G, Ermolina I, Mueller-Holtz M. A THz spectroscopy method for quantifying the degree of crystallinity in freeze-dried gelatin/amino acid mixtures: an application for the development of rapidly disintegrating tablets. Int J Pharm 2013; 455 (1–2): 357-364
- 30 Rodionova OY, Titova AV, Godin FY, Balyklova KS, Pomerantsev AL, Rutledge DN. Monitoring of the natural aging of Diclofenac tablets, NIR and MIR-ATR spectroscopy coupled with chemometrics data analysis. J Pharm Biomed Anal 2022; 219: 114917
- 31 Sousa Sampaio PN, Calado CRC. Antimicrobial evaluation of the Cynara cardunculus extract in Helicobacter pylori cells using mid-infrared spectroscopy and chemometric methods. J Appl Microbiol 2022; 133 (03) 1743-1756
- 32 Čapková T, Pekárek T, Hanulíková B, Matějka P. Application of reverse engineering in the field of pharmaceutical tablets using Raman mapping and chemometrics. J Pharm Biomed Anal 2022; 209: 114496
- 33 Moroni AB, Vega DR, Kaufman TS, Calvo NL. Form quantitation in desmotropic mixtures of albendazole bulk drug by chemometrics-assisted analysis of vibrational spectra. Spectrochim Acta A Mol Biomol Spectrosc 2022; 265: 120354
- 34 Kachalkin MN, Ryazanova TK, Sokolova IV. Quantitative determination of ademetionine in tablets utilizing ATR-FTIR and partial least squares methods approaches. J Pharm Biomed Anal 2024; 241: 115991
- 35 Alaoui Mansouri M, Kharbach M, Bouklouze A. Current applications of multivariate curve resolution-alternating least squares (MCR-ALS) in Pharmaceutical Analysis. J Pharm Sci 2024; 113 (04) 856-865
- 36 Zhao X, Wang N, Zhu M. et al. Application of transmission Raman spectroscopy in combination with partial least-squares (PLS) for the fast quantification of paracetamol. Molecules 2022; 27 (05) 1707
- 37 Padhi S, John R, Tripathi K. et al. A comparison of spectral preprocessing methods and their effects on nutritional traits in cowpea germplasm. Legume Sci 2024; 6: e2977
- 38 Cai Y, Ma X, Huang B, Zhang R, Wang X. LIBS combined with SG-SPXY spectral data pre-processing for cement raw meal composition analysis. Appl Opt 2024; 63 (06) A24-A31
- 39 Mokari A, Guo S, Bocklitz T. Exploring the steps of infrared (IR) spectral analysis: pre-processing, (classical) data modelling, and deep learning. Molecules 2023; 28 (19) 6886
- 40 Martín-Islan AP, Cruzado MC, Asensio R, Sainz-Díaz CI. Crystalline polymorphism and molecular structure of sodium pravastatin. J Phys Chem B 2006; 110 (51) 26148-26159























