This column discusses the similarities and differences between the validation approaches for analytical methods for QC product analysis and bioanalysis.
In previous columns I have mainly discussed aspects of method validation for product analysis performed in quality control (QC) laboratories, but not for bioanalytical methods. To be specific, a bioanalytical method in the context of this column is the analysis of small drug molecules and any metabolites in biological samples generated from non-clinical and clinical studies. The data generated from these studies are used to evaluate the pharmacokinetics, bioavailability, bioequivalence and toxicokinetics, which are used to support regulatory submissions of new drugs or licensing generic versions of existing drugs.
Typically, the biological fluids analysed are plasma, derived from whole blood by centrifugation, and urine — which needs no further explanation! So let's start from the firm foundation that we use chromatographs for analyses in both QC and bioanalytical methods and we need to have validated methods. These methods are used for the quantitative determination of drugs and their metabolites in biological samples. Therefore, they must generate reproducible and reliable results to allow reliable interpretation of the study samples. It is essential to use fully validated bioanalytical methods that yield reliable results that can be interpreted satisfactorily.
Before discussing the detail of the validation of the methods themselves we will look at the differences in analysis and the HPLC equipment used between QC and bioanalysis. I'm not going to write about reference standards as all analytical methods are predicated on having standards of known purity and stability and these must be documented appropriately.
Methods used in the QC laboratory determine if the manufactured product meets its specifications and a wide variety of analytical techniques can be used, including chromatography, spectroscopy and traditional wet chemistry analysis, such as Karl Fischer titration. The analysis will cover the raw materials used at the start of the manufacturing process, key intermediates and the final product, and will include analysis of known impurities. Usually the amounts to be determined will be known from the product specification and will have a short dynamic range, for example, 50–150% or 75–125% of the nominal amount for the active ingredient in contrast to impurity analysis that will be between 0.05–0.5% of the nominal amount of the active component. Many different analytical methods will be used for product batches in QC analysis, some will be specific to the product and some will be general analytical techniques applicable to all products, for example, loss on drying.
In contrast, there are relatively fewer bioanalytical methods, but they are applied more intensively to samples from both non-clinical and clinical studies. A clinical study can generate up to 5000 samples for analysis depending on the complexity of the study design and its objectives. Instead of a narrow concentration range as found in a product specification, bioanalytical assays can cover 2–4 orders of magnitude. This is especially true if a drug is given intravenously because the assay will follow the time course of the absorption, distribution and elimination of the drug in the body — often until the drug is no longer detected. Thus, the analytical method follows the dynamic analyte concentrations from ingestion of the drug to its final elimination.
The chromatographic pumps used for the bioanalysis are broadly the same as used for QC analysis, however, the autosamplers and detectors used may differ as we will now discuss in more detail.
Autosampler: A traditional autosampler can be used for bioanalysis, but it is important to ensure that the probe can sample just the volume of the sample programmed into the instrument or data system. The older style valve autosamplers where the sample itself was used to flush the value and avoid carryover cannot be used as the bioanalytical sample extracts range from 50–200 μL and there is insufficient volume to flush the value. Thus any autosampler used must have minimal carryover to avoid contaminating the next sample injection.
Many bioanalytical methods use solid-phase extraction to isolate the analytes from the biological sample and the extracts are placed in 96-well plates. Therefore, to avoid sample transfer to vials, the autosampler must also be capable of accepting these plates to inject the sample extracts.
Detector: The typical HPLC detector for the majority of product analysis is a UV detector. This is also used in bioanalysis, but the main detector used is a mass spectrometer, typically a triple quadrupole instrument. This is partly because more potent compounds are produced in pharmaceutical research and development which results in assay concentration ranges in the nanogramme/mL and pictogramme/mL, but is also partly because of the specificity and sensitivity required for the analysis. In this latter context, MS–MS detectors are used along with the associated software to allow the use of multiple ion monitoring, as well as positive and negative ionization modes (with or without chemical ionization). The additional expense for a mass spectrometer detector is offset by the scientific requirements for sensitivity and selectivity as well as the ability to have shortened run times, often less than 5 min per injection.
The parameters and criteria for validation of methods used in QC analysis have been developed by the International Conference on Harmonisation (ICH) starting in the 1990s. ICH is a collaboration between the three main regulatory bodies in the US, Europe and Japan and their counterparts in the pharmaceutical industry. There are several publications available on the web at www.ich.org under the quality section and the main ones concerned with method validation and application in the pharmaceutical QC laboratory are shown in Table 1.
Table 1: ICH method validation publications.
Similarly, bioanalysts have not been idle and they have developed their own consensus views on bioanalytical method validation in a series of conferences organized conjointly since 1990 by the American Association of Pharmaceutical Sciences (AAPS) and the US regulatory agency the Food and Drug Administration (FDA). There have been three bioanalytical consensus conferences held in Crystal City near Washington DC in 1990, 2000 and 2006. The first conference resulted in the first publication on bioanalytical methods validation,1 after the second conference the FDA issued a guidance for industry on the subject published in 20012 as shown in Table 2.
Table 2: Bioanalytical method validation conferences.
An important conclusion of the first conference was that bioanalytical methods validation was divided into two parts:
1. Initial validation of the method before analysis of study samples.
2. Control when analysing study samples using the validated method.
The contents of the FDA guidance document will be discussed in more detail in the next two sections with specific focus on chromatographic analysis. I will not discuss validation and application of ligand binding assays (e.g., immunoassays) that are also covered by the conference reports and the FDA guidance because this is not within the remit of LCGC Europe.
Pre-study validation: We'll start the discussion of bioanalytical method validation before its application to study samples with the calibration curve used to determine the relationship between the concentrations in the standards and its dynamic range because this is the key to the quality of the overall validation and the method's application.
Calibrated curve: The first issue is to establish which calibrated curve model will be used in the method validation, in essence the instrument response versus analyte concentration. There is some debate if this is should be performed during method development or at the start of the validation. I personally think this is a method development issue, but regardless of where it is performed the fact remains it must be done. A calibration curve should be constructed for each analyte to be measured by a method using spiked samples in the same biological matrix as the study samples. The selection of the calibration model is essential for a bioanalytical assay as there is a large dynamic range over which an analyte will be measured and as most quantification methods are based on a regression analysis it is the most appropriate that will be selected either unweighted linear regression or weighted (either 1/x or 1/x2) are the best models to use.
Using unweighted least squares linear regression analysis to draft a calibration line is acceptable when the concentration range is relaively small. However, over the larger concentration ranges usually found in bioanalysis the problem is that errors, which occur at the highest concentration standards, can distort the fit of the lower standards and bias measurement at low concentrations. To overcome this problem, a weighting of 1/x or 1/x2 can be applied to bias the lower concentration standards, the problem is to determine if weighting is necessary and, if so, which one is best.
The FDA guidance for industry notes: "Standard curve fitting is determined by applying the simplest model that adequately describes the concentration–response relationship using appropriate weighting and statistical tests for goodness of fit".2 The best way is to analyse the residuals of the standards and the ideal line fit is if the residuals are evenly distributed over the concentration range (homoscedastic) and do not increase at the limits of the standard curve. Once the calibration model has been selected it can be applied in the method validation.
How many standards are required for a calibration curve? This depends on the range of the assay but the FDA advises that the minimum number of standards is 6–8 covering the range of the assay from the lower limit of quantification (LLOQ) to the upper limit of quantification (ULOQ) plus two matrix blanks, one with internal standard and one without.2 The calibration curve standards must be extracted at the same time as the validation samples, or indeed the biological samples when the assay is applied to real samples.
This is different from QC analysis involving chromatography where the calibration models are simpler such as a single point, response factor and occasionally a point-to-point, the reason is that the concentrations or amounts determined are in a relatively limited range compared with bioanalysis as we will discuss below.
Dynamic range of the method: This is determined simply by the acceptable precision and accuracy of the assay. The 1990 consensus conference and FDA guidance determined that this was ±15% over the range of the assay except at the LLOQ where the acceptable precision and accuracy is ±20%.1,2 It is the LLOQ that the FDA focus on in their guidance as the lowest standard on the calibration curve has to meet specific acceptance criteria:
After reading this section, chromatographers working in QC laboratories must be shaking their heads in disbelief at a precision of ±15% when typically a system suitability test (SST) has a pharmacopoeial limit of ±2% following five standard injections.3 However, what you must realise is that the analyte is in a complex biological matrix that must be removed and the sample preparation used to remove it introduces much of the variation in the method. Often the analytes are bound to the components of the matrix and this binding must be disrupted to recover them. However, this gives rise to one of the more interesting problems in bioanalysis, which is the fact that the samples used for analysis derived from studies in animals or humans are different from the spiked samples used to prepare the standards, QC and unknown samples used either in the method validation or application of the method. We will return to this point later in this column.
Where study samples are above the method ULOQ then they will need to be diluted to bring them within the dynamic range of the assay. This process also needs to be validated along with the minimum and maximum dilution ranges to be used, for example, 1:1 to 1:100.
Assay selectivity: Selectivity is the ability of an analytical method to differentiate and quantify the analyte in the presence of other components in the sample. For selectivity, analyses of blank samples of the appropriate biological matrix (plasma, urine, or other matrix) should be obtained from at least six sources.2 Note here the correct use of the term of selectivity instead of the more commonly used specificity; even with a highly selective MS detector bioanalytical methods can have interference from endogenous compounds amongst other things. Selectivity should be assessed at the lower limit of quantification (LLOQ) rather than a higher concentration. The major problem with selectivity is that it does not have any dimensions and, therefore, cannot be measured per se and a chromatographer reviews the chromatograms to see if there is any interference.
Moreover, selectivity is not an event — it is a journey because each run needs to be assessed, hence the need for blank samples in the analytical run on each and every occasion to assess if there are any problems from the matrix or the chemicals used in the method.
Method accuracy, precision and analyte recovery: The accuracy and precision of a bioanalytical method is determined by replicate analysis of samples containing known amounts of the analyte. In a run measurement (single batch analysis) the accuracy and precision is measured at three concentrations with a minimum of five determinations per concentration. Of the three concentrations one should be close to the LLOQ, one in the mid-range of the assay and the other at the ULOQ. Acceptance criteria for both precision and accuracy are within 15% [coefficient of variation (CV) and calculated concentration respectively] except at LLOQ, where the values for both parameters should be within 20%.
Precision is also determined between-run, inter-batch precision or repeatability, which measures precision with time and may involve different analysts, equipment, reagents and laboratories. This is simply the repetition of the first batch at least three times and if resources allow the analysis should be performed on different days with different analysts and equipment.
The terminology of precision used in bioanalysis is different from the QC analysis where the terms repeatability, intermediate precision and reproducibility are used instead. When determining precision within a laboratory repeatability and intermediate precision correspond to the bioanalytical terms within day and between day determinations, while reproducibility is used for interlaboratory studies for QC analysis.4
The recovery is the efficiency of the extraction of the analyte from the biological matrix and is calculated by comparing the detector responses of extracted samples and stock solutions of the same concentration, with thought this experiment can be included in the precision and accuracy runs to save overall effort and maximize instrument time. Recovery of the analyte need not be 100%, but the extent of recovery of an analyte and of the internal standard should be consistent, precise and reproducible. Recovery experiments should be performed by comparing the analytical results for extracted samples at three concentrations (low, medium and high) with unextracted standards that represent 100% recovery.2
Analyte stability: Stability of the analytes both in the biological fluid and during preparation and awaiting analysis needs to be determined as part of the validation. Typically, the short-term stability experiments are determined in the original method validation but longer term stability in the biological matrix starts about the same time but will continue either until the analyte concentration starts to fall or when the anticipated storage time is reached which could be 6–12 months. Experiments covered in the bioanalytical method validation guidance include:
The conditions used when conducting in various stability experiments should reflect the anticipated conditions under which samples will be stored and the design of the validation experiments should generate sufficient data to allow solid conclusions to be drawn from the results.
Reporting the validation: At the end of the validation a formal report is issued that details the work done and the outcome of the overall validation with the operating ranges of the assay, stability and sample handling recommendations and the biological matrices for which the method is validated. The method is now ready for use on assaying samples from studies.
When using the method for analysing samples a system suitability test is performed to check that the chromatograph is working, however, this is not as formal as a QC laboratory system suitability test (SST) as outlined in a pharmacopoeia.3 The bioanalytical laboratory determined the extent of the work and how many standards are injected before starting an analytical run.
Sample types per run: Each analytical run must consist of the following as a minimum:
Quality control samples: The number of QC samples needed for a run is dependent on the number of samples to be analysed. As the number of unknown samples can vary between 20 to 400+ the number of QC samples also varies, the smallest number is 6 for about 40–60 samples, then this will rise as the batch size increases. For about 150–200 samples about 12–15 QC samples will be required that will be distributed throughout the batch to ensure the assay operates throughout the run. This is equivalent to reinjecting a standard throughout a QC analysis to check the chromatographic response is acceptable. The precision and accuracy acceptance criteria for the QC samples need to be determined in advance but the FDA guidance based on the first consensus conference notes that the ULOQ and median QCs need to be ±15% of their actual concentration while the LLOQ QC samples must be ±20%. Two of the six QC samples may be outside the ±15% of their respective nominal value, but not both at the same concentration. The standards and QC samples used in a run should be placed so that they can detect assay drift over the run, this is important with the number of samples that could be assayed during a single run.2
Calibration curve data: Some standards can be dropped from the calibration line calculation if they do not fit the line but this needs to be justified and documented, but at least four out of six non-zero standards should be within the acceptance criteria including the LLOQ and the highest concentration calibration standards. The justification for dropping a standard is based on the back-calculated values of each standard, typically within ±15%. However, it is important when excluding any the standards the calibration model used must not change. Furthermore, calculation of concentration in unknown samples by extrapolation of standard curves below LLOQ or above the highest standard is not recommended and the samples are reported as either not quantifiable or diluted onto the calibration curve respectively.
Incurred sample reanalysis: The major problem with bioanalysis is that samples generated for method validation studies are prepared by spiking analytes into a biological matrix. Typically, the solution used to spike the drug concentrations is organic, for example, methanol or an organic-aqueous solution depending on the water solubility of the compounds being spiked. This is a source of a potential problem as the method validation samples and the QC samples used to monitor the performance of the assay when it is applied may not be characteristic of the samples taken from the animals and humans in the study. This problem of the difference between in-vivo or incurred samples versus the in-vitro or spiked samples has been debated since the first of the AAPS/FDA consensus conferences in 1990. There are many reasons for the differences between spiked and incurred samples such as protein binding, conversion between parent drug and metabolites and matrix effects.
The discussion resurfaced in 2006 at the 3rd AAPS/FDA Bioanalytical workshop held in 2006 that indicated that there should be analysis of incurred samples. Incurred sample reanalysis is mainly limited to pharmacokinetic studies to ensure that the conclusions drawn are based on robust bioanalytical data and is currently only a US requirement. Early in the study analysis, 20 samples are selected for reanalysis that are close to the highest and lowest concentrations measured from several subjects and 67% of the reanalysed results should be within 20% of the original results. These results are reported in a separate table from the main study results.
Reintegration and reanalysis: The criteria for data reintegration and repeat sample analysis should be documented in standard operating procedures (SOPs). In the former case the SOP should define the criteria for a reintegration and how it will be performed along with documentation of the data and both the original and reintegrated results should be reported. In the latter case the SOP should define the criteria for repeating sample analysis, such as inconsistent replicate analysis, samples outside of the assay range, sample processing errors, equipment failure, poor chromatography and inconsistent pharmacokinetic data. However, reassays should be done in triplicate if there is sample volume available and the way of reporting the original, meaning or rejecting values must be defined and applied consistently.2
Report: The study reporting will present the results along with example chromatograms but the ones selected must be defined before the samples are analysed, which avoids the selection of the best chromatograms for the report while hiding the horror stories in the archive.
In this column I have presented an overview of the validation of bioanalytical methods and the controls used when analysing study samples. Where appropriate I have highlighted the differences from QC product analysis. For more details of bioanalytical method validation read the FDA guidance on method validation and also look on the AAPS website (www.aaps.org) for further papers on the subject and the consensus conference reports.
1. V.P. Shah et al., Eur J Drug Metab Pharmacokinet., 16, 249–255 (1992).
2. Food and Drug Administration. Guidance for Industry: Bioanalytical Method Validation. Rockville, Maryland, USA (2001).
3. United States Pharmacopeia, General Chapter <621>.
4. ICH, Q2 (R1): Validation of Analytical Procedures: Text and Methodology.
Bob McDowall, PhD, is principal at McDowall Consulting, Bromley, Kent, UK. He is also a member of the Editorial Advisory Board for LCGC Europe.
Advancing Bladder Cancer Research with Mass Spectrometry: A FeMS Interview with Marta Relvas-Santos
November 12th 2024LCGC International interviewed FeMS Empowerment Award winner Marta Relvas-Santos on her use of mass spectrometry to identify potential biomarkers and therapies for bladder cancer. She also shared insights on her work with FeMS and advice for fellow scientists.