50 Standardization and quality assurance of quantitative determinations

Lothar Siekmann, Gerhard Röhle

50.1 Specimen – analyte in the sample matrix

Specimens to be examined in medical laboratories are characterized by extraordinary variety due to their different origin and composition. Materials to be examined include:

Body fluids such as blood, cerebrospinal fluid, ascites, or pleural fluid
Excretions such as urine, saliva, sputum, or feces
Tissue samples.

Usually the specimen represents a mixture of many different chemical substances and more or less differentiated organic structures.

The sum of all components and characteristics of a sample except for the analyte itself is referred to as the sample matrix. Each component of a specimen can become the subject (analyte) of a laboratory diagnostic evaluation.

The analyte may represent quantities (measurands) as different as:

Physical properties
Chemical elements, ions, inorganic molecules
Low-molecular-mass organic structures
Macromolecules with known or only approximately known structure
Cells or cellular systems.

50.2 Quality characteristics of methods of determination

The enormous variety of components of biological specimens requires that the methods used for the detection or quantitative determination of individual analytes should meet high expectations. For characterizing a quantitative analytical method, a number of different quality characteristics are taken into consideration. Among these, specificity is of special significance in the present context.

Specificity

Specificity characterizes the ability of a method to measure the analyte without erroneous interference by other components contained within the sample matrix.

Other quality characteristics include:

Trueness
Precision
Limit of quantitation
Linearity
Traceability.

They are dealt with in further detail in the following text.

50.3 Standardization – traceability

Standardization should always be based on the application of the concept of traceability. This concept is the subject of the ISO 17511 standard /1/ and is explained by the model in Fig. 50-1 – Traceability of quantitative clinical chemistry analytical methods according to ISO 17511. The concept describes a hierarchical structure of measurement procedures and calibrators, from the patient sample as the lowest level of the hierarchy to the highest level (i.e., the definition of the measurand in SI units) /2/.

(a) Definition of the quantity: if a quantity can be described by a defined molecular structure, it can be specified as an amount of substance concentration (e.g., in mol/L).

(b) The primary reference measurement procedure is the determination of the purity (certification) of a primary reference material. Such certified reference materials, whose purity was determined by metrology institutes, are available for many measurands.

( c) Primary reference materials can be used to calibrate secondary reference methods such as are developed and applied by reference laboratories.

(d) Secondary reference methods are procedures that are required to possess a high degree of specificity and are therefore suitable to deliver results of highest trueness and precision in a complex biological matrix. Such methods can be expected to determine the true value within narrow limits of measurement uncertainty. A typical method principle for the development of a reference measurement procedure is isotope dilution mass spectrometry.

Secondary reference methods are intended for the certification of manufacturer’s calibrators, of control samples for internal and external quality assurance and, to some extent, for the certification of panels of patient samples as part of the development and testing of diagnostic test kits. Matrix reference materials, which can be obtained from metrology institutes (e.g., cholesterol in human serum), are also certified with such secondary reference methods.

(e) Manufacturer’s calibrators are used to calibrate in-house measurement procedures.

(f) These manufacturer’s procedures (in-house) are used to calibrate the product calibrators.

(g) The product calibrators are part of the test kits available to routine laboratories for diagnostic purposes.

(h) Diagnostic laboratories use routine methods to determine analytical results in the patient samples with the test kits provided by manufacturers (calibrators, reagents, equipment).

The traceability of a measuring result in a patient sample (i) up to the highest level – represented by the definition of the measurand (a) – is assured if all the individual steps in this hierarchical model are traceable.

Each measurement procedure (reference, manufacturer’s, or routine procedure) and each value assigned to a reference material or calibrator has a certain measurement uncertainty [μ_c(Y)]. The measurement uncertainty of the result of a patient sample is calculated from all the individual contributions [μ_c(Y)] of the hierarchical chain according to the rules for the calculation of overall measurement uncertainty.

In the hierarchical model of traceability described here, individual steps, except for the highest levels (a) through (d), may be skipped. For example, it is possible to skip the in-house manufacturer’s procedure and the manufacturer’s calibrator and certify the product calibrators (g) from a manufacturer directly with a secondary reference measurement procedure (d).

In theory it would be possible, for example, to measure patient samples (i) directly with a secondary reference measurement procedure (d). In practice, however, this would make little sense due to the enormous costs and time required.

The prerequisite for traceability to the highest level is the availability of primary and secondary reference measurement procedures and primary calibrators. Unless these are available, the traceability ends at the level of the manufacturer’s calibrator or manufacturer’s procedure. Standardization in terms of consistent measurement results across manufacturers can then not be achieved.

50.4 Application of the concept of traceability for standardization

The implementation of the traceability concept has been monitored globally by a committee (Joint Committee for Traceability in Laboratory Medicine, JCTLM, www.bipm.org/jctlm/) since 2002. The leading members of the committee are the International Committee of Weights and Measures (CIPM) represented by the International Bureau of Weights and Measures (BIPM) in Paris, the International Federation of Clinical Chemistry and Laboratory Medicine (IFCC), and the International Laboratory Accreditation Cooperation (ILAC).

The JCTLM receives applications for the listing of reference materials, reference measurement procedures, and services by reference laboratories. Following a review by expert subcommittees with respect to compliance with the requirements of ISO standards 15193 /3/, 15194 /4/ and 15195 /5/, an up-to-date listing is published annually on the JCTLM’s website. Laboratories which apply for a listing according to the services offered (quantities) must be accredited to ISO 15 195. They are also required to regularly participate in an external quality assurance program for reference laboratories (www.dgkl-rfb.de:81).

The EU Directive on In Vitro Diagnostic Medical Devices (98/79/EC) stipulates: the traceability of values assigned to calibrators and/or control materials must be assured through available reference measurement procedures and/or available reference materials of a higher order.

This requirement applies equally to the following two groups involved with in vitro diagnostic medical devices:

The manufacturers of in vitro diagnostic devices
The organizers of external quality control programs.

The directive of the German Medical Association /6/ has been requiring the use of reference method values as target values in external quality control for a multitude of quantities since 1987. Since then, continuous improvement has been seen with regard to consistency of the results obtained with test procedures from different manufacturers.

This must be seen as a positive result of the application of the traceability concept for the standardization of clinical chemistry methods of analysis.

50.5 Quality assurance using control samples

For internal quality assurance of quantitative determinations, control samples are added to the series of patient samples as “random samples”.

If the true target values are available for the control specimens, the routine result for an analyte contained within the control sample may be compared to its target value (control of trueness). In the case of repeated determinations of the analyte in samples of the same control specimen (e.g., in every analytical series) the variation of results can be calculated given an adequately large number of single results (control of precision). If the lack of trueness and the lack of precision of the control measurements are still within predetermined limits, it can be presumed that the results measured in the patient samples also fulfill the requirements set by the quality assurance standards.

A biological specimen to be used as a control sample for the purposes of quality assurance should meet the following criteria:

It must be homogeneous within a lot in order to exclude sample-to-sample differences
A lot should be large enough so that one-time expenses (e.g., determination of target values, stability checks) are reduced to a minimum
The material should be stable enough to allow for prolonged storage without showing any alterations
The general characteristics of the control specimen and those of the patient samples should not differ from each other (commutability).

The combination of the latter two requirements represents an essential problem of a system using control samples. The virtually indispensable requirement for stability of control material usually means that it has to be more or less altered in comparison to the usually instable native specimen. The problems arising from this dilemma tend to remain within tolerable limits in the case of some control materials (e.g. serum).

In the case of other control materials, large difficulties may arise given the current state of the art in the preparation of control samples. For instance, highly specialized analytical systems designed to differentiate between denatured and normal blood cells will hardly be capable of correctly counting fixed blood cells contained in a control blood sample.

50.6 Location parameters – target values

For the purpose of external and internal quality assurance, the terms listed in Tab. 50-1 – Terms in association with location parameters and target values are used in conjunction with target values and location parameters. It follows from the requirements of the In Vitro Diagnostic Medical Devices Directive and the Directive of the German Medical Association that, whenever possible, reference method values should be used as target values in internal and external quality control /7/.

50.7 Reliability – uncertainty of measurement

Every measurement has an uncertainty associated with it (Fig. 50-2 – Measurement values of two laboratories for a parameter using two different routine methods). Measurement uncertainty should be calculated according to the Guide to the Expression of Uncertainty in Measurement (www.bipm.org/en/publications/guides/gum.html).

Among sources of error, systematic errors are differentiated from random ones. Refer to Tab. 50-2 – Types of errors.

The error of measurement of an analytical result consists of systematic and random components of error. Numerical values can be assigned to both the error of measurement as well as the systematic and random components of error. Refer to Fig. 50-3 – Relationships between inaccuracy, incorrectness and imprecision as well as accuracy, trueness and precision.

The accuracy of a measurement value depends on the trueness and the precision of the method of measurement. No numerical values can be assigned to these terms.

Accuracy, trueness (accuracy of the mean) and precision have a similar relationship to each other as inaccuracy (error of measurement), inaccuracy of the mean (systematic error), and imprecision (random error); these terms have accordingly antithetic meaning.

Refer to Tab. 50-4 – Characterization of analytical reliability.

Because the true value is not determinable, for the practical purposes of quality assurance it is substituted by a defined true value (target value), e.g. a reference method value or – if the latter is not available – by a method dependent assigned value. Under such circumstances, the term “conventionally true value” is used. In this context the terms “conventional bias” and “conventional inaccuracy” may be derived from this.

The relationships between measurement values, expectation values and target values (reference method or method dependent assigned values) are shown in

Fig. 50-2 – Measurement values of two laboratories for a parameter using two different routine methods
Tab. 50-3 – Relationships between location parameters, target values and measurement values.

For clarification, the meaning of the discrepancies are summarized in Tab. 50-1 – Terms in association with location parameters and target values.

50.8 Accuracy

Lothar Thomas

Biological variation (accuracy) is one of the most factors that contributes to laboratory results. Biological variation shows the fluctuation of a measured laboratory parameter around its homeostatic set point in steady state conditions. Accuracy of a method should be taken into account in the assessment of a laboratory result. Accuracy is differentiated into two categories /11, 12/:

within subject biological variation (CV_i), defined as the variation around its homeostatic set point in a person in a steady state condition
between subject biological variation (CV_g), defined as the variations between the homeostatic set points of different individuals.

In a study /12/ the biological variations of routine laboratory parameters are measured /12/ and shown in Tab. 50.8 – Biological variation of laboratory parameters.

50.9 Statistical tools

Statistical tools are used in the quality assurance of quantitative determinations. Refer to:

Tab. 50-5 – Statistical tools used in quality assurance of quantitative determinations
Tab. 50-6 – Description of percentile.

50.10 Applied quality assurance

The goal of quality assurance of quantitative determinations as part of laboratory diagnostic evaluations is, on the one hand, to determine how widely the measurement values vary due to random errors (control of precision) and, on the other hand, to check the extent of systematic errors if the necessary prerequisites are in place (control of trueness).

Insofar as the terms trueness and accuracy are used here in the context of applied quality assurance, they always refer to the conventional trueness or accuracy.

Essential components of quality assurance are internal laboratory controls of precision and trueness as well as external quality assurance as accomplished by means of inter laboratory surveys.

Procedures involved in quality assurance are listed in Tab. 50-7 – Procedures involved in quality assurance.

50.11 Interpreting results of an inter laboratory survey

As part of inter laboratory surveys usually two samples with different concentrations of the analyte are sent out to the participants of the inter laboratory survey. The results of all participating laboratories of the inter laboratory survey can be summarized in so-called Youden diagrams (Fig. 50-7 – Youden diagram showing the serum creatinine results of an inter laboratory survey conducted by the Reference Institute for Bioanalytics).

Each point in a Youden diagram represents both results from a given laboratory: the value for sample A is read off the abscissa and that for sample B off the ordinate. A laboratory whose two measurement results both coincide with the target values will have its representing point situated right in the middle of the diagram. Frequently the results are distributed in the shape of an elliptical cloud surrounding the diagonal taking a course from the lower left to the upper right. This correlates to a predominantly systematic deviation of both results towards either higher or lower values.

The Youden diagram shows the results of an inter laboratory survey performed for serum creatinine whose correct determination still presents problems with the currently available routine methods. For the comparison, all results obtained with the Jaffé method were shown on the left (highlighted by the black dots), and all results obtained with the enzymatic methods on the right of the graphic (highlighted by the black dots). The directive of the German Medical Association require the use of a reference method value as the target value.

Interpretation shows that the individual results (i) vary widely and are partly outside the control limits when the Jaffé method is used, and (ii) correlate much better with the reference method target values with only few results outside the control limits when the enzymatic methods are used.

Similarly, it is possible on the website of the Reference Institute for Bioanalytics (www.rbf.bio) to show the participants both with respect to method principles and in groups of different manufacturer’s test kits used.

In this way the reliability of commercial test methods can be shown (coincidence with target values, variation of results from different laboratories).

50.12 Criteria of acceptance

The measurement value of an analytical determination except for random coincidences always shows more or less extensive random and systematic deviations from the target value.

The extent of the random errors in an analytical system can be calculated as part of the control of precision; the extent of systematic errors of measurement with regard to the target value can only be ascertained using trueness control procedures. Unknown discrepancies between target and true value are not subject for consideration. The obvious goal of analysis is for these errors of measurement to remain within acceptable limits.

The definition of limits for an acceptable scatter and an acceptable systematic error of measurement with regard to analytical results is only possible as per agreement. Two steps are necessary for such an agreement:

A basic standard must be selected which is reasonably related to the desirable analytical precision and trueness
The basic standard must be converted by means of agreed upon factors or formulas into applicable criteria of acceptance.

There is extensive literature on this problem /9/. One of the earliest suggestions for a basic standard is the reference interval of a quantity /10/.

The following two examples should help in understanding the principle of this concept:

1. The reference interval for serum chloride be 98–108 mmol/L (103 mmol/L ± 5%). A valid relative error of measurement of 6%, for instance, would imply that a measurement result in the pathologically low range (e.g., 97 mmol/L) just as one in the pathologically high range (e.g. 109 mmol/L), would have to be considered as adequately correct if the actual concentration were 103 mmol/L. The tolerable error of measurement for serum chloride would therefore have to be lowered.

2. The reference interval for serum urea be 10–50 mg/dL (30 mg/dL ± 67%). Given this prerequisite, even measurement results with errors or deviations, for instance, of 20% could lead to erroneous interpretation only if the actual urea concentration is close to the limits of the reference interval. The tolerable error of measurement for urea therefore can be more generously set than for chloride.

The examples demonstrate that it can make sense for the purpose of quality assurance to use the reference interval of a quantity as the basic standard for the criteria of acceptance.

To convert this basic standard into applicable criteria of acceptance it has been postulated that:

The analytical scatter in conjunction with the control of precision as expressed by the coefficient of variation may amount to maximally 1/12 of the width of the reference interval as expressed by a percentage of its mean.
In conjunction with the “control of trueness” (more precisely control of accuracy) the relative error or deviation of a measurement value with regard to the target value may amount to maximally 1/4 of the reference interval as expressed by a percentage of its mean

For serum chloride whose width of reference interval amounts to approximately 10% of its mean it thus follows that:

The CV of the precision control measurement values would have to be less than 0.85%.
A measurement value for the “control of accuracy” would be allowed to deviate from the target value by maximally 2.5%.

This concept formed the basis for the criteria of acceptance defined in Part B1 of the directive of the German Medical Association notwithstanding some compromises which were agreed upon on account of the still suboptimal state of the art concerning analytical procedures /6/.

Besides the reference interval other approaches for defining a basic standard for criteria of acceptance were also discussed:

The demands placed on the reliability of analytical results by clinicians
The (limited) performance capability of current analytical procedures
The intraindividual and inter individual biological variation of a quantity.

As part of the intensive efforts to reach an internationally acceptable agreement on the important question of criteria of acceptance, the biological variation is currently almost the only area of agreement.

Despite this agreement concerning the basic standard, a final binding settlement should not be anticipated in the near future.

So far in this discussion essential aspects of the criteria of acceptance used in conjunction with the control of trueness have not been taken into consideration e. g:

The difference in quality of the target values. If the target value is a reference method value, adherence to a defined tolerable error of measurement may be much more problematic than if the target value represents a specific target value used in conjunction with a routine method (Fig. 50-7 – Youden diagram showing the serum creatinine results of an inter laboratory survey conducted by the Reference Institute for Bioanalytics).
The dependence of the concentration on errors of measurement. As is observed in the case of many quantities, relative errors of measurement become significantly larger with a decrease in concentration. Thus, for instance, it is impossible during the determination of glucose in the hypoglycemic range using customary routine methods to adhere to the tolerable error of measurement if it is defined as a certain percentage valid across the entire range of measurement.

50.13 Regulation of quality assurance in Germany

A laboratory director can generally be expected to make use of all the available means of quality assurance in order to ensure the reliability of the results obtained at his laboratory and to meet the relevant medical requirements. For most analyses there are no regulations prescribing the extent of any such measure.

The German Medical Association has published a directive for some areas of laboratory medicine that defines a minimum quality assurance program for a number of analyses.

The Directive of the German Medical Association for Quality Assurance of Quantitative Analyses in Medical Laboratories (2008) /6/ contains:

A part A, which describes the “General Requirements for Quality Assurance of Medical Laboratory Examinations”. Many of the requirements stated there correspond to those of ISO standard 15189 /10/.
A subpart B1, which deals with “quantitative analyses in medical laboratories“. This part contains specific regulations for internal and external quality assurance, including a table listing the control limits for numerous quantities. The requirements are summarized in Tab. 50-9 – Requirements of the German Medical Association.

Further subparts, B2 through Bx, comprising the requirements for qualitative analyses, pathogens, and spermatologic investigations are in development.

References

1. International Organisation for Standardization. ISO 17511. In vitro diagnostic medical devices – Measurement of quantities in biological samples – Metrological traceability of values assigned to calibrators and control materials, ISO, 2003.

2. Beastall GH. Traceability in laboratory medicine: what is it and why is it important for patients? eIFCC 2018; 29 (4): 242–7.

3. International Organisation for Standardization. ISO 15193. In vitro diagnostic medical devices – Measurement of quantities in samples of biological origin – Requirements for content and presentation of reference measurement procedures, ISO, 2009.

4. International Organisation for Standardization. ISO 15194. In vitro diagnostic medical devices – Measurement of quantities in samples of biological origin – Requirements for certified reference materials and the content of supporting documentation, ISO, 2009.

5. International Organisation for Standardization. ISO 15195. Laboratory medicine – Requirements for reference measurement laboratories, ISO, 2003.

6. Fraser CG, Hyltoft Petersen P. Desirable standards for laboratory tests if they are to fulfill medical needs. Clin Chem 1993; 39: 1447–55.

7. Haeckel R, Gurr E, Hoff T, on behalf of the working group Guide Limits on the German Society of Clinical Chemistry and Laboratory Medicine (DGKL). J Lab Med 2016; 40 (4): 263–70.

8. Tonks DB. A study of the accuracy and precision of clinical chemistry determinations in 170 Canadian laboratories. Clin Chem 1963; 9: 217–33.

9. Bundesärztekammer. Richtlinie der Bundesärztekammer zur Qualitätssicherung laboratoriumsmedizinischer Untersuchungen. Dt Ärztebl 2008; 105: A341–A355.

10. International Organisation for Standardization. ISO 15189. Medical laboratories – Particular requirements for quality and competence, ISO, 2003.

11. Sandberg S, Carobene A, Aarsand AK. Biological variation – eight years after the 1st srategic conference of eFLM. Clin Chem Lab Med 2022; 60 (4): 465–8.

12. Diaz-GarzonJ, Fernandez-Calle P, Aasand AK, Sandberg S, Coskun A, Carobene A, Jonker N et al. Long-term within-and between-subject biological variation of 29 routine laboratory measurands in athletes. Clin Chem Lab Med 2022; 60 (4): 618–28.

Table 50-1 Terms in association with location parameters and target values

Term	Explanation
True value	The true value of a quantity is a theoretical location parameter; in a strict sense it is the correct value. It cannot be entirely accurately determined by the use of any real measurement procedure; at the most it can be closely approximated (e.g., by using a reference method).
Expectation value	The expectation value is the mean of the distribution function associated with the measurement values which is characteristic for each method of measurement (i.e., the individual analytical method used by each laboratory). The larger the number of individual measurement values for a quantity of the given sample, the better its mean value will represent the theoretical expectation value. A systematic error which is usually present causes a discrepancy between the expectation and the target value.
Target value	For quality assurance practically applied in medical laboratories, “target value” is used as a collective term for conventionally true values of quantities (e.g., in control samples). Conventionally correct values may deviate from the true value to an unknown degree. Thus, in a strict sense, they are more or less incorrect. Since the true value of a quantity, however, is unknown depending on the given conditions, in each case a reference method value, a standardized reference method value or simply only a method dependent assigned value is defined as the target value for the practical purpose of checking for trueness.
Reference method value	A reference method value is the result of determining the target value by means of a reference method. It is the best available estimate of the true value.
Assigned value	An assigned value is a method dependent target value for a defined routine method. It is the result of determining the target value by complying with a specified test protocol according to a certain method. Non specificities of the method due to each sample matrix in use may lead to more or less sizable discrepancies between the true and the assigned value. In the context of quality assurance, the comparison of the method dependent assigned value to the expectation value of one’s own identical method allows an assessment as to what degree the method can be used without objections.
Consensus value	This (unofficial) Anglo-American term is used in conjunction with results determined by different laboratories for a given quantity in samples of the same control specimen (e.g., as is customary in an inter laboratory survey). The mean (or median) of all existing values (total mean) but also of a sub sample of values determined by the same method (method mean) is referred to as “consensus value”. The position of the “Consensus value” of all values strongly depends on the position and size of the involved sub samples. Therefore its informational value is very limited. The position of the “consensus value” of a sub sample is influenced by the number of individual values and the hard-to-control circumstances surrounding their measurement.

Table 50-2 Types of errors

Types of errors	Explanation
Systematic error	Essentially two factors are involved in systematic errors made during quantitative determinations using routine methods: A constant systematic error (system dependent bias) may, for instance, be due to incorrect calibration of the analytical system. It causes a constant deviation or error of measurement as part of all determinations. A variable systematic error (sample dependent bias) e.g. is caused by the non specificity of a method. Its extent may vary from sample to sample depending on how much impact each sample matrix has on the method used.
Random error	Random error is caused by uncontrollable variations both within a given measurement system as well as during its use. In the case of repeat determinations of an analyte involving the same specimen they cause the individual results to vary more or less extensively around their mean. Random errors in principle are unavoidable; by applying analytical thoroughness, however, their extent may be reduced to a characteristic minimum for each method used.

Types of errors

Explanation

Systematic error

Essentially two factors are involved in systematic errors made during quantitative determinations using routine methods:

A constant systematic error (system dependent bias) may, for instance, be due to incorrect calibration of the analytical system. It causes a constant deviation or error of measurement as part of all determinations.
A variable systematic error (sample dependent bias) e.g. is caused by the non specificity of a method. Its extent may vary from sample to sample depending on how much impact each sample matrix has on the method used.

Random error

Random error is caused by uncontrollable variations both within a given measurement system as well as during its use. In the case of repeat determinations of an analyte involving the same specimen they cause the individual results to vary more or less extensively around their mean. Random errors in principle are unavoidable; by applying analytical thoroughness, however, their extent may be reduced to a characteristic minimum for each method used.

Table 50-3 Relationships between location parameters, target values and measurement values*

Discrepancy	Meaning
A–E (A minus E) Measurement value – true value	The inaccuracy of a measurement value (MVx) The inaccuracy is determined by a certain method, e.g. method 1, in a given laboratory, e.g. laboratory 1. It is caused by different types of errors: A random error results in a more or less sizable deviation from the expectation value. The extent of the deviation depends on the precision of the method but is not predictable for each individual case. A constant systematic error (bias), e.g. due to erroneous calibration, results in a constant deviation – valid for other samples as well – from the optimal value for the given method and for the given sample (assigned value). Non specificity of the analytical method adds also another component of distortion to the measurement value because of the matrix of the given sample; in the case of repeat measurements it does so systematically by the same amount. The matrices of other samples cause different, variable systematic errors. Another analytical method for the same parameter in the same sample is characterized by different precision, different bias and different sources of non specificity (method 2). Therefore the different types of errors also exert a different effect on the accuracy of the measurement value.
A–D (A minus D) True value – expectation value	Bias With increasing numbers of repeat measurements their mean will be increasingly less influenced by random errors. In the case of an infinite number of individual measurement values, the mean of the expectation value and its random error equal zero.
C–D Assigned value – expectation value	Conventional bias with regard to the assigned value The determination of an assigned value involves several assigned value laboratories and multiple measurements. Proceeding like this results in the constant systematic errors (bias) contained in the individual measurement systems ideally canceling each other out. The discrepancy C–D is thus caused by the bias of the individual measurement system. If no reference method value exists for the parameter, a reliable assigned value offers the next best possibility for calculating the conventionally systematic error.
C–E Assigned value – measurement value	Conventional inaccuracy with regard to the assigned value The discrepancy C–E in addition to C–D contains the random error of an individual measurement.
A–C True value – assigned value	Bias of the assigned value The extent of the bias remains unknown because the true value is not determinable. The bias of the assigned value is caused by the non specificity of the method. In the case of the same sample it exerts its effect on all measurements, determination of the assigned value or individual measurement alike, using the same method.
A–B True value – reference method value	Bias of the reference method value Although the exact extent of the bias of the reference method value remains unknown because the true value is not determinable, its value is considered to be very low according to current scientific knowledge.
B–D Reference method value – expectation value	Conventional bias with regard to the reference method value Because of the low bias of a reference method value, the measurable conventional bias with regard to the reference method expectation value is the best available estimate of the absolute incorrectness (A–D).
B–E Reference method value – measurement value	Conventional inaccuracy with regard to the reference method value The discrepancy B–E additionally contains the random error of an individual measurement.

Discrepancy

Meaning

A–E (A minus E)

Measurement value – true value

The inaccuracy of a measurement value (MVx)

The inaccuracy is determined by a certain method, e.g. method 1, in a given laboratory, e.g. laboratory 1. It is caused by different types of errors:

A random error results in a more or less sizable deviation from the expectation value. The extent of the deviation depends on the precision of the method but is not predictable for each individual case.
A constant systematic error (bias), e.g. due to erroneous calibration, results in a constant deviation – valid for other samples as well – from the optimal value for the given method and for the given sample (assigned value).
Non specificity of the analytical method adds also another component of distortion to the measurement value because of the matrix of the given sample; in the case of repeat measurements it does so systematically by the same amount. The matrices of other samples cause different, variable systematic errors.

Another analytical method for the same parameter in the same sample is characterized by different precision, different bias and different sources of non specificity (method 2). Therefore the different types of errors also exert a different effect on the accuracy of the measurement value.

A–D (A minus D)

True value – expectation value

Bias

With increasing numbers of repeat measurements their mean will be increasingly less influenced by random errors. In the case of an infinite number of individual measurement values, the mean of the expectation value and its random error equal zero.

C–D

Assigned value – expectation value

Conventional bias with regard to the assigned value

The determination of an assigned value involves several assigned value laboratories and multiple measurements. Proceeding like this results in the constant systematic errors (bias) contained in the individual measurement systems ideally canceling each other out. The discrepancy C–D is thus caused by the bias of the individual measurement system. If no reference method value exists for the parameter, a reliable assigned value offers the next best possibility for calculating the conventionally systematic error.

C–E

Assigned value – measurement value

Conventional inaccuracy with regard to the assigned value

The discrepancy C–E in addition to C–D contains the random error of an individual measurement.

A–C

True value – assigned value

Bias of the assigned value

The extent of the bias remains unknown because the true value is not determinable. The bias of the assigned value is caused by the non specificity of the method. In the case of the same sample it exerts its effect on all measurements, determination of the assigned value or individual measurement alike, using the same method.

A–B

True value – reference method value

Bias of the reference method value

Although the exact extent of the bias of the reference method value remains unknown because the true value is not determinable, its value is considered to be very low according to current scientific knowledge.

B–D

Reference method value – expectation value

Conventional bias with regard to the reference method value

Because of the low bias of a reference method value, the measurable conventional bias with regard to the reference method expectation value is the best available estimate of the absolute incorrectness (A–D).

B–E

Reference method value – measurement value

Conventional inaccuracy with regard to the reference method value

The discrepancy B–E additionally contains the random error of an individual measurement.

* For explanation refer to Fig. 50-2 – Measurement values of two laboratories for a parameter in a given sample using two different routine methods.

Table 50-4 Characterization of analytical reliability

Accuracy

The measurement value (including its random error) coincides with the true value. Accuracy has no numerical value.

Inaccuracy (error of measurement)

Discrepancy between the measurement value and the true value due to systemic and random errors.

Trueness

The expectation value (mean of repeat measurements) coincides with the true value. Trueness has no numerical value.

Bias (systematic error)

Discrepancy between the expectation value (mean of repeat measurements) and the true value due to systematic errors.

Precision

Coincidence between repeat measurements. Precision has no numerical value.

Imprecision (random error)

Quantifiable degree of range between repeat measurements due to random errors.

Table 50-5 Statistical tools used in quality assurance of quantitative determinations

Tool	Significance
Sample of values	A sample of values contains all the results which were obtained under identical given conditions (e.g., the determination of the same analyte in the identical specimen using identical methods) and thus may be grouped together.
Mean	The mean represents the sum of individual results xi (x₁ + x₂... + x_n) of a sample of values divided by the number of results n (arithmetic mean). The mean x characterizes the location of the sample of values if the values are symmetrically distributed i.e., are part, for example, of a normal distribution curve (bell shaped Gaussian curve). Refer to: Fig. 50-4 – Bell shaped Gaussian curve Fig. 50-5 – Example of an empirical frequency distribution curve; normally distributed sample of values.
Standard deviation	A standard deviation is the difference between the mean x of a sample of values and the x value of one of the two turning points of the bell-shaped curve (Fig. 50-4). The standard deviation is a measure of the width of the bell shaped (Gaussian) curve and thus of the range of the values contained in the sample. It can be calculated using very few values. 68% of all sample values are contained with the range x – 1s and x + 1s if the underlying distribution is normal. Describing the range of values by means of the standard deviation is only reliable if the values are normally distributed (Fig. 50-4 – Bell shaped Gaussian curve).
Coefficient of variation	The standard deviation s relative to the mean x specified as a percentage is referred to as the coefficient of variation CV. A more accurate term would be “relative standard deviation”, since it does not, in actual fact, represent a coefficient. The term coefficient of variation, however, has established itself internationally. Coefficients of variation facilitate the comparison between the ranges of samples of values associated with different means and are independent of the individually used unit of measurement.
Percentiles	A percentile describes the position of a value sorted by ascending order within a row of values of a given sample. For examples see Tab. 50-6 – Description of percentile. Details are neglected for the sake of better comprehension.
Median	The median is another term for the 50th percentile. The median is positioned in the middle of a sample of values which has been sorted in ascending order: one half of the values are below, the other half are above the median. Given the values in Fig. 50-6 – Example of an empirical frequency distribution curve; not normally distributed, mean and standard deviation can, of course, also be calculated. However, if the corresponding bell shaped curve is reconstructed based on this data, the resultant curve does not even come close to describing the actual distribution of values: the mean x is clearly positioned above the point where the values are concentrated and almost all the values fall into the range x ± 1s, even though this ought to be case for only 68% of them. The median in contrast clearly marks the position of the point where the values are concentrated. Both the 16th (%) and 84th (%) percentiles also represent realistic information regarding the range: both in terms of its extent as well as its asymmetry which is due to the stronger tendency toward higher values.

Table 50-6 Description of percentile

Number (N) of values	N = 50	N = 100	N = 150	N = 287
Percentile (Abbreviation)	Position in the row of the value corresponding to the percentile
1. (1%)	1.	1.	1.	3.
16. (16%)	8.	16.	24.	46.
50. (50%)	25.	50.	75.	144.
84. (84%)	42.	84.	126.	241.
100. (100%)	50.	100.	150.	287.

Remark: The preference for listing the 16th (16%) and 84th percentiles (84%) is based on the analogy of the (84% – 16%) range to the x ± 1s range which is associated with normally distributed values and also contains 68% of the values.

Table 50-7 Procedures involved in quality assurance

Procedure	Explanation
Control of precision	A given quantity is determined at regular intervals (e.g. with every analytical series) by means of an individual measuring system in an individual laboratory in samples of a suitable control specimen. If a larger number (e.g., 20 measurement values of these repeat determinations) are available, their location is calculated, par example as the mean and the dispersion, as standard deviation and the coefficient of variation. The range of dispersion should not exceed a preset level (criterion for acceptance). For a control specimen to be suitable for the control of precision it must contain the analyte in a relevant concentration and it must be homogeneous and stable. Furthermore, it should be present in such large amounts that it may be used for as many repeat determinations as possible. For most parameters of laboratory diagnostic evaluations it should be realistic to meet these prerequisites. Target values are not necessary for the control of precision.
Control of trueness and accuracy	A given quantity is determined at regular intervals by means of an individual measuring system in an individual laboratory using samples of a suitable control specimen. The difference between an individual measurement value and the target value is considered to be the error of measurement. The discrepancy of the mean of repeat determinations (expectation value) and the target value is a reflection of the systematic error. The discrepancies with regard to the target value can be listed in absolute terms; although frequently the relative deviation from the target value is of greater practical usefulness. Inaccuracy or bias should not exceed a preset level (criterion for acceptance). A mandatory prerequisite for specimens suitable for the control of trueness is that a target value exists for the quantity. A reference method or standardized reference method value is of highest value for assessing trueness (Fig. 50-2 – Measurement values of two laboratories for a parameter in a given sample using two different routine methods). A method dependent assigned value which was determined according to a specified test protocol is less valuable. Even less valuable is a method-dependent target value which was defined without a standardized test protocol.
Inter laboratory surveys	Essential characteristics of an inter laboratory survey are that samples of the specimen are analyzed in different laboratories and that their results for one or more analytes are compared with each other and, if applicable, with a target value. If a larger number of laboratories participate, it is useful for one person to be in charge of organizing the entire survey and interpreting the data. If the result of an individual measurement is compared to a target value in the context of interpreting data from an inter laboratory survey, the discrepancy between the two will yield the conventional inaccuracy of the measurement value (Fig. 50-2). Potential target values for quantities in the control samples of an inter laboratory survey are – as is the case in the internal laboratory control of trueness – reference method values, standardized reference method values, and various assigned values. If none of these target values are available for a quantity, the consensus values (mean or median of all results of a parameter or, better still, of all results obtained by the same measurement method) may function in an inter laboratory survey for outlining target values. How valuable a consensus value is in this context strongly depends on the number of measurement values from which it is calculated.

Table 50-8 Biological variation of laboratory parameters in athletes /12/

Parameter	Unit	(CV_a) Analytical impression (%)	(95% CV_i) Within subject variation (%)	(95% CV_g) Between subject variation (%)
Albumin	mmol/L	1.1	3.2 (2.9–3.5)	3.8 (2.9–5.1)
ALT	U/L	23	27.6 (24.2–31.4)	30.7 (23.1–42.6)
Amylase	U/L	1.4	8.8 (8.1–9.6)	22.7 (17.9–30.8)
AST	U/L	9.2	17.2 (15.5–19.1)	24.7 (18.9–33.8)
Calcium	mmol/L	1.3	1.9 (1.7–2.1)	2.3 (1.8–3.1)
Chloride	mmol/L	0.4	1.2 (1.1–1.3)	1.2 (0.9–1.6)
HDL-Cholesterol	mmol/L	1.2	8.7 (8.0–9.5)	18.8 (14.8–25.6)
LDL-Cholesterol	mmol/L	1.4	10.6 (9.7–11.6)	28.0 (22.2–38.1)
Total Cholesterol	mmol/L	1.3	7.0 (6.4–7.7)	13.8 (10.7–18.6)
Creatinine	mmol/L	0.3	4.5 (4–5)	13.3 (10.5–17.8)
ALP	U/L	1.5	7.0 (6.2–8.0)	15.0 (10.7–22.8)
Phosphate	mmol/L	2.1	9.3 (8.6–10.2)	7.8 (5.8–10.6)
Glucose	mmol/L	1.2	5.9 (5.4–6.5)	5.2 (5.4–6.5)
LDH	U/L	1.0	8.3 (7.6–9.1)	11.5 (8.8–15.6)
Magnesium	mmol/L	1.6	3.3 (3.0–3.7)	3.7 (2.9–5.0)
Potassium	mmol/L	0.6	4.6 (4.2–5.1)	4.3 (3.3–5.9
Total protein	g/L	0.9	3.4 (3.2–3.8)	4.6 (3.6–6.2)
Sodium	mmol/L	0.4	0.5 (0.4–0.6)	0.4 (0.3–0.6)
Transferrin	μmol/L	1.4	4.8 (4.4–5.3)	12.5 (9.8–16.5)
Triglycerides	mmol/L	2.3	19.3 (17.7–21.1)	22.2 (16.9–30.3)
Urate	mmol/L	1.0	8.9 (8.2–9.7)	20.6 (26.3–27.8)
Urea	mmol/L	1.3	12.6 (11.2–14.4)	10.3 (6.9–16.1)

Table 50-9 Requirements of the German Medical Association

Requirements	Explanation
Internal quality assurance
Quantities	All quantitative analyses performed by a medical laboratory are subject to internal quality assurance according to the directive of the German Medical Association /9/. Tab. B 1a through c contains the control limits for approximately 100 different measurable quantities in blood, serum, plasma, urine and cerebrospinal fluid for the maximum permissible deviations for the evaluation of individual measurements of control samples. For all other quantities, a procedure for establishing the laboratory’s internal error limits is prescribed.
Individual measurements of control samples	At the beginning of each measurement campaign, a single measurement of the control sample shall be performed. Each day, control samples shall be used in at least two different concentration ranges. The evaluation of the results of the individual control sample measurements shall be performed prior to releasing the results for patient samples.
Evaluation of the individual control sample measurements	At the end of each control cycle, the relative root mean square of error is calculated from the results of the individual control sample measurements. The result must not exceed the limits specified in Tab. B 1a through c of the directives.
External quality assurance
Mandatory participation	All medical laboratories that perform quantitative analyses for quantities listed in the directive (Tab. B 1a through c) are required to participate in inter laboratory surveys conducted by reference institutes.
Frequency	Each laboratory shall participate in at least one inter laboratory survey per quarter. Certificates issued by the reference institute confirming that the requirements of the survey have been met are valid for 6 months.
Number of samples	In each case, two different samples with different concentrations are sent to the participating laboratories.
Evaluation	The evaluation of the results for the two individual control sample measurements by the reference institute is based on control limits which are listed in the directive (Tab. B 1a through c) as maximum permissible deviations in inter laboratory surveys.

Figure 50-1 Traceability of quantitative clinical chemistry analytical methods according to ISO 17511 /1/.

Figure 50-2 Measurement values (MV) of two laboratories for a parameter in a given sample using two different routine methods. The relationships between location parameters, target values and measurement values are shown in Tab. 50–3 – Relationships between location parameters, target values and measurement values.

TV, true value (not determinable); RMV, reference method value; AV, assigned value (method-dependent); EV, expectation value; MVX, individual measurement value with random errors

Discrepancies:
– A–D = Absolute bias (not determinable)
– A–E = Absolute inaccuracy (not determinable)
– B–D or C–D = Conventional bias with regard to the target value (RMV or AV)
– B–E or C–E = Conventional inaccuracy with regard to the target value (RMV or AV)

Each component of the error of measurement may be preceded by a negative or positive sign.

Figure 50-3 Relationships between inaccuracy, incorrectness and imprecision as well as accuracy, trueness and precision.

Figure 50-4 Bell shaped Gaussian curve.

Ideal form of the frequency distribution (density function) of a normally distributed sample of values with a listing of the relative proportion (%) of all values positioned in the segments –1 s to +1 s as well as below –1 s and above +1 s.

x = Rank of the values, e.g. of concentrations
x = Mean
s = Standard deviation
N = Number of values

Figure 50-5 Example of an empirical frequency distribution curve; normally distributed sample of values.

Figure 50-6 Example of an empirical frequency distribution curve; not normally distributed sample of values with a listing of the position of the mean (x) and the standard deviation (s) as well as the median (M) plus the 16th and 84th percentiles (%).

x = Rank of the values, e.g. of concentrations
N = Number of values

Figure 50-7 Youden diagram showing the serum creatinine results of an interlaboratory survey conducted by the Reference Institute for Bioanalytics. Each point in these diagrams represents both results from a given participant; the value for sample A is read off the abscissa and that for sample B off the ordinate. In the diagram on the left, all results (538 of 711) from participants who used the Jaffé method are highlighted as dark points. On the right side, all results from participants who used enzymatic methods (59 of 711) are shown as dark points.