ACCURACY, RELIABILITY AND VALIDITY
The Board of Studies definitions are very brief and the following expanded definitions may be of use:
a) ACCURACY: Exactness or conformity to truth.
Science texts refer to accuracy in two ways:
(i) Accuracy of a result or experimental procedure can refer to the percentage difference between the experimental result and the accepted value. The stated uncertainty in an experimental result should always be greater than this percentage accuracy.
(ii) Accuracy is also associated with the inherent uncertainty in a measurement. We can express the accuracy of a measurement explicitly by stating the estimated uncertainty or implicitly by the number of significant figures given. For example, we can measure a small distance with poor accuracy using a metre rule, or with much greater accuracy using a micrometer. Accurate measurements do not ensure an experiment is valid or reliable. For example consider an experiment for finding g in which the time for a piece of paper to fall once to the floor is measured very accurately. Clearly this experiment would not be valid or reliable (unless it was carried out in vacuum).
RELIABILITY: Trustworthy, dependable.
In terms of first hand investigations the Board seems to define reliability as repeatability or consistency. If an experiment is repeated many times it will give identical results if it is reliable. In terms of second hand sources reliability refers to how trustworthy the source is. For example the NASA web site would be a more reliable source than a private web page. (This is not to say that all the data on the site is valid.) The reliability of a site can be assessed by comparing it to several other sites/sources.
c) VALIDITY: Derived correctly from premises already accepted, sound, supported by actual fact.
A valid experiment is one that fairly tests the hypothesis. In a valid experiment all variables are kept constant apart from those being investigated, all systematic errors have been eliminated and random errors are reduced by taking the mean of multiple measurements. An experiment could produce reliable results but be invalid (for example Millikan consistently got the wrong value for the charge of the electron because he was working with the wrong coefficient of viscosity for air). An unreliable experiment must be inaccurate, and invalid as a valid scientific experiment would produce reliable results in multiple trials.
NOTE - The notes that follow from this point on are my own work.
ERRORS
The two different types of error that can occur in a measured value are:
Systematic error – this occurs to the same extent in each one of a series of measurements eg zero error, where for instance the needle of a voltmeter is not correctly adjusted to read zero when no voltage is present.
Random error – this occurs in any measurement as a result of variations in the measurement technique (eg parallax error, limit of reading, etc).
When we report errors in a measured quantity we give either the absolute error, which is the actual size of the error expressed in the appropriate units or the relative error, which is the absolute error expressed as a fraction of the actual measured quantity. Relative errors can also be expressed as percentage errors. So, for instance, we may have measured the acceleration due to gravity as 9.8 m/s2 and determined the error to be 0.2 m/s2. So, we say the absolute error in the result is 0.2 m/s2 and the relative error is 0.2 / 9.8 = 0.02 (or 2%). Note relative errors have no units. We would then say that our experimentally determined value for the acceleration due to gravity is in error by 2% and therefore lies somewhere between 9.8 – 0.2 = 9.6 m/s2 and 9.8 + 0.2 = 10.0 m/s2. So we write g = 9.8 0.2 m/s2. Note that determination of errors is beyond the scope of the current course.
Consider three experimental determinations of g, the acceleration due to gravity.
Experiment A Experiment B Experiment C
8.34 0.05 m/s2 9.8 0.2 m/s2 3.5 2.5 m/s2
8.34 0.6% 9.8 2% 3.5 71%
We can say that Experiment A is more reliable (or precise) than Experiment B because its relative error is smaller and therefore if the experiment was repeated we would be likely to get a value for g which is very close to the one already obtained. That is, Experiment A has results that are very repeatable (reproducible). Experiment B, however, is much more accurate than Experiment A, since its value of g is much closer to the accepted value. Clearly, Experiment C is neither accurate nor reliable.
In terms of validity, we could say that Experiment B is quite valid since its result is very accurate and reasonably reliable – repeating the experiment would obtain reasonably similar results. Experiment A is not valid, since its result is inaccurate and Experiment C is invalid since it is both inaccurate and unreliable.
How do you improve the reliability of an experiment? Clearly, you need to make the experimental results highly reproducible. You need to reduce the relative error (or spread) in the results as much as possible. To do this you must reduce the random errors by: (i) using appropriate measuring instruments in the correct manner (eg use a micrometer screw gauge rather than a metre ruler to measure the diameter of a small ball bearing); and (ii) taking the mean of multiple measurements.
To improve the accuracy and validity of an experiment you need to keep all variables constant other than those being investigated, you must eliminate all systematic errors by careful planning and performance of the experiment and you must reduce random errors as much as possible by taking the mean of multiple measurements.