The measures of spread are quantities that indicate how closely a set of data clusters around its centre. There are different measures of spread.
Standard Deviation and Variance
A deviation is the difference between an individual value in a set of data and the mean for the data.
The larger the size of the deviations, the greater the spread in the data. Value less than mean have negative deviations. If you add up all the deviations for a data set, they will cancel out. The standard deviation is a measure of spread found by taking the square root of the mean of the squares of the deviations
The lowercase Greek letter sigma is the symbol for the standard deviation of a population while the letter s stands for the standard deviation of a sample.
where N is the number of data in the population and n is the number of data in the sample.
Note that the formula for s has n- 1 in the denominator instead of n. This denominator compensates for the fact that a sample taken from a population tends to underestimate the deviations in the population.
Note that the standard deviation gives the greater weight to the larger deviations since it is based on squares of the deviations.
The variance is a measure of dispersion that is equal to the square of the standard deviation.
For large samples the calculation of standard deviation can be quiet tedious. If working with grouped data, estimate the standard deviation using the following formula.
where f is the frequency for a given interval and m is the midpoint of the interval.
Quartiles and Interquartiles Range
Quartiles divide an ordered set of data into four equal groups. The three dividing point are the first quartiles (Q1), the median, and the third quartile (Q3)
The Interquartiles range or IQR is Q3 –Q1, which is the range of the middle half of the data.
The larger the interquartile range, the larger the spread of the central half of the data. Thus the interquartile rang provides a measure of spread. The semi –interquartile range is one half of the interquartile range. Both of these ranges indicate how closely the data are clustered around the median.
A box and whisker plot illustrates the spread of a distribution of data.
The box show`s the first quartile, the median, and the third quartile. The end of the whiskers represents the lowest and highest values in the set of data. Thus, the length of box shows the interquartile range, while the left whisker shows the range of the data below the first quartile, and the right whisker shows the range above the third quartile. A modified box and whisker plot shows the outlier as separate points instead of including the the whiskers
Percentiles are similar to quartiles, except the percentiles divide an ordered set of data into 100 equal intervals
A z score is the number of standard deviations a given piece of data is from the mean. Thus, the z-score of a datum is given by the formula
Variable values below the mean have negative Z-scores, and values above the mean have positive z-scores.
By: Osman Osman
Great work…. You have good expalnation
but your work is lacking for visuals.