What is the symbol for percentile

Quantile, percentile

Recall that one calculates the median by looking at the relative position of the data. If you order the measurement results, then the median is exactly the value in the middle. For example, if we know that the median of a test was 83, then we know that 50% of all other results are less than 83 and 50% are greater. The median is an example of a Percentile (also Percentile rank called), more precisely: the median the 50th percentile.

Percentiles divide an ordered data set into hundred parts that contain an equal number of measured values. Therefore, a subdivision into percentiles only makes sense for larger data sets.

In general, a subdivision of this type is called Quantile. In addition to percentiles, there are other important quantiles:Quartiles (Divided into four sections), Quintiles (Subdivided into five sections) and Decile (Division into ten sections).

definition

The PercentileP (1 ≤ P ≤ 99) of a distribution function is the value for which P% of all other values ​​are equal to or below and (100-P)% of all values ​​are equal to or above.

Quantiles are generally a limit that determines how many values ​​are above or below a certain value.

Every distribution has a quantile function. Their definition range is between 0 and 1 (0% and 100%). Mathematically speaking, the quantile function is that Inverse (inverse function) of the cumulative distribution function.

For example, if a value is in the 35th percentile, then that value is lower than 65% of all other values.

example

  • If a test result fell in the 89th percentile, what percentage of all results are the same or less?
    -> 89% of all other values ​​have the same value or are below.
  • If a test consisted of a hundred questions and a person answered 95 questions correctly, would that mean that that test result was in the 95th percentile?
    -> No. Percentiles provide information about the relative position of a measurement (in this case, a test result). When calculating the percentile, all other results must be taken into account. If the other participants also achieved quite high results and only 70% of all other test results had the same or a lower value than 95, then this means that the value 95 is in the 70th percentile, even if the test finished with 95 out of 100 points has been.

Quartiles

While percentiles divide a distribution into 100 sections, this is often more than is needed. Quartiles (Latin: Quarter values) therefore divide the distribution function into only four sections, each with the same number of measured values. They are therefore also suitable for smaller amounts of data. Quartiles are the most important quantiles. The four quartiles have different names and spellings:

  • Q0,25 = Q1 = first quartile = lower quartile
  • Q0,5 = Q2 = second quartile = median (middle quartile)
  • Q0,75 = Q3 = third quartile = upper quartile
  • Q1.0 or Q0 cover the whole and are therefore statistically irrelevant

The difference between the third and the first quartile is called the Interquartile range designated.

Calculation of quantiles

There are many different ways to calculate percentiles. They sometimes lead to different results in different situations, but they are usually quite close to each other. With all methods used, however, the data must first be sorted according to their rank (i.e. from small to large for numbers). The most natural way to find a percentile is to find a value for which the P% of all data is equal to or less than that. However, this is not always possible, and so one must be satisfied with the value that most closely meets this criterion. At this point the methods differ, which then try to determine this approximate value exactly.

The general formula for calculating the empirical quantiles is done with the formula on the right, where n is the number of measured values ​​and p is the quantile searched for.

example

Let us take the following ten measured values ​​as an example (therefore n = 10):

x1, ..., x10 = (1, 2, 2, 3, 5, 8, 9, 12, 12, 13)

We want to calculate the third quartile, which is at p = 0.75. According to the formula for calculating empirical quantiles, we first determine n · p = 10 · 0.75 = 7.5, which is not an integer. Hence we calculate the empirical quantile by adding determine. The brackets round up the value of x while rounded off. The 3rd empirical quartile is therefore at x8 = 12.

However, Microsoft Excel calculates a different third quartile for the same data set, namely 11.25. This is because Excel tries to calculate an "exact" value, even if this value is not part of the actual output data set. Excel uses a technique called linear interpolation, which assumes that the relationship between the individual readings is linear. Excel uses the following, somewhat complicated formula:

It is usually not necessary to memorize this formula as Excel and other statistical programs are used for such calculations.