Measuring Inequality in Terms of Entropy
-Application of the Theil Index to my Trading Card Collection
In one of my recent articles, I introduced a statistical quantity called Gini coefficient to measure the inequality of a given set of variables — in our case the values of my trading card collection. This article deals with the Theil Index or more precisely the Mean Log Deviation, another well-established inequality measure that derives from the more general concept of entropy.
Entropy — a Historical Aside
Entropy was first recognized by German physicist Rudolf Clausius in the mid-nineteenth century as a physical property in classical thermodynamics which describes the direction or outcome of spontaneous changes in a system and predicts that certain processes are irreversible or impossible. The definition of entropy is vital to the formulation of the second law of thermodynamics which states that the entropy of an isolated system will not decrease with time, as it strives to arrive at a state of thermodynamic equilibrium characterized by the property that the entropy is maximized. While most laws in physics apply equally on whether time goes forwards or backward, entropy is one of the few quantities that require a particular direction for time.
In the 1870s, Austrian physicist Ludwig Boltzmann explained entropy as a measure of the number of possible microscopic configurations of the underlying systems’ atoms and molecules. He derived that it is proportional to the logarithm of the number of possible microscopic states with the proportionality factor being called Boltzman constant. The statistical definition of entropy extends its empirical definition in the framework of classical thermodynamics. It motivates the interpretation of entropy as disorder, or more precisely, as a measure of uncertainty or lack of knowledge of the underlying microstates
In 1948 the American mathematician and engineer Claude Shannon transferred the concept of entropy to the field of information theory as the amount of information that is needed to fully specify the microstate of the system. In General, information theory studies the transmission, processing, extraction, and utilization of information and aims at the resolution of uncertainty. The informational value of a communicated message depends on the degree to which it is surprising. To take an example from the trading card world. the knowledge that a given pack does not contain a very rare card provides very little information because that is an event that almost always occurs. On the other hand, the knowledge that a particular pack contains that specific card has a very high value because it predicts the outcome of an event with very low probability.
In information theory, entropy can be formally defined as:
where X is a discrete random variable with possible outcomes x_1, … , x_n which occur with probability P(x_1), … P(x_n). The base of the logarithm can be chosen arbitrarily. When the information is given in binary digits, the binary logarithm is usually used.
Because the logarithm will play an important role in this and the following chapter its definition together with some basic examples and properties are summarized in Table 1:
To illustrate the concept of entropy in information theory, let’s consider a coin toss. At first, let the toss be fair so that the probability for both heads and tails equals ½. This situation features the biggest uncertainty in regard of which outcome will occur and thus should result in the highest possible entropy. In this example, the entropy is given by:
In case that the probability of heads is 0.6 and that of tails is 0.4 the entropy is given by:
It can easily be shown that for uniform probability the entropy is maximized. On the other hand, minimal entropy is obtained when the probability of one event is 1 and 0 in the other case:
Here 0 log (0) is taken to be 0, which is consistent with the limit
The result of zero entropy is in line with the general interpretation that each toss of the coin delivers no new information as the outcome of each coin toss is always certain. On the other hand, in the situation of maximum uncertainty where the probability is 50/50, the prediction of the outcome is most difficult and the result of each toss delivers one full bit of information.
Measuring Inequality with the Help of Entropy — Introduction of the Theil Index and Mean Log Deviation
This section is quite heavy on mathematical formulae and requires some experience with the arithmetic manipulation of logarithmic expressions. If you are not interested in the mathematical derivation of the inequality measure we are about to use, you can skip this chapter and directly jump to the application in the following section.
Starting from Shannon’s definition of entropy [1] from Eq. 1,
we like to investigate the distribution of values of a trading card collection. In that case, the P(x_i) are given by the ratio of a particular card’s value v_i and the total value of the collection V which can be written as the total number of cards N multiplied by the mean value of the cards
Using the natural logarithm, we can define the entropy as
This is a measure of how randomly values are distributed among all cards in the collection. As we learned above, entropy is maximized when all outcomes are equally likely, or in our case, when each card is equally valuable, i.e. each value corresponds to the mean value. Therefore, the maximum entropy is given by
Employing the concept of entropy, the Dutch econometrician Henri Theil introduced an indicator of inequality. The so-called Theil T index measures how distant a given entropy is from the maximum entropy, that is
This difference between a given entropy and the maximum possible value is also referred to as redundancy in information theory.
In the following article, we will use an inequality measure that is closely related to the Theil T index and sometimes is called the Theil L or, more intuitively, the mean logarithmic deviation (MLD):
The MLD measures the inequality as the mean of the logarithm of the individual card values’ contribution to the total value (second term below) from the logarithm of the equal distribution of these contributions (1/N). This can be recognized from the following arithmetic transformation:
The MLD is zero when each card has the same value and increases as the values become more unequal.
Both quantities, MLD and Theil T, belong to the family of the generalized entropy (GE) inequality measures which satisfy the general formula [2]:
where the parameter represents the weight given to distances between values at different parts of the distribution. For α=0 we obtain the MLD and for α=1 the Theil T index.
Even though the more popular Gini coefficient has a more intuitive interpretation as the area under the Lorentz curve, it lacks one useful property. As opposed to the Gini coefficient, both, the Theil T and the MLD, are decomposable in an additive way meaning that the inequality can be decomposed into the part that is due to inequality within certain subgroups plus the part that is due to differences between the subgroups [2]. The additive decomposition of the MLD in terms of n subgroups is given by
where MLD_j represents the MLD of subgroup j. Thus for a trading card collection the inequality in terms of the MLD is the sum of the average inequality within each subgroup of cards weighted by the subgroups’ p3roportion of cards plus the inequality between the different subgroups.
While the MLD satisfies some other mathematical advantageous properties over the Theil T index which are beyond the scope of this article [3], the main reason I will use it in the following discussion is that its formula is a little easier to apply.
Application of the MLD to my trading card collection
In the final part, I will demonstrate how to conveniently determine the MLD in Excel using, once again, my trading card collection as an example. I will also show by way of example the validity of the decomposition principle.
To calculate the MLD of my trading card collection I first introduce an auxiliary column F which just calculates the natural logarithm of the respective card values given in E.
In the following, we like to investigate the inequality of the value distributions for cards issued in different years. To easily obtain an aggregation of my data table in terms of years, I create a Pivot table
and select the following fields
Conveniently, Excel already determines row M with regard to the number of entries that I will need for my calculation. This could also be achieved by changing the value field setting accordingly. In the next step, we will calculate the MLD for each subgroup making use of the following arithmetic transformation of Eq. 3:
Considering that the mean value of any subgroup is just the total value of that subgroup divided by the number of cards of the subgroup, the MLD in Excel corresponds to:
We will compare and try to explain the inequalities in column E in the following section. But first we will determine the MLD of the total collection given in cell E14 using all of the sub MLDs given in cells E4 to E13. To do so, we use the decomposition formula in Eq. from the previous section which states the total inequalities as the sum of the average inequality within each year weighted by the year’s proportion of cards plus the inequality between the different years. For convenience, we introduce two auxiliary columns, one for the proportion and one for the logarithm of the mean total value and the mean value of each subgroup.
Now, utilizing the function SUMPRODUCT (returning the sum of the products of corresponding ranges or arrays), we can easily calculate the total MLD in E16
which exactly corresponds to the value determined directly from cell E14.
Discussion of Results
In Fig. 1, I plotted the inequality of my Shaq cards’ values (in terms of their MLD) for different years. The plot shows a rising inequality until 1998 indicating that an increasing number of rare and thus valuable inserts were produced over the years. The exception is 1992, Shaq’s Rookie year, which shows a large inequality due to the Stadium Club Beam Team booking at $400. Without that outlier, the MLD would be in the range of the succeeding years. The reason why the MLDs decrease from 1998 is rather an artifact of my collection being heavier in early to mid ’90s cards and should not be taken as an indicator that cards became less valuable, in general.
Fig. 1 also shows the standard deviation, the most fundamental statistical measure for variation, in grey. Even though there is not an exact match with the plot of the MLDs the correlation is still noticeable. The fact that this simple measure also gives a useful approximation of the inequality of my cards’ values is rather a coincidence than a general rule as the standard deviation does not meet all of the mathematical properties of disparity.
References
[1] Claude Elwood Shannon: A Mathematical Theory of Communication: Bell System Technical Journal. Short Hills N.J. 27.1948
[2] Haughton, Jonathan; Khandker, Shahidur R.. 2009. Handbook on Poverty and Inequality. Washington, DC: World Bank. © World Bank. https://openknowledge.worldbank.org/handle/10986/11985
[3] Cowell, Frank; Flachaire Emmanuel. Feb 2017. Inequality Measures and the Median: Why inequality increased more than we thought http://www.ecineq.org/ecineq_nyc17/FILESx2017/CR2/p327.pdf