Power-Law and Log-Normal Distributions in Temporal Changes of Firm-Size Variables

In this paper the author shows that signed temporal changes of firm size variables follow the power-law for large changes; while, for middle changes a log-normal distribution is found. In the analyses, the author employed three databases: highincome data, high-sales data and positive-profits data of Japanese firms. It is particularly worth noting that the growth rate distributions in temporal changes of the firm size data have no wide tail, unlike the distributions observed in assets and sales of firms, the number of employees and personal income data. An Extended-Gibrat's Law was also found in the growth rate distributions of temporal changes of firm size variables, which induces both the power-law and the log-normal distributions in the temporal changes of firm size under the Detailed Balance. --


Introduction
Power-law distributions are frequently observed in economic data such as assets, the number of employees, personal income, sales, profits and income of firms (denoted by x). The Power Law is known as Pareto's Law (Pareto 1897) and the probability density function (pdf) is represented as where C is a normalization and the power μ is called the Pareto index. In general, the power-law is valid only for large-sized companies (Badger 1980;Montroll and Shlesinger 1983). The size threshold of the region of large-sized firms is denoted by x th . It has been hypothesized that the pdf follows the log-normal distribution for middle-sized firms below the size threshold x th : Here,x is a mean value and σ 2 is a variance. Therefore, we concluded that further research of the power-law distribution was warranted because a small number of firms dominate a large percent of the overall sales and profits. Furthermore, since the majority of firms are middle-sized, we also decided to look at the log-normal distributions.
It has been shown that these distributions can be explained by laws observed in a massive amount of digitized economic data. Fujiwara et al. (2003Fujiwara et al. ( , 2004 point out that Pareto's Law can be derived from the Law of Detailed Balance and from Gibrat's Law (Gibrat 1931). Along these lines, Ishikawa (2006aIshikawa ( , 2007a shows that the log-normal distribution can also be deduced using the Law of Detailed Balance and Non-Gibrat's Law. The Detailed Balance is time-reversal symmetry as observed in the equilibrium system. On the one hand, Gibrat's Law states that the conditional pdf of the growth rate is independent of the initial value. On the other hand, Non-Gibrat's Law describes the dependence of the initial value. Gibrat's Law is observed only in companies in the large scale region, and Non-Gibrat's Law in companies in the middle scale region. It is interesting to note that there are two types of growth rate distributions. The shape of the growth rate distribution of profits or income of firms ( Fig. 1 1 ) is different from that of assets and sales of firms, the number of employees or personal income (Fig. 2). This difference is observed not only for companies in the large scale region but also for companies in the middle scale region. The point is that the difference might be related to the difference between two Non-Gibrat's Laws in companies in the middle scale region. On the one hand in Fig. 1, the probability of positive growth decreases and the probability of negative growth increases as the bin size of the classification of x increases in the middle scale region (Ishikawa 2006a(Ishikawa , 2007a. On the other hand in Fig. 2, the probability of positive and negative growth decreases as the bin size of the classification of x increases (Aoyama 2004). This size dependence for companies in the middle scale region is significant due to the preponderance of companies and individuals in this size range.
In this study, we propose that the shape of the growth rate distribution is determined by the type of economic variable. In concrete terms, the variables are calculated by any subtraction or not. In this study we will prove our hypothesis by employing data from sales, profits and income.

Firm Size Distributions
In this section, we will review the derivation of Pareto's Law and the log-normal distribution from the Detailed Balance and (Non-)Gibrat's Law, and confirm the laws by employing data for sales, profits and income of Japanese firms.
In Japan, firms having an annual income of more than 40 million yen have been officially designated as "high-income firms"; there are approximately 70 thousand of these firms. The exhaustive database was published by Diamond Inc. Top 500 thousand sales data of Japanese firms are available on the database "CD Eyes 50" published by TOKYO SHOKO RESEARCH, LTD. This database is thought to be approximately exhaustive. The database includes both positive and negative profits data. The number of firms with positive profits is approximately 300 thousand and that with negative profits is approximately 40 thousand. In this study we will exclude firms with negative profits, the number of which is significantly less than that with positive profits. This is due to the fact that the negative profits data gathered from high-sales data are exclusive. Despite the fact that we do not have a complete picture of the positive profits data for companies in the middle scale region; nonetheless, we decided to employ this data to investigate the consistency of the laws with the data. In this study, we will examine the following three databases: high-income data (Database I), high-sales data (Database II) and positive-profits data (Database III).

Pareto's Law from the Detailed Balance and Gibrat's Law
Let firm sizes at the two successive points in time be denoted by x 1 and x 2 . The growth rate R is defined as the ratio R = x 2 /x 1 . The Detailed Balance and Gibrat's Law (Gibrat 1932) are represented as follows: The joint pdf P 12 (x 1 , x 2 ) is symmetric under the exchange x 1 ↔ x 2 : • Gibrat's Law The conditional pdf of the growth rate Q(R|x 1 ) is independent of the initial value x 1 : where the conditional pdf Q(R|x 1 ) is defined as by using the pdf P (x 1 ) and the joint pdf P 1R (x 1 , R).
These laws are confirmed in Databases I -III. In order to compare analyses in the next section, we will investigate data for 2003 (x 0 ), 2004 (x 1 ) and 2005 (x 2 ). In the scatter plot of each database, the Detailed Balance (3) is approximately confirmed by applying the one-dimensional Kolmogorov-Smirnov (K-S) test (see Appendix). Figs. 3 -5 show time-reversal symmetry under the exchange x 1 ↔ x 2 . 2 Gibrat's Law (4) is also confirmed for each database. Figs. 6 -8 show that the conditional pdf of the growth rate is approximately independent of the initial value, if the initial value is larger than some threshold x th . Here the pdf for r = log 10 R defined by q(r|x 1 ) is related to that of R by log 10 q(r|x 1 ) = log 10 Q(R|x 1 ) + r + log 10 (ln 10) .
Note that large negative growth is not available if there is a lower bound of the data. This is notably observed in Figs. 3 and 6 for high-income Database I. This is also observed in Figs. 4 and 7 for high-sales Database II; however, the lower bound is probably obscure. 3 The Detailed Balance and Gibrat's Law are confirmed by employing personal income data in Japan (Fujiwara et al. 2003), and assets and sales data in France and the number of employees in UK ).
In the literature (Fujiwara et al. 2003), Pareto's Law is analytically derived from the Detailed Balance and Gibrat's Law. By using the relation P 12 (x 1 , x 2 )dx 1 dx 2 = P 1R (x 1 , R)dx 1 dR under the exchange of variables from (x 1 , x 2 ) to (x 1 , R), these two joint pdfs are related to each other From this relation, the Detailed Balance (3) is rewritten in terms of P 1R (x 1 , R) as Substituting the joint pdf P 1R (x 1 , R) for the conditional pdf Q(R|x 1 ) defined by Eq. (5), the Detailed Balance is expressed as By the use of Gibrat's Law (4), the Detailed Balance is reduced to where we define G(R) ≡ Q(R −1 )/(RQ(R)). By setting R = 1 after differentiating Eq. (10) with respect to R, we obtain the following differential equation where x denotes x 1 . The solution is given by This is identical to Pareto's Law (1) with G 0 (1) = μ + 1. Note that Gibrat's Law is valid only in cases where the initial value is larger than some threshold x th . 4 This threshold is coincident with the threshold in Pareto's Law because there is no threshold in the Detailed Balance ( Fig. 5 and Appendix).
In order to make Pareto's Law clear, we consider the cumulative number: Pareto's Law is confirmed in Databases I -III (Figs. 9 -11). In Fig. 9 for the cumulative number plot of income, Pareto's Law holds for cases over approximately 100 million yen (The number of firms in the region is approximately 25 thousand). This is related to the fact that Gibrat's Law is observed for n = 2, · · · , 5 in Fig. 6. In Fig. 10 for the cumulative number plot of sales, Pareto's Law holds for cases over approximately 200 million yen (The number of firms in the region is approximately 315 thousand). This is related to the fact that Gibrat's Law is observed for n = 3, · · · , 20 in Fig. 7. Each threshold comes from the lower bound of the data. In Fig. 11 for the cumulative number plot of profits, Pareto's Law holds for profits over approximately 100 million yen (The number of firms in the region is approximately 15 thousand). This corresponds to the fact that Gibrat's Law is observed for n = 16, · · · , 20 in Fig. 8. This threshold does not come from the lower bound of the data. For n = 1, · · · , 15, as n increases, the growth rate distributions change according to a law that we have designated Non-Gibrat's Law.

Log-normal Distribution from the Detailed Balance and Non-Gibrat's Law
In the literature (Ishikawa 2006a(Ishikawa , 2007a, the log-normal distribution is analytically derived from the Detailed Balance and Non-Gibrat's Law. In order to identify Non-Gibrat's Law in the middle scale region, we approximate log 10 q(r|x 1 ) in Fig. 8 by linear functions of r as follows: These approximations are not appropriate for n = 1, · · · , 5; therefore, we consider the case for n = 6, · · · , 20. Eqs. (14) and (15) are expressed as exponential functions: where d = 10 c /ln 10 . Under these approximations, the Detailed Balance (9) is reduced to for the case of R > 1. Interestingly, t ± (x) in the approximations (14) and (15) are uniquely fixed under the Detailed Balance. By setting R = 1 after differentiating Eq. (18) with respect to R, we obtain the following differential equation where x denotes x 1 . The same differential equation is obtained for R < 1. Similarly, from the second and third derivatives of Eq. (18), the following differential equations are obtained: The solutions t ± (x) are uniquely fixed as With Eq. (19), t ± (x) also uniquely fix the pdf P (x) as These solutions satisfy Eq. (18) beyond perturbation around R = 1 under the restricted assumption of Eqs. (14) and (15). These analytic results are confirmed in Database III. By applying the linear approximations (14) and (15) to the data in Fig. 8, the relation between x and t ± (x) is obtained (Fig. 12). On the one hand, Fig. 12 shows that t ± (x) hardly responds to x for n = 15, · · · , 20. This means that Gibrat's Law is valid for the region of large scale firms. On the other hand, t + (x) linearly increases and t − (x) linearly decreases symmetrically with log 10 x for n = 6, · · · , 10. This is Non-Gibrat's Law (21) derived analytically by the linear approximations (14) and (15).
Non-Gibrat's Law (21) and the resultant pdf (22) are considered as Gibrat's Law and Pareto's Law, respectively, for the case of α = 0. We apply Eqs. (21) and (22) not only in the middle scale region but also in the large scale one. In this sense, we refer to Eq. (21) as Extended-Gibrat's Law. The parameters are estimated as follows: α ∼ 0 for x > x th , α ∼ 0.14 for x min < x < x th , t + (x th ) ∼ 2, t − (x th ) ∼ 1, x th ∼ 10 2+0.2(16−1) = 10 5 thousand (= 100 million) yen and x min ∼ 10 2+0.2(6−1) = 10 3 thousand (= 1 million) yen. Rigorously, a constant parameter α must not take different values. In the database, however, a large number of firms stay in the same www.economics-ejournal.org region for two successive years. This parameterization is approximately valid for describing the pdf. This is confirmed in Fig. 13. This figure shows that 14,800 firms (a number which accounts for approximately 8.3% of the total data and whose profits account for approximately 91.6% of the total profits) are included in the large scale region (x ≥ x th ). In the middle scale region (x min ≤ x 1 < x th ), there are 130,018 firms (73.7% of data; 8.3% of total profits). A similar analysis was confirmed in the data from 2003 (x 0 ) to 2004 (x 1 ).

Distributions in Temporal Change of Firm Size
In analyses in the previous section, we have investigated growth rate distributions of income, sales and profits. There is a noteworthy difference between them. As depicted in Fig. 1, the growth rate distributions of profits can be approximated by linear functions (14) and (15). The validity of the approximations is confirmed by the results. In Fig. 6, these approximations are also appropriate for the growth rate distributions of income. The growth rate distributions of sales are, however, difficult to approximate by the linear functions because the distributions with a curvature have wide tails (Fig. 7) as depicted in Fig. 2. This difference has been observed in other literature by employing not only Japanese firms data but also European and North American firms data (Amaral et al. 1997, Okuyama et al. 1999, Matia et al. 2004, Gabaix 2005 for instance). This aspect has also been observed in other quantities. In the literature (Canning et al. 1998 for instance), the growth rate distributions of GDP have no wide tail. In the literature (Fujiwara et al. 2003), the growth rate distributions of personal income in Japan have wide tails. In the literature , the growth rate distributions of assets and sales in France and the number of employees in UK have also wide tails.
Where does this difference between figures of the growth rate distributions come from? On the one hand, income and profits of firms are calculated by a subtraction of total expenditure from total sales at a rough estimate. The values can be both positive and negative. On the other hand, assets and sales of firms, the number of employees and personal income are not calculated by any subtraction. The values cannot be negative. From these facts, we make a simple assumption that the difference between figures of growth rate distributions comes from being subtracted. In order to verify this assumption, we will investigate the temporal change of firm size data. If our assumption is correct, the growth rate distributions in the temporal change of firm size data are approximated by linear functions.
Firstly, we analyze the temporal change of sales data, the number of which is the largest in three databases, Databases I -III. In the analysis, we used sales data over 400 million yen, the value of which is sufficiently larger than the obscure lower bound of the data (Figs. 4 and 7). These sales data are in Pareto's Law region (Fig. 10). Let us consider two temporal changes v 12 = x 2 − x 1 and v 01 = x 1 − x 0 . Here, v 12 is the change between 2004 (x 1 ) and 2005 (x 2 ), and v 01 is between 2003 (x 0 ) and 2004 (x 1 ). The temporal changes v 01 and v 12 can be both negative and positive. The data are classified into the following four cases: (v 01 > 0, v 12 > 0), (v 01 > 0, v 12 < 0), (v 01 < 0, v 12 > 0) and (v 01 < 0, v 12 < 0).
In each case, distributions in the growth rate of temporal sales changes R = |v 12 /v 01 | are shown in Fig. 14. In four cases, no wide tail is observed as expected.
Here, we take the absolute value of v because it can be negative. Furthermore, Extended-Gibrat's Law is approximately confirmed in each case (Fig. 15) as follows: The distributions in the temporal sales changes |v 01 | and |v 12 | are shown in Fig. 16, in which not only Pareto's Law in the large scale region but also the lognormal distribution in the middle scale region is observed. Figure 16 shows that Pareto indices for |v 01 | and |v 12 | are approximately the same value in each figure. This fact and Extended- Gibrat's Law (25) suggest that there is a Detailed Balance under exchange |v 01 | ↔ |v 12 | in each case. 5 The scatter plots of the temporal sales changes are shown in Fig. 17. In each case, by using the K-S test given in the Appendix, the following Detailed Balance is approximately observed: P 12 (|v 01 |, |v 12 |) = P 12 (|v 12 |, |v 01 |) .
In the temporal sales change data, the Detailed Balance (26) and Extended-Gibrat's Law (25) are observed. The distribution of the temporal sales change data, therefore, follows the Pareto's Law in the large scale region and the log-normal distribution in the middle scale: In the same manner as the analysis of the profits data, we confirm this in Fig. 18. The parameters are estimated as follows: α ∼ 0 for |v| > |v th |, α 6 = 0 for |v min | < |v| < |v th |, t + (|v th |) − t − (|v th |) ∼ 1, |v th | ∼ 10 4+0.5(5−1) = 10 6 thousand (=1 billion) yen and x min ∼ 10 4+0.5(2−1) = 10 4.5 thousand (=10 million) yen. In each case, approximately 5∼10% of firms in the data are included in the large scale region and approximately 65∼70% of firms in the data exist within the middle scale.
Similar phenomena are observed in Databases I and II. In our analysis of the temporal high-income change in Database I, this phenomenon is confirmed for the case in which the growth rate distribution of firm size has no wide tail and the data are completely exhaustive. In the analysis of the temporal positive-profits change in Database II, this phenomenon is also confirmed for the case in which the growth rate distribution of firm size has no wide tail and the data cover the middle scale region.

Conclusion and Future Issues
In this study, we have shown that the signed temporal change of firm size data follows not only power-law in the large scale region but also the log-normal distribution in the middle scale region. In the analyses, we employed three databases: high-income data (Database I), high-sales data (Database II) and positive-profits data (Database III) of Japanese firms. It is particularly worth noting that the growth rate distributions in the temporal change of firm size have no wide tail as is observed in the assets and sales of firms, the number of employees and personal income data. The growth rate distribution with no wide tail can be linearly approximated. This lack of a wide tail is also observed in the temporal change of the firm size, such as income and profits of firms. From these observations, we conclude that the quantity calculated by any subtraction has no wide tail in the growth rate distribution and vice versa.
In the data of temporal firm size changes, the Detailed Balance was also confirmed. This leads to Extended-Gibrat's Law. At the same time, the Pareto indices are almost the same value in the large scale regions of two successive temporal change data. The Detailed Balance and Extended-Gibrat's Law lead to Pareto's Law in the large scale region and the log-normal distribution in the middle scale one. This is consistently confirmed in the empirical data.
From the growth rate distribution in the temporal firm size changes with no wide tail, it is possible to derive the following schemes analytically or numerically (Tomoyose et al. 2008). (Scheme A) The growth rate distribution of x, which cannot be negative (assets and sales of firms, the number of employees and personal income), has wide tails (Fig. 2). (Scheme B) The growth rate of distribution x, which can be negative (profits and income of firms), has no wide tail (Fig. 1). In addition, the difference of Non-Gibrat's Laws might become more distinct. On the one hand, in the firm size growth rate distributions with no wide tail (Fig. 1), the probability of positive growth decreases and the probability of negative growth increases symmetrically as the bin size of the classification of x increases in the middle scale region. On the other hand, in the firm size distributions with wide tails (Fig. 2), the probability of positive and negative growth decreases simultaneously as the bin size of the classification of x increases.
www.economics-ejournal.org Figure 19: Each p value of the onedimensional K-S test for the scatter plot of high-income data points (Fig. 3).
Figure 20: Each p value of the onedimensional K-S test for the scatter plot of high-sales data points (Fig. 4).   (Fig. 17).