A latent class analysis towards stability and changes in breadwinning patterns among coupled households

Abstract A latent class model is proposed to examine couples’ breadwinning typologies and explain the wage differentials according to the socio-demographic characteristics of the society with data collected through surveys. We derive an ordinal variable indicating the couple’s income provision-role type and suppose the existence of an underlying discrete latent variable to model the effect of covariates. We use a two-step maximum likelihood inference conducted to account for concomitant variables, informative sampling scheme and missing responses. The weighted log-likelihood is maximised through the Expectation-Maximization algorithm and information criteria are used to develop the model selection. Predictions are made on the basis of the maximum posterior probabilities. Disposing of data collected in Japan over thirty years we compare couples’ breadwinning patterns across time. We provide some evidence of the gender wage-gap and we show that it can be attributed to the fact that, especially in Japan, duties and responsibilities for the child care are supported exclusively by women.


Introduction
A clear division of paid and unpaid work along gender lines in households is found in most countries, but the recent trend in several advanced societies is away from the male breadwinner model towards a dual-earner family. However, despite the increase in the paid employment of married women in Japan as well as Western industrialized societies, the notion of gendered division of labour, that is men should be the family's primary economic provider and women should stay at home to focus mainly on domestic labour and care for family members, has been accepted as a tradition. While recently the number of households with wives entirely dependent on their spouses' income has dramatically declined, most women in dual-earner households still earn much less than their spouses and the households where wives earn equal to or more than their husbands are very few.
The increasing rate of labour force participation of women does not necessarily mean that economic inequality within couples is expected to have changed. Instead, dual-earner households does not liberate women from their traditional gender role, especially in the strong male breadwinner model like Japan. The Japanese tax and pension systems are legislated regarding the male breadwinner household as "standard household". The Japanese government policies carried out over time have encouraged married women to stay at home and to be dependent on their husbands, to work part-time or in non-regular jobs and to have broken career patterns. Work and family reconciliation is di cult and this perpetuates women's dependence on a male breadwinner.
As gender inequalities in the division of labour within marriage/family are closely related to gender inequalities in other spheres of life, particularly in the labour market as well as political and other societal arenas, understanding what determines gendered intra-household inequality is essential for researchers who want also to understand other aspects of gender strati cation as well as for politicians interested to adopt laws favouring an active women role also in the labour market. Many studies argued that women's economic dependency on men is an important attribute of strati cation systems and essential force in the maintenance of gender inequality (see, among others [26]). Recent studies aim at exploring if the gender pay inequalities are due to the perception of both man and woman that they are fair (see [3]). However, as women's educational attainment has been rising, more and more women might be part of the dual-career couple and other types of non-traditional couples in contemporary Japan. Having said that, some people may still rely on and nd value in the conventional gender roles as recently stressed by [27].
We aim to examine couples' breadwinning patterns such as, male-breadwinner or equal-income among Japanese couples according to a latent class model [14] and study how they have changed over the past four decades, during which we have seen an increase in the women's labour force participation and a rising of educational attainment. As well as, after identifying distinct latent classes of breadwinning types, we aim to explore the associations with their socio-economic statuses to detect the main determinants of this classication. In Japan, heterosexual marriage has been considered as the only way to form a family until recently and marriage must be o cially registered at a local government. Lately there are some people who choose alternatives to marriage such as consensual unions and registered partnerships, but the way people live together has not signi cantly changed (1.1 percent lived in a consensual union in 2015). It is not only because married couples receive many legal and economic bene ts not available to unmarried couples, but also because marriage has been necessary for women's nancial survival, social interaction and personal well-being until recently [22].
Many studies concerns United States and Australia and investigate what di erentiates dual-income couples from husband sole provider couples, for example [21] shows what di erentiates equal-provider couples from male-breadwinner couples among dual income households and their changes over time. Few studies have been conducted in Europe, for example [30] investigate women as main earners by comparing European countries according to social policies. Previous research on this topic in Japan is scarce. Recently, [17] attempts to explore by means of a multinomial logit model the determinants of the factors that di erentiate equalprovider couples in Japan, where wives share the breadwinning responsibility equally with their partners, from male-breadwinner couples among dual-income households. However, it focused only on dual-earner families and the respondents in sole-breadwinning families were omitted from the analysis. Furthermore, the breadwinning typology commonly used in those studies are usually operationalized based on a priori percentages of wife's income of the total household income. Therefore, breadwinning categories of "male breadwinner" or "equal" might arbitrary be constructed.
Our analysis is conducted by employing a latent class model and relying on the declared incomes of respondent and spouse to national surveys carried out along 40 years. We properly de ne an ordinal response able to disentangle the earning proportions derived by comparing two underlying continuous variables and we account for missing responses as well as for sampling probabilities. In this way, we get consistent estimates of the parameters of interest, reliable standard errors adjusting for possible failures of the sample in covering the target population. The latent class model has been considered for the analysis of data arising in di erent contexts. Within survey data it is a exible tool to account for the heterogeneity among responses. The proposed model-based approach enables us to investigate the associations with the couple's characteristics through a suitable parameterization as well as to dispose of data driven couple's typologies (see, among others, [19] and [4]). Maximum likelihood estimation of the model parameters is well established and it is carried out through the Expectation-Maximization algorithm. However, even if missing observations arise frequently the use of weighted methods for the estimation of the parameters with missing responses has been recently employed. Instead of performing case deletion we retain the missing responses under the missing-at-random assumption and we impute the missing values on the concomitant variables. In this work we extend the proposal of [20] by considering comparisons across waves.
The paper is organized as follows: in Section 2 we describe the surveys and the data on which the empirical analyses are carried out. In Section 3 we discuss the latent class model with sample weights and maximum likelihood estimation of the model parameters. In Section 4 we report and compare the results across waves. Finally, a discussion concludes the paper.

Survey data
Data are obtained from the past three decades of Japanese cross-sectional data of the Social Strati cation and Social Mobility (SSM) surveys, and the last decade of the Japanese Strati cation and Social Psychology (SSP) survey. The 1985 SSM survey consists of three surveys: a) for men (N=1,239, response rate is around 61 percent), b) for men (N=1,234, response rate is around 61 percent), and f) for women (N=1,474, response rate is around 68 percent). Each survey used di erent questionnaires, but some questions overlapped and it is possible to dispose of almost the same variables. The 1995 survey consists of two surveys: a) for men and women (1,248 men and 1,405 women, response rate is around 66 percent) and b) for men and women (1,242 men and 1,462 women, response rate is around 67 percent). Each survey used di erent questionnaires, but again, some questions overlapped and we combine the two surveys. 1 The respondents are interviewed and asked a wide range of questions such as respondents' socio-economic background, education, social consciousness, and the spouse's information when a respondent has a spouse. The surveys have been conducted with faceto-face interviews with a special focus on social strati cation and inequality in contemporary Japan. They are based on selected national representative respondents through a multiple-stage sampling design. The respondents were aged between 20 and 69 for 1985, 1995, and 2005, and between 20 and 64 for the 2015 SSP survey. Therefore, we use the data collected from 2,473 men and 1,474 women in 1985, 2,490 men and 2,867 women in 1995, 2,660 men and 3,082 women in 2005, and nally 1,644 men and 1,931 women in 2015. The response rates are 63.3% in 1985, 66.4% in 1995 (response rate for men and women are 61.8% and 71.1%, respectively), 44.1% in 2005 (response rates for men and women are 40.9% and 47.2%, respectively), and 43.0% in 2015. 2 The 2005 and the 2015 survey data are provided with individual sampling weights, not available with the 1985 and 1995 waves. The weights are obtained with respect to the following auxiliary variables: gender, age group, and region of residence.
The couple's income provision-role type is obtained by the declared incomes of both survey respondent and spouse. Incomes include earned and investment incomes and all the other additional incomes such as pensions and dividend on stock shares. The measurement of couple's income provision-role type is based on who may be a dominant provider and the share of contribution of wife's income for the household income. The relative importance of this concept is also highlighted by [18] and [21]. Five ordinal categories of the key variable are considered: (1) husband sole provider, which consists of couples where only husband is employed; (2) husband provides majority, which consists of couples where husbands' earnings represent 60 percent or more of the sum of the spouses' income; (3) equal providers, which identi es couples where wife's earnings represent somewhere from 40 percent to 60 percent, meaning that each partner contributes between 40 and 60 percent of total household income; (4) wife provides majority, which consists of couples where wives' earnings represent 60 percent or more of the combined total income of the husband and wife; (5) wife sole provider, which consists of couples where only wife is employed. Table 1 shows the weighted frequencies of the response variable for each year. We notice some changes of breadwinning patterns among married couples over the past four decades. The proportion of the households with husbands as sole provider has declined from 42.9% in 1985 to 22.9% in 2015. However, despite the continuing rise in Japanese women's participation in the economy, husbands until recently have been the sole or the primary breadwinner in 65% of the couples and equal-provider couples have been only 11.8% up to 2015. The highest shares of item nonresponse for household income and/or spouse income is observed in 2005.
. The presence of a preschooler is coded as a binary variable with respect to children's age 0-6. Couples' relative education is measured by two categories: 1 husband has higher education than wife, 2 wife has higher education than husband. Wife's education: 1 less than high school, 2 high school, 3 two-years college degree, 4 four-year college degree or higher. Size of the place of living: 1 major cities; 2 > 200,000; 3 (100,000, 200,000]; 4 ≤ 100,000; 5 small towns and villages. Husband's income is measured in ten thousand yen a year.
Concomitant variables are selected according to previous research among the available survey measurements on the basis of their possible in uence on the division of labour within households. They are shown in Table 2 according with the employed categorization. Educational level is considered as a proxy of human resources and we expect a relative importance of this variable on married couples' choice of breadwinning patterns. Couples' relative education-level between spouses has been considered to connect to values and culture which may in uence division of labour within household. Whether the marriage is homogamous, hypergamous or hypogamous may be associated with patriarchal culture. More patriarchal households, which may be associated with female hypergamous couples, may prefer traditional breadwinning type, as well as their relative power within the marriage, see, among others [24]. In addition, when available, we consider the variable related to locality of the respondents. The continuous covariates such as wife's and husband's age, husband income, are considered according to the quantiles illustrated in Table 2 and included as dummy variables to adjust distributions throughout the thirty-year period. We de ne age and income bands since their e ects on the couple's provision-role type might not be linear and they can be due to the impact of various life events and career stages. Descriptive statistics for these covariates are presented in Table 3.

Latent class model
The latent class model belongs to the class of nite mixture models [16]. It includes an unobserved random variable with random parameters. The advantages of using latent variable models in social research are il-lustrated among others, by [25] and [5] Chapters 1 and 2. With reference to a random unit drawn from the population of interest let Y ij be the derived response variable observed for individual i, i = , . . . , n on the j, j = , . . . , r ordered category. Its values are obtained by comparing two continuous variables therefore we assume that it is derived by an underlying unobserved latent variable denoted as U i for i, i = , . . . , n. This latent variable has a discrete distribution left unspeci ed with k support points having nite discrete values. In this way a semi-parametric model results [15]. As in the standard latent class model formulation the observed responses are assumed independent one another conditionally to the latent variable. The latter is named local independence assumption. The parameters are referred to the measurement and to the latent model. Those of the measurement model are the conditional probabilities of the response variable given the latent variable The latent model is de ned by the probability of belonging to each latent class. Considering a set of covariates arranged in the vector X, where x is the corresponding realization, they enter into the model through the following parameterization where β u is an intercept speci c of each latent class and β u is the vector of regression parameters. Note that, in the parameterization above, the rst latent class is taken as a reference category and has a special role when the latent classes are ordered according to response categories. It allows us to compare the patterns with respect to the rst latent class and this choice makes the interpretation of the results easier. Other link functions can be chosen such as those presented in [1]. Concerning the missing responses we assume that the response variable is independent from the missing responses given the observed covariates X and the latent variable U. This is a less stringent assumption with respect to the missing at random assumption.
To account for individual sampling weights denoted as w i , i = , . . . , n as those provided within the surveys and related to the inverse of the selection probability we estimate the model through a weighted loglikelihood. The latter is determined given a sample of n independent individuals for which we observe the responses y , . . . , yn as where θ denotes the overall vector of free parameters arranged in a suitable way and x i denotes the vector of the observed covariates con guration for individual i, i = , . . . , n. The above quantity is maximized through the Expectation-Maximization (EM) algorithm [7] and [10] representing the main tool to estimate the models with latent variables. It is based on the complete data log-likelihood, for more details see [5].
We apply a two-step approach to avoid bias for the estimated coe cients referred to the covariates. First, we estimate the model with the missing responses and sampling weights and we perform a model selection strategy to choose the suitable number of latent classes. This consists to estimate the model with each k several times and to check for local maxima. We rely on the Akaike Information Criterion (AIC, [2]) and on the Bayesian Information Criterion (BIC, [23]) to guide the selection. They are measures of the relative goodness of t of a model, accounting simultaneously for its accuracy and complexity. The AIC is de ned on the basis of the following index AIC k = − ˆ k (θ) + #par, whereˆ k denotes the maximum of the log-likelihood of the model with k latent classes and #par denotes the number of free parameters in the model. The BIC di ers from the previous by the penalized term that also includes the sample size through the term log(n). Then, by xing the parameters of the measurement model we estimate the remaining parameters by adding the full set of covariates as in Equation (1). Standard errors for the parameter estimates are obtained according to the observed information matrix computed through numerical methods.
By updating the probabilities of belonging to each latent class through the Bayes' theorem [8] we compute the estimated posterior probability to be assigned to each latent class determined aŝ In this way, we dispose of a suitable allocation rule for each respondents and they are pro led according to the latent class composition.

Results
In

. Latent class model
We performed the imputation for the missing values of age and husband's income by using the weighted mean matching method, as explained in the previous section, by considering sampling weights and covariates as predictors [29]. In this way, it is possible to allocate to a latent class also respondents with missing provision-role behavior. Under the missing-at-random assumption [28] show that multiple imputation methods perform equally well as the latent class imputation, see also [31] for a comparison on the performance of the imputation strategies with an ordinal response variable. The number of latent classes is selected for each wave by estimating the models for a number of classes ranging from 1 to 4 with di erent initialisations of the EM algorithm. The model is estimated by adapting the functions of the R package LMest [6]. The highest maximum log-likelihood at convergence, the number of free parameters and the corresponding AIC and BIC values are reported in Table 4. The selected number of latent classes is identical across waves. Table 5 shows the estimated conditional probabilities of the response for each wave. 3 The latent classes are characterized by qualitatively di erent couple's typologies and the results allow us to label the rst latent class as that of traditional couples (U T ) and the other one as that of new couples (U N ) since clearly the rst consists exclusively of couples where the husband is the unique provider and the second is a more heterogeneous class with prevalence of couples where the husband is provider of the majority. The latent class of traditional couples (U T ) is characterized by a high degree of gender role specialization, strong gender based division of work where the husband specialize in market-work and wife in domestic work and caregiver. In 2015 we notice that (U N ) is composed of 58% of couples where the husband provides majority, 16% of couples where each component is equal provider and 7% of couples where the wife provides the majority. The latter percentage is higher than that estimated for the latent class model with data from the rst wave.   Table 1).

. Latent class model with covariates
Concomitant variables enter into the model as in Equation (2). The maximum log-likelihood of the four models are:ˆ = − , andˆ = − , with 51 free parameters, respectively,ˆ = − , with 47 free parameters andˆ = − , with 46 free parameters. Table 6 reports the estimates of the logit regression parameters a ecting the transition from the latent class of traditional couples (U T ) to the new couples (U N ) for each wave. The estimated intercepts are positive and signi cant indicating a general tendency towards the latent class of new couples (U N ) especially in 2015.
The estimated coe cients can be compared across waves. We observe that having preschool children shows the highest estimated coe cient whose negative sign indicates that couples with babies tend to belong to the cluster of traditional couples (U T ) with respect to new couples (U N ). In fact, the estimated odds ratio of belonging to (U N ) for the couples with preschool children is exp(β) = . the estimated odds for those not having preschool children. The negative e ect of having preschool children decreases in 2005 with respect to 1995, but in 2015, increases again, meaning that having preschool children implies to belong to (U T ). This result might suggest continued, or even strengthened, gender di erentiation in contemporary Japanese society. Although mothers in many countries experience strong normative pressure to ful ll the norms of motherhood, Japanese mothers hold extremely high standards of mothering that lead them to perceive it as gruelling and di cult to manage successfully [11].
As for the wife's education, a signi cant e ect of women's tertiary education on the hypothesized positive e ect is found. Compared with high school graduates, women who have completed four years of college or more are expected more likely to belong to the class (U N ), since they may have less traditional gender-role attitudes. However, signi cant e ects of women's tertiary education are not found in other periods than 2005. Interestingly, we observe a negative e ect of the lower level of education to transit to new couples (U N ) in 2015. Having said that, the pattern of the e ects of couple's relative educational level seems constant and signi cant through the four waves: hypergamy, meaning that husband has higher education than his spouse, implicates less likely to be new couple, whereas hypogamy, meaning that wife has higher education than her spouse, implicates more likely to be new couple. The results suggest that hypogamy is linked to women's greater bargaining power within households. In addition, it contributes to more gender-egalitarian attitudes inside and outside the house.
The e ect of the husband age is positive for the transition to (U N ) only in 1985 and it is restricted to middle ages (from 39 to 53). On the other hand, in 1995 wife ages from 46 and above have a great impact on the transition to the new type of family (U N ). In 2005, living in small towns contributes to this transition. The results can be understood in the light of the situation in the Japanese society: the labour force participation rate of women is generally lower in urban areas than in smaller or rural cities. In large cities, commuting time to work tends to be longer than in rural areas. Moreover, in more urbanised regions, working hours are longer than in rural areas. Working parents who have young children often rely on grandparents for childcare, Table 6: Estimates of the regression parameters (β) for each wave and percentages of predicted couples in U T (categories are de ned in Table 2 Signi cance at 10%( † ), 5%( * ), 1%( ** ). The missing category is taken as baseline.
but it is di cult for working mothers to ask parents to look after their children in big cities because many grandparents live in another area of the city or they can still work. Concerning husband's incomes we note that the median income of Japanese households is decreasing from 4,620,000 yen in 2005 to 4,270,000 yen in 2015, probably due to the global economic crisis. Husband's high incomes in 2015 and 2005 determine a lower probability to belong to the new family (U N ), meaning that couples with husband having relatively high income, for example over 6 million yen (approximately 44,600 euro), are less likely to be new families (U N ). The income e ects have a more clear cut after 2000, although having upper middle level income in 1985 and 1995 implies still to belong to traditional couples.

. Posterior probabilities
The spouses's allocation to a latent class is performed through the estimated maximum posterior probability as in Equation (2). The last row of Table 6 reports the estimated percentages of couple's predicted in (U T ) for each wave. The lowest percentage is estimated for the 2015 wave and it is equal to 11.21%. Table 7 reports the composition of this subpopulation according with the covariates for each wave in order to compare the values with those reported in Table 3 related to the observed sample proportions.
Traditional families are mainly prevalent when the husband and the wife are young. Categories from 5 to 8 corresponding to wife's age from 44 to 58 (see Table 2) are less represented in 2015, meaning that especially recently, some wives are induced to get back to work during their middle ages. We expected that the younger couples support gender egalitarian values more and this would be re ected in gender equality in couples earnings structures. The twentieth century gave rise to profound changes in traditional gender roles as argued by [12], although the force of this "rising tide" has varied among rich and poor societies. The authors demonstrate that richer, postindustrial societies support the idea of gender equality more than agrarian and industrial societies and intergenerational di erences in values are largest in postindustrial societies and relatively minor in agrarian societies, suggesting that the former are undergoing intergenerational changes in values. They also argue that cohort change in gender-role attitudes in postindustrial societies is unidimensional, with newer cohorts consistently more egalitarian than older cohorts. However, we found an association between age and the probability of being in (U N ) since households with wife in younger birth cohorts are more likely to be traditional families (U T ). It is still not normative for young married women to share equal nancial responsibilities within household. This is partly due to the chronic shortages of regular childcare arrangements.
Considering the last survey none of the couples is predicted in the traditional family if the husband's income is less then 1,750,000 yen a year (approximately 13,000 euro) and that 61% of the couples predicted as traditional family has preschool children. We observe that (U T ) is mainly characterized by relative young households living in big cities, generally with husband having income of more than 4,250,000 yen a year, wives having high school education and husband's education that is higher than that of the wife, with one or more preschool children.

Discussion
About 20-30% of the couples are predicted in the latent class labeled as "Traditional couple" where only husbands undertake the nancial responsibility for the family and wives bear domestic responsibility. These percentages show an increasing trend up to 2005. However, in 2015 we estimate a relevant increase of couples predicted in the latent class labeled as "New couple" that are non-traditional families. One of the reasons why the new families have been more represented in the past decade is that being a conventional single-income household is becoming more di cult due to the recent economic crisis causing income stagnation and decline. Over the past two decades, the average male annual income fell steadily. Therefore, in cases where husbands do not have relatively pro table and economically sound jobs, wives tend to have shared breadwinner roles. According to a survey carried out at national level in 2018 females employment rate in Japan is higher than that of United States working women but their entry-level positions and wages are lower then their male counterpart, demonstrating that high quali cation occupations are often precluded for women.
We disentangle how the couple's typologies are related with the covariates and we highlight some considerations. Having young children has been constantly related to the greater likelihood that the respondents are in couples where husband is the sole provider. It is partly because of gender stereotype on wife's employment and on family roles in Japan where women have to stay at home, cook, clean, and care for children when having little babies. Another challenge for the families with preschoolers is the chronic shortages of regular childcare arrangements. Childcare has become an important policy issue in Japan as well as in many countries, but still there is a large number of children on waiting lists for childcare centers. Other childcare services such as home-based group care by family day care providers or babysitters who watch the children in either their own homes or the children's home are not very popular in Japan and they are expensive. Moreover, when the employed women become pregnant and hope to take some maternity leave, it happens that employers place pressure on women to quit. This leads to the coining of the phrase "maternity harassment". These conditions have precluded mothers of preschoolers from staying in the workforce. At the same time, Japanese mothers, not fathers, are subjected to an extraordinary degree of excoriation by physicians, educators, preschool directors, and even high-level of government o cials. In recent years, municipal governments in Japan have initiated various types of support programs. However, these programs might promote the notion that child rearing should be standardized, and make it extremely di cult for mothers to ful l the requirements of being "good wives, wise mothers". Moreover, the popularity of social media might exacerbate normative pressures to perform well because mothers can look at what other mothers are doing, judge themselves in comparison, and feel inadequate in their mothering abilities. Consequently, deep sense of guilty might help mothers with young children withdraw from the labour market more in recent years. High wage husbands imply more chance to build a traditional family. The result can be better understood in the light of the following arguments. When the levels of husband's income are enough to maintain a family and run their home, wives tend to accept to be economically dependent and stay at home. If husband's earnings are not enough to take care of all family nancial obligations, wives tend to enter the labour force to supplement the husband's income. Under the current Japanese social security system, there is a tax reduction for married couples if one person make less than 1.5 million yen a year and dependent spouses of regular employees are automatically entitled to the basic pension without making any payments. Critics say that this entitlement deters full-time housewives from participating into the labour market.
Concerning the relative educational level of the couple we notice that hypergamous couples where husband is educated more than his wife heighten a couple's likelihood of having traditional, non-egalitarian marriage practices. This suggests that power imbalances within relationships may be related to the couple's choice of income provision-role type. This asymmetric gender relation within families seem to be associated with a patriarchal culture inducing to prefer traditional marriage practices. Having tertiary education does not necessarily show a higher probability towards a new type of couple, than lower educational levels. This leads to low economic returns to investment in education for women in Japan.
A worthwhile direction for future research would be to impute the missing responses by using a full information maximum-likelihood method in order to compare the results with those obtained with the current approach. Another research aspect would be to apply the proposed method to make comparative analysis on breadwinning patterns beyond Western societies along the research lines proposed by [9] and [13]. Comparing countries where breadwinning and caregiving expectations are institutionalized di erently would contribute to improve further understanding of the women's opportunities in the various institutional contexts. 1 The Social Strati cation and Social Mobility Survey data up to 2005 are available from the Center for Social Research and Data Archives, Institute of Social Science https://csrda.iss.u-tokyo.ac.jp/. 2 In recent years, many countries are facing the increasing rate of missingness in surveys as well as Japan. Substantial declines in survey response rates after 2005 are mainly related to enactment of Act on the Protection of Personal Information in 2003 in Japan which evoked growing awareness of privacy or con dentiality issues. 3 The R code is available from https://github.com/penful/breadwinners