Why Do Females Live Longer Than Males?

Why Do Females Live Longer Than Males?
Jean Lemaire
Wharton School, University of Pennsylvania
Insurance and Risk Management Department
3641 Locust Walk, CPC 310
Philadelphia, PA 19104-6218, USA
Tel: 1 (215) 898-7765
Fax: 1 (215) 898-0310
[email protected]
NOVEMBER 13, 2000
In most countries, females live several years longer than males.
Many biological and behavioral reasons have been presented in the
scientific literature to explain this “female advantage.” A crosssectional regression study, using 50 explanatory variables and data
collected from 169 countries, provides support to the behavioral
hypothesis. Four variables, unrelated to biological sex differences,
explain over 61% of the variability of the life expectancy
differential. One variable (the number of persons per physician)
summarizes the degree of economic development of a country. The
three other selected variables (the fertility rate, the percentage of
Hindus and Buddhists, and Europeans living in the former Soviet
Union) are social/cultural/religious variables1.
Keywords: Life expectancy, cross-country regression, female survival advantage
This work was supported by an unrestricted grant to the Leonard David Institute of Health Economics
provided by the Merck Company. Many thanks to Narumon Saardchom for numerous suggestions, and
to Jianhua Huang for outstanding statistical and programming advice.
1. Introduction – Literature Overview
Mortality rates have decreased markedly in the twentieth century. The gap in
life expectancies between rich and poor, whites and non-whites, educated and less
educated, has narrowed significantly. However, the gender gap has become wider. In
most countries, male fetuses, infants, children, and adults, exhibit greater mortality.
This directly affects the sex ratio of the population, and social and demographic
factors such as the chances of marriage, the duration of widowhood, the stability of
social security systems, the construction of unisex actuarial tables, the pricing of
annuities and second-to-die policies, and the valuation of pension plans. The
male/female sex ratio at conception is estimated to range between 1.2 and 1.5 –the
first and in some respects the only biological advantage of the males. One hundred
years (and nine months) later, females outnumber males by a ratio of four to one. This
results in life expectancy differentials at birth averaging 4.51 years worldwide, with a
maximum of 12.3 years in Belarus. Males also have higher mortality rates in the vast
majority of animal species (Retherford, 1975.)
A wide variety of variables, and interactions among them, influence sex
differences in mortality. Factors explaining the “female advantage” (FA) can be
broadly subdivided into biological and behavioral causes.
Biological differences: women are biologically more fit than men, due to genetic and
hormonal differences; they benefit more from advances in medical science and
economic progress.
Behavioral differences: the lifestyle of men is damaging to their health; the FA
increases as discrimination against females subsides, following changes in cultural and
religious beliefs.
Identifying the causes of the FA is critical for an accurate forecast of mortality
in the 21st century. If the larger part of the FA is the result of behavior (such as
smoking, stress, exposure to AIDS, driving patterns), the FA should reduce, as
behaviors of the two genders tend to become similar in many societies. If the larger
part of the FA is due to biological causes, a significant difference will persist, barring
any spectacular medical breakthrough.
A vast body of literature, from many different disciplines (medicine, biology,
sociology, demography, and epidemiology) addresses the issue. Excellent summaries
are Waldron (1985) and Nathanson (1984). A comprehensive survey by Kalben
(2000), concludes that the causes of the FA supported by evidence are (i) biological,
(ii) the greater prevalence of smoking among males, and (iii) the better ability of
females to take advantage of socioeconomic and medical advances of the last 150
years. The theory that the FA is the result of greater male labor force participation and
occupational risk is not supported by evidence. There is no confirmation of the widely
held assumption that the FA will progressively disappear as women achieve equality
with men, particularly in employment.
Studies supporting the biological hypothesis
Wingard (1982) performs a multiple logistic analysis of the mortality of 6,928
adults from California, followed-up during nine years. The study controls death rates
for sixteen health, social, demographic, and psychological factors: age, race,
socioeconomic status, occupation, health, use of health services, smoking and alcohol
use, physical activity, weight, sleeping patterns, marital status, social contacts, church
and group membership, and life satisfaction. The unadjusted ratio of men to women
mortality is 1.5. Controlling for factors such as smoking and alcohol use decreases this
ratio, as more men smoke and drink. Controlling for other factors, such as physical
activity, increases the ratio, as more women than men are physically inactive. The
adjustment for all 16 factors increases the mortality ratio to 1.7. So this large set of
demographic and behavioral factors does not explain the FA. Other behaviors that
differ among men and women, such as suicides, homicides, and fatal accidents, only
account for a small proportion of total deaths and cannot explain the FA either. Men in
the 15-24 age group exhibit excess mortality from motor vehicle accidents, but this
only explain a small fraction of the overall sex mortality differential. It is concluded
that an explanation of the FA needs to incorporate biologic factors. Also, interactions
between biologic and behavioral factors need to be considered, as the impact of a
given genetic factor on the FA can vary considerably according to environmental
Madigan and Vance (1957) study a group showing little behavioral differences
between males and females: the teachers and staff of Roman Catholic Brotherhoods
and Sisterhoods, who lead very similar lives as regards diet, housing, work, recreation,
and medical care. Many sources of mortality differentials, such as pregnancies,
employment differences, service in armed forces, hazardous leisure and work
activities, do not exist in this group. Variables that could not be controlled include
smoking, alcohol consumption, and obesity. Over 41,000 subjects were observed
during a 54-year period. Because of their lifestyle, both Brothers and Sisters
experience lower mortality rates than the general population. However, sex mortality
differentials are similar, and even greater after the age of 45. An analysis of the
causes of death shows that women may be no more resistant than men to infectious
and contagious diseases, so that the gains achieved by women this century may be
explained by a better constitutional resistance to degenerative diseases. The increasing
FA may result from the transition from times where infectious diseases were the main
cause of death, to modern times where death mostly results from degenerative
diseases. The disappearance of infectious diseases unmasks an innate male survival
disadvantage from certain degenerative diseases.
There is substantial evidence that males have not benefited from medical
advances as much as females. Graney (1979) compares pre- and post-1946 mortality
rates showing that infant mortality dropped drastically for both sexes after the
introduction of antibiotics; by far the greatest decline occurred for females. The
period from 1950 to 1969 saw a decline of 17% of death rates due to chronic heart
disease in the United States, due to a 22% reduction among women, and a 7% decline
among men. The improvement in maternal mortality, from 66 per 10,000 in the 1920’s
to 1.5 per 10,000 in 1969, has benefited females exclusively. Mortality from cancer of
the reproductive organs is lower for males, so that females are enjoying more benefits
from the improvement of cancer detection and treatment (Retherford, 1975.)
Graney (1979) provides a genetic explanation of the FA. While any X
chromosome contains a large amount of genetic information, the Y chromosome
carried by males is smaller, has fewer genes, and carries less information. It has even
been suggested that a Y chromosome may act as no more than a blank (Scheinfeld,
1958.) A Y chromosome-bearing sperm is lighter and swims faster than its X-bearing
counterpart, resulting in more male conceptions. However, males lack the genetic
advantage of a second X chromosome. With their two X chromosomes, females use
the entire immunology system of both parents. Males do not have a second X
chromosome to provide extra protection. If an abnormal gene turns up in a male’s
single X, he is at its mercy, while a female will have a normal gene in her extra X to
counteract the defective gene.
Waldron (1976) provides an hormonal explanation of the FA. Males produce
more androgens than estrogens, while in females proportions are reversed.
Androgens, particularly testosterone, raise blood pressure and increase liver
production of LDL, the bad cholesterol. Estrogens act on the liver to produce more
immune globulin and more HDL, the good cholesterol. This makes the female
biochemical environment better able to fight bodily stresses. After menopause, the
decrease in estrogen levels seems to have an immediate impact on the cardiovascular
risk. The male to female ratio for myocardial infraction drops from 3:1 before age 50
to less that 2:1 after. The hormonal explanation is supported by Hamilton and Mestler
(1969), who compare the life expectancy at age 8 of castrated and intact men.
Castrated men live 10.2 years longer.
The greater adaptability of the female body may arise from the need to adjust
to the huge changes that take place during menstruation, childbearing, and menopause.
Graney (1979) suggests that biological differences between the sexes are related to
their differentiated social roles. To support the intense demands of pregnancy,
childbirth, and nursing, females have developed biological resources that are available
at other times as emergency reserves. Menopause promotes longer life by eliminating
the mortality risk from childbirth. Male biologic characteristics have evolved to meet
long-term demands of hunting, shelter-building, and even combat with other males; so
size and muscle mass are maximized in males, leaving less reserves to combat
emergencies such as acute infections.
Studies supporting the behavioral hypothesis
There is considerable evidence that changes in smoking habits this century
have contributed to the evolution of the FA. Retherford (1975) finds that the sex
difference in life expectancy between the ages of 37 and 87 was 2.71 years for
nonsmokers and 5.13 years for smokers in 1962. He concludes that nearly half of this
difference is due to tobacco smoking, and that about 75% of the increase of the life
expectancy difference between 1910 and 1962 is due to changes in smoking habits.
However, cigarette consumption is correlated with alcoholism, socioeconomic status,
psychological type, marital status, and no attempt is made to control for these
variables. This results in an overestimation of the effect of smoking.
There is some evidence that the greater participation of men in the labor force,
and the subsequent exposure to occupational hazards, may contribute to the FA. Men
are more likely to be employed in jobs exposed to carcinogens, and have a higher rate
of fatal work accidents. This can only account for at most 5% of the male excess
mortality, of which about half can be explained by exposure to asbestos (Waldron,
1991.) The effect of occupational hazards has decreased substantially today, as safety
measures, better hygiene and reduced working hours, have improved work conditions,
while most jobs with exposure to carcinogens have been eliminated. The decrease in
cigarette consumption will further reduce any effect of occupational hazards, given the
interaction between smoking and carcinogens. Consequently, men’s employment in
riskier occupations contributes very modestly to the FA.
The FA is smaller in most developing countries. A first explanation of this
phenomenon is that some causes of death favor males (who are less vulnerable to
intestinal infections and tuberculosis, for instance) while some favor females (who are
less prone to die from violence and accidents.) An excess mortality for women will
automatically result in countries where the former causes are more prevalent (death
from intestinal infection is more frequent in developing societies, for instance.)
A second explanation is son preference (Das Gupta and Bhat, 1966.) In many
societies, particularly those with strong Hindu or Confucian traditions, the patriarchal
family structure and the low status of women induce a preference for sons over
daughters. Son preference is strong in Jordan, Syria, Bangladesh, Nepal, India,
Pakistan, and only slowly fading in China, South Korea, Taiwan (Arnold and
Zhaoxiang, 1986.) Females get discriminated against throughout their lives, are
weaned earlier than boys, have less access to education, health care, food supplies, and
other goods and benefits scarce in a poor society. Reasons for son preferences are
numerous, and summarized by the south Indian proverb “Raising a daughter is like
watering your neighbor’s plant.” Males are valued in agricultural areas because of
their larger contribution to household production and the support of aging parents.
Education of female children is perceived as an investment that will shift outside the
family after marriage, after payment of dowry and wedding costs. Hindu sons have to
perform religious functions, such as the cremation of deceased parents.
Many other reasons have been put forward to explain the FA. They include the
loss of iron during menstruation, the tendency of women to visit doctors more often,
pressure on men not to miss work, the higher use of preventive care by women, type A
behavior, the fear of men to survive their spouse, even the disappearance of whalebone
2. International Comparisons
Evolution is a fairly rapid and effective process of adaptation to changes in the
environment. However, the recent increase of the FA has been way too fast to be
explained by evolution only; it proves the importance of social, economic, and
environmental influences on mortality. Historically, males tended to survive longer
than females, a pattern that seems to have persisted from the origins of our species
until well into the modern era. Survival rates only began to change 150 years ago.
Around the turn of the century, the FA was small in a number of countries. It has
grown significantly since. (Berin, Stolnitz, and Tenenbein, 1990.) Only recently has
some stabilization of the FA occurred in developed countries. Table 1, from Nadarajah
(1983) and recent data, shows the evolution of the FA in Sri Lanka since 1920.
Table 1. Expectation of Life at Birth in Sri Lanka, 1920-2000.
- 2.0
- 2.1
- 2.1
- 1.3
- 1.0
- 0.6
Nadarajah’s comparison of causes of death in the age group 15-44 in Sri Lanka
in 1952-54 and 1970-72, summarized in table 2, supports the theory that women have
taken more advantage of medical improvements. Death rates for diseases and causes
that affect females more (tuberculosis, pneumonia, infectious and parasitic diseases,
and maternal deaths) have drastically declined during the period under study. Death
rates for causes that affect males disproportionately (diseases of the circulatory
system, accidents, suicide and violence) have increased.
Table 2. Death Rates in the Age Group 15-44 by Sex and Cause, Sri Lanka.
Diseases of the digestive system
Diseases of the circulatory system
Infectious and parasitic diseases
Accidents, suicides, violence
Maternal deaths
Other causes
All causes
Few papers provide a comprehensive analysis of the secular trend in the FA.
They usually report the results of a longitudinal study, analyzing the evolution over
time of the causes of death in a given country. International comparisons are usually
descriptive, analyzing sex differentials by age groups and causes of death (Stolnitz,
1955, 1956, UN Secretariat, 1988.) Studies focus on immediate medical causes of
death, and do not explore the reasons for heart diseases, cancers, and violence.
A notable exception is a regression analysis by Preston (1976), based on
mortality data from 43 countries, most of them developed, during the period 1960-64.
Preston’s conclusions are as follows: the variable most strongly related to FA
observed by Preston is the percentage of the labor force in agriculture (males and
females), with a correlation of –0.574. Variables evaluating sex differentiation in
education or in the labor force are poorly correlated with the FA. Stepwise regression
results in the selection of three variables to explain the sex mortality differential: the
percentage of the labor force in agriculture, forestry, hunting, and fishing; the
percentage of population residing in cities of more than 1 million inhabitants; and an
interaction term, the reciprocal of daily grams of animal protein per capita times the
percentage of males in level 1 school enrollment. All three regression coefficients are
significant at the 5% level. The square of the multiple correlation coefficient is 0.541.
To date, the Preston study is the most persuasive published demonstration of the
influence of socio-environmental factors on mortality.
In this article, update and extend Preston’s work. We perform a crosssectional study, analyzing the FA today across the world, using regression techniques.
We incorporate data from 169 countries in the world, in various stages of
development. We use a much larger set of explanatory variables. We also investigate
spatial autocorrelation.
3. Variables and Correlations
Data on the possible causes of the FA were collected from 169 countries, with
a total population of 5.964 billion. Admittedly, there can be wide variations of
demographic variables within large countries. For instance, India exhibits striking
diversity. The state of Kerala has features that are typical of a middle-income country:
a life expectancy of 72 years, an infant mortality rate of 17 per thousand, a fertility
rate (1.8 births per woman) under replacement level, and a female/male ratio above
unity (1.04). In Uttar Pradesh, the infant mortality rate is six times as high as in
Kerala, the fertility rate is 5.1, and the female/male ratio stands at 0.88, lower than any
country in the world (Murthi, Guio and Drèze, 1995.)
The values taken by 50 potential explanatory variables have been recorded.
Sources of data are the World Fact Book of the Central Intelligence Agency, the
Encyclopaedia Britannica Book of the Year 2000, the Food and Agriculture
Organization, the United Nations, the World Bank’s Development Indicators, and the
World Health Organization.
The FA, defined as the difference between the life expectancy at birth of
women and men, in years, is the dependent variable of this research. This measure of
the overall sex differential in mortality is the most commonly used, as it summarizes
mortality at all ages. It is suitable to make comparisons among populations with
different age structures, as it is not affected by the age distribution.
All variables are listed in this section, along with the correlation coefficient
with FA, and comments. It was decided to use unweighted correlations rather than to
assign each country a weight proportional to its population. Weighted correlations
would have given too much importance to the ten largest countries, with a combined
population exceeding 60% of the world’s total. Also, it was felt that small countries,
like Luxembourg or Norway, have their own specific cultural values and health care
systems. Using weighted correlations would have amounted to disregard that
In an independent sample, correlations exceeding 0.18 would be significant at
the 1% level. However, any cross-sectional data may be subject to some degree of
spatial correlation, which would make correlation coefficients less significant than
they appear to be. The problem of spatial correlation is investigated in the Appendix.
Variables have been subdivided into four categories, measuring economic
modernization, social/cultural/religious behavior, health care quality, as well as
geographic dummy variables. There is some overlap between categories. For instance,
a decrease of infant mortality results not only from an improvement in health care
facilities, but also from better female education.
Given the extreme skewness of the distribution of some variables such as
persons by car, doctor, and hospital bed, and maternal mortality, variance-stabilizing
logarithmic transformations were applied, resulting in all cases in a significant
increase of the correlation coefficient.
Variables measuring the degree of economic modernization
1. Percentage of population living in urban areas (0.5010).
Urbanization is strongly correlated with FA and with most measures of
economic development. It is an indicator of the degree of economic modernization,
but also a proxy for gender bias, as discrimination occurs mainly in rural areas, due to
the perceived larger value of men in an agricultural setting (Williamson, 1973,
concludes that urbanization is the strongest determinant of son preference.)
2. GNP per capita, (0.1937)
3.GNP per capita, converted to international dollars using purchasing power parity
rates. An international dollar has the same purchasing power as a $US in the
United States (0.2591)
4. GDP per capita (0.2631)
5. Percentage of GDP from the services sector (0.3800)
6. Log (persons per car) (-0.5147)
7. Daily calorie intake (0.4632)
8. Difference between daily calorie intake and requirements (%). The FAO
determines calorie requirements per country, as a function of age and sex
distribution, average body weight, and temperature. The variable expresses as a
percentage, the difference (positive or negative) between actual and required
intake (0.1592)
9. Prevalence of malnutrition among children under the age of five (%) (-0.6360)
10. Percentage of economically active females working in agriculture (-0.6325)
11. Percentage of individuals who have access to safe water (0.3675)
Social, cultural, religious variables
12. Percentage of smokers in the male population (0.1417)
13. Percentage of smokers in the female population (0.4852)
14. Difference between male and female smoking rates (-0.0835)
46.84% of the world’s male population smokes, versus 11.29% percent of the
female population. Male and female smoking are uncorrelated. There is little
variability among male smoking, which leads to a low correlation with FA. The
positive relationship between female smoking and FA does not mean that smoking
increases life expectancy, but rather that the FA is larger in countries where more
women smoke. Social pressures against female smoking exist in countries with a low
FA. Higher correlations may have been obtained, had we been able to factor in the lag
time between smoking and death. Smoking patterns have changed drastically in some
countries, but this has yet to affect mortality rates. The 32% smoking rate among
males from Singapore represents a decrease from 74% due to legislation introduced in
the 1970s. Cigarette consumption in China has increased more than three-fold since
the 1950s; this is expected to increase the proportion of deaths due to smoking from
13% in 1987 to about 33% (Pokorski, 2000.)
15. Illiteracy rate (%) for women above the age of 15 (-0.6288)
16. Illiteracy rate (%) for men above 15 (-0.5895)
17. Difference between female and male illiteracy rates (%) (-0.5626)
18. Enrollment ratio for women. Total school enrollment at first and second levels
divided by the population of the corresponding age groups (0.5053)
19. Enrollment ratio for men (0.4271)
20. Difference between male and female enrollment ratio (-0.4284)
21. Expected number of years of education for males (0.4821)
22. Expected number of years of education for females (0.5441)
23. Females per 100 males enrolled, second level (0.4656)
24. Females per 100 males enrolled, third level (0.2789)
25. Difference in school life expectancies (-0.4534)
In four cases (literacy, enrollment, school life expectancy, and smoking) data
enabled us to compute separately correlations for men and women. In all cases links
were found to be weaker for men. For instance, an improvement in female literacy
contributes more to a decrease of child mortality than a male increase, as females are
the main providers of care for children.
Variables related to education seem essential to understand the FA
phenomenon. While all authors agree that the “demographic transition” to lower
levels of mortality and fertility is linked to economic development, recent research
(Murthi, Guio and Drèze, 1995) and our correlation coefficients show that the income
effect can be slow and weak, and that education-related variables such as female
literacy have a more profound influence. Always accompanying economic
modernization is a transformation of value and belief systems. Education may be the
most effective agent of change in the belief system.
Female education is considered to be crucial even if family income is
controlled. It reduces the desired family size, while improving the ability to achieve
the planned number of births. The desire for a large family reduces, as educated
women are more likely to resist the burden of repeated pregnancies. They have other
sources of fulfillment. They are less dependent on their sons for old-age security. Time
has a higher opportunity cost that reduces the value of the time-intensive activity of
child bearing and education. Educated women are more likely to work, which reduces
fertility, due to the burden of household work and employment. Child mortality
reduces as educated women are more knowledgeable about nutrition and health care.
A lower fertility reduces child mortality through longer birth spacing.
It is likely that policies to improve female literacy will prove to be more
efficient in reducing mortality rates than measures aiming to change the nature of
marriage systems or the ingrained discrimination against women. Laws outlawing
dowry or arranged marriages for minors, specifying inheritance rights for women, or
forbidding the use of ultrasound to determine gender of fetuses, have generally proved
to be useless in changing century-old habits.
A comparison of correlations for all education variables demonstrates the
importance of basic education. Enrollment figures at the second and the third level are
not as related to FA and illiteracy as variables measuring education at the first level.
26. Females as a percentage of the labor force (0.2515)
27. Female contribution to the service industry, measured by percentage of females in
the service industry, out of working females, divided by percent of GDP from the
service sector (0.3924)
28. Percent of economic activity due to female labor (-0.0884)
Higher levels of female labor participation reduce sex discrimination, as they
raise the status of women in society, lower dowry levels, (and consequently the costs
of rearing daughters), make women less dependent on sons for old-age security, and
make women more able to resist male pressure to discriminate in favor of boys.
29. Fertility: number of children per childbearing woman (-0.6348)
30. Homicide Rate (-0.1542)
31. Divorce rate (0.0924)
Homicides account for a small percentage of deaths, which explains the low
correlation. The divorce rate shows no correlation with FA, despite the vast body of
literature proving that marital status strongly influences survival, with divorced men
appearing to be more vulnerable to the disruption of social relationships (Retherford,
1975, Trowbridge, 199.) Social ties seem to protect people against mortality, and
women have more alternative ties outside marriage that they can fall back upon. The
weak relationship may be explained by the fact that many emerging countries do not
report a divorce rate, and by the heterogeneous cultural approaches toward divorce.
32. Percentage of Muslims (-0.2382)
33. Percentage of Christians (0.2674)
34. Percentage of Buddhists (-0.1375)
35. Percentage of Hindus (-0.1593)
36. Percentage of people with indigenous beliefs, Africa (-0.2825)
37. Percentage of non-religious people (0.4004)
Religious beliefs are subject to much uncertainty. The CIA reports 86% of
Christians in Bulgaria, Britannica 39%! Figures are probably underestimated in
communist countries, overestimated in Western Europe, and approximated in Eastern
Europe. There is no measure of intensity of beliefs, and on the degree of religious
practice. Figures from EB appear to be more reliable, and have been used here.
Despite uncertainties, significant correlations appear. Christianity and atheism are
more prevalent in countries with a high FA. Hinduism, Buddhism, and Islam are
associated with a lower FA, probably a consequence of gender discrimination.
Variables measuring the quality of health care
38. Health care expenses, as a proportion of GNP (0.2010)
39.Cost of health care per capita, in international dollars (0.2637)
40. Log(persons per physician) (-0.6546)
41. Log(persons per hospital bed) (-0.6520)
42.Percentage of children under age one immunized by vaccination against diphtheria,
pertussis and tetanus (0.3017)
43. Infant mortality rate (-0.5422)
44. Log(maternal mortality rate) (-0.6170)
45. Index measuring the overall performance of the health care system (0.4094)
The 2000 World Health Report, published by the WHO, ranks all countries
according to the performance of their health care system. This ranking is a source of
controversy, mainly because the United States spends far more per person than any
other country, yet only ranks 37th in health care quality. Yet, the index has a
correlation exceeding 0.85 with life expectancy! Our correlation coefficients show
that health care quality is a better predictor of the FA than health care cost.
Other variables
46. Dummy variable for the Asia and Pacific regions (-0.1517)
47. Dummy variable for Africa (-0.4041)
48. Dummy variable for Latin America and the Caribbean (0.1229)
49. Dummy variable for Europe, North America, Israel, Australia, and New Zealand
50. Dummy variable for the six former Soviet Union European countries: Belarus,
Estonia, Latvia, Lithuania, Russia, and the Ukraine (0.5130)
4. Regression results
The literature review suggests that a combination of biologic, social, economic,
medical, and behavioral factors, and the interplay between them, can explain the FA.
Selection techniques of regression analysis were applied to identify the most
significant variables, among the 50 variables introduced in section 3.
The selected regression model contains four variables: the logarithm of the
number of persons per physician, the fertility rate, the percentage of people with
Hindu or Buddhist beliefs, and the dummy variable representing European countries
that belonged to the Soviet Union. The regression equation is
FA = 9.9042 – 0.4731 log (persons per physician) – 0.4444 (fertility)
– 0.0179 (% of Hindus and Buddhists) + 4.9218 (Soviet Union dummy)
p-values for the four selected variables are, respectively, 0.1591%, 0.0647%,
0.1723%, and 1.72E-11. With these four variables selected, no other variable is
significant, even at the 20% level.
Three countries (Afghanistan, Bangladesh and Namibia) exhibit a standardized
residual under –2; they are among the six countries for which the FA is negative.
Brazil, Kazakhstan, and Mauritania are the only three countries with a standardized
residual exceeding 2. Brazil and Kazakhstan are the only countries out of the former
Soviet Union European zone with a FA exceeding 10 years.
Several interaction terms were considered. It is for instance likely that gender
discrimination has more effect when food is scarce and doctors rare. Also, female
education may help reduce child mortality by enabling women to take better
advantage of available medical facilities. Female literacy and availability of doctors
and hospital beds could have a synergistic effect. While several interaction terms
prove to be mildly correlated with FA, none remains significant after the inclusion of
the selected variables.
The most significant variable selected is by far the dummy variable
characterizing the eastern European countries than formerly belonged to the Soviet
Union. Living in one of these six countries increases the FA by close to five years.
None of the 50 variables recorded in this study explains the collapse of the male life
expectancy in these countries following the demise of communism, which has been
dramatic: the FA exceeds 11 years in these six countries, and only in these countries.
Alcoholism has been proposed as a major reason for this phenomenon. Other factors
commonly mentioned are rapidly declining social and economic conditions, increasing
crime and corruption, a deteriorating health care system, chain-smoking, radiation
produced by decades of nuclear irresponsibility, and birth defects provoked by
environmental disasters.
The next most significant variable selected is fertility, the number of births per
woman. The FA increases by 0.44 years for each unit decrease of the number of
children. Fertility is highly correlated with many variables such as female illiteracy
(correlation = 0.866), female school enrollment ratio (-0.8084), maternal mortality
(0.8277), employment in agriculture (0.7746), even female smoking (-0.6528.) It is
the best summary measure of the transformations in beliefs and values that result from
increased female education and the parallel decrease of employment of women in
agriculture. Fertility is also linked to son preference. In areas where the education of
daughters has little value, they are likely to be married earlier and start childbearing
sooner. Where women have a substantial productive role, fertility decreases, as
females marry later, have a base of power other than reproduction, and are more
assertive in personal choices.
The number of persons per physician is also highly significant; it is strongly
correlated with many of the variables measuring economic modernization:
urbanization (correlation = -0.7269), number of cars (0.7486), percentage working in
agriculture (0.8290), malnutrition (0.7842). It is also correlated with most health care
variables such as infant mortality (0.7903), maternal mortality (0.8415), and health
care quality index (-0.6974). Therefore we consider the number of persons per
physician to be the best variable summarizing, not only of the quality of a country’s
health care system, but also of its degree of economic modernization.
Also highly significant is the percentage of people with Hindu or Buddhist
beliefs. There is a strong body of evidence of discrimination against women in
countries where these two religions are prevalent. The Indian Medical Association
estimates that three million female fetuses are aborted each year after sex-selection
sonograms. Estimates of the number of annual female infanticides in India reach
several million. Midwives in India earn the equivalent of $0.50 and a sack of grain for
each live delivery of a girl, twice as much plus a sari if it’s a boy - and $5 to get rid of
a newborn female (Wall Street Journal, May 9, 2000.) A comparative study of male
and female mortality during childhood by D’Souza and Chen (1980) shows
abnormally high death rates among girls living in Bangladesh rural areas, between the
ages of one month and 15 years. Some countries officially recognize son preference;
in several Chinese provinces, couples accepting a one-child certificate are
compensated by a larger monetary bonus if their child is a daughter; in some rural
areas, couples can have two children if the first is a girl. The maximum penalty for
female infanticide is China is only 13 years of prison (Arnold and Zhaoxiang, 1986.)
The multiple correlation coefficient is 0.7815: over 61% of the variability of
the FA can be explained by the four selected variables (this is significantly higher than
Preston’s 54.1%.) Therefore this study provides support to the hypothesis that
behavior strongly influences differential mortality between males and females. Three
of the selected variables are mostly determined by beliefs and family values. The
fourth variable is linked to economic development.
5. Conclusions
The increase in the sex mortality differential during the 20th century has
paralleled important events: huge declines in (1) deaths from infectious and parasitic
diseases, (2) the size of the family, (3) illiteracy rates, (4) improvements in gender
discrimination, and (5) increased urbanization. As these changes occurred
simultaneously, a high degree of multicollinearity between explanatory variables
results, which makes it unrealistic to expect a definitive answer to the question “Is the
female advantage a consequence of biological or behavioral causes?”
While biological differences are undeniable, the fact that the sex mortality
differential has changed rapidly over time is an indication that they are not the sole
reason for the FA. Our regression study, that incorporates data from 169 countries, at
various stages of development, emphasizes the importance of behavior, as three of the
four selected variables are based on social/cultural/religious values. Our fourth
selected variable summarizes the degree of economic modernization of a country.
Together, the four variables, that are totally unrelated to biological differences, explain
over 61% of the variability of the FA, a strong support of the behavioral hypothesis.
The impact of the interactions between biological and behavioral factors
should also not be neglected. The change of the sex mortality differential indicates
that, if indeed there is an innate male survival disadvantage (for degenerative and heart
diseases, for instance), it has a significant effect only in combination with factors
emerging in the course of socioeconomic modernization. Highly differentiated death
rates from cardiovascular diseases can only appear in conditions of low infant
mortality and high life expectancy.
Finally, it should be acknowledged that different explanations may account for
the FA at different ages: differentials in infants may be primarily biological, while
social/behavioral reasons may explain more of the FA in adults.
Appendix. Spatial Dependence in Regression Models
Regression analysis is based on the assumption that errors are independently
distributed. In a cross-sectional study, errors between neighboring countries are often
spatially correlated. This may result from the influence of unobserved, or
unobservable, variables that exhibit spatial dependence. For instance, unobserved
cultural factors may lead Pakistan, India and Bangladesh to common approaches to
gender bias or fertility. Or unobserved dietary similarities between Latin American
countries may impact mortality. As summarized by Tobler’s (1979) “first law of
geography,” “everything is related to everything else, but closer things more so.”
While there is a voluminous literature on serial dependence over time in the
analysis of time series, little attention has been paid to its counterpart in crosssectional data, spatial autocorrelation, until recently. A recent review by Anselin and
Bera (1998) points to Paelinck and Klaasen (1979) as the pioneering approach to
spatial econometrics. Still, the most widely available statistical software do not
emphasize spatial problems; they contain no specific routines to perform maximum
likelihood estimation of spatial processes or specific tests for the presence of spatial
autocorrelation. S+ is among the few packages that have a spatial module,
Spatial autocorrelation between errors in adjoining locations is the result of a
mismatch between the spatial unit of observation (the country, in this case) and the
spatial extent of the variable under study (son preference, for instance.) High values
for a random variable tend to cluster in space: locations tend to be surrounded by
neighbors with similar values. As a result, the sample contains less information than
an uncorrelated counterpart. This loss of information needs to be explicitly
acknowledged in estimation and tests.
A crucial issue in modeling spatial correlation lies in the specification of
“locational similarity,” the determination of those locations for which values of
random variables are correlated. Such locations are referred to as “neighbors,” even
though this does not necessarily mean than they are physically adjacent. Since, in a set
of n observations, it is impossible to estimate n x n covariance terms, the structure of
spatial dependence has to be specified through an exogenous model. This is usually
achieved through the construction of an n by n positive and symmetric matrix W
which expresses for each observation (row) the locations (columns) that are neighbors.
For instance, wij = 1 when i and j are neighbors, wij = 0 otherwise. To facilitate
interpretation, this weights matrix is often standardized so that the elements of a row
add up to one:
wijS = wij / ∑ j wij
The selection of the elements that are nonzero in W is a matter of considerable
arbitrariness. The traditional approach, based on geography only, designates
observations as neighbors if they have a physical border in common. Another
approach consists in using distances: wij = 1 if the distance between the observations
dij < δ, where δ is a cutoff value. Other suggestions are wij = 1 / d ij , or
wij = e
− βd ij
, or even
wij = bijβ d ijα , where bij is the share of the common border
between i and j in the perimeter of i. In economic applications, weights are sometimes
based on selected socioeconomic characteristics such as per capita income. In social
sciences, weights may reflect whether or not two individuals belong to the same social
network. The weights need to be specified exogenously, to express any measure of
potential interaction between i and j.
Once matrix W has been built, the spatial dependence is incorporated into a
linear regression model with k explanatory variables through equation
y = ρWy + Xβ + ε
where y is the n by 1 vector of observations of the dependent variable, X is a n by k
matrix of observations of the explanatory variables, ε is a n by 1 vector of error terms,
β is a k by 1 vector of regression coefficients, and ρ is the spatial autoregressive
parameter; it measures true contagion between countries such as diffusion of beliefs.
The product Wy results in a weighted average of the y values in their neighborhood
set. This concept of a spatial lag operator is similar to the inclusion of an
autoregressive term for a dependent variable in a time-series analysis. The spatial lag
for a given observation is always correlated, not only with its error, but also with error
terms at all other locations. Each country is correlated with every other country, in a
relationship that decays with the order of contiguity.
This model for spatial lag dependence can be re-formulated as
( I − ρW ) y = Xβ + ε
(I – ρW)y is called a spatially filtered dependent variable: the effect of spatial
autocorrelation has been taken out. This is roughly similar to the process of first
differencing a dependent variable in a time series.
An alternate way to incorporate spatial autocorrelation in regression is to
specify a spatial process for the error term. The model is
y = Xβ + ε
ε = λWε + ξ
where λ is the spatial autoregressive coefficient for the error lag Wε, and ξ is an
uncorrelated error term. λ is often called “the nuisance parameter,” reflecting the
interpretation of spatial error dependence as a nuisance, resulting from correlation in
measurement errors or in variables that are not crucial to the model (the unobserved
variables spillover across countries.)
A consequence of correlation among error terms is that ordinary least-squares
estimators are biased and inconsistent; a maximum likelihood approach is necessary.
The traditional R2 is not an appropriate measure of fit in the presence of spatial
autocorrelation. Models can be compared using the maximized log-likelihood, or an
adjusted form such as the Akaike Information Criterion, that takes into account the
number of parameters in the models. Anselin and Bera (1998) provide a summary of
estimation techniques and tests for spatial dependence. In particular, tests for the null
hypotheses H0: ρ=0 and H0: λ=0 are presented. Numerical procedures are usually
complicated, as many simplifying results from serial correlation in time series do not
hold in the case of spatial correlation. Estimation techniques require nonlinear
optimization of the likelihood function, and the manipulation of matrices of dimension
equal to the number of observations.
In order to test for the presence of spatial autocorrelation in our final regression
equation, a simple neighbors matrix W was built, in which entries are 1 when two
countries are physical neighbors, and 0 otherwise. It was felt that a more sophisticated
approach was not necessary: the spread of unobserved cultural values across countries
was not deemed to depend much on distance between capitals or length of the border.
The most common test for the assumption H0: λ=0, Moran’s test, the spatial
equivalent of Durbin Watson’s test, was applied to the residuals of the selected
regression equation. This test was found to be powerful against a wide range of forms
of spatial dependence. The test statistic is
n  e'We 
S o  e' e 
where e is the vector of OLS residuals, n the number of observations, and S0 a
standardization factor equal to the sum of the spatial weights,
∑i ∑ j wij .
the null assumption of no spatial correlation among residuals, I is asymptotically
normally distributed with known first two moments (see Anselin and Bera, 1998).
Moran’s statistic, the estimated value of the spatial correlation coefficient, was
0.163, indicating a modest but significant degree of unexplained correlation across
Matrix W was then standardized so that the sum of its coefficients along each
row adds up to 1. This approach may better reflect the contagion of values and beliefs
across countries: the more neighbors a country has, the smaller the spillover of culture
across the border. China has more influence on Mongolia than Mongolia on China.
Moran’s revised spatial correlation for this modified matrix is XXXX
It is concluded that a modest spatial correlation of errors is present in our data,
given the four selected variables. Consequently, the p-values for the four variables
might be slightly understated, and the multiple correlation coefficient might be
misleading. Nevertheless, the four selected variables are so significant that our
conclusions are very unlikely to be challenged by a more sophisticated maximum loglikelihood approach.
L. Anselin and A. K. Bera (1998). “Spatial dependence in Linear Regression Models
with an Introduction to Spatial Econometrics.” In Handbook of Applied Economic
Statistics, A. Ullah and D.E.A. Giles, Editors, Marcel Dekker, New York, 237-289.
F. Arnold and L. Zhaoxiang (1986). “Sex Preference, Fertility, and Family Planning
in China.” Population and Development Review, 12, 221-245.
B.N. Berin, G.J. Stolnitz and A. Tenenbein (1990). “Mortality Trends of Males and
Females over the Ages”. Transactions of the Society of Actuaries, 41, 9-32.
M. Das Gupta and P.N Mari Bhat (1966). “Intensified Gender Bias in India: A
Consequence of Fertility Decline.” Annual Meeting of the Population Association of
S. D’Souza and L.C. Chen (1980). “Sex Differentials in Mortality in Rural
Bangladesh.” Population and Development Review, 6, 257-270.
M.J. Graney (1979). “An Exploration of Social Factors Influencing the Sex
Differential in Mortality.” Social Symposium, 28, 1-26.
J.B. Hamilton and G.E. Mestler (1969) “Mortality and Survival: Comparison of
Eunuchs with intact Men and Women in a Mentally Retarded Population”. Journal of
Gerontology, 24, 395-411.
B.B. Kalben (2000). “Why Men Die Younger: Causes of Mortality Differences by
Sex.” North American Actuarial Journal, 4, 83-111.
F.C. Madigan and R.B. Vance (1957).
Design.” Social Forces, 35, 193-199.
“Differential Sex Mortality: A Research
M. Murthi, A-C. Guio and J. Drèze (1995). “Mortality, Fertility, and Gender Bias in
India: A District-Level Analysis.” Population and Development Review, 21, 745-780.
T. Nadarajah (1983). “The Transition from Higher Female to Higher Male Mortality
in Sri Lanka.” Population and Development Review, 9, 317-325.
C.A. Nathanson (1984). “Sex Differences in Mortality.” Annual Reviews Sociology,
J. Paelinck and L. Klaasen (1979). “Spatial Econometrics.” Saxon House,
R. J. Pokorski (2000). “Excess Mortality in Asia Associated with Cigarette Smoking.”
North American Actuarial Journal, 4, 101-113.
S.H. Preston (1974). “Mortality Patterns in National Populations”. Academic Press,
New York.
R.D. Retherford (1975). The Changing Sex Differential in Mortality. Greenwood
Press, Westport, Connecticut.
A. Scheinfeld (1958). “The Mortality of Men and Women”. Scientific American, 198,
G.J. Stolnitz (1955, 1956). “A Century of International Mortality Trends” Population
Studies, IX, 24-55, and X, 17-42.
W. Tobler (1979). “Cellular Geography” In Philosophy in Geography, S. Gale and G.
Olsson, Editors, Reidel, Dordrecht, 379-386.
C. Trowbridge (1994) “Mortality Rates by Marital Status.” Transactions of the
Society of Actuaries, XLVI, 321-344.
UN Secretariat (1988). “Sex Differentials in Life Expectancy and Mortality in
Developed Countries: An Analysis by Age Groups and Causes from Recent and
Historical Data.” Population Bulletin of the United Nations, 25, 65-106.
I. Waldron (1976). “Why Do Women Live Longer Than Men?” Social Science and
Medicine, 10, 349-362.
I. Waldron (1985). “What Do We Know about Causes of Sex Differences in
Mortality?” Population Bulletin of the United Nations, 18,59-76.
I. Waldron (1991). “Effects of Labor Force Participation on Sex Differences in
Mortality and Morbidity.”
In Women, Work, and Health, edited by M.
Frankenhaeuser, U. Lundberg, and M. Chesney, Plenum Press, New York, 17-38.
N.E. Williamson (1973). “Preferences for Sons Around the World.”
dissertation, Department of Sociology, Harvard University.
D.L. Wingard (1982). “The Sex Differential in Mortality Rates. Demographic and
Behavioral Factors.” American Journal of Epidemiology, 115, 205-216.