# Spatial analysis of PM_{10} and cardiovascular mortality in the Seoul metropolitan area

## Article information

## Abstract

### Objectives

Numerous studies have revealed the adverse health effects of acute and chronic exposure to particulate matter less than 10 μm in aerodynamic diameter (PM_{10}). The aim of the present study was to examine the spatial distribution of PM_{10} concentrations and cardiovascular mortality and to investigate the spatial correlation between PM_{10} and cardiovascular mortality using spatial scan statistic (SaTScan) and a regression model.

### Methods

From 2008 to 2010, the spatial distribution of PM_{10} in the Seoul metropolitan area was examined via kriging. In addition, a group of cardiovascular mortality cases was analyzed using SaTScan-based cluster exploration. Geographically weighted regression (GWR) was applied to investigate the correlation between PM_{10} concentrations and cardiovascular mortality.

### Results

An examination of the regional distribution of the cardiovascular mortality was higher in provincial districts (gu) belonging to Incheon and the northern part of Gyeonggido than in other regions. In a comparison of PM_{10} concentrations and mortality cluster (MC) regions, all those belonging to MC 1 and MC 2 were found to belong to particulate matter (PM) 1 and PM 2 with high concentrations of air pollutants. In addition, the GWR showed that PM_{10} has a statistically significant relation to cardiovascular mortality.

### Conclusions

To investigate the relation between air pollution and health impact, spatial analyses can be utilized based on kriging, cluster exploration, and GWR for a more systematic and quantitative analysis. It has been proven that cardiovascular mortality is spatially related to the concentration of PM_{10}.

**Keywords:**Cardiovascular mortality; Geographically weighted regression; PM

_{10}; SaTScan; Spatial epidemiology

## Introduction

Numerous studies on environmental epidemiology have reported the adverse effects of short- and long-term exposure to air pollution. Such exposure can exacerbate pre-existing heart and lung diseases, and it can lead to cardiovascular disease, respiratory disease, asthma, impaired lung function, and premature death [1,2].

Particulate matter less than 10 μm in aerodynamic diameter (PM_{10}) increases oxidative stress and inflammatory response, and it is also associated with autonomic nervous system disorders and changes in blood coagulation, which can lead to cardiovascular disease and cardiovascular-related mortality [3-5].

Environmental epidemiology have quantitatively evaluated the adverse health effects caused by air pollution through various statistical analyses, including time-series studies and casecontrol studies [6,7]. Studies that incorporate the geospatial analysis method are actively being conducted. The geographic information system can effectively analyze spatial information by managing and analyzing spatial data obtained through a precise and systemic review of the physical space [8]. Spatial analysis has become an effective tool in the field of environmental epidemiology [9].

Health levels vary between different regions, and thus in the field of environmental health, it is important to identify regional characteristics and distribution. Health effects due to exposure to environmental pollutants are subject to spatial analysis, and it is necessary to determine the spatial correlation between them [10].

In a spatial epidemiology, spatial analyses such as measuring distance, identifying clusters, spatial smoothing and interpolation, and spatial regression were used, and such methods are widely used when researching health effects with respect to air pollution [11]. To identify the spatial correlation between potential health effects and environmental hazards, it is useful to determine cluster of health effects, assess the spatial distribution of risk for disease, and conduct a spatial analysis comparing environmental data with health data [12,13].

In this study, the spatial analyses were used to determine the correlation between PM_{10} and cardiovascular mortality in the Seoul metropolitan area. To identify the characteristics of spatial distribution with regard to PM_{10} and cardiovascular mortality, spatial analysis was conducted in each location, and cluster analysis was conducted to examine areas with high a cardiovascular mortality rate. Moreover, in order to determine the spatial correlation between PM_{10} and cardiovascular mortality, spatial distribution of PM_{10} and cluster areas of cardiovascular mortality were compared, and a spatial correlation was assessed quantitatively using regression analysis.

## Materials and Methods

### Study Scope and Data

The study areas were 79 provincial districts in Seoul, Incheon, and Gyeonggi-do within the metropolitan area (25 in Seoul, 11 in Incheon, and 44 in Gyeonggi-do). The metropolitan area is 11,745 km^{2}, covering approximately 11.8% of South Korea’s land area (99,720 km^{2}). As of 2010, the population of Seoul was 9,794,000, that of Incheon was 2,663,000, and that of Gyeonggi-do was 113,790,000. These three areas account for 49.1% of the entire South Korean population.

As of 2010, the National Institute of Environmental Research has been operating 102 urban air monitoring stations in the metropolitan area in order to assess the level of air quality in the city. Data from the urban air monitoring network were used for PM_{10}, and the β-ray absorption method is being used to measure PM_{10} on an hourly basis.

We obtained daily counts of deaths between January 1, 2008 and December 31, 2010 from the National Statistical Office, South Korea. This study focused on deaths caused by cardiovascular-related diseases (International Classification of Diseases 10th revision [ICD-10], code I00-I99) within the metropolitan area, based on the time of death.

### Spatial Distribution of PM_{10}

To investigate the spatial distribution of PM_{10}, kriging, which is widely used to estimate the spatial distribution of environmental factors, was applied. To estimate the concentration of PM_{10} unobserved in this study, ordinary kriging was applied by using the spherical theoretical variogram model as the weight. Compared to the Gaussian model or the exponential model, the calculated error value is minimal in the spherical model [14]. In general, the formula for ordinary kriging is as follows:

Here, z^{*} is the critical value of the given point, z_{i} is the value from the established point, γ_{i} is the weight for each of the neighboring data used, and n is the number of data used for the kriging calculation. Weight is a function of distance, and it should be determined so that there is minimal variance of estimation between the predicted value and the observed value.

To assess the concentration of PM_{10} in the Seoul metropolitan area between 2008 and 2010, the concentration of PM_{10} in the 79 provincial districts was calculated using kriging and zonal statistics via ArcGIS 10.1 (ESRI Inc., Redlands, CA, USA).

### Software for The Spatial and Space-Time Scan Statistic

To assess the overall spatial correlation, Moran’s index (I) was calculated for the distribution of cardiovascular mortality [15]. Moran’s I ranges from -1 to +1, with a positive/negative sign representing positive/negative spatial autocorrelation [16]. OpenGeoDa was used to determine the spatial autocorrelation of cardiovascular mortality.

Cluster detection was performed using software for the spatial and space-time scan statistic (SaTScan) version 9.0 (Martin Kulldorff, Boston, MA, USA) with a Poisson model. Statistical cluster areas of hot spots are mapped based on SaTScan’s likelihood ratio [17]. This study used SaTScan’s Poisson model, and the likelihood ratio γ of the Poisson model is as follows:

Here, *C* is the entire value and *c* is the observed value. *E*[*c*] is the predicted value of the observed value and *I*() is the index factor.

### Geographically Weighted Regression

Geographically weighted regression (GWR) was performed in order to analyze the spatial correlation between PM_{10} and cardiovascular mortality. Compared to general regression analysis, which represents the study area with one regression coefficient, GWR estimates regression coefficients regionally, thereby explaining the changes in the independent variable depending on the location of the spatial unit [18].

GWR uses the weighted matrix (W) between regions to calculate *β _{ji,}* each of the region’s regression coefficients, with regard to variable

*j*, which depends on the geographical location

*i*. GWR’s regression analysis is as follows:

Here, *i* is the location of an area, *Y* represents cardiovascular mortality, the independent variable is the PM_{10} concentration, and the regression coefficients and error term are represented as *β* and *ϵ,* respectively. The geographical weighted estimate of regression coefficient (*i* [19]. The GWR used ArcGIS version 10.1 to determine the relationship between PM_{10} and cardiovascular mortality.

## Results

Table 1 displays the basic statistics of the number of cardiovascular mortality and PM_{10} concentration from 2008 to 2010, which were identified through kriging. In 2010, the annual average concentration of PM_{10} was 52.06 μg/m^{3}, showing considerable improvement in contrast to 2008 and 2009. The number of cardiovascular mortality was maintained at a constant level during the research period, and the number of cases over three years was 778.34.

The global Moran’s I showed that the reported rates were 0.36 in 2008, 0.29 in 2009, 0.46 in 2010, and 0.44 between 2008 and 2010, all of which proved to be statistically significant. The Moran’s I was 0.2 or higher, indicating that the cardiovascular mortality presented a spatial autocorrelation, and in particular, Moran’s I was higher than 0.4 in the case of cardiovascular mortality in 2010, along with the period between 2008 and 2010, signifying a strong spatial autocorrelation.

Standardized mortality ratio (SMR) and the results of spatial scan statistics for cardiovascular mortality are shown in Figure 1. The SMR for each provincial district (gu) of the Seoul metropolitan area was 0.55–1.38 in 2008, 0.51–1.48 in 2009, and 0.70–1.69 in 2010. The SMR was higher within Incheon and the northern part of Gyeonggi-do compared to other regions between 2008 and 2010. In contrast, the SMR of Seoul and the neighboring Gyeonggi-do area was lower than that in other regions.

The difference in the likelihood ratio determined the cluster region, and mortality cluster (MC) 1 indicates the likelihood ratio within the highest risk cluster. Regions MC 1 and MC 2 were categorized as the provincial district within Incheon and the northern part of Gyeonggi-do, respectively.

Table 2 displays the results of SaTScan such as the population exposed to risk within the cluster as a hot spot for cardiovascular mortality between 2008 and 2010, incidences of cardiovascular mortality as well as relative risk, and data regarding the cluster, which were statistically significant.

The concentration of PM_{10} was categorized in four equal intervals in order to compare the distribution of regional clusters regarding PM_{10} and cardiovascular mortality. Particulate matter (PM) 1 signifies a region with PM_{10} concentration that is higher than 75%, PM 2 50–75%, PM 3 25–50%, and PM 4 lower than 25%. Figure 2 indicates overlapping spatial distribution regions that are the concentration range of PM_{10} and hot spots of cardiovascular mortality.

Table 3 presents the relationships between MCs and PM_{10} concentration. The results of a comparison of the average PM_{10} range and the cluster as hot spots of cardiovascular mortality accumulated throughout 2008–2010 show that the eight provincial districts of the MC 1 area all coincided with the PM 1 area, which are regions with a high concentration of PM_{10}. A comparison of the corresponding ratio regarding the cardiovascular MC regions by the level of PM_{10} concentration indicates that cluster of cardiovascular mortality correspond mostly with areas of high PM_{10} concentration.

In order to quantitatively analyze the correlation between the SMR of cardiovascular mortality and PM_{10}, spatial regression analysis using the GWR was conducted. This study excluded the 1% region with the highest level of PM_{10} concentration (one provincial district) as an outlier among the 79 provincial districts of the Seoul metropolitan area.

Table 4 presents the results of the correlation of cardiovascular mortality and PM_{10} using the GWR, which signifies a significant relevance between PM_{10} and cardiovascular mortality. The minimum, maximum, and average regression coefficient of PM_{10} and cardiovascular mortality by region in the GWR were 0.799, 1.562, and 0.956, respectively and a difference in the correlation between PM_{10} and cardiovascular mortality was reported. The R^{2}* _{adj}* of the GWR was 0.205, and the result of analyzing the residual of Moran’s I was 0.039, which indicates that the spatial autocorrelation of the residual shows an arbitrary pattern and satisfies the assumption of the basic GWR.

## Discussion

In this research, various spatial epidemiological analyses were applied to the Seoul metropolitan area between 2008 and 2010 in order to determine the correlation between PM_{10} and cardiovascular mortality. Kriging was performed on PM_{10} concentration for to determine spatial distribution, and spatial autocorrelation was determined by using Moran’s I. Disease mapping of cardiovascular mortality was conducted with use of SaTScan in order to explore the geographical distribution, and cluster of cardiovascular mortality were investigated. Also, the spatial correlation between PM_{10} and cardiovascular mortality was identified quantitatively using the GWR.

Disease surveillance researches pursuing the temporal spatial distribution of disease using SaTScan have been actively conducted. Fukuda et al. [20] have investigated clusters of mortality cases due to colon cancer, lung cancer, and breast cancer using SaTScan throughout Japan as the subject matter, and the relation between mortality clusters and social clusters (socio-economic indicators and population density) was analyzed. The results of the analysis revealed that the mortality cluster due to lung cancer and breast cancer deaths were the urban areas. Horst and Coco [21] used SaTScan to analyze clusters of patients visiting basic health services and emergency rooms due to respiratory and gastrointestinal diseases. Su et al. [22] used SaTScan to spatially analyze clusters of cancer patients based on gender, type of cancer, and ethnicity in order to identify the range and characteristics of cancer center service areas.

Also, the following five categories were examined for the purpose of disease surveillance: data processing, analysis methods, technical issues, analysis output, and user’s facility regarding the four spatiotemporal analysis programs (SaTScan, ClusterSeer, GeoSurveillance, and R-Surveillance); results showed that SaTScan was the most robust program for searching clusters of health effects [23].

Disease surveillance entails collecting, analyzing, and interpreting data in order to identify the frequency and distribution of diseases within the population [24]. It is an important process that supplies valuable information for policy decisions and implementation. Utilizing spatial cluster analysis enables basic data to be used for preventative activities by identifying areas with relatively high disease risk as well as disease clusters.

Internationally, research analysis on the relationship between air pollution and health effects is being conducted using various spatial analyses, and a significant spatial correlation between health and air pollution is being reported. Corburn et al. [25] used SaTScan to identify asthma cluster in New York City, and they analyzed the correlation between land use, housing type, and air pollution. The analysis results revealed a quantitative correlation of 69% between asthma cluster and air pollution facilities. Jephcote and Chen [26] utilized the GWR in order to examine the relationship between respiratory hospitalizations of children under the age of 15, socio-economic indicators, and PM_{10} from automobile emissions in Leicester, UK. The results of their study indicated that respiratory hospitalization of children is related to socio-economic indicators and PM_{10} from automobile emissions, and this relationship differed between regions. In this research, a comparison of the PM_{10} and the hot spot of cardiovascular mortality identified through SaTScan showed that in most cases, the cardiovascular mortality cluster was included within the area with high levels of PM_{10}. Also, a significant correlation was identified as a result of analyzing the PM_{10} concentration and cardiovascular mortality using the GWR, and the spatial correlation appeared differently in the result of the regional regression coefficient calculation, which indicates a similar result as the previous research results. The reason that the regional spatial correlation between PM_{10} and cardiovascular mortality appears differently is due to various factors, such as concentration levels and components of the PM_{10}, demographic characteristics, and sensitivity of the population exposed to the PM_{10}, as well as socio-economic factors [27,28].

The spatial analysis using the GWR scientifically suggests regional differences between air pollution and health effects, and conducting policy intervention based on this information can help in the effective distribution of limited resources by prioritizing the policy statement and the target area.

Through the development of a geographic information system and the joining of statistical techniques, a more scientific analysis regarding the relationship between health risks due to environmental exposure and risks within the spatial epidemiology is made possible. Spatial epidemiology involves integrated analysis, such as visualization, exploratory analysis, and modeling based on spatial data regarding environmental exposure and health effects, and it examines the analysis of the spatial correlation and patterns in order to predict and prevent health risk [10,29]. This research utilized distance interpolation, disease mapping, and GWR analysis among various spatial methods in order to identify the correlation between air pollution and health effects, and it produced significant results of the correlation between air pollution and health effects through systematic and quantitative analysis.

However, there are certain limitations to this research. First, it is an ecological study that uses population data, and it does not integrate personalized data due to the limitations of the data. Therefore, the research does not consider personal characteristics or dynamic factors such as population movement because the analysis is based on population data. Second, cardiovascular mortality can occur due to factors other than air pollution, including environmental factors, socio-economic factors, medical factors, and dietary factors. This study only analyzed health effects caused by PM_{10}. In the future, it will be necessary to configure the model to consider a number of factors.

The spatial analyses and findings of the research on the relationship between air pollution and health effects can be utilized as a scientific method to identify vulnerable areas for evidencebased public health policies.

## Acknowledgements

This study is part of the research ‘Evaluation of Impacts of Climate Change on Air Pollution and Health (III)’ conducted by the Korean Environmental Institute, funded by the Ministry of Environment and the National Institute of Environmental Research.

## Notes

The authors have no conflicts of interest with material presented in this paper.