In this study, a new approach of the multivariate regression model has been applied to make a precise mathematical model to determine further drilling for the detailed iron exploration in the Koohbaba area, Northwest of Iran. Furthermore, to figure out the additional drilling locations, the ore length to the total core ratio for the drilled boreholes has been used based on the geophysical exploration dataset. Hence, different regression analyses including linear, cubic, and quadratic models have been applied. In this study, the ore length to the total core ratio of the chosen drilled boreholes has been considered as a dependent variable; besides, the outputs of the magnetic data using the UP10 (10m upward-continuation), RTP (reduction to the pole), and A.S. (analytic signal) techniques have been designated as independent variables. Based on probability value (p-value), coefficients of determination (R2 and ), and efficiency formula (EF), the fourth regression model has revealed the best results. The accuracy of the model has been confirmed by the defined ratio of boreholes and demonstrated by four additional drilled boreholes in the study area. Therefore, the results of the regression analysis are reasonable and can be used to determine the additional drilling for the detailed exploration.
The modern mineral exploration is the definitive aim for the geophysical examination. Hence, several geophysical maps should generate to examine the underground mineral perception . To determine the best zone, drilling of some boreholes is fundamental. Although the most reliable examination for the deposit potential is the drilling , it is the most expensive procedure of the mineral exploration . Therefore, using proper methods is essential to decrease the drilling risks and improve the accuracy of the drilling sites [4,5]. Statistical methods play an important role to enhance the success rate and overcome the cost of mineral exploration [6,7,8].
In the past few years, the quantitative study of geoscientific data has increased rapidly . There are several probabilistic, statistical, and mining models proposed for mineral exploration [4,5], such as logistic regression [2,10], ridge regression , multitemporal nonlinear regression , weight of evidence [13,14], artificial neural networks , Bayesian network classifiers , and multiple regression analyses [17,18]. All of these methods and techniques have shown promising results and have applied successfully to mineral resource appraisal .
Multivariate regression analysis is used successfully to model subsurface mineralization based on the geochemical dataset [19,20], iron mineral mapping , and rock properties predictions  and to improve prediction model to estimate the sediment yield . In this study, a novel application of the multivariate regression method is proposed to determine additional drilling based on the geophysical exploration dataset. To this end, six different types of multivariate regression have been employed for the iron exploration by using several techniques including RTP (reduction to the pole), A.S. (analytic signal), and UP10 (Upward Continuation) to designate magnetic susceptibility concentrations. The outcomes of these methods have been compared with the log reports from eight different boreholes in the Koohbaba (Qoja-Kandi) area, and the results have demonstrated the proper accuracy of the technique. The log reports of the boreholes have been used to perform further drilling for iron exploration. To achieve this goal, multivariate regression analysis of the ground magnetic data layers has been performed.
This study has been conducted at the Koohbaba area within the Urumieh-Dokhtar Magmatic Arc (UDMA), located in the Northwest of Iran. The study location has a surface area of 1.44 km2, positioned between 46°59′40″ and 47°1′50″ east longitude and 36°51′30″ and 36°53′40″ north latitude, East Azerbaijan (Figure 1). Magmatic activity in UDMA was originated in the Eocene and continued to Pliocene with its climax in the Middle Eocene . Moreover, concerning some recent research works [24,25], UDMA was dominated by the Eocene magmatic rocks, and this fact was confirmed by the geochemical analysis .
The UDMA has been dominated by plutonic rocks together with felsic volcanic rocks  and forms intrusive–exclusive complex with over 4 km thickness [28,29] and . There is a wide range of composition in the study area, such as schist and shale (Kahar formation), dolomite and limestone (Elika formation), shale, sandstone and limestone (Shemshak formation), limestone, marl, sand stone, conglomerate, and andesite. Omrani et al.  have explained that UDMA volcanic rocks form a wide range in composition, which include andesite, minor basalts, and dacites. Magnetite associated with andesite units caused iron mineralization in the study area. The UDMA hosts large metal deposits such as iron and copper [32,33], as shown in Figure 2. Moreover, Mansouri et al.  have introduced some iron ore deposits close to the research area.
The regression analysis shows the relation between one or more responses (dependent variables) and one or more predictors (independent variables) and also predicts the values of the responses for a given set of predictors. Usually, these variables are quantitative, i.e., interval or ratio. The following mathematical formula can express the simple relationship between those variables:
Multivariate regression is a beneficial statistical method to evaluate the linear relationships between several independent and dependent variables . Therefore, it is the multiple regression expansion with an equal number of equations as the number of response variables. One advantage of using multivariate analysis is that the type 1 error can be determined, and it does not cover the number of variables . Consequently, this method is applied in this study. The linear regression model can be expressed as follows [36,37]:
In the matrix form, it becomes:
The error assumptions are as the following:
E(ε) = 0
Cov(ε) = σ2I
In multivariate regression, the relationship between the dependent variable Y and the independent variables Xi is measured by the coefficients of determination . This is the most frequent statistical approach to estimate the fitness of the model . The mathematical expressions are as follows:
Ground magnetic data were collected by following the same method used by Mansouri et al.  in the Koohbaba area. The required geophysical data have provided by magnetometer GSM-19T in the research region (±0.2 nT absolute accuracy). As shown in Figure 3, the total magnetic intensity (TMI) represents the magnetic anomalies in the E–W direction in the north and center of the site. In general, there are three dipolar magnetic anomalies (one magnetic anomaly in the north and two magnetic anomalies in the west and east of the center). These three dipolar magnetic anomalies are 130 related to magnetite dikes in andesite units . Accordingly, the multivariate regression method has been used to determine further drilling sites for iron exploration.
The output of the drilled boreholes has been considered as the dependent variable because this variable reveals the accuracy for defining the drilling points. The value for the ratio is between 0 and 1, and for the best boreholes, it is closer to one. The log reports have been collected, and the magnetite thickness of each has been measured respect to the total core (Table 1). The accepted boundary for the ore length is 20% of the total Fe. Besides, Table 2 presents the statistical factors of the ratio using the regression analysis.
|Borehole id||Total core (m)||Magnetite range (m) (grades greater than 20% Fe total)||Magnetite thickness (m) in total core (grades greater than 20% Fe total)||Ore length/total core|
|Dependent||Sample||Y||ore length/total core||8||0.156||0.13||0.522||−0.436|
|Unique independent||Raster maps (pixel)||X1||UP10||8||52261.8||3464.8||0.065||−1.273|
In the regression analysis, selecting the independent variable is essentially important as these variables must be relevant to the models. Therefore, to make the model, three geophysical raster maps such as upward continuation (UP10), reduce to pole (RTP), and analytic signals (A.S.) have been generated by using Oasis Montaj V.8.4. The upward continuation method is a proper method for deep and semi-deep iron and porphyry-Cu deposit exploration . This method distinguishes the magnetic field far away from the source, and it can decrease the effect of shallow magnetic frequency to create a better map. Figure 4a shows the UP10 map in the Koohhbaba area.
The RTP approach converts magnetic anomalies to a symmetrical pattern, and it can make the magnetic anomalies shape with higher accuracy to the spatial site. Therefore, the magnetic anomalies can interpret much easier [44,45]. In this study, the TMI map has been converted to RTP by using a magnetic declination (4.93) and inclination (55.43). Figure 4b shows the RTP raster map of the study location. The A.S. technique is a wildly known filter to enhance the magnetic field and for locating the magnetic anomalies edges. The basic concept of this method has been discussed in the literature [29,46,47]. The A.S. raster map is shown in Figure 4c.
After generating maps, the values of UP10, RTP, and A.S. raster layers have been extracted in the exact location of eight boreholes (dependent variable); consequently, the values of three unique independent variables (UIVs) have been obtained. The statistical parameters of UIVs (UP10, RTP, and A.S.) have been used in the regression modeling (Table 2).
In the regression analysis such as linear regression, to have the best model, it is essential to create many models . In this study, six types of multivariate regression analyses, including linear, quadratic, and cubic models, have been employed (Table 3) to find out the best drilling points for iron exploration.
|Types of regression||Number of coefficients||Formula|
The models became more complex, from Y1 to Y6 as the coefficients increased too. To select the best model, there are some criteria considered for the regression analysis. The values for R2, , and p-value (ANOVA test) for different regression models is presented in Table 4. Besides, the unknown regression coefficients values are presented in Tables 5 and 6. Other independent variables, which are not presented in tables, are excluded, and they do not affect the statistical models.
|Model 1||Model 2||Model 3|
|Variables||Coefficients ( )||Variables||Coefficients ( )||Variables||Coefficients ( )|
|Model 4||Model 5||Model 6|
|Variables||Coefficients ( )||Variables||Coefficients ( )||Variables||Coefficients ( )|
To select the best model, some criteria have to been considered. First, the computed variance and random error mean values confirmed the acceptable value for all regression models. Furthermore, based on Table 3, the p-value (ANOVA test) of the models is acceptable (≪0.05). Therefore, all these values demonstrated the accuracy of the regression models.
As Table 4 represents, the lowest value for R2 has been obtained by the first model (Y1), and the highest one has been achieved by the last three models (Y4, Y5, and Y6). The fourth model is a second-degree function, and it has lower complexity than other models (Table 3). Even though R2 is a proper parameter for examining the model with the same number of independent variables, it cannot be adequate for comparing the models with different numbers of independent variables as increasing the number of independent variables will increase the R2 values. Hence, has been computed, and model number 4 has indicated a fitted model with the highest value (0.966). Therefore, based on the results obtained by coefficients of determination, model Y4 is the most appropriate model in this study, and it can be applied to determine further drilling exploration sites in the study area.
According to the EF values (Table 4), Y4 is considered as the best model, followed by Y5 and Y6. This result confirmed the output result from the regression analysis by considering the coefficients of determination and p-values (ANOVA test). To determine the further drilling sites in the study area (Koohbaba), the raster map is obtained from Y4. To generate the intended figures, ArcGIS V.10.1 has been employed (with using the raster calculator toolbox). Figure 5a represents the final raster map of the study by considering the fourth regression model.
Moreover, four new boreholes (borehole No. 9–12) have been drilled in the study area, belonging to class 5. The result is presented in Table 7 and Figure 5. The accepted boundary for the magnetic thickness is 20% of the total Fe. Concerning this information, the final results are very promising and appropriate.
|Borehole ID||Total core (m)||Magnetite range (m)||Magnetite thickness (m)||Ore length/total core|
The conclusions are presented as follows:
Regression analysis is a proper and direct statistical method to identify the potential favorable drilling exploration sites with high accuracy in Koohbaba, Northwest of Iran.
The application of the multivariate regression analysis has been confirmed in this area. In this research study, multivariate regression has been developed to create a mathematical model (with reasonable accuracy) for iron mineral exploration by using geophysical data as a new approach.
Six different types of multivariate regression models such as two linear, two quadratic, and two cubic equations have been employed to identify the additional drilling area. According to the results of the coefficients of determination (R2 and ), p-value, and EF, the fourth regression model (quadratic equation) has been the best response, and it has been confirmed by the ratio (ore length/total core) values of the former drilled boreholes and the further drilled boreholes.
The accuracy of the model has been approved by drilling four new boreholes. This additional field investigation has shown promising results.
The results confirm that the regression analysis model using the geophysical data is an effective approach as it can reduce the time and cost of exploration.
Conclusively, 17092.11 m2 of the study area has been considered as a suitable candidate zone for the detailed studying and determining additional drilling for iron exploration in the area of interest.
 Oh HJ, Lee S. Regional probabilistic and statistical mineral potential mapping of gold–silver deposits using GIS in the Gangreung area, Korea. Resour Geol. 2008 Jun;58(2):171–87. Search in Google Scholar
 Harris DV, Pan G. Mineral favorability mapping: a comparison of artificial neural networks, logistic regression, and discriminant analysis. Nat Resour Res. 1998 Aug;8(2):17. Search in Google Scholar
 Marjoribanks RW. Geological methods in mineral exploration and mining. 2nd edn. Berlin, New York: Springer; 2009. Search in Google Scholar
 Xiong Y, Zuo R, Carranza EJM. Mapping mineral prospectivity through big data analytics and a deep learning algorithm. Ore Geol Rev. 2018 Nov;102:811–7. Search in Google Scholar
 Mansouri E, Feizi F, Rad AJ, Arian M. Remote-sensing data processing with the multivariate regression analysis method for iron mineral re- source potential mapping: a case study in the Sarvian area, central Iran. Solid Earth. 2018 Mar;9(2):373–84. Search in Google Scholar
 Chen Y, Wu W. Mapping mineral prospectivity using an extreme learning machine regression. Ore Geol Rev. 2017 Jan;80:200–13. Search in Google Scholar
 Ramezanali AK, Feizi F, Jafarirad A, Lotfi M. Application of best-worst method and additive ratio assessment in mineral prospectivity mapping: a case study of vein-type copper mineralization in the Kuhsiah- e-Urmak area, Iran. Ore Geol Rev. 2020 Feb;117:103268. Search in Google Scholar
 Feizi F, KarbalaeiRamezanali A, Mansouri E. Calcic iron skarn prospectivity mapping based on fuzzy AHP method, a case study in Varan area, Markazi province, Iran. Geosci J. 2017 Feb;21(1):123–36. Search in Google Scholar
 Chen C, Dai H, Liu Y, He B. Mineral prospectivity mapping integrating multi-source geology spatial data sets and logistic regression modelling. Proceedings 2011 IEEE international conference on spatial data mining and geo- graphical knowledge services. Fuzhou, China: IEEE; 2011 Jun. p. 214–7 Search in Google Scholar
 Xiong Y, Zuo R. GIS-based rare events logistic regression for mineral prospectivity mapping. Comput Geosci. 2018 Feb;111:18–25. Search in Google Scholar
 Hang R, Liu Q, Song H, Sun Y, Zhu F, Pei H. Graph regularized nonlinear ridge regression for remote sensing data analysis. IEEE J Sel Top Appl Earth Observ Remote Sens. 2017 Jan;10(1):277–85. Search in Google Scholar
 Kim HJ, Seo DK, Eo YD, Jeon MC, Park WY. Multi- temporal nonlinear regression method for landsat image simulation. KSCE J Civ Eng. 2019 Feb;23(2):777–87. Search in Google Scholar
 Tangestani MH, Moore F. Porphyry copper potential mapping using the weights- of- evidence model in a GIS, northern Shahr-e-Babak, Iran. Aust J Earth Sci. 2001 Oct;48(5):695–701. Search in Google Scholar
 Agterberg FP, Bonham-Carter GF. Measuring the performance of mineral-potential maps. Nat Resour Res. 2005 Mar;14(1):1–17. Search in Google Scholar
 Singer DA, Kouda R. Application of a feedforward neural network in the search for Kuroko deposits in the Hokuroku district, Japan. Math Geol. 1996 Nov;28(8):1017–23. Search in Google Scholar
 Porwal A, Carranza EJM, Hale M. Bayesian network classifiers for mineral potential mapping. Comput Geosci. 2006 Feb;32(1):1–16. Search in Google Scholar
 Carranza EJM. Catchment basin modelling of stream sediment anomalies revisited: incorporation of EDA and fractal analysis. Geochem Explor Environ Anal. 2010 Nov;10(4):365–81. Search in Google Scholar
 Carranza EJM. Mapping of anomalies in continuous and discrete fields of stream sediment geochemical landscapes. Geochem Explor Enviro Anal. 2010 May;10(2):171–87. Search in Google Scholar
 Granian H, Tabatabaei SH, Asadi HH, Carranza EJM. Multivariate regression analysis of lithogeochemical data to model subsurface mineralization: a case study from the Sari Gunay epithermal gold deposit, NW Iran. J Geochem Explor. 2015 Jan;148:249–58. Search in Google Scholar
 Ramezanali AK, Feizi F, Jafarirad A, Lotfi M. Geochemical anomaly and mineral prospectivity mapping for vein-type copper mineralization, Kuhsiah-e-Urmak area, Iran: application of sequential gaussian simulation and multivariate regression analysis. Nat Resour Res. 2020 Feb;29(1):41–70. Search in Google Scholar
 Ma YZ. Pitfalls in predictions of rock properties using multivariate analysis and regression methods. J Appl Geophys. 2011;75:390–400. Search in Google Scholar
 Grauso S, Pasanisi F, Tebano C, Grillini M, Peloso A. Investigating the sediment yield predictability in some Italian rivers by means of hydro-geomorphometric variables. Geosciences. 2018 May;8:249. Search in Google Scholar
 Kananian A, Sarjoughian F, Nadimi A, Ahmadian J, Ling W. Geochemical characteristics of the Kuh-e Dom intrusion, Urumieh–Dokhtar magmatic arc (Iran): implications for source regions and magmatic evolution. J Asian Earth Sci. 2014 Aug;90:137–48. Search in Google Scholar
 Yeganehfar H, Ghorbani MR, Shinjo R, Ghaderi M. Mag- matic and geodynamic evolution of Urumieh–Dokhtar basic volcanism, Central Iran: major, trace element, isotopic, and geochronologic implications. Int Geol Rev. 2013 April;55(6):767–86. Search in Google Scholar
 Babazadeh S, Ghorbani MR, Cottle JM, Bröcker M. Multistage tectono-magmatic evolution of the central Urumieh-Dokhtar magmatic arc, south Ardestan, Iran: insights from zircon geochronology and geochemistry. Geol J. 2019 July;54((4):2447–71. Search in Google Scholar
 Omrani J, Agard P, Whitechurch H, Benoit M, Prouteau G, Jolivet L. Arc-magmatism and subduction history beneath the Zagros mountains, Iran: a new report of adakites and geodynamic consequences. Lithos. 2008 Dec;106(3–4):380–98. Search in Google Scholar
 Arian M. Physiographic-tectonic zoning of Iran’s sedimentary basins. Open J Geol. 2013;3(3):169–77. Search in Google Scholar
 Alavi M. Tectonics of the Zagros erogenic belt of Iran: new data and interpretations. Tectonophysics. 1994;229:211–38. Search in Google Scholar
 Ramezanali AK, Mansouri E, Faranak F. Integration of aeromagnetic geophysical data with other exploration data layers based on fuzzy AHP and C-A fractal model for Cu-porphyry potential mapping: a case study in the Fordo area, central Iran. Boll di Geofisica Teorica ed Applicata. 2017;58(1):55–73. Search in Google Scholar
 Feizi F, Mansouri E, Ramezanali AK. Prospecting of Au by remote sensing and geochemical data processing using fractal modelling in Shishe-Botagh, area (NW Iran). J Indian Soc Remote Sens. 2016 Aug;44(4):539–52. Search in Google Scholar
 Mansouri E, Feizi F, Karbalaei-Ramezanali A. Identification of magnetic anomalies based on ground magnetic data analysis using multifractal modelling: a case study in Qoja-Kandi, East Azerbaijan province, Iran. Nonlinear Process Geo-Phys. 2015 Oct;22(5):579–87. Search in Google Scholar
 Feizi F, Karbalaei-Ramezanali A, Tusi H. Mineral potential mapping via TOPSIS with hybrid AHP–shannon entropy weighting of evidence: a case study for porphyry-Cu, Farmahin area, Markazi province, Iran. Nat Resour Res. 2017 Oct;26(4):553–70. Search in Google Scholar
 Feizi F, Mansuri E. Separation of alteration zones on ASTER data and integration with drainage geochemical maps in Soltanieh, Northern Iran. Open J Geol. 2013;03(02):134–42. Search in Google Scholar
 Duleba A, Olive D. Regression analysis and multivariate analysis. Sem Reprod Med. 1996 May;14(2):139–53. Search in Google Scholar
 Rencher AC. Methods of multivariate analysis. Wiley series in probability and mathematical statistics. 2nd edn. New York: Wiley; 2002. Search in Google Scholar
 Chung CF, Agterberg FP. Regression models for estimating mineral resources from geological map data. J Int Assoc Math Geol. oct 1980;12(5):473–88. Search in Google Scholar
 Zhang D. A coefficient of determination for generalized linear models. Am Stat. 2017 Oct;71(4):310–6. Search in Google Scholar
 Johnson RA, Wichern DW. Applied multivariate statistical analysis. 6th edn. New Jersey: Peardon; 2007. Search in Google Scholar
 Chung CJF, Van Westen CJ. Multivariate regression analysis for landslide hazard zonation. Geographical information systems in assessing natural hazards, vol. 5. Netherlands, Dordrecht: Springer; 1995. p. 107–33. Search in Google Scholar
 Scott AJ, Holt D. The effect of two-stage sampling on ordinary least squares methods. J Am Stat Assoc. 1982;77(380):7. Search in Google Scholar
 Chang J, Olive DJ. OLS for 1D regression models. Commun Stat Theory Methods. 2010 May;39(10):1869–82. Search in Google Scholar
 Akossou AYJ, Palm R. Impact of data structure on the estimators R-square and adjusted R-square in linear regression. Int J Math Comput. 2013;20:10. Search in Google Scholar
 Cornell JA. Factors that influence the value of the coefficient of determination in simple linear and nonlinear regression models. Phytopathology. 1987;77(1):63. Search in Google Scholar
 Abedi M, Norouzi GH. Integration of various geophysical data with geological and geochemical data to determine additional drilling for copper explo- ration. J Appl Geophys. 2012 Aug;83:35–45. Search in Google Scholar
 Golshadi Z. Interpretation of magnetic data in the Chenar-e Olya area of Asad- abad, Hamedan, Iran, using analytic signal, euler deconvolution, horizontal gradient and tilt-derivative methods. Boll di Geofisica Teorica ed Applicata. 2016 Dec;57(4):329–42. Search in Google Scholar
 Nabighian MN. The analytic signal of two-dimensional magnetic bodies with polygonal cross-section: its properties and use for automated anomaly interpretation. Geophysics. 1972 Jan;37(3):507–17. Search in Google Scholar
 Nabighian MN. Additional comments on the analytic sig- nal of two-dimensional magnetic bodies with polygonal cross- section. Geophysics. 1974 Feb;39(1):85–92. Search in Google Scholar
 Li J, Heap AD. A review of spatial interpolation methods for environ- mental scientists. Canberra: Geoscience Australia; 2008. p. 154. Search in Google Scholar
 Vicente-Serrano SM, Saz-Sanchez MA, Cuadrat JM. Comparative analysis of inter- polation methods in the middle Ebro valley (Spain): application to annual precipitation and temperature. Clim Res. 2003;24:161–80. Search in Google Scholar