Abstract
The objective of this study is to evaluate the efficiency of two wellknown algorithms (Ocean Colour 4 for MERIS [OC4Me] and neural net [NN]) used in the calculation of chlorophylla (Chla) from the Sentinel3 Ocean and Land Colour Instrument (OLCI) compared to in situ measurements covering the Mediterranean Sea. In situ data set, obtained from the Copernicus Marine Environmental Monitoring Service (CMEMS) and more specifically from the data set with the title INSITU_MED_NRT_OBSERVATIONS_013_035, and Chla values at different depths were extracted. The concentration of Chla at a penetration depth was calculated. Then, water was classified into two categories, Case1 and Case2. For Case2 waters, the OC4Me presents a moderate correlation with the in situ data for a time window of 0–2 h. In contrast with the NN algorithm, where very weak correlations were calculated, lower values of the statistical index of Bias for Case1 waters were calculated for the OC4Me algorithm. Higher values of Pearson correlation were calculated (r > 0.5) for OC4Me algorithm than NN. OC4Me performed better than NN.
1 Introduction
Chlorophylla (Chla) can absorb light, and it can use this energy as a fuel for photosynthesis. It can be considered as a proxy of the biomass of the phytoplankton species [1]. The concentration of Chla is considered to be a critical water quality parameter for many environmental issues such as eutrophication. Eutrophication can lead to severe outcomes for the aquatic environment, namely an increase in hypoxia, fish deaths and the presence of harmful algae [2,3].
Chla can be measured using various methods such as spectrophotometry, highperformance liquid chromatography (HPLC), and fluorometry. However, these methods require an experienced analyst to generate consistently efficient results. Moreover, for various reasons such as the need for collecting samples at regular time intervals, there is an inability to continuously monitor Chla [4]. Satellites can provide and improve the temporal and spatial distributions of Chla compared to in situ sampling. Satellite images could cover large water areas in terms of spatial and temporal requirements in contrast with field methods, where a lot of measurements would be needed.
Chla can be estimated by satellites using empirical, analytical, and semianalytical methods. Empirical algorithms are calculated by statistical regression [5] or endmember selection [6]. Analytical algorithms are established by simplified solutions of the radiative transfer equation. Semianalytical methods can be classified between the one’s named analytical and the one’s classified as empirical, and the techniques used are a combination of empirical and analytical. Other techniques include spectral inversion procedures, which match spectral measurements to biooptical forward models [7].
Areas, which cannot be described by only one optical constituent of the water column, are named as Case2 waters. Different substances, particulate compounds, a large variety of organic macromolecules, living organisms such as phytoplankton, zooplankton, bacteria, and their debris, and excrements exist in the water column. All these have various optical properties in regard to scattering, absorption, and partly fluorescence [8]. In Case2 waters, their optical properties are controlled not only by phytoplankton and other particles such as coloured dissolved organic matter but also by substances, such as inorganic particles in suspension and yellow substances. Models can also be applied in areas where the phytoplankton is the main component influencing the optical properties of the water column [9], known as Case1 waters.
Various sensors have been used to estimate Chla, including Landsat, MERIS, MODIS [10], and hyperspectral systems [11]. Sensors such as MODIS, SeaWiFS, and MERIS have high temporal acquisition and bands ideally positioned for the detection of water quality parameters such as Chla. On the contrary, multispectral sensors such as Landsat, IKONOS, and SPOT have few broadbands with higher spatial resolutions (4–30 m) and are primarily designed for terrestrial applications [10].
In addition, several studies have been performed to calculate Chla in the Mediterranean Sea [12,13]. For example, in 2007, the SeaWiFS satellite was used for the calculation of Chla at the Mediterranean Sea, because standard algorithms named OC4v4, OC3, and Algal1 failed [12]. In situ, Chla and optical measurements were considered for the validation of two standard regional (named BRIC and DORMA) biooptical algorithms and a global one (OC4v4). DORMA is based on an Ocean Colour 2 band algorithm (OC2). BRIC is a regional algorithm for the retrieval of Chla from SeaWiFS in oligotrophic conditions. The results of this study indicate that OC4v4 performed better than BRIC and DORMA. Furthermore, a new regional algorithm was developed, named Mediterranean OceanColour 4 bands (MedOC4). The MedOC4 had the best performance because it minimized the difference between concentrations of Chla calculated from the satellite and the in situ measurements [12].
In 2012, a computing system consisting of three separate processing chains was developed for the calculation of water quality parameters such as Chla and attenuation coefficient by the Satellite Oceanography Group (GOS) of Rome [13]. It was designed for the study area of the Mediterranean and the Black Seas. This method was applied to the satellite data of SeaWiFS, MODISAqua, and MERIS. Validation of the Chla derived from the satellite data with in situ measurements was performed. The in situ Chla data sets consisted of 21 cruises, covering different areas over the Mediterranean Sea and one fixed station located in Italy. SeaWiFS Chla product performed better over the Mediterranean Sea in comparison to those of MODIS and MERIS. Despite their good general agreement with in situ observations, MODIS and MERIS derived Chla showed an underestimation correlated with the in situ data [13]. In 2016, the Ocean and Land Colour Instrument (OLCI) was launched on board of Sentinel3A, which was based on ENVISAT’s Medium Resolution Imaging Spectrometer (MERIS). Compared to MERIS, a higher number of bands (21 instead of 15) and an improved temporal coverage of the global ocean (<4 days instead of approximately 15 days) were provided [14].
The main purpose of the present research is to evaluate the efficiency of algorithms used by Sentinel3, for the calculation of Chla, with in situ data covering the Mediterranean Sea. An additional question of this paper, which was later considered, is to calculate the optimum time window for the best correlation between the Chla calculated from the satellite and the in situ data set. The time window is defined as the time difference between the in situ and the satellite measurements. The results are summarized in scatter plots and tables. Two algorithms were tested named Ocean Colour 4 for MERIS (OC4Me) and neural net (NN). OC4Me applies a fourthorder polynomial equation and a Maximum Band Ratio (MBR) between the irradiance reflectance at the wavelength of 443, 490, 510, and 555. NN uses an Inverse Radiative Transfer Model and neural networks to estimate the water constitutes.
2 Methods and materials
2.1 Study area
The study area is defined as the Mediterranean Sea, which is considered to be a midlatitude, mostly oligotrophic and ultraoligotrophic basin. Higher values of biomass and consequently of Chla may be found in areas influenced by river runoff or by deep convection events [15,16]. The Mediterranean Sea is regarded as a good oceanographic test area because of its complex ocean dynamics and its dense anthropogenic pressure [15]. A map that was created to illustrate the position of the in situ measurements and the area of study is shown in Figure 1. The in situ measurements used for the evaluation were selected based on the availability of satellite data at a similar date.
The points (in situ measurements) are spread in different areas across the Mediterranean Sea and more specifically at the Ionian Sea, Ligurian Sea, and the Tyrrhenian Sea. Nevertheless, the majority of them are concentrated in the Ligurian Sea.
An openaccess data set from the Copernicus Marine Environmental Monitoring Service (CMEMS) with in situ Chla concentrations (INSITU_MED_NRT_OBSERVATIONS_013_035) was chosen for the comparison with the corresponding satellite data. Measurements of Chla fluorescence are saved in variable FLU2 [17]. The WGS84 geographic coordinate system was selected as the most appropriate one for the study. The applied in situ parameters from CMEMS representing Chla concentrations were Chla fluorescence (FLU2), Chla (CPHL), and Chla adjusted (CPHL_ADJUSTED). According to CMEMS, the CPHL_ADJUSTED refers to quality checked Chla and eventually corrected data. The measurements of Chla are always related to pressure or depth. A total of 67 NetCDF files with in situ measurements of Chla were initially downloaded. Files containing the variable CPHL were used for the comparison, because of the lack of satellite images for the rest of the in situ data set. Those NetCDF files contained a set of 692 points with in situ data which were selected for the comparison.
The Sentinel3 satellite data considered in this study were acquired between the years 2016–2018. Sentinel3 uses the OLCI instrument, which is an imaging spectrometer measuring solar radiation reflected by the earth. Compared to the MERIS satellite, it provides additional spectral channels, different camera arrangements, and simplified onboard processing. The Level 2 full resolutions (WFR) nontime critical (NTC) products of Sentinel3 were selected, providing a good spatial resolution of 300 m and a temporal resolution of 1–3 days, depending on the region of interest. The satellite has 21 spectral bands ranging from 400 to 1020 nm [14]. The characteristics of the bands are presented in Table 1.
Band  Central wavelength (nm)  Bandwidth (nm) 

Oa1  400  15 
Oa2  412.5  10 
Oa3  442.5  10 
Oa4  490  10 
Oa5  510  10 
Oa6  560  10 
Oa7  620  10 
Oa8  665  10 
Oa9  673.75  7.5 
Oa10  681.25  7.5 
Oa11  708.75  10 
Oa12  753.75  7.5 
Oa13  761.25  2.5 
Oa14  764.375  3.75 
Oa15  767.5  2.5 
Oa16  778.75  15 
Oa17  865  20 
Oa18  885  20 
Oa19  900  10 
Oa20  940  20 
Oa21  1,020  40 
Two algorithms were examined on their efficiency on the estimation of the Chla from Sentinel3 OLCI satellite data. These two algorithms are the OC4Me MBR and NN as referenced in ESA [14] and EUMETSAT [18]. Level 2 products of Sentinel3 use the mentioned algorithms for the calculation of Chla. The algorithms are widely known and freely available to the public. Moreover, the focus of this study is not to calibrate or develop new algorithms rather than to test the ones used by Sentinel3 OLCI satellite data.
A total of 114 images from Sentinel3 were found to be appropriate for the comparison with the in situ data. The images were selected based on the sensing date which had to be the same as the date of the in situ measurement.
The methodology of our analysis was separated into two main parts. The first one consisted of processes for the in situ data and the second one of processes for the satellite data set. A flow diagram was prepared (Figure 2).
2.1.1 Analysis of in situ measurements
Regarding the in situ measurements, the variables FLU2, CPHL, CPHL_ADJUSTED, LATITUDE, and LONGITUDE were extracted from the NetCDF (.nc) files downloaded from CMEMS. For the comparison with the satellite measurements, in situ data containing CPHL were used. To assure the quality of the in situ measurements, the quality controls provided by the CMEMS were taken into consideration. In Table 2, the values and the description of the quality controls are presented. Moreover, the POSITION_QC and the TIME_QC referring to quality controls for the time and position were used.
Code  Meaning  Comment 

0  No QC was performed  — 
1  Good data  All realtime QC tests passed 
2  Probably good data  — 
3  Bad data that are potentially correctable  These data are not to be used without scientific correction 
4  Bad data  Data have failed one or more of the tests 
5  Value changed  Data may be recovered after transmission error 
6  Not used  — 
7  Nominal value  Data were observed but not reported. Example of an instrument target depth 
8  Interpolated value  Missing data may be interpolated from neighbouring data in space or time 
9  Missing value  An observation was performed, but is not available 
Data with quality controls 1, 2, and 7 were selected for the comparison with satellite data. In addition, negative in situ measurements of Chla were excluded. Chla values measured at different depth/pressure were preferred. The meaning of code 7 corresponds to measurements that could not be reported, e.g. a depth of a Chla concentration in this case the nominal values were used. Regarding the transformation of pressure to water depth, the methodology proposed by Saunders was applied [19]. It uses the following two equations that take into account the pressure in decibars (db) and the latitude:
where p is the water pressure in dbar, φ is the latitude in degrees, z is depth in metres, and c _{1} is a function of latitude.
Verification of Chla calculated from the satellite data should be compared to the in situ concentration of Chla named C _{M} [15]. For the calculation of C _{M}, the following formula (equation 3) was applied [15].
where k _{d} is the attenuation coefficient, C(z) is the measured Chla concentration expressed in mg m^{−3}, and Z_{pd} is called the penetration depth [20] and is defined as the inverse of the k _{d} (m^{−1}), and therefore, it can be calculated as Z _{pd} = 1/k _{d} [21].
For the calculation of Chla (equation 3), the measured concentration should be calculated. It follows a Gaussian distribution [22]. Equation (4) is given by:
The concentration of Chla at a specific depth (C(z)) depends on four parameters: the background Chla concentration (B _{0}), the height parameter (h), the width of the peak (σ), and the depth of the Chla peak (z _{m}). They were restrained to only positive values, because of the fact that negative numbers have no meaning [22]. A graph (Figure 3) is presented, showing a typical profile of Chla in relation to depth.
To specify the best fitting of the equation mentioned (equation 4), two statistical indexes (Nash–Sutcliffe coefficient [NS] and Willmott’s index [d]) were determined. A good fit was considered when d ≥ 0.75 and NS ≥ 0.36 [24]. For the parameters of the distributions to have a physical meaning, they were constrained to only positive values [23]. Regarding the satellite data, a matchup file was generated between the dates of the satellite data and the in situ measurements. For each in situ measurement, the corresponding pixel value was used. A window size with 1 × 1 pixel was taken into consideration for the matchup file and pixels identified as “water” were examined as well. The attenuation coefficient is a direct product of Sentinel3 images and consequently, no estimation was needed. It was applied for the calculation of the Z _{pd}.
For the calculation of K _{d}, a semianalytical approach is used from Sentinel3 data, which is based in the addition of spectral inherent optical properties (IOPs), namely the absorption coefficient and the backscattering coefficient, into a solution of the radiative transfer equation. K _{d} is estimated for Case1 waters. A linear relationship between K _{d} is established through statistical analyses of simultaneous field data from Chla. A more detailed description of the estimation of K _{d} can be found in Morel et al. [25].
2.1.2 Analysis of satellite data
Two algorithms were examined on their efficiency on the estimation of the Chla from satellite images: the OC4Me and a NN algorithm. In the first one, Chla is directly calculated using the MBR. The algorithm applies a fourthorder polynomial equation [18] and was developed by Morel et al. [25]. A more general description of the algorithm can be found in the study of O’Reilly et al. [26].
where R = log10 (MBR) and the coefficients A _{0} = 0.450, A _{1} = −3.259, A _{2} = 3.522, A _{3} = −3.359, and A _{4} = 0.949; and C is the derived concentration of Chla in mg m^{−3}.
The MBR (equation 6) is calculated as follows:
where R _{443}, R _{490}, R _{510}, and R _{555} are the irradiance reflectance at the wavelength of 443, 490, 510, and 555 nm, respectively, as it is described by EUMETSAT [18]. Relevant nominal band used by Sentinel3 OLCI at the wavelength of 555 is 560 nm.
The second algorithm, i.e. NN, uses an Inverse Radiative Transfer Model to estimate the water constitutes. It was initially developed by Doerffer and Schiller for Case2 waters [8]. Later, it was updated to become the Case2 Regional/Coast Colour (C2RCC) processor applied by Sentinel3. The processor relies on an extensive database of simulated water leaving reflectance and related topofatmosphere radiances. Neural networks are trained in the inversion of the spectrum for atmospheric correction, as well as the retrieval of IOPs of the water body. The IOPs were then converted into Chla and a total suspended matter concentration using arithmetic conversion factors.
The methodology proposed by Matsushita et al. was selected for distinguishing Case1 from Case2 waters [27]. Case1 waters should satisfy the following relationship (equation 7):
where α is defined as the absorption coefficient at 412 and 443 nm wavelengths.
The last equation can be related to the reflectance ratio between the values at 412 and 443 nm having a number greater than one or equal to one as described by Matsushita et al. [27]. Thus, a comparison of the reflectance calculated at the wavelengths of 412 and 443 nm is applied. If the reflectance at 412 nm is higher than or equal to the one at 443 nm, then the pixel is Case1 water. Otherwise, it must be considered as Case2 water. Moreover, the two algorithms had relevant error layers. A threshold of 0.2 mg m^{−3} to the error layers was applied for both cases to avoid measurements with severe errors. In the study of Bulgarelli and Zibordi, the same can be seen with respect to reflectance at 412 nm having lower values than at 443 nm for Case2 waters [28].
In regard to the Sentinel3 images, they were selected based on the sensing date which had to be the same as the date of the in situ measurement. Images, where Gaussian distributions could not be fitted with the in situ data (which will be described in Section 1.1), were excluded.
Moreover, the Chla calculated from the satellite, using the two algorithms (OC4Me and NN), is always estimated for the euphotic zone.
2.1.3 Comparison of in situ measurements and satellite data
Generally, the time window in the bibliography varies from 8 to 12 h [15]. The described process was performed for different ranges of time windows (0–7 h) for Case1 and Case2 waters accordingly. Therefore, another question was raised, regarding the optimum time window for the best correlation between the Chla calculated from the satellite and the in situ data set.
To validate Chla calculated from satellite data against the corresponding in situ values, statistical indexes (coefficients of determination [r ^{2}], Pearson correlations [r], and Bias) were calculated. The indexes that were applied are the following ones (Table 3).
Statistical index  Equation  Units 

r ^{2} 

— 
r 

— 
Bias 

mg m^{−3} 
y is the value of Chla calculated from the satellite data and
To test the level of significance of a correlation coefficient, a ttest was performed. The equation used to calculate the tstatistic is the following:
where n is the number of measurements, r is the Pearson correlation, and r ^{2} is the coefficient of determination.
Then a pvalue using a Student T distribution output was generated (TDIST function in Microsoft excel). A twotailed distribution and a significance level of 0.05 (α = 0.05) were selected. It was compared with a significance level of 0.05. If the pvalue is less than the significance level, there is a significant linear relationship between the Chla calculated from the in situ and the satellite data.
Because of the large amount of data, code in the programming language of Python was written for several steps of the proposed methodology such as extracting parameters of interest from the NetCDF files, fitting Gaussian distributions as described in equation (4), getting pixel information in the selected longitudes and latitudes from the images, and finally for the validation of the results. Moreover, a geodatabase was constructed with tables and queries for the processing and organization of the data. In addition, Modelbuilder in ArcGIS was applied to make spatial joins for the in situ and the relevant images.
3 Results
Two data sets were created, one with in situ measurements from CMEMS and one with Sentinel3 satellite data. To eliminate possibly incorrect measurements, quality controls provided by CMEMS were applied. The in situ data set included values ranging from 0.01 to 4 mg m^{−3}, which were compared to Chla calculated from the satellite data. Two algorithms (OC4Me and NN) used by the Sentinel3 satellite were evaluated. Case1 and Case2 waters were distinguished, based on the reflectance on the wavelengths of 412 and 443 nm.
The performance of the OC4Me algorithm with the in situ data set for the calculation of Chla was examined. Figure 4 and Table 4 illustrate the results. Figure 4 shows the in situ and the modelled concentrations plotted in a scatter plot for a time window of 0–2 h for Case2 waters. The rest of the scatter plots for Case1 and Case2 are shown in Appendix (Figures A1a–g and A2a–g). In similar studies, a logarithmic scale was used in the scatter plots. An additional reason for using a log scale is to clearly visualize Chla concentrations, derived from in situ and satellite data. Table 4 illustrates basic statistical values calculated for the in situ and the modelled Chla of Case1 and Case2 waters using OC4Me algorithm.
Time window (h)  In situ concentrations (mg m^{−3})  Modelled concentrations (mg m^{−3})  

Case1  Case2  Case1  Case2  
Min.  Max.  Mean  Min.  Max.  Mean  Min.  Max.  Mean  Min.  Max.  Mean  
0–1  0.02  1.61  0.49  0.02  1.83  0.53  0.08  2.66  0.54  0.01  3.83  1.21 
0–2  0.02  1.61  0.47  0.01  1.86  0.56  0.04  2.66  0.46  0.01  3.83  1.23 
0–3  0.02  1.61  0.49  0.01  2.10  0.61  0.04  2.66  0.46  0.01  3.97  1.19 
0–4  0.01  2.03  0.55  0.01  2.57  0.66  0.01  2.76  0.51  0.01  3.97  1.21 
0–5  0.01  3.06  0.61  0.01  2.57  0.70  0.01  2.76  0.52  0.01  3.97  1.23 
0–6  0.01  3.06  0.63  0.01  3.56  0.72  0.01  2.76  0.5  0.01  3.97  1.20 
0–7  0.01  3.65  0.66  0.01  3.89  0.78  0.01  2.76  0.5  0.01  3.97  1.21 
In Figure 4, the modelled concentrations of Chla did not exceed 4 mg m^{−3}. In situ concentrations starting from 0.01 to more than 1 mg m^{−3} were estimated. Very few measurements exceed the green line having slopes less than 0.5. The rest of the measurements are spread between slopes greater than 1/2 (green line) and 0.5. Moreover, an overestimation of the modelled Chla can be observed.
In Table 4, the minimum values of the modelled and in situ data sets are similar for both Case1 and Case2 waters ranging from 0.01 to 0.08 mg m^{−3}. Similar mean values are observed for the in situ values regardless of the time window for Case1 and Case2 waters ranging from 0.47 to 0.78 mg m^{−3}. For the modelled concentrations, the values of the statistics (mean, max) for Case1 are closer to the corresponding in situ measurements. The maximum (max) values of the modelled concentrations calculated for Case1 are lower than the ones of Case2. As the time difference increases, the concentrations observed (Case1 and Case2) are higher for most of the statistical values. Furthermore, an overestimation is observed (mean values) concerning the Case2 waters, whereas in Case1 waters, the in situ and the modelled mean values are very close.
Using the data sets created for the different ranges of time windows, the coefficient of determination, Pearson correlation, pvalues, Bias, and the number of points were calculated, and they are presented in Table 5.
Time window (h)  Case1  Case2  

r ^{2}  r  pvalue  Bias (mg m^{−3})  Number of points  r ^{2}  r  pvalue  Bias (mg m^{−3})  Number of points  
0–1  0.25  0.50  3.0 × 10^{−3}  0.05  33  0.51  0.71  1.2 × 10^{−8}  0.68  48 
0–2  0.39  0.62  2.8 × 10^{−9}  −0.01  74  0.55  0.74  3.8 × 10^{−21}  0.67  114 
0–3  0.39  0.62  1.8 × 10^{−12}  −0.03  103  0.37  0.61  2.1 × 10^{−19}  0.58  178 
0–4  0.31  0.56  1.7 × 10^{−13}  −0.04  149  0.32  0.57  8.7 × 10^{−22}  0.55  241 
0–5  0.37  0.61  6.5 × 10^{−21}  −0.09  193  0.31  0.56  3.0 × 10^{−25}  0.53  293 
0–6  0.37  0.61  5.9 × 10^{−24}  −0.12  223  0.33  0.57  4.0 × 10^{−32}  0.49  350 
0–7  0.35  0.59  1.7 × 10^{−26}  −0.16  266  0.34  0.58  1.0 × 10^{−39}  0.43  421 
For Case1 and Case2, the best coefficient is calculated for a time window from 0 to 2 h having a value of 0.39 and 0.55, respectively. For Case1, the coefficient of determination ranges from 0.25 to 0.39, and for Case2 from 0.31 to 0.55. It is noteworthy that as the time window increases, the coefficient increases for Case1 but decreases for Case2. Correlation can be described by Evans [29].
Therefore, for Case2 the correlation for the time windows of 0–2 h and 0–1 h can be considered as “moderate.” For time windows greater than 2 h, correlations can be considered as “weak.” For Case1, they can be characterized as “weak.” Pearson correlations were calculated for Case1 and Case2 waters with values higher than 0.5 (r ≥ 0.5) indicating a strong correlation between Chla calculated from the satellite and the in situ. All the pvalues estimated for the different time windows and Cases are lower than 0.05; therefore, the relationship between calculated Chla from the satellite and the in situ data set is statistically significant.
In comparison to the r ^{2}, the statistical index of the Bias gets its higher values for Case2. Having a negative or a positive value can show an overestimation or an underestimation of the Chla calculated from the satellite. It varies from −0.16 to 0.05 mg m^{−3} (Case1) and from 0.43 to 0.68 mg m^{−3} (Case2). Moreover, the number of points can play an essential role in the values calculated from the statistical index of Bias. The number of points selected for the validation of Case2 is higher than the corresponding value of Case1. In general, the OC4Me algorithm performs better for Case2 waters in terms of the coefficient of determination having higher coefficients. However, in terms of the statistical index of Bias, OC4Me algorithm performs better for Case1 waters.
The efficiency of the NN algorithm for the calculation of Chla was examined as well (Figure 5 and Table 6). Figure 5 shows the in situ and the modelled concentrations plotted in a scatter plot for a time window of 0–2 h for Case2 waters. The rest of the scatter plots for Case1 and Case2 are shown in Appendix (Figures A3a–g and A4a–g). Table 6 shows some basic statistical values for the in situ and the modelled Chla based on the distinction of Case1 and Case2 waters.
Time window (h)  In situ concentrations (mg m^{−3})  Modelled concentrations (mg m^{−3})  

Case1  Case2  Case1  Case2  
Min.  Max.  Mean  Min.  Max.  Mean  Min.  Max.  Mean  Min.  Max.  Mean  
0–1  0.02  1.22  0.44  0.03  1.19  0.49  0.05  0.70  0.21  0.07  0.75  0.34 
0–2  0.02  1.27  0.44  0.01  1.37  0.49  0.03  0.72  0.22  0.07  0.75  0.31 
0–3  0.02  1.54  0.47  0.01  2.10  0.55  0.03  1.20  0.22  0.07  0.78  0.29 
0–4  0.02  2.03  0.52  0.01  2.57  0.62  0.03  1.20  0.22  0.06  0.93  0.29 
0–5  0.01  2.31  0.58  0.01  2.57  0.67  0.02  1.20  0.21  0.06  1.20  0.29 
0–6  0.01  2.31  0.60  0.01  3.56  0.68  0.02  1.20  0.21  0.06  1.20  0.30 
0–7  0.01  2.31  0.61  0.01  3.89  0.76  0.02  1.20  0.21  0.06  1.20  0.31 
In Figure 5, it is observed that the modelled concentrations of Chla did not exceed 1 mg m^{−3}. In situ concentrations starting from 0.01 to more than 1 mg m^{−3} were estimated. A higher number of values exceed the green line having slopes less than 0.5 in the case of NN algorithm in comparison to the OC4Me, where very few data were estimated (Figure 4). In addition, an underestimation of the modelled Chla can be observed.
In Table 6, the minimum values of the modelled and in situ data sets are similar for both Case1 and Case2 waters ranging from 0.01 to 0.05 mg m^{−3}. Similar mean values are observed for the in situ values regardless of the time window for Case1 and Case2 waters ranging from 0.44 to 0.76 mg m^{−3}. Not very high differences in the modelled concentrations are observed between Case1 and Case2. As the time difference increases, the concentrations observed (Case1 and Case2) are higher in most of the statistics.
Using the data sets created for the different ranges of time windows, the coefficient of determination, Pearson correlation, pvalues, Bias, and number of points were calculated and are presented in Table 7.
Time window (h)  Case1  Case2  

r ^{2}  r  pvalue  Bias (mg m^{−3})  Number of points  r ^{2}  r  pvalue  Bias (mg m^{−3})  Number of points  
0–1  0.0001  0.01  9.58 × 10^{−1}  −0.23  30  0.03  0.17  3.12 × 10^{−1}  −0.15  36 
0–2  0.09  0.30  1.36 × 10^{−2}  −0.23  67  0.16  0.40  1.03 × 10^{−4}  −0.19  89 
0–3  0.15  0.39  1.25 × 10^{−4}  −0.25  93  0.07  0.26  1.58 × 10^{−3}  −0.27  140 
0–4  0.12  0.35  5.79 × 10^{−5}  −0.3  129  0.07  0.26  1.86 × 10^{−4}  −0.33  195 
0–5  0.14  0.37  7.39 × 10^{−7}  −0.36  165  0.1  0.32  7.44 × 10^{−7}  −0.37  235 
0–6  0.14  0.37  8.98 × 10^{−8}  −0.39  192  0.15  0.39  1.14 × 10^{−11}  −0.38  286 
0–7  0.12  0.35  7.41 × 10^{−8}  −0.4  229  0.16  0.40  1.31 × 10^{−14}  −0.45  343 
For Case1 and Case2, the best coefficient is calculated for a time window from 0 to 2 h having a value of 0.09 and 0.16 accordingly. The statistical index of Bias is always negative, indicating an underestimation of Chla from the satellite data. It ranges from −0.15 to −0.45 mg m^{−3} for Case2 and from −0.23 to −0.40 mg m^{−3} for Case1. As the time window increases, a higher underestimation is introduced to our data sets. Following the classification described by Evans [29], all the correlations can be characterized as “very weak”. The number of points considered for the validation of Case2 is higher than of Case1.
For Case1, Case2, and time window 0–1 h, no correlation was estimated (r < 0.3) using Pearson correlation. For the rest of time windows and Case1 waters, values higher than 0.3 and lower than 0.5 (0.3 ≤ r ≤ 0.5) indicate a moderate strength of correlation. For Case2 and time windows 0–3 and 0–4, a weak strength of correlation was estimated. For the rest of Case2 and time windows, a moderate strength of correlation was calculated. All the pvalues estimated for the different time windows and Cases are lower than 0.05 therefore, the relationship between calculated Chla from the satellite and the in situ data set is statistically significant.
4 Discussion
By comparing Tables 5 and 7, “very weak” correlations are estimated using the NN algorithm in comparison with OC4Me, where “weak” to “moderate” correlations were calculated. Moreover, higher values of the statistical index Bias for Case1 waters were calculated for the NN (−0.23 to −0.40 mg m^{−3}) algorithm compared to OC4Me (−0.16 to 0.05 mg m^{−3}). Higher values of Pearson correlations were estimated (r > 0.5) for OC4Me algorithm than NN. For Case2 and time windows 0–1 h, 0–2 h greater values were calculated for r and strong correlations could be observed. Thus, OC4Me performed better than the NN algorithm having higher r ^{2} and r for both Case1 and Case2 waters and lower values of Bias for Case1. The optimum time window was estimated at around 0–2 h using OC4Me, having the best coefficient of determination and a strong correlation between Chla calculated from the satellite and the in situ data set.
The variant number of points selected for the validation in all cases can influence the results of the statistical index of Bias, r ^{2}. Generally, the values calculated for the coefficient of determination, using OC4Me and NN algorithm accordingly, cannot be considered sufficient for any remotesensing product that could be applied in quantitative observation of phytoplankton biomass.
The NN algorithm of Sentinel3 was designed for Case2 waters. The Chla calculated from NNs requires a careful and elaborate determination of the multiple coefficients (training phase) [30]. Using a training data set may improve the calculation of Chla [31]. Consequently, the lack of them could explain the behaviour of the algorithm with the in situ concentrations.
One of the validations of this algorithm was performed using the Mermaid (status of 2012) in situ data set. A mean Bias of −0.21 mg m^{−3} was calculated for cases with high sun glint and an additional one of −0.27 mg m^{−3} without high sun glint. The coefficients applied in the equation for the calculation of Chla were obtained from data of the North Sea by regression [32]. Similar values with the ones calculated in regard to the statistical index of Bias can also be found to the study previously mentioned.
In Tilstone et al. [33], measurements from stations across the North Sea, English Channel, Celtic Sea, Mediterranean Sea, and along the Iberian coast were considered between June 2001 and March 2012. The accuracy of a range of ocean colour Chla was assessed, using two different atmospheric corrections (AC) processors (COASTCOLOUR and MERIS Ground Segment processor version 8.0 – MEGS8.0), in the NorthWest European waters. For the calculation of Chla, various algorithms such as the OC4Me were applied. Using the two mentioned processors (COASTCOLOUR and MEGS8.0) and the OC4Me algorithm for the estimation of Chla, coefficients of determinations were calculated with a value of 0.67 and 0.55 accordingly. In the present study, the best value was estimated using the OC4Me algorithm for Case2 with 0.55, which is close to the study mentioned above. Furthermore, in the paper of Tilstone et al. [33], a NN algorithm similar to the one considered here was applied. A coefficient of 0.61 and 0.31 using the COASTCOLOUR and the MEGS8.0 AC models was estimated accordingly. Therefore, the AC method may play a significant role in the results [33].
Another study performed by Toming et al. [34] at the Baltic Sea calculated similar results regarding the coefficient of determination. It reported an r ^{2} up to 0.56 in coastal waters (R/V Salme measurements) and 0.43 in the open parts of the Baltic Sea (FerryScope measurements).
The attenuation coefficient at 490 nm was taken into account for the calculation of Chla at the penetration depth from the in situ data. This is a direct product of Sentinel3. It was designed for Case1 waters [25]. Considering an attenuation coefficient for Case2, thus calculating the concentration of Chla at the penetration depth more accurately, may improve the results of the statistical indexes calculated for them.
For both cases, another significant uncertainty is the accuracy of the in situ data set. The only way to check the accuracy of the in situ data set considered is through the quality controls provided by the CMEMS. The quality controls do not produce a number specifying the accuracy but offer only descriptive information, which may be not sufficient. Other issues with the in situ could arise from various sources.
The unawareness of the exact kinds of methods, which were applied to produce the in situ data such as spectrophotometry or HPLC, may cause different uncertainties in our results. For example, not knowing the methodologies to derive the in situ Chla from CMEMS may influence the performance of the algorithms. In the work of Santos Dos et al. [35] different types of field methods were used to estimate Chla. Two of them, spectrophotometric and fluorimetric, had overestimated the Chla concentration.
As mentioned in CMEMS, the in situ data set is estimated from several methods from various institutions and researchers. If Chla values were derived from fluorescence data, then in situ concentrations could be a poor proxy to compare it with Chla calculated from satellite data.
Variabilities in the sample collection strategies (i.e. sampler type, depth(s), etc.) may produce differences in the measured in situ Chla values. In situ being collected at different periods under various potential circumstances, waters with low or high turbidity, and at different weather conditions could affect the accuracy of the measurements. Thus, all the mentioned uncertainties could affect the quality of the in situ data set and explain the performances of the algorithms.
Initially, the in situ data set was planned to be distinguished between coastal and oceanic stations; however, the in situ metadata from the CMEMS did not allow such discrimination.
5 Conclusions
In this study, the efficiency of two algorithms, the OC4Me MBR based on a polynomial algorithm and the NN Chla concentration based on an Inverse Radiative Transfer Model, was tested against open access in situ measurements. The in situ data set obtained from the CMEMS with Chla concentrations (INSITU_MED_NRT_OBSERVATIONS_013_035) was elected for the comparison with the Sentinel3 satellite data.
We conclude in “very weak” correlations between the Chla calculated from the NNs and the corresponding in situ data. On the contrary, the OC4Me model of the ocean colour had a better performance with the in situ data set classified from “weak” to “moderate.” Lower values of the statistical index of Bias for Case1 waters were calculated for the OC4Me algorithm. Higher values of Pearson correlation were estimated (r > 0.5) for OC4Me algorithm than NN. For Case2 and time windows 0–2, greater values for r were calculated and strong correlations could be observed. Therefore, OC4Me performed better than NN. The optimum time window was estimated at around 0–2 h, having a high coefficient of determination and a Pearson correlation. Concentrations higher than 4 mg m^{−3} were not tested. Generally, not sufficient correlations could be calculated between the in situ and the satellite data because of potential uncertainties arising from the quality of the in situ data.
Acknowledgements
This research has been cofinanced by the European Regional Development Fund of the European Union and Greek national funds through the Operational Program Competitiveness, Entrepreneurship, and Innovation, under the call RESEARCH–CREATE–INNOVATE (project code: T1EDK02966).
References
[1] Watanabe FSY, Alcântara E, Stech JL. High performance of chlorophylla prediction algorithms based on simulated OLCI Sentinel3A bands in cyanobacteriadominated inland waters. Adv Space Res. 2018;62:265–73.10.1016/j.asr.2018.04.024Search in Google Scholar
[2] Ferreira JG, Andersen JH, Borja A, Bricker SB, Camp J, Cardoso da Silva M, et al. Overview of eutrophication indicators to assess environmental status within the European marine strategy framework directive. Estuarine Coastal Shelf Sci. 2011;93:117–31.10.1016/j.ecss.2011.03.014Search in Google Scholar
[3] Boesch DF. Challenges and opportunities for science in reducing nutrient overenrichment of coastal ecosystems. Estuaries. 2002;25:886–900.10.1007/BF02804914Search in Google Scholar
[4] Environmental YSI. The basics of chlorophyll measurement; 2013. p. 1–3Search in Google Scholar
[5] Kabbara N, Benkhelil J, Awad M, Barale V. Monitoring water quality in the coastal area of Tripoli (Lebanon) using highresolution satellite data. ISPRS J Photogramm Remote Sens. 2008;63:488–95.10.1016/j.isprsjprs.2008.01.004Search in Google Scholar
[6] Tyler AN, Svab E, Preston T, Présing M, Kovács WA. Remote sensing of the water quality of shallow lakes: a mixture modelling approach to quantifying phytoplankton in water characterized by highsuspended sediment. Int J Remote Sens. 2006;27:1521–37.10.1080/01431160500419311Search in Google Scholar
[7] Odermatt D, Gitelson A, Brando VE, Schaepman M. Review of constituent retrieval in optically deep and complex waters from satellite imagery. Remote Sens Environ. 2012;118:116–26.10.1016/j.rse.2011.11.013Search in Google Scholar
[8] Doerffer R, Schiller H. The MERIS Case 2 water algorithm. Int J Remote Sens. 2007;28:517–35.10.1080/01431160600821127Search in Google Scholar
[9] Antoine D. Sentinel3 optical products and algorithm definition. OLCI level 2 algorithm theoretical basis document. ocean color products in case 1 waters; 2010. p. 1–19.Search in Google Scholar
[10] Matthews MW. A current review of empirical procedures of remote sensing in Inland and nearcoastal transitional waters. Int J Remote Sens. 2011;32:6855–99.10.1080/01431161.2010.512947Search in Google Scholar
[11] Olmanson LG, Brezonik PL, Bauer ME. Airborne hyperspectral remote sensing to assess spatial distribution of water quality characteristics in large rivers: the Mississippi River and its tributaries in Minnesota. Remote Sens Environ. 2013;130:254–65.10.1016/j.rse.2012.11.023Search in Google Scholar
[12] Volpe G, Santoleri R, Vellucci V, Ribera d’Alcalà M, Marullo S, D’Ortenzio F. The colour of the Mediterranean Sea: global versus regional biooptical algorithms evaluation and implication for satellite chlorophyll estimates. Remote Sens Environ. 2007;107:625–38.10.1016/j.rse.2006.10.017Search in Google Scholar
[13] Volpe G, Colella S, Forneris V, Tronconi C, Santoleri R. The Mediterranean ocean colour observing systemsystem development and product validation. Ocean Sci. 2012;8:869–83.10.5194/os88692012Search in Google Scholar
[14] ESA. Sentinel3 user handbook; 2013.Search in Google Scholar
[15] D’Ortenzio F, Marullo S, Ragni M, D’Alcalà MR, Santoleri R. Validation of empirical SeaWiFS algorithms for chlorophylla retrieval in the Mediterranean Sea: a case study for oligotrophic seas. Remote Sens Environ. 2002;82:79–94.10.1016/S00344257(02)000263Search in Google Scholar
[16] Antoine D, Morel A, Andre JM. Algal pigment distribution and primary production in the eastern Mediterranean as derived from coastal zone color scanner observations. J Geophys Res. 1995;100:16193–209.10.1029/95JC00466Search in Google Scholar
[17] Jaccard P, Hjemann DO, Ruohola J, Marty S, Kristiansen T, Sorensen K, et al. Copernicus marine environment monitoring service. Quality control of biogeochemical measurements; 2018.Search in Google Scholar
[18] Eumetsat. Sentinel3 OLCI marine user handbook; 2018.Search in Google Scholar
[19] Saunders MP. Practical conversion of pressure to depth. Phys Oceanogr. 1981;11:573–4.10.1175/15200485(1981)011<0573:PCOPTD>2.0.CO;2Search in Google Scholar
[20] Clark DK. Biooptical algorithms – Case 1 waters (ATBD 17, v1.2). Ocean color web page; 1997. p. 1–52.Search in Google Scholar
[21] Strass VH. Meridional and seasonal variations in the satellitesensed fraction of euphotic zone chlorophyll. J Geophys Res. 1990;95:18289–301.10.1029/JC095iC10p18289Search in Google Scholar
[22] Platt T, Caverhill C, Sathyendranath S. Basinscale estimates of oceanic primary production by remote sensing: the North Atlantic. J Geophys Res. 1991;96:15147.10.1029/91JC01118Search in Google Scholar
[23] Silulwane NF, Richardson AJ, Shillington FA, MitchellInnes BA. Identification and classification of vertical chlorophyll patterns in the Benguela upwelling system and AngolaBenguela front using an artificial neural network. South Afr J Mar Sci. 2001;23:37–51.10.2989/025776101784528872Search in Google Scholar
[24] Lopes FB, Novo EMLdM, Barbosa CCF, Andrade EMd, Ferreira RD. Simulation of spectral bands of the MERIS sensor to estimate chlorophylla concentrations in a reservoir of the semiarid region. Rev Agro@mbiente Online. 2016;10:96.10.18227/19828470ragro.v10i2.3482Search in Google Scholar
[25] Morel A, Huot Y, Gentili B, Werdell PJ, Hooker SB, Franz BA. Examining the consistency of products derived from various ocean color sensors in open ocean (Case 1) waters in the perspective of a multisensor approach. Remote Sens Environ. 2007;111:69–88.10.1016/j.rse.2007.03.012Search in Google Scholar
[26] O’Reilly JE, Maritorena S, Mitchell BG, Siegel DA, Carder KL, Garver SA, et al. Ocean color chlorophyll algorithms for SeaWiFS. J Geophys Research: Ocean. 1998;103:24937–53.10.1029/98JC02160Search in Google Scholar
[27] Matsushita B, Yang W, Chang P, Yang F, Fukushima T. A simple method for distinguishing global Case1 and Case2 waters using SeaWiFS measurements. ISPRS J Photogramm Remote Sens. 2012;69:74–87.10.1016/j.isprsjprs.2012.02.008Search in Google Scholar
[28] Bulgarelli B, Zibordi G. On the detectability of adjacency effects in ocean color remote sensing of midlatitude coastal environments by SeaWiFS, MODISA, MERIS, OLCI, OLI and MSI. Remote Sens Environ. 2018;209:423–38.10.1016/j.rse.2017.12.021Search in Google Scholar PubMed PubMed Central
[29] Evans JD. Straightforward statistics for the behavioral sciences. Pacific Grove: Brooks/Cole Pub. Co.; 1996.Search in Google Scholar
[30] IMT Neural Net; 2019. Available from: https://sentinels.copernicus.eu/web/sentinel/technicalguides/sentinel3olci/level2/imtneuralnet)Search in Google Scholar
[31] Ioannou I, Foster R, Gilerson A, Gross B, Moshary F, Ahmed S. Neural network approach for the derivation of chlorophyll concentration from ocean color. In: Hou WW, Arnone RA, eds. 2013/06. p. 87240PP.10.1117/12.2018143Search in Google Scholar
[32] Doerffer R. Algorithm theoretical bases document (ATBD) for L2 processing of MERIS data of case 2 waters, 4th reprocessing; 2015.Search in Google Scholar
[33] Tilstone G, Mallorhoya S, Gohin F, Couto AB, Sá C, Goela P, et al. Remote sensing of environment which ocean colour algorithm for MERIS in north west European waters? Remote Sens Environ. 2017;189:132–51.10.1016/j.rse.2016.11.012Search in Google Scholar
[34] Toming K, Kutser T, Uiboupin R, Arikas A, Vahter K, Paavel B. Mapping water quality parameters with Sentinel3 ocean and land colour instrument imagery in the Baltic Sea. Remote Sens. 2017;9:1070.10.3390/rs9101070Search in Google Scholar
[35] Santos Dos ACA, Calijuri MC, Moraes EM, Adorno MAT, Falco PB, Carvalho DP, et al. Comparison of three methods for chlorophyll determination: spectrophotometry and fluorimetry in samples containing pigment mixtures and spectrophotometry in samples with separate pigments through high performance liquid chromatography. Acta Limnol Bras. 2003;15:7–18.Search in Google Scholar
© 2021 Ioannis MoutzourisSidiris and Konstantinos Topouzelis, published by De Gruyter
This work is licensed under the Creative Commons Attribution 4.0 International License.