Show Summary Details
More options …

Open Physics

formerly Central European Journal of Physics

Editor-in-Chief: Seidel, Sally

Managing Editor: Lesna-Szreter, Paulina

IMPACT FACTOR 2018: 1.005

CiteScore 2018: 1.01

SCImago Journal Rank (SJR) 2018: 0.237
Source Normalized Impact per Paper (SNIP) 2018: 0.541

ICV 2017: 162.45

Open Access
Online
ISSN
2391-5471
See all formats and pricing
More options …

Local kernel nonparametric discriminant analysis for adaptive extraction of complex structures

Quanbao Li
/ Fajie Wei
/ Shenghan Zhou
• Corresponding author
• School of Reliability and System Engineering, Beihang University, Beijing 100191, China, E-mail: Tel./Fax: +86 1082317804
• Email
• Other articles by this author:
Published Online: 2017-05-05 | DOI: https://doi.org/10.1515/phys-2017-0030

Abstract

The linear discriminant analysis (LDA) is one of popular means for linear feature extraction. It usually performs well when the global data structure is consistent with the local data structure. Other frequently-used approaches of feature extraction usually require linear, independence, or large sample condition. However, in real world applications, these assumptions are not always satisfied or cannot be tested. In this paper, we introduce an adaptive method, local kernel nonparametric discriminant analysis (LKNDA), which integrates conventional discriminant analysis with nonparametric statistics. LKNDA is adept in identifying both complex nonlinear structures and the ad hoc rule. Six simulation cases demonstrate that LKNDA have both parametric and nonparametric algorithm advantages and higher classification accuracy. Quartic unilateral kernel function may provide better robustness of prediction than other functions. LKNDA gives an alternative solution for discriminant cases of complex nonlinear feature extraction or unknown feature extraction. At last, the application of LKNDA in the complex feature extraction of financial market activities is proposed.

PACS: 42.30.Sy; 24.10.Lx; 06.60.Ei

1 Introduction

Linear discriminant analysis (LDA) generally refers to Fisher’s discriminant analysis (1936), which is an excellent method of dimensionality reduction and classification using a projection algorithm [1]. LDA is widely used in pattern recognition, business intelligence and genome sciences for good classification accuracy and high computing efficiency. Numerous extensions for LDA have been proposed during the recent decades. These extensions serve to, for example, address the small sample size (SSS) problem, handle the incremental great sample, relax the assumption of normality, and extract nonlinear and nonparametric features.

Currently, a prevalent extension for the nonlinear problem is kernel discriminant analysis (KDA) [24]. KDA first maps low-dimensional data into high-dimensional data, and subsequently projects high-dimensional ones onto low-dimensional ones. It is able to recognize certain simple nonlinear relationships. However, KDA, in complex nonlinear structures, is not as effective as the nonparametric method with a local classifier, such as k-nearest neighbor (KNN). Nonparametric discriminant analysis (NDA) relaxes the normality assumption of traditional LDA [5]. NDA provides a unified view of the parametric nearest mean reclassification algorithm and the nonparametric valley seeking algorithm. Diaf combined NDA and KDA to introduce a non-parametric Fisher’s discriminant analysis with kernels [6]. Weighted LDA is commondly used in handling the unbalanced sample [7]. Nearest neighbor discriminant analysis (NNDA) can be regarded as an extension of NDA using a new between-class scatter matrix [8]. Above discriminant analyses are parametric and nonparametric methods with a global classifier, which more or less identify nonlinear features. Fan proposed a parametric discriminant analysis with a local classifier in 2011 named local linear discriminant analysis (LLDA), which is skilled in complex nonlinear structures [9]. For each testing sample, LLDA first extracts the k-nearest subsets from the entire training set and then classifies them by LDA. The k-nearest subsets are calculated by Euclidean distance. Shi and Hu (2012) presents LLDA utilizing a composite kernel which is derived from a combination of local linear models with interpolation [10]. Li et al proposed NDA with kernels, and testified the feasibility of the proposed algorithm on 3D model classification [11]. Zeng (2014) proposes weighted marginal NDA to efficiently utilize the marginal information of sample distribution [12]. NDA has been extended to a semi-supervised dimensionality reduction technique to take advantage of both the discriminating power provided by the NDA method and the locality-preserving power provided by the manifold learning [13]. Du (2013) embedded sparse representation in NDA for face recognition [14]. Adaptive slow feature discriminant analysis is an attractive biologically inspired learning method to extract discriminant features for classification on time series [15]. Fast incremental LDA feature extraction are derived by optimizing the step size in each iteration using steepest descent and conjugate direction methods [16].

In this paper, we generalize LLDA to local kernel nonparametric discriminant analysis (LKNDA), which is a nonparametric discriminant analysis with a local classifier. LKNDA performs more accurately and robustly than LLDA in almost all cases. LKNDA improves conventional discriminant analysis with the inspiration from nonparametric statistics. This analysis considers the weight of different samples in subsets and modifies the kernel function of nonparametric statistics into a unilateral kernel function. LKNDA defines a generalized nearest neighbor function, which will be useful in real cases of further study. LKNDA relaxes the normality assumption, and it performs well in the nonlinear or nonparametric problem. Compared with the KNN method, LKNDA has the same time complexity and higher accuracy on class margin. The bottom half of this paper proposed an application in the complex feature extraction of financial market activities.

2.1 Overview of the conventional NDA

NDA developed by Fukunaga and Mantock (1983) is a nonparametric extension of Fisher’s LDA [5]. Discriminant analysis is designed to identify the optimal projection direction that maximizes the ratio of between-class scatter to within-class scatter. NDA introduces a weighting function to emphasize the boundary information. Thus, this analysis can address non-normal data distributions by incorporating data direction and boundary structure.

To maximize the objective function of NDA: $J(w)=wTSBwwTSww.$(1)

w is a projection matrix. Sw is the within-class scatter matrix and SB is the between-class scatter matrix. Two scatter matrices are defined as follows [5, 6]: $SW=1N∑i=1C∑l=1Ni(xl−μi)(xl−μi)T$(2) $SB=1N∑i=1C∑l=1Ni∑j=1,j≠iCω(i,j,l)(xli−mj(xli))(xli−mj(xli))T$(3)

where $mj(xli)=1k∑p=1knn(xli,j,p)$(4) $ω(i,j,l)=min{dα(xli,nn(xli,i,k)),dα(xli,nn(xli,j,k))}dα(xli,nn(xli,i,k))+dα(xli,nn(xli,j,k)).$(5)

µi is the mean vector of class i. ${\mathbit{m}}_{j}\left({\mathbit{x}}_{l}^{i}\right)$ represents the mean vector of the k-nearest neighbors of vector xl Xi from class j. $nn\left({\mathbit{x}}_{l}^{i},j,k\right)$ is the k-th-nearest neighbor from class j to sample xl Xi. $d\left({\mathbit{x}}_{l}^{i},nn\left({\mathbit{x}}_{l}^{i},i,k\right)\right)$ is the Euclidean distance from xl Xi to its k-th-nearest neighbor from class j. ω(i, j, l) is a weighting function to de-emphasize the effect of samples with large magnitudes, which are far away from the decision boundary.

2.2 Local kernel nonparametric discriminant analysis

Local Kernel Nonparametric Discriminant Analysis is a nonparametric method for solving complex nonlinear problems. For each testing sample, LKNDA fetches a local subset from the entire training set by the nearest neighbor function. LKNDA also considers that the more similar to the estimated sample, the larger the sample weight.

(1) Local subset extraction

To discriminate each sample xE of the testing set, we first extract a nearest neighbor subset with bandwidth K. nn(xE, k) is the nearest neighbor function. This function defines the k-th-nearest neighbor of xE in the testing set in certain similarity computing standards.

The generalized similarity calculation model satisfies the following conditions:

1. Non-negative: for all i and j, 1 ≥s(i, j) ≥ 0. Especially when i = j, s(i, j) = 1.

2. Symmetry: for all i and j, s(i, j) = s(j, i).

There are many calculation models for similarities.

Each model is applicable to the specific condition. Euclidean distance is the most common spatial distance algorithm for Rn. In addition, cosine similarity is another approach of spatial similarity, and the Jacard coefficient is a measure of attribute similarity. In the application of LKNDA, the selection of the distance formula needs to consider the data characteristics and research purposes.

Parameter k can be manually specified. The optimal parameter can also be determined by cross-validation. It can also refer to the selecting principle of Fan’s LLDA [9]: for a large sample, let K be 5%–10% of the amount of all testing samples; for a small sample, let K be 10%–20%.

NN represents the ordered subset of top K nearest neighbor samples. In the subset NN, the number of classification is C, the sample size is K, and the sample size of class i is Ni. Thus, $NN={nn(xE,k)|k=1,…,K}$(6) $K=∑i=1CNi$(7)

(2) Local subset prediction

The maximization objective function of NDA is: $J(w)=wTSBwwTSww$(8) with within-class scatter matrix and between-class scatter matrix $SW=1K∑i=1C∑l=1Ni(xli−μi)(xli−μi)T$(9) $SB=1K∑i=1C∑l=1Ni∑j=1j≠iCKrxli−1K(xli−μj)(xli−μj)T$(10)

where µi is the mean vector of class i in the local ordered subset NN. w Rd is the projection matrix and the parameters for J(w). D is the before-projection dimension that equals to the attribute size in LKNDA, while d is the after-projection dimension that satisfies d C – 1. $nn\left({\mathbit{x}}_{l}^{i},j,k\right)$ is the k-th-nearest neighbor from class j to sample ${\mathbit{x}}_{l}\in {\mathbit{X}}_{i}.r\left({\mathbit{x}}_{l}^{i}\right)$ is the numerical order of ${\mathbit{x}}_{l}^{i}$ in NN. $\mathcal{K}\left(x\right)$ is the unilateral kernel function, which is discussed in the next subsection. For $x\in \left[0,1\right),\mathcal{K}\left(x\right)>0.$

For any times of w, J(w) is unchanged. To solve the optimization problem, we add a constraint that lets the denominator norm be 1. $max:J(w)=wTSBwwTSww$(11) $s.t.:∥wTSww∥=1.$(12)

Using Lagrangian method, $c(w)=wTSBw−λ(wTSww)$(13) $⇒∂c∂w=2SBw−2λSww=0$(14) $⇒SBw=λSww.$(15)

As a consequence, to maximize J(w) is equivalent to obtaining a projection matrix wopt whose columns are the eigenvectors corresponding to the top eigenvalues of the eigenequation SBw = λSww.

2.3 Unilateral kernel function

Nonlinear sciences have an enormous potential for applied mathematics [17]. There are two different definitions of kernel function: in the field of machine learning, a kernel function is used to map a sample set into a high dimensional space; in the field of nonparametric statistics, a kernel function is a weighting function of nonparametric estimation. LKNDA accepts differences in the importance of neighbor samples. The greater the distance between xE and samples in NN, the lower the reliability of the information provided. Thus, we modify the kernel function of nonparametric statistics into a unilateral kernel function, which presents the reliability differences. Table 1 shows six types of unilateral kernel functions, which are drawn in Figure 1.

Figure. 1

Six types of unilateral kernel functions

Table 1

Six types of unilateral kernel function

An ordered subset of neighbors contains K samples. According to the unilateral kernel function in Table 1, the unnormalized weight of the i-th training sample is gi and the normalized weight is hi. $gi=K(i−1K)$(16) $hi=K(i−1K)/∑i=1KK(i−1K)$(17)

Figure 1 shows the pattern of different unilateral kernel functions. For $0\le {u}_{1}<{u}_{2}\le 1,\mathcal{K}\left({u}_{1}\right)\ge \mathcal{K}\left({u}_{2}\right).$ Kernel weighted process means that, for any two samples, the weight of high similarity is greater than or equal to the one of low similarity. The accuracy of LKNDA is usually affected by the neighbor size K. Despite this, a unilateral kernel function makes LKNDA more robust by weighting procedure. This idea is proven in section 3.2.

If the nearest neighbor function is Euclidean distance, the kernel type is uniform, the bandwidth K is a certain percentage of the total sample, and NDA is replaced by LDA for local subset computing, then LKNDA degrades into LLDA.

2.4 Pseudo-code of LKNDA

1: input data: xTest, xTrain, bandwidth K, kernel function $\mathcal{K}\left(.\right)$

2: $\mathsc{K}←\left\{\mathcal{K}\left(\frac{i-1}{K}\right)|i=1,\dots ,K\right\}$

3: for each xE in xTest do

4:      NN ← {nn(xE, i)|i = 1,..., K}

5:      if all NN has the same class Ci then

6:         xE is Ci

7:      else

8:         calculate SW and SB using NN and 𝓚

9:         w is constituted by the d eigenvectors of ${\mathbit{S}}_{w}^{-1}{\mathbit{S}}_{B}$ corresponding to its first d largest eigenvalues

10:         the nearest class of wTx in NN is xE

11:      end if

12: end for

2.5 Time complexity

LDA is highly efficient classification algorithm, and its time complexity is O(d3 + nd2 + md). This sample size of the whole training set is n, and the size of the testing set is m. The number of categories of the after-projected sample is d. Usually, d is very small and m < n, so the time complexity of LDA can be expressed as O(n).

The time complexity of LKNDA is O(mkn + t(d3 + kd2 + d)), where n and m are the observation amount of the training set and testing set respectively. The local subset has k samples, and t is the number of discriminant analysis after the pruning algorithm. The number of categories of the after-projected sample is d. Usually, d and k are notably small, and t < m < n, so the time complexity of LKNDA can be expressed as O(mn), which equals to KNN.

3 Comparison

In this section, we assess LKNDA in two ways. First, we compare the accuracy of six methods using 2 dimensional and 3 dimensional composite data. Second, we make a parameter comparison with different combinations of bandwidth and unilateral kernel types.

3.1 Classification methods comparison

In this paper, simulations of composite data are used to evaluate methods. 2 dimensional and 3 dimensional data can visually reflect the characteristics of the data. In the following discussion, we apply six data generating processes, which are shown in Figure 2. These data generating processes are: simple small samples, mild hybrid and unbalanced simple triangle samples, multi-cluster samples, the Taiji diagram, superimposed curve samples and 3D spiral samples. During the simulation, to avoid the particularity of a set of data, we repeated the simulation 100 times for each data generation process to obtain 100 datasets. For each dataset, we performed 10 times random sampling, which extracts 1/3 of the observations as test sample and the rest as train sample. In this way, we have 1000 pairs of test samples and train samples for one data generating process.

Figure. 2

Six data generating processes for composite data

This section computes the accuracy of six classification models: naive Bayesian classifier (NBC), C5.0 decision tree classifier (C5.0), k-nearest neighbor (KNN), linear discriminant analysis (LDA), support vector machine (SVM) and local kernel nonparametric discriminant analysis (LKNDA). To emphasize the comparative method credibility, we use the same 1000 pairs of testing data and training data for each classification model. We summarize the mean and standard deviation of prediction accuracy and list them in Table 2.

Table 2

Prediction performance of six kinds of composite data

Figure 2(a) shows simple small samples with linear characteristics and presents a simple classification problem. LDA, SVM and LKNDA achieve nearly 100% accuracy of classification. KNN and C5.0 are inefficient for small sample classification and not ideal for this case. NBC strictly depends on the independence assumption and thus reaches a poor result.

Figure 2(b) shows mild hybrid and unbalanced simple triangle samples. It has a single structure, obviously linear boundaries and adequate samples; thus, all six types of methods can predict with approximately 92% accuracy. By inference, for simple classification problems and adequate samples, the effect of different methods has no significant difference.

Figure 2(c) shows multi-cluster samples. It is intuitively clear. It has obvious category boundaries. C5.0, KNN, SVM and LKNDA classify accurately close to 100%. NBC and LDA perform poorly in this case. NBC is based on marginal probability distribution and the independence assumption. Green samples and blue samples in Figure 2(c) have the same marginal probability distribution, which makes NBC unable to distinguish them. LDA is a projection method based on mean and the variance of classes. Three types of samples have the same mean, causing LDA to fail.

Figure 2(d) shows the Taiji diagram. It is an identification problem with complex nonlinear structure and clear edge margins. LKNDA performs best, then KNN, followed by C5.0. These three nonparametric methods have high accuracy. LDA, SVM and NBC do not perform well because they can only explain a linear or simple nonlinear classification situation.

Figure 2(e) shows superimposed curve samples. It is an identification problem with complex nonlinear structure and linear regularity. LKNDA performs best. KNN often misclassifies the points near the intersection point. Other methods fail in this case.

Figure 2(f) shows 3D spiral samples. It is a three-class identification problem with complex nonlinear structure and clear edge margins. LKNDA performs best, then C5.0. Other methods are invalid.

According to the above analysis, the environment of six classifications can be summarized as in Table 3. NBC requires a large sample size and an independence assumption. It cannot be used in dependence relationships, both linear and nonlinear. C5.0 is a decision tree classifier with an information entropy theory. It requires a large sample size. Its dividing surfaces are limited in the x or y direction, which leads to classification errors. KNN is a competent nonparametric classification algorithm and is capable of solving the problem of complex nonlinear characteristics with adequate sample size. However, KNN is not capable of understanding the law of the points near the class interface. LDA is a linear projection classifier. LDA performs well with a small sample size but is invalid in a nonlinear environment. SVM solves simple nonlinear problems by establishing a classification hyper plane. It is still insufficient to handle complex nonlinear problems. LKNDA, proposed in this paper, strives to absorb the advantages of both nonparametric and parametric methods. It takes the advantages of nonparametric methods in solving complex nonlinear problems. At the same time, it draws on the benefits of parametric ones for pattern recognition with a small sample size.

Table 3

Conditions of classification algorithm

In the small sample classification tasks, if data are rendered as linear, LDA can be used properly. If data are simple nonlinear, SVM is fine. If data are complex nonlinear and has adequate samples, KNN is often applied. If KNN behaves poorly, or if you want to further improve the prediction accuracy, you can turn to LKNDA.

3.2 Parameter comparison of LKNDA

There are two parameters in LKNDA: bandwidth K and kernel type. In this section, we conduct two classification tasks using the Taiji diagram (Figure 2(d)) and superimposed curve (Figure 2(e)) samples. In each task, we let K = 5, 10, 15, 20, 30, 40 for each of the six unilateral kernel functions in Table 1. Table 4 summarizes the results of the simulation prediction, which is based on different combinations of the parameters of LKNDA. We test 1000 times for each combination to obtain a mean of forecast accuracy. All standard deviation is approximately 0.15, which is not listed in the table. Different values for the parameters contribute to different results, which is shown in Table 4.

Table 4

Mean of simulation accuracy for different bandwidth and unilateral kernel functions

In Figure 3(a) and 3(b), the x-coordinate is the parameter K and the y-coordinate is the mean of simulation accuracy for 1000 times. In each graph, six lines have six colors, which represent six types of unilateral kernel functions. Each line in Figure 3 displays as a downward parabolic shape. This means that there is an optimal Kopt for different values of bandwidth K. The more K deviates from Kopt, the faster the accuracy declines. Both 3(a) and 3(b) show that different kernel functions vary in prediction accuracy. Quartic kernel and triangular kernel perform best. Cosine kernel and Epanechnikov kernel are moderately good. Gaussian kernel and uniform kernel are less effective. After combining K and kernel type, we deem that K mainly determines prediction accuracy and kernel type mainly determines the robustness of prediction. When selecting a suitable unilateral kernel function, the sensitivity of the accuracy rate to K is decreasing, which means that the robustness of prediction is obviously increasing and the accuracy of prediction is slightly increasing.

Figure. 3

Line chart of mean of simulation accuracy for different bandwidths and unilateral kernel functions

4.1 General framework of timing system

Market timing is a strategy for making a decision to buy or sell a financial asset by trying to predict future market price movements. The key of market timing is to justify the market trend that is a perceived tendency of financial markets to move in a particular direction over time. Quantitative timing traders always attempt to identify market trends using technical indicators. There are dozens of common technical indicators and many indicators developed by financial institutions. Technical indicators have different algorithms, different types, different trading signals and different scope of assets. Effectiveness of indicators varies with the market environment changes. Indicators performed well in the sample often stunned in the outsample prediction.

In practice, the quantification team usually needs to carry out the single indicator optimization, the single indicator test and the timing signal mixing to construct the timing trading system. In Figure 4, the single indicator optimization stage is selecting robust optimal parameters in the feasible region of parameters according to the preset rules. The single indicator filtering stage is screening usable technical indicators by the sample test, out-sample test, and extrapolation test. The timing signal mixing stage is generating integrated trading signals through the integration of multi index.

Figure. 4

General flow chart of timing system

There are many ways to create a timing signal mixer. The most commonly used methods are equal voting system and general linear weighting. Complex methods are integer mixed genetic algorithm [18, 19] and neural networks [2022]. This section only discusses the application of LKNDA method to construct timing signal mixer and the classification prediction of LKNDA in the mixer construction process. The index selection, parameter optimization, timing system and signal mixer effect are not discussed further.

4.2 An application for timing signal mixer

The CSI 300 is an influential stock market index in the Shanghai and Shenzhen stock exchanges. There are a wealth of financial products related to the index. In the following study, CSI 300 data are used. Features are calculated by yields of the next five days. The signal source as independent variables is two market state indicators and six timing indicators. LKNDA is used to build the mixer.

1. Features:

$Market trend: FT={1FRETT > 1%−1FRETT < −1%−1otherwise$(18)

where Τ means calculated day, FRET is the yield of the next five days, ${\text{FRET}}_{T}=\frac{{\text{CLOSE}}_{T+5}-{\text{CLOSE}}_{T+1}}{{\text{CLOSE}}_{T+1}}$

2. Market state indicators:

Relative position: $S1,T=CLOSET−min{LOWI}max{HIGHi}−min{LOWI},i∈[T−119,T]$(19)

Oscillation intensity: $S2,T=∑|CLOSEi−CLOSEi−1|max{HIGHI}−min{LOWI},i∈[T−59,T]$(20)

3. Technical indicators:

Six technical indicators I1, T, I2, T, . . ., I6, T listed in Table 5.

Table 5

Sample back-test performance of six technical indicators

Taking state index as classification variable and technical index as independent variable, the mixer was constructed. Signal mixer tested by the method of extrapolation in which sample set {Fi, S.,i, I.,i|i ∈ [1, T – 5]} was used to predict FT for the T day. Results predicted by LKNDA is shown in Table 6. The prediction accuracy of LKNDA was 79.8%, and the probability of contrary signal between predicted value and actual value was only 7.3%. The prediction accuracy of SVM, LDA, C5.0, NBC and KNN were 78.7%, 77.6%, 75.8%, 72.2% and 70.5% respectively.

Table 6

Proportion of predicted value to actual value in total sample

According to the integrated signal generated by the mixer, the performance of strategy is shown in Figure 5. The CSI 300 index is gray line on the secondary axis. The red area is where the long signal is, and green is the range of the short trade signal. The white bar area is the empty signal period. The net value of back testing is gray line one the primary axis. Strategy to achieve annualized yield 43.2%, Sharpe ratio 1.51 and Calmar ratio 1.82, which is far higher than the single-strategy extrapolation performance on the right side of Table 5. Timing signal mixer obtained excellent performance. On the one hand, LKNDA has good adaptability in extracting complex features. On the other hand, it can automatically adjust the weight of different strategies according to the market state, and maximize the effect of the strategy group.

Figure 5

Integrated transaction signal of mixer and net value of back testing

5 Conclusions

This paper presents a supervised classification algorithm, local kernel nonparametric discriminant analysis, which relaxes the normality assumption of conventional discriminant analysis. Compared with NBC, LDA and SVM, LKNDA can effectively identify the classification of complex nonlinear structures. Compared with KNN and C5.0, it can accurately identify the characteristics of the local sample. The bandwidth K primarily determines prediction accuracy, and the type of selected unilateral kernel function primarily determines the robustness of prediction. Compared with KNN, LKNDA has the same time complexity O(mn), higher accuracy, and a smaller adequate sample size. This method is applied to the construction of the mixer. It is very effective in extracting the complex features of the financial system. Application of the mixer performed well in CSI 300 back-test. In future studies, the method will be tested in more cases.

Acknowledgement

This work was supported by the National Nature Science Funds of China (No. 71501007, 71332003, 71672006). The authors would like to thank the referees and the editor who handled this manuscript, for all their invaluable comments and suggestions.

References

• [1]

Fisher R., The use of multiple measurements in taxonomic problems, Annuals of human genetics, 1936, 7, 179-188 Google Scholar

• [2]

Roth V., Steinhage V., Nonlinear discriminant analysis using kernel functions, Adv. Neural Inf. Process. Syst., 1999, 568-574 Google Scholar

• [3]

Mika S., Rätsch G., Weston J., Schölkopf B., Müllert K., Fisher discriminant analysis with kernels, Neural networks signal Process. IX. IEEE, 1999, 41-48Google Scholar

• [4]

Baudat G., Anouar F., Generalized discriminant analysis using a kernel approach, Neural Comput., 2000, 12(10), 2385-2404

• [5]

Fukunaga K., Mantock J.M., Nonparametric discriminant analysis, IEEE Trans. Pattern Anal. Mach. Intell., 1983, 6, 671-678 Google Scholar

• [6]

Diaf A., Boufama B., Benlamri R., Non-parametric Fisher’s discriminant analysis with kernels for data classification, Pattern Recognit. Lett., 2013, 34(5), 552-558

• [7]

Jarchi D., Boostani R., A new weighted LDA method in comparison to some versions of LDA, Proc. Word Acad. Sci. Eng. Technol. 2006, 12, 233-238 Google Scholar

• [8]

Qiu X., Wu L., Nearest neighbor discriminant analysis, Int. J. Pattern Recognit. Artif. Intell., 2006, 20, 1245-1259

• [9]

Fan Z., Xu Y., Zhang D., Local linear discriminant analysis framework using sample neighbors, IEEE Trans. Neural Networks, 2011, 22, 1119-1132

• [10]

Shi Z. and Hu J., Local linear discriminant analysis with composite kernel for face recognition. IEEE Int. Conf. Neural Networks, 2012, 20, 1-5 Google Scholar

• [11]

Li J., Sun W., Wang Y., Tang L., 3D model classification based on nonparametric discriminant analysis with kernels, Neural Comput. Appl., 2013, 22(3-4), 771-781

• [12]

Zeng Q., Weighted marginal discriminant analysis, Neural Comput. & Appl., 2014, 24(3-4), 503-511

• [13]

Xing X., Du S., Jiang H., Semi-supervised nonparametric discriminant analysis. IEICE T. Inf. Syst., 2013, E96.D(2): 375-378

• [14]

Du C., Zhou S., Sun J., Sun H., Wang L., Discriminant embedding by sparse representation and nonparametric discriminant analysis for face recognition, J. Cent. South Univ., 2013, 20(12), 3564-3572

• [15]

Gu X., Liu C., Wang S., Zhao C., Feature extraction using adaptive slow feature discriminant analysis, Neurocomputing, 2015, 154, 139-148

• [16]

Ghassabeh Y.A., Rudzicz F., Moghaddam H.A., Fast incremental LDA feature extraction, Pattern Recogn., 2015, 48(6), 1999-2012

• [17]

Pérez-García V.M., Fitzpatrick S., Pérez-Romasanta LA, Pesic M, Schucht P, Applied mathematics and nonlinear sciences in the war on cancer, Appl. Math. Nonlinear Sci., 2016, 1(2), 423-436

• [18]

Lin Y.C., Hwang K.S., Wang F.S., A mixed-coding scheme of evolutionary algorithms to solve mixed-integer nonlinear programming problems, Comput. Math. Appl., 2004, 47(8-9), 1295-1307

• [19]

Chung, T.S., Wang Z.Y., Li Y.Z., Optimal generation expansion planning via improved genetic algorithm approach, Int. J. Elec. Power Energy Syst., 2004, 26(8), 655-659

• [20]

Mcculloch W.S., Pitts W., A logical calculus of the ideas immanent in nervous activity, Neurocomputing: Found. Res., MIT Press, 1943 Google Scholar

• [21]

Werbos P. Beyond regression: new tools for prediction and analysis in the behavioral sciences, PhD thesis, Harvard University, 1974Google Scholar

• [22]

Feng J., Shi D., Complex Network Theory and Its Application Research on P2P Networks, Appl. Math. Nonlinear Sci., 2016, 1(1), 45-52

Accepted: 2017-02-23

Published Online: 2017-05-05

Citation Information: Open Physics, Volume 15, Issue 1, Pages 270–279, ISSN (Online) 2391-5471,

Export Citation