### Feature extraction from time-frequency image based on GLCM

GLCM is a texture feature extraction method for image pixel processing13. Its basic principle is to count the number of pixel pairs in a specific direction and distance. The direction is generally divided into four directions: 0°, 45°, 90° and 135°. The distance is a positive integer, which can be set manually to generate a grayscale co-occurrence matrix. In order to reduce the amount of calculations, the time-frequency images are processed by modifying the image size and the gray level.

STFT is used to convert the bearing vibration signal of each sample into the corresponding time-frequency image, and Figure 1 shows the time-frequency images of a set of samples representing four types of bearing states. Like the time-frequency image generated by STFT14 has a large size which is 875×656, the image size is reduced to 128×128. Also, since the gray image has 256 gray levels, which requires a lot of computation, the gray level of the time-frequency image is set to 4, and the 4×4 grayscale co-occurrence matrix is ​​obtained by counting the number of pixel pairs whose direction is 90 degrees and whose distance is 2.

The four features including contrast, correlation, homogeneity and energy calculated from the time-frequency image grayscale co-occurrence matrix do not reach a good classification effect in the ‘experience. Therefore, in this study, the (4 times 4) The co-occurrence matrix is ​​converted into a 16-dimensional vector, which is the preliminary characteristic of each time-frequency image. Due to the large numerical range of each preliminary feature in the vector, the preliminary features are processed to avoid the impact of too large a numerical range on the classification effect. The mapminmax function in MATLAB shown in Equation. (1) is used to normalize preliminary features in the training set. As the mapminmax function consists of normalizing each row of data, the matrix composed of the preliminary characteristics of the training set is transformed into a row vector before normalization, and then returned as a matrix after normalization. When eq. (1) is used to normalize the preliminary characteristics in the test set, the values ​​of xmaximum and xmin are the maximum and minimum values ​​among the preliminary features of the training set rather than the test set.

$${text{y = (y}}_{{{text{max}}}} {text{ – y}}_{{{text{min}}}} {)} times { text{(x – x}}_{{{text{min}}}} {text{)/(x}}_{{{text{max}}}} {text{ – x} }_{{{text{min}}}} {text{) + y}}_{{{text{min}}}}$$

(1)

where xmaximum and xmin are the maximum and minimum values ​​of each row before normalization, respectively, and ymaximum Andymin are the maximum and minimum values ​​of each row after normalization, respectively.

The specific steps for extracting features from the GLCM-based time-frequency image are as follows:

Step 1::

Reduce the size of the time-frequency image from 875×656 to 128×128.

2nd step::

Convert color image to gray image. At this point, the pixel range of the time-frequency image is between 0 and 255.

Step 3::

Convert the gray image to a four-level gray image. At this time, the pixel range of the time-frequency image is 1 to 4.

Step 4::

Count the number of pairs of pixels whose direction is 90 degrees and whose distance is 2, and generate the (4 times 4) grayscale co-occurrence matrix.

Step 5 ::

Convert the (4 times 4) matrix to a 16-dimensional vector, and take this vector as the preliminary characteristic of the time-frequency image. The preliminary features of the training set and the test set are normalized, respectively, and then the features of the GLCM-based time-frequency image can be obtained.

The features of a set of samples with different states using the GLCM-based time-frequency image feature extraction method are given in Fig. 2.

### KELM based on MBASA

#### KELM

Detailed description of ELM can be shown in related literatures15.16ELM can be expressed as follows:

$$f(x) = g(x)omega = g(x)G^{T} left( {frac{I}{C} + GG^{T} } right)^{ – 1} T$$

(2)

where (omega = G^{T} left( {frac{I}{C} + GG^{T} } right)^{ – 1} T) denotes the weight connecting the hidden layer to the output layer, and VS denotes the regularization parameter.

KELM uses kernel function to override feature mapping (g(x)) of ELM, which makes KELM has better convergence and generalization performance than ELM17. KELM can be expressed as follows:

$$f(x) = left[ {begin{array}{*{20}c} {K(x,x_{1} )} vdots {K(x,x_{N} )} end{array} } right]left( {frac{I}{C} + GG^{T} } right)^{ – 1} T$$

$$= left[ {begin{array}{*{20}c} {K(x,x_{1} )} vdots {K(x,x_{N} )} end{array} } right]left( {frac{I}{C} + Delta_{KELM} } right)^{ – 1} T$$

(3)

where (Delta_{KELM}) denotes the matrix of the nucleus,

$$Delta_{{KELM_{i,j} }} = K(x_{i} ,x_{j} ) = g(x_{i} ) cdot g(x_{j} )$$

(4)

The Cauchy kernel is an excellent alternative kernel function, which is employed in this study and can be expressed in the following form:

$$K_{Cauchy} ({mathbf{x}}_{i} ,{mathbf{x}}_{j} ) = frac{1}{{1 + frac{{left| {{mathbf{x}}_{i} – {mathbf{x}}_{j} } right|^{2} }}{eta }}}$$

(5)

where (eta) is the Cauchy kernel parameter.

The regularization parameter and the kernel parameter of the KELM model can affect the performance of KELM, which should be carefully selected.

#### Optimization of KELM parameters based on MBASA

BASA mimics the activities of beetle antennae in nature18. However, the traditional beetle antenna search algorithm employs only one beetle, a beetle antenna search is difficult to find the optimal parameters when the ranges of the parameters to be optimized are wide. Thus, MBASA which employs several beetles is presented in this article. Searching for multi-beetle antennae increases the possibility of obtaining the optimal parameters.

The MBASA-based KELM parameter selection process can be described as follows:

Step 1: :

Set beetle positions in vector ({mathbf{x}}^{t}) to youth moment ( (t = 1,2, cdots)). Initialize MBASA settings, including beetle position ({mathbf{x}}^{0})antennae length (d^{0}) and step size (delta^{0}).

2nd step: :

Assess the fitness of each beetle A five-fold cross-validation method is used to assess the fitness of the beetles. In the quintuple cross-validation method, the training samples are also divided into 5 sample subsets, among which 4 sample subsets are used to train the KELM model, and the remaining subset is used to test the KELM model. Each subset can be used as a test subset in turn. Then, the total accuracy of the diagnosis (A_{i}^{{}}) 5 subsets of samples can be obtained as follows:

$$A_{i} = frac{{N_{correct,i} }}{{N_{total} }}$$

(6)

The physical form of Ihe beetle is defined as follows:

$$f({mathbf{x}}_{i} ) = 1 – A_{i}$$

(seven)

Step 3: :

Get the right and left side search behaviors.

In order to model the search behavior, a random search direction of beetles can be described as follows,

$${vec{mathbf{b}}} = frac{{{mathbf{rv}}left( {k,1} right)}}{{eps + left| {{mathbf{rv}}left( {k,1} right)} right|}}$$

(8)

where ({mathbf{rv}}left( {m,1} right)) is a m-dimensional vector with random values ​​between -1 and 1, m is the dimensions of the position, here, m is set to 2, and (eps = 2^{{{ – }52}}).

The activities of beetle antennae are mimicked by right-sided and left-sided searching behaviors, which are expressed as:

$$left{ begin{gathered} {mathbf{x}}_{i,r} = {mathbf{x}}_{i}^{t} + d^{t} {vec{ mathbf{b}}} hfill {mathbf{x}}_{i,l} = {mathbf{x}}_{i}^{t} – d^{t} {vec{ mathbf{b}}} hfill end{gathered} right.$$

(9)

where ({mathbf{x}}_{i,r}) is the position that is in the search area of ​​the Ithe right side of the beetle, ({mathbf{x}}_{i,l}) is the position that is in the search area of ​​the Ithe left side of the beetle, and (d^{t}) is the detection length of the antennas corresponding to the operating capability at youth moment.

Step 4: :

Update beetle positions.

The iterative model is generated as Eq. (9) to be associated with odor detection by considering seeking behavior,

$${mathbf{x}}_{i}^{t} = {mathbf{x}}_{i}^{t – 1} + delta^{t} {vec{mathbf{b }}}signleft( {fleft( {{mathbf{x}}_{i,r} } right) – fleft( {{mathbf{x}}_{i,l} } right)} right)$$

(ten)

where (delta^{t}) is the search step, and (signleft( cdot right)) is a sign function.

Step 5::

Compare the physical condition of ({mathbf{x}}_{i}^{t}) with the fitness of the current best position of the Ie beetle, if (fleft( {{mathbf{x}}_{i}^{t} } right) then (f_{best} = fleft( {{mathbf{x}}_{i}^{t} } right)), ({mathbf{x}}_{i,best} = {mathbf{x}}_{i}^{t})where ({mathbf{x}}_{i,best}) is the current best position of the Ie beetle, and (f_{i,best}) is the fitness of the current best position of the Ie beetle.

Step 6::

Update the length of the antennas (D) and step size (delta) as following,

$$d^{t} = 0.95d^{{t{ – }1}} + r0$$

(11)

$$delta^{t} = 0.95delta^{t – 1}$$

(12)

where (r0) is the constant.

Step 7::

Repeat steps 2 through 6 until the stop condition is met.

Step 8::

Obtain the best position of the best fitness among all the beetles, which are the optimal parameters of KELM.

### Bearing failure diagnosis process based on GLCM and KELM based on MBASA

The bearing fault diagnosis process based on GLCM and KELM based on MBASA is shown in Fig. GLCM. Then, based on the training samples, create the fitness function and optimize the regularization parameter and the kernel parameter of KELM using MBASA. Moreover, establish the MBASA-KELM model by the optimized regularization parameter and the kernel parameter of KELM. Finally, test the proposed diagnostic model.