Fast Fourier Transform

Diving Transformers World

8 min readJul 17, 2024

In this article, we will walk you through the fantastic concept of fast fourier transform derived features using kaggle and along with that, we will also learn aggregate functions and once it’s all done, we will finally dive into the intricacies of hilbert transform and hann window fundamentals.

Table of Content

FFT derived features
Features derived from aggregate functions
Features derived using the hilbert transform and hann window

FFT Derived Features

One of the features of the model is the Fast Fourier Transform (FFT)

applied to the entire segment, this is not directly used as a feature but as a
basis for the calculation of multiple aggregation functions. The FFT is calculated using a fast implementation of the
discrete Fourier transform.

We are using numpy implementations for FFT for one-dimensional arrays

(fft.fft), which is extremely fast, numpy being based on BLAS (Basic
Linear Algebra Subprograms) and Lapack (Linear Algebra PACkage),
two libraries that provide routines for performing basic vector and matrix
operations and solving linear algebra equations. The output of the function

used here is a one dimensional array of complex values. Then, we extract the
vectors of real and imaginary parts from the array of complex values and we
calculate these features.

Extract real and imaginary parts of the FFT, this is the first part of
further processing the Fourier fast transform of the acoustic signal.
Calculate the mean, standard deviation, min and max for both the real
and imaginary parts of the FFT. From the previous transformation

which separates the real and imaginary parts of the FFT, we then
calculate these aggregate functions.
Calculate the mean, standard deviation, min and max for both the real
and imaginary parts of the FFT for 5K and 15K data points from the
end of the FFT vector.

The code to create the file segment as well as the FFT and FFT-derived

features are given here. First, we calculate the FFT for the subset of the
acoustic data. Then, we calculate the real and imaginary parts of the FFT.
From the real FFT component, we calculate, using pandas aggregated
functions, the mean, standard deviation, max and min values. We then
calculate similar values from the imaginary part of the FFT signal.

We then follow with calculating features derived from various aggregated
functions.

Features Derived From Aggregate Functions

The mean, standard deviation, max and min applied to the entire segment
are calculated with the code, using pandas aggregate functions
mean, std, max and min.

We continue to compute additional aggregated features. For our model, we
will include various signal processing techniques, as you’ll notice and then,

by measuring the feature importance after we train our baseline model, we
will qualify which features contribute more to our model prediction.

Moving on, we calculate the mean change for the entire segment; here,
“segment” refers to the original subset of acoustic data. change is calculated
with the numpy function diff and the parameter 1. This function receives an array of values and calculates the difference between each successive value
in the array. Then we calculate the average of the values in the array of

differences. We also calculate the mean change rate for the entire acoustic
data segment. This is calculated as the average of non-zero values in the new

change vectors divided by the original values in the data segment. The code
for these features is as follows.

Additionally, we also calculate the maximum and minimum of the absolute
values per entire segment. After calculating the absolute values, we
calculate the minimum and maximum values.

We are trying to include a diverse range of features, to capture as much of
the signal patterns as possible, when we aggregate the temporal signal. The
code for this is.

A set of aggregated functions on the first and last 10K and 50K values per
the acoustic data segment can also be calculated, as follows.

Standard deviation for first 50K and last 10K values per acoustic data
segment.
Average value for first 50K and last 10K values per acoustic data
segment.
Minimum values for first 50K and last 10K values per acoustic data
segment.
Maximum values for first 50K and last 10K values per acoustic data
segment.

These features are aggregating a smaller part of the signal and therefore will
capture signal characteristics from only a smaller interval before the failure.
The combination of aggregated features on the whole signal length and on a
smaller part of the signal will add more information about the signal. The
code for these features will be.

Next, we include the ratio of maximum to minimum values for the entire
acoustic data segment and the difference between the maximum and
minimum values for the entire acoustic data segment. We also add the

number of values exceeding a certain amplitude of oscillation above 500
units and the sum of values per entire segment. We try to capture some of
the hidden patterns in the signal using this diversity of features we engineer.

In particular, here, we include information from the extreme oscillations in
the signal.

We continue to add diverse aggregated features that try to capture various
characteristics of the original signal. We further calculate the mean change

rate excluding nulls for the first 10K and last 50K data points per acoustic
data segment.

Some of the features we are adding will exclude the elements in the data that
are 0, to ensure only non-zero values are included in the calculation of the
aggregated function. The code for using the nonzero function is.

A set of engineered features involves quantiles, specifically the 01%, 05%,
95% and 99% quantile values, for the entire acoustic data segment. The
quantiles are calculated using the numpy quantile function. A quantile is a statistical term that refers to dividing a dataset into intervals of equal
probability. For example, a 75% quantile value is the point where 75% of the
data has values less than that number. A 50% quantile is the point where

50% of the data has values less than that number and is also called the
median. We also add absolute values for the 01%, 05%, 95% and 99%
quantile values. See the code for the calculation of these features.

Another type of engineered feature introduced is trend values calculated
with the add_trend_values function with the absolute flag off. Trend values

will capture the general direction in which the acoustic data signal is
changing. For a signal that shows an oscillation around 0 with high
frequency, the trend will capture the change in the average value of the
actual signal.

We also add absolute trend values calculated with the add_trend_values

function with the absolute flag on. We include this type of engineering
feature to capture patterns in the signal that appear in the absolute value of
the signal. In this case, for the calculation of the trend, we use the absolute

values of the original signal. Therefore, this trend will capture the direction
of variation of the absolute value of the signal. The corresponding code is
given here.

Next, we include the mean of absolute values and the standard deviation of
absolute values. Median absolute deviation (mad), Kurtosis, Skew
(skewness) and Median values are also calculated. These functions are
calculated using a numpy implementation. The median absolute deviation is a
robust measure of the variability of a univariate sample of quantitative data.
Kurtosis is a measure of the combined weight of a distribution tail relative to

the center of the distribution. Skew from skewness is a measure of the
asymmetry or distortion of a symmetric distribution. The median is, as
we have already observed, the value separating the higher half from the lower

half of a set of data. All these aggregated functions capture complementary
information about the signal. The code for the calculation of these
aggregation functions are shown here.

Next, we include several features calculated by using transformation
functions specific to signal processing.

Feature Derived Using Hilbert Transform and Hann Window

We also calculate the Hilbert mean. We apply the Hilbert transform of the
acoustic signal segment using the scipy.signal.hilbert function. This

calculates the analytic signal using the Hilbert transform. Then, we calculate
the mean of the absolute value of the transformed data. The Hilbert
transform is used frequently in signal processing and captures important

information about the signal. Because we use aggregation functions to
generate features from our temporal data, we would like to include a large, diverse range of existing signal processing techniques, to add important

complementary elements of the signal when training the model.

Next, we include a feature derived from the Hann window mean value. We
use this feature derived from the Hann window to reduce the abrupt
discontinuities at the edge of the signal. The Hann window mean value is
calculated using the convolution of the original signal with the result of the
Hanning window and dividing by the sum of all values in the Hanning
window.

We previously introduced the definition of classical STA/LTA. We calculate
a few features like classical STA/LTA mean for 500-10K, 5K-100K, 3,333-

6,666, and 10K-25K STA/LTA windows. These are calculated with the
STA/LTA function introduced previously. We include a variety of
transformations to try to capture diverse signal characteristics in the
aggregated engineering features.

Finally, we will also calculate features based on moving averages.

Conclusion

Finally, we will understand the fundamentals of FFT, aggregate functions and hilbert/Hann derived features and during the guide we still remember the best machine learning model that we made, but never all days are similar, these tactics that we discussed in this article help you to clarify and rectify models and how to deal with transformer, actually we did not exploring transformers but at the edge of glance, it is enough for modeling a descriptive derived features using different applicable techniques.