Well well well, guess who’s back!😎

Vin shady
In a galaxy far, far away, we once chatted about how to use numpy to run an FFT and squeeze out a spectrum and phase from a stock market signal.
Let me refresh your memory: the goal is to predict the daily price of the S&P 500 for the next 30 days using Fourier theory.

I'm not gonna bore you again with the whole speech about the working assumptions—go check out Part 1 if you missed it. 👌
In this piece, I'm diving into the actual prediction game. To do that, I’ll walk you through two radically different methods:

  • amplitude processing
  • frequency-based processing

But first, let me remind you that a spectrum is the frequency-domain representation of a time-domain signal. It's basically like breaking down your signal into a bunch of sine waves wiggling at specific frequencies. For us, that means identifying the cyclical components.

 

Signal Processing by Amplitude

The first assumption to make here – yes, again, deal with it – is that my stock market signal is going to be determined by the most powerful/important cycles.
To express that, we use Fourier to study the cycles (see: frequency decomposition). Then to capture the idea of “powerful,” we use a distance measure, with the most natural one being absolute value. And – lucky us – we’ve already calculated the absolute value of each frequency component relative to 0 – that’s the amplitude 😉. So, our amplitude spectrum is basically a neat little picture of how important each frequency is to our signal.

Next up, since we only care about the strongest ones, we’ve gotta nuke the weaker ones – that’s just how nature works, bro. It's called "filtering".😎

And here's where it gets wild – there’s a crap-ton of ways to do this:

  • Hard thresholding: you set a threshold and everything below it gets zeroed like your ex’s memory of your existence 👀
  • Adaptive thresholding: set a threshold, and stuff below it gets weakened based on how far it is from the cutoff
  • Top percentage filtering: you keep the top N% amplitudes and ghost the rest
  • Statistical thresholding: you filter using the standard deviation of the amplitudes, or whatever fancy stat makes you feel smart
  • Bayesian thresholding: keep the coefficients that boost a particular metric – usually SnR, error rate, or forecast accuracy. Basically, cook up whatever nerd recipe you like 
 

Amplitude Filtering

You probably guessed it already, in this article we’re going with the simplest method: hard thresholding.
Why?
Because this goddamn article already took me 3 weeks to write and I’d like to move on with my life, thanks!
cough cough... What I meant to say is that the idea is to randomly set a threshold and see if the filtering works. Yeah, not exactly a dream scenario when you put it like that.😑

To help us out, we can look at the number of frequencies kept relative to the total (in %), which is basically the N% method. But honestly, we're not going to bust our asses over this right now because I’m about to filter with over 200 different thresholds anyway... so here’s the code to give you a feel for the general idea:

indices = np.abs(spectrum) > 1200 # filter by a random threshold
cleaned_spectrum = spectrum * indices # Apply the mask
cleaned_returns = np.real(np.fft.ifft(cleaned_spectrum)) # return to the time domain

print("nb of frequence taken in account",np.sum(indices))
print("percentage of it for all of the frequency", np.sum(indices)/len(indices))
 
We can then compare the classic spectrum to the filtered one:
Spectrum FFT
Filtred Spectrum FFT

You can clearly see that frequencies under 1200 now have zero amplitude... Why I’m even typing this out, I don’t know. It’s like narrating a porno—unnecessary and vaguely uncomfortable.

Alright alright, VAR check please 🟨 —maybe I went a bit too hard on that one.

Anyway, just to soothe our nerd conscience, we’ll go back into the time domain to display the reconstruction alongside the original signal. Obviously, we’re expecting something that only keeps the key cycles, and should be pretty close to the original...

signal and cleaned signal

And then—bam—horror. The "clean" signal looks nothing like the original. Well... not completely nothing like it.😱
Direct your refined reader’s eye to the general movement of both curves: they're kinda similar, but the filtered one has way less intense variation. For those of you itching to debate it, feel free to go wild in the comments.

Here’s how I see it:

  • We know the variance of our returns changes over time: heteroskedasticity, for those of you with a stick up your stats.
  • We know that to make a big-ass wave, you need to pile on smaller ones — that's the poetic imagery for the day.
  • So, to get big, abrupt movements, you need the low frequencies.
  • But we filtered them out!
  • So now we’re stuck with a curve that doesn’t move abruptly.

Obviously, other factors are at play too—like the phase of small amplitudes, or the non-linearity of thresholding versus the linear nature of the transform, which can introduce distortions.

The point of all this rambling? It’s totally normal that our filtered curve doesn’t look like the original signal! And, more importantly, it means that our predictions—yes, we’re finally getting there—probably won’t include the big price moves they should. So we need to account for the real market explosiveness that the model fails to capture.

And that’s how I justify my compulsive overanalysis of literally everything.😏

 

Predictions

And now we get to my favorite part—and yours too, admit it: prediction via Fourier.😎
The magic of Fourier prediction lies in the reconstruction of the signal from its frequency components. Here's the formula you’re going to tattoo on your brain till the end of time:

f(x)=Rnf^(ξ)e2iπξxdξf(x) = \int_{\mathbb{R}^n} \hat{f}(\xi) e^{2i\pi \xi \cdot x} \, d\xi

We’re going to mindlessly follow the formula to carry out the reconstruction:

  1. Normalize the amplitude of each harmonic and recover the phase
  2. Compute angular frequencies ($2 \times \pi \times \xi$)
  3. Create an extended time axis to also include future data
  4. Build the cosine components: each frequency gets a cosine term scaled by its amplitude. We use cosine because we only care about the real part of the signal—go ahead, cry a bit for the lost phase that we will never analyze.
  5. Sum the components: by adding up all the harmonics, we mimic the integral and reconstruct the signal.

Here’s how that looks in code:

def fft_forecast(signal, forecast_horizon, threshold, sampling_rate=1/7):

    """
    Extend and forecast a time series using its filtered FFT components
    Parameters
    ----------
    signal : ndarray
        Original time series data.

    forecast_horizon : int
        Number of future points to predict.
    threshold : float
        Minimum magnitude for FFT coefficients to be retained.
    sampling_rate : float, optional
        Samples per unit time (default is 1/7, e.g., weekly data).
    Returns
    -------
    extended_signal : ndarray
        Combined original and forecasted time series of length len(signal) + forecast_horizon.
    """
    num_samples = signal.size # Number of original samples
    time_step = 1 / sampling_rate # Time step between samples

    # Compute the FFT of the original signal
    fft_coeffs = np.fft.fft(signal)
    # Compute corresponding frequencies
    freqs = np.fft.fftfreq(num_samples, d=time_step)

    # Filter coefficients by magnitude threshold
    mask = np.abs(fft_coeffs) > threshold
    filtered_coeffs = fft_coeffs * mask

    # Compute amplitude and phase for each retained component (1 & 2)
    amplitudes = np.abs(filtered_coeffs) / num_samples
    phases = np.angle(filtered_coeffs)
    angular_freqs = 2 * np.pi * freqs

    # Build time array for original + forecast
    total_length = num_samples + forecast_horizon # Total length including forecast
    t = np.arange(total_length) * time_step
    
    # Reconstruct and forecast by summing cosine waves
    extended_signal = np.zeros(total_length)
    for k in np.where(mask)[0]:
        extended_signal += amplitudes[k] * np.cos(angular_freqs[k] * t + phases[k])
    return extended_signal

 

But why does this work?

The secret sauce here is the cosine function, which is infinite and therefore extendable to infinity. That allows us to stretch the time axis beyond the original span of the signal to include future forecasts. In other words, thanks to this sexy little function, we can predict way past the data’s bedtime.👍

 Amplitude filtered forecasts

And voilà, here are my predictions! Are they good? Are they garbage? No idea—but challenge complete: I am predicting a full week of the S&P 500 (7x5 = 35). Suck it, linear regression.

 

So what if we pushed things a bit further?

Alright, I’ve got my predictions—should I stop there? HELL no. The idea is to poke at them a little, see if they even make sense. What if I’d picked a threshold of 1000? 1100? Something else? How the hell do I know if I’m right?

Exactly—I don’t. Just like in life, why bother knocking one door at a time when you could just blast the whole neighborhood with a megaphone? Let’s throw all the thresholds in the mix at once and combine them into one beautiful, precise result. Why settle for a single guess when you can have them all?
start = 100
end = 2000
step = 50

# Create an array of thresholds
th = np.arange(start, end + step, step)

# Perform the reconstruction for each threshold in a vectorized way
forecast_array = np.array([fft_forecast(x, horizon, i)[-horizon:] for i in th])

# Putting it on the price scale
forecast_price = close.iloc[-1] + np.cumsum(forecast_array.T, axis = 0)
 
And with a sexy little visualization—look at that delicious 5% confidence interval, baby:
Amplitude filtered forecasts map

And aside from looking cute, we’ve done one more important thing: we’ve added probability to the prediction space spat out by our model—and that’s straight-up beautiful. Now we know where the price will be, and exactly what the odds are. Which isn’t easy to get usually, since our filtering method is nonlinear—so no neat Gaussian blob like in those crusty old autoregressive models.

Let me remind you: the model might be total crap. But I’m giving you the method first, so you can apply it to your own projects.

With this, it’s not hard to factor in transaction fees and consider every possible order to get a future expectation map directly on your trades—and yeah I’m going a bit deep for this article, but we’ll get there 😉. And here we’ve got real info, ready to be weaponized in a trading strategy.

 

Intermission 2

Yeah yeah, I know—it’s a long-ass movie! Complain to the director on your way out.
Hey you in the back! No leaving popcorn in the seats!

Alright. Movie theater usher roleplay—check!😁

In this part, we saw how to make predictions under the assumption that only the big important cycles matter for price direction. To do that, we:

  1. ran an FFT (see article 1)
  2. filtered the amplitude using a hard threshold method
  3. reconstructed the signal using the FFT formula
  4. created a probability layer over the predictions based on the model

The idea now is to use the same method, but filter by frequency instead. The steps are basically the same, but the narrative and the results take a new twist.

See you again soon—bring snacks that don’t crunch so damn loud next time.✋

 

Don’t hesitate to comment, share, and most importantly, code!  
I wish you an excellent day and lots of success in your trading projects!  
La Bise et à très vite! ✌️