N-Rho To Noise is a ratio of 2 components. Rho is my own calculation of a signal that is differenced (force time series stationary, allowing for more predictability) and its relation to a unit of a measure of noise. N is the amount of times it is differenced. Using a simplified q-learning reinforcement learning agent, the length of the ratio is calibrated to its optimal value.
- Purple indicates the undifferenced signal is above the RMSE error bands - Red indicates both the differenced and undifferenced signals are above the threshold for a strong positive deviation, suggesting a short
- Blue indicates the undifferenced signal is below the RMSE error bands - Green indicates both the differenced and undifferenced signals are below the threshold for a negative strong deviation, suggesting a long
- Strong long signal when you have both an undifferenced Rho and differenced Rho giving you local agreement (blue bar followed by green) - Strong short signal when you have an undifferenced and differenced Rho giving you identical signals (purple bar followed by red)
Optimal length: the parameter of the length that the model configures to be the best parameter Optimal reward: the reward corresponding to the optimal length (green=strong value, orange=intermediate strength, red=poor) Average reward: the average reward of the set of lengths used over all episodes (green=strong value, orange=intermediate strength, red=poor) Cumulative reward: the sum of all the rewards Variance: a measure of how varied the data is (too much variance can suggest it cannot generalize too well to unseen data)