- #1
kmrstats
- 2
- 0
Hi -
First timer here. Excuse me if this question is not up to the level i see posted on this forum, but here goes.
I have been asked to provide a daily signal generated from the number of occurrences of a set of specified phrases present in a news data feed. The first thing I did is generate a moving average from the daily count of each phrase in the feed and generate a signal if the current count was above the moving average by a specified percentage. Using this approach I didn't think the signal provided much value beacuse the phrase counts are very bursty. The count can be in the low teens for a number of days in a row and then jump to a 100 for a couple of days and then settle back into the low teens.
What type of statistics should I use to determine a statistically significant event given my scenario described above?
Thanks in advance
First timer here. Excuse me if this question is not up to the level i see posted on this forum, but here goes.
I have been asked to provide a daily signal generated from the number of occurrences of a set of specified phrases present in a news data feed. The first thing I did is generate a moving average from the daily count of each phrase in the feed and generate a signal if the current count was above the moving average by a specified percentage. Using this approach I didn't think the signal provided much value beacuse the phrase counts are very bursty. The count can be in the low teens for a number of days in a row and then jump to a 100 for a couple of days and then settle back into the low teens.
What type of statistics should I use to determine a statistically significant event given my scenario described above?
Thanks in advance