Intro
You can create a rolling average for a specified number of days using a SQL DataFlow or Magic ETL. To do this, you will need to join the data back onto itself multiple times.
Solution Details and Steps
MySQL
-
Create a MySQL DataFlow.
-
Add your input DataSet.
-
Create a transform using the following query:
SELECT a.`value`,(SELECT AVG(b.`value`) FROM input_DataSet b WHERE b.`date` <= a.`date` AND b.`date` >= SUBDATE(a.`date`, INTERVAL 5 DAY) ) AS ‘Rolling 5 Day Average’ ,a.`date` FROM input_DataSet
-
(Optional) If you would like to change the day comparison (changing 5 day average to 7 days), you can change the number in your SUBDATE() [SUBDATE(a.`date`, INTERVAL 5 DAY)].
-
If you would like to do a rolling sum or another calculation you can change your AVG() to the proper aggregation. [SELECT AVG(b.`value`) FROM input_DataSet b ].
-
You will need to change `date` to your date column.
-
You will need to change `value` to your value column.
-
Create an output to the DataFlow. The query should be SELECT * FROM transform_data_1.
-
Name the output and your DataFlow.
RedShift
-
Create a RedShift DataFlow.
-
Add your input DataSet.
-
Create a transform using the following query:
SELECT a.”value”,(SELECT AVG(b.”value”) FROM input_DataSet b WHERE b.”date” <= a.”date” AND b.”date” >= DATEADD(DAY, -5, a.”date”) ) AS “Rolling 5 Day Average”,a.”date” FROM input_DataSet
-
If you would like to change the day comparison (changing 5 day average to 7 days), you can change the number in your DATEADD() [DATEADD(DAY, -5, a.”date”)].
-
If you would like to do a rolling sum or another calculation you can change your AVG() to the proper aggregation. [SELECT AVG (b.”value”) FROM input_DataSet b ].
-
You will need to change “date” to your date column.
-
You will need to change “value” to your value column.
-
Create an output to the DataFlow. The query should be SELECT * FROM transform_data_1.
-
Name the output and your DataFlow.
Magic ETL
-
Create a new Magic ETL.
-
Add “Input DataSet” tile.
-
Add “Select Columns” tile. Select only your date and value columns.
-
You are going to create a “Date Operations” tile for every day you wish to compare. If you are showing 5 day average you will need to create 5 tiles.
-
You will select for all of these to use your “Select Columns” tile as their input.
-
Your “Date Operations” tiles will be adding a day to your original date column. Your first tile will add 1 day, your second will add 2 and this will continue.
-
You now want to add 5 different “Join Data” tiles.
-
Your joins are going to pull from the previous join and your date operation.
-
The join should be formatted with your previous join (original data) as your left table. Your right table will be your “Date Operations."
-
You should join on the original “Date” = “Date + _."
-
Your data should now look like the following screenshot if you “Run Preview.”
-
You can use a “Select Columns” to pull your original “Date” and “Value” columns. You will also pull all other value columns. (Value_1 – Value_5)
-
You can now start to add your values together to create your average.
-
You will need to add all of your value columns together.
-
Using the “Calculator” tile we can add two columns together at a time.
-
You now can divide by the number of value columns you have, using the “calculator” tile. In this example it is 6.
-
You can now clean up your columns using the “Select Columns” tile.
-
You only need to keep your “Date”, “Value”, and “Rolling Average” columns.
-
Your data should now look similar to:
-
You can now add an output to your Magic ETL.
Comments
0 comments
Please sign in to leave a comment.