Publishing data analysis post
This commit is contained in:
parent
4247ac4d79
commit
61c8fac08c
@ -1,38 +1,84 @@
|
|||||||
---
|
---
|
||||||
title: "Getting Into Day Trading: Analyzing The Moving Average"
|
title: "Getting Into Day Trading: Analyzing The Moving Average"
|
||||||
date: 2017-11-04T14:11:54-04:00
|
date: 2017-11-04T14:11:54-04:00
|
||||||
draft: true
|
draft: false
|
||||||
tags: ["day trading", "data analysis", "julia"]
|
tags: ["day trading", "data analysis", "python"]
|
||||||
---
|
---
|
||||||
|
|
||||||
Now that we have a Julia environment good to go, and a dataset available, time to start doing some real analysis.
|
|
||||||
|
|
||||||
I know that I have this bit of data for the WLTW symbol, and what would be helpful is to see that data completely
|
I know that I have this bit of data for the WLTW symbol, and what would be helpful is to see that data completely
|
||||||
plotted in all of it's glory. Let's take a look at the closing costs (y) plotted against the date (x).
|
plotted in all of it's glory. Let's take a look at the closing costs (y) plotted against the date (x).
|
||||||
|
|
||||||
![Image](/img/post/WLTW_CLOSING_COSTS.png)
|
![Image](/img/post/WLTW_CLOSING_COSTS.png)
|
||||||
|
|
||||||
Not bad, we can see an ok trend going from January to December 2016. This data isn't very useful yet but I can
|
This is a good start, but how good are the SMA's at tracking this close cost? Let's first write a little Python that will grab
|
||||||
showcase some awesome Julia packages, and how I generated the graph.
|
the SMA for a given window, and the end of the window it was calculated for the X-axis.
|
||||||
|
|
||||||
I used DataFrames.jl to store the data, Query.jl to grab a subset of the data, and Gadfly.jl to plot the data.
|
```python
|
||||||
All of these are excellent libraries for doing your thing when analyzing.
|
import numpy as np
|
||||||
|
|
||||||
```julia
|
def moving_avs(col, window):
|
||||||
data = readtable("prices.csv", header=True)
|
moving_avs = {}
|
||||||
q = @from i in data begin
|
for i in range(0, len(col), window):
|
||||||
@where i.symbol == "WLTW"
|
moving_avs[i] = np.mean(col[i:i+window])
|
||||||
@select {i.date, i.close}
|
return moving_avs
|
||||||
@collect DataFrame
|
|
||||||
end
|
|
||||||
|
|
||||||
p = (q, y=:close, Geom.Point, Guide.Title("Closing Costs: WLTW - 2016"))
|
|
||||||
draw(PNG("wltw_closing_costs.png", 6inch, 4inch), p)
|
|
||||||
```
|
```
|
||||||
|
|
||||||
Now I'd like to add the plots for the 3-day SMA, and the 5-day SMA to the plot of WLTW closing costs. What these
|
Using Numpy for analysis, and Pandas for Series to hold my values, I can use this function to create a dictionary tracking exactly what
|
||||||
are, are the average of either the last 3 days or the last 5 days for a single datapoint. I believe that
|
day I am ending an SMA calculation on, as well as the SMA for that range. Window becomes the step size in the range call, and `np.mean` does
|
||||||
by doing so, we may be able to visualize if either datapoint is adequate in predicting trends in this data. I'll be looking for
|
the work calculating simple moving averages for slices of the data array.
|
||||||
how close any given moving average is to the actual trend of the close costs for the WLTW security.
|
|
||||||
|
|
||||||
|
Now I can plug in my values to the function to generate some simple moving averages.
|
||||||
|
|
||||||
|
```python
|
||||||
|
data = pd.read_csv("prices.csv")
|
||||||
|
wltw = data[data["symbol"] == "WLTW"]
|
||||||
|
|
||||||
|
threedaysma = moving_avs(wltw, 3)
|
||||||
|
fivedaysma = moving_avs(wltw, 5)
|
||||||
|
```
|
||||||
|
|
||||||
|
Back to the orignal question, how well do the SMAs track against the closing cost? Well let's find out.
|
||||||
|
|
||||||
|
```python
|
||||||
|
import matplotlib.pyplot as plt
|
||||||
|
plt.plot(wltw["close"])
|
||||||
|
plt.plot(list(threedaysma.keys()), list(threedaysma.values()))
|
||||||
|
plt.show()
|
||||||
|
```
|
||||||
|
|
||||||
|
The resulting graph is here for the three day SMA:
|
||||||
|
|
||||||
|
![Image](/img/post/threedaysma.png)
|
||||||
|
|
||||||
|
That looks very very promising for this small timerange. The three day SMA follows the closing cost very closely.
|
||||||
|
Now I don't know about you, but I'd like to see just how closely the three day SMA follows the closing cost. I learned
|
||||||
|
in statistics of a little measure called correlation. From the interwebs:
|
||||||
|
|
||||||
|
> Correlation is a statistical measure for how two or more variables fluctuate together.
|
||||||
|
|
||||||
|
Now, I won't go into too many details here, as I have mammoth libraries at my disposal. However, I can explain the basics of
|
||||||
|
the measure of correlation. Correlation is between the values of -1 and 1, inclusive. A value of 1 means that the two datasets are positively
|
||||||
|
correlated (fluctuate together), while a value of -1 means that the two datasets are negatively correlated (fluctuate inversely). Any number in-between
|
||||||
|
represents how strongly correlated datasets are positively or negatively, and 0 means that the data is not correlated whatsoever.
|
||||||
|
|
||||||
|
To calculate correlation, I use the Numpy method for the Pearson product-moment correlation given two array-like inputs. First, I clean the data.
|
||||||
|
I'll do this by dropping close cost values that don't correspond to the end of SMA windows for the 3-day SMA.
|
||||||
|
|
||||||
|
```python
|
||||||
|
cleaned = wltw.iloc[list(threedaysma.keys()),:]
|
||||||
|
```
|
||||||
|
|
||||||
|
And now, to calculate our Pearson product-moment correlation coefficients
|
||||||
|
|
||||||
|
```python
|
||||||
|
threedaysma_array = np.array(list(threedaysma.values()))
|
||||||
|
print(np.corrcoef(cleaned["close"], threedaysma_array))
|
||||||
|
|
||||||
|
#[[ 1. 0.96788571]
|
||||||
|
# [ 0.96788571 1. ]]
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
What this output 2D array tells us, is that the data are very strongly correlated! For the points we cleaned, the correlation coefficient is almost 1. That
|
||||||
|
is great news, and we can most likely use this moving forward for forecasting and short-term trading.
|
||||||
|
|
||||||
|
BIN
static/img/post/threedaysma.png
Normal file
BIN
static/img/post/threedaysma.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 20 KiB |
Loading…
Reference in New Issue
Block a user