Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Time Series Analysis: Gold Price Exchange and Temperature Data, Exams of Complex analysis

A comprehensive analysis of time series data, focusing on the fluctuations in the next funds gold price exchange traded fund and quarterly average temperature data. It explores trend estimation techniques, including moving average, parametric quadratic polynomial, local polynomial, and splines, and evaluates their effectiveness in capturing trends and achieving stationarity. The document also delves into differencing techniques for time series analysis and provides a detailed analysis of the residuals from the trend models. It is a valuable resource for students studying time series analysis and data modeling.

Typology: Exams

2024/2025

Available from 03/01/2025

TopScorer100
TopScorer100 🇺🇸

4.8

(6)

91 documents

1 / 29

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
ISYE 6402 Homework Spring 2024
Part 1: EXT FUNDS Gold Price Exchange
Background
In this problem, we will study fluctuations in The NEXT FUNDS Gold Price Exchange Traded Fund that is a
type of investment fund that aims to track the performance of gold prices. By investing in this fund, investors
can gain exposure to the price movements of gold without having to physically own the metal. The fund holds
physical gold as its underlying asset, and its value is based on the market price of gold. You will use the file
Fund Prices Data.csv, where monthly prices are from January 2010 to Dec 2022.
Instructions on reading the data
To read the data in R, save the file in your working directory (make sure you have changed the directory if
different from the R working directory) and read the data using the R function read.csv()
You will perform the analysis and modelling on the Close data column.
#Here are the libraries you will need:
library(mgcv)
library(TSA)
library(dynlm)
library(ggplot2)
library(reshape2)
library(greybox)
library(mlr)
library(mgcv)
library(lubridate)
library(dplyr)
library(data.table)
#Run the following code to prepare the data for analysis:
data<-read.csv("Fund Prices Data.csv")
Question 1a: Exploratory Data Analysis
Plot the Time Series and the ACF plot for the series. Comment on the stationarity of both time series based on
these plots. Which (if any) stationarity assumptions are violated for the time series?
fp <- ts(data$Close, start = 2010, freq = 12)
plot(fp,col="purple",lwd=1.5,ylab="Fund Price",main="FP Time Series Plot")
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d

Partial preview of the text

Download Time Series Analysis: Gold Price Exchange and Temperature Data and more Exams Complex analysis in PDF only on Docsity!

ISYE 6402 Homework Spring 2024

Part 1: EXT FUNDS Gold Price Exchange

Background

In this problem, we will study fluctuations in The NEXT FUNDS Gold Price Exchange Traded Fund that is a type of investment fund that aims to track the performance of gold prices. By investing in this fund, investors can gain exposure to the price movements of gold without having to physically own the metal. The fund holds physical gold as its underlying asset, and its value is based on the market price of gold. You will use the file Fund Prices Data.csv, where monthly prices are from January 2010 to Dec 2022.

Instructions on reading the data

To read the data in R, save the file in your working directory (make sure you have changed the directory if different from the R working directory) and read the data using the R function read.csv()

You will perform the analysis and modelling on the Close data column.

#Here are the libraries you will need:

library (mgcv) library (TSA) library (dynlm) library (ggplot2) library (reshape2) library (greybox) library (mlr) library (mgcv) library (lubridate) library (dplyr) library (data.table)

#Run the following code to prepare the data for analysis:

data<-read.csv("Fund Prices Data.csv")

Question 1a: Exploratory Data Analysis

Plot the Time Series and the ACF plot for the series. Comment on the stationarity of both time series based on these plots. Which (if any) stationarity assumptions are violated for the time series?

fp <- ts(data$Close, start = 2010 , freq = 12 )

plot(fp,col="purple",lwd=1.5,ylab="Fund Price",main="FP Time Series Plot")

acf(fp,lag.max= 12 * 12 ,col="purple",lwd=1.5,ylab="Fund Price",main="FP ACF Plot")

Response: 1b

The plot shows the trend lines indicate that the trend has been better captured by the local polynomial and splines as compared to the moving average and Parametric approach.

Question 1c: Residual Analysis

Evaluate the quality of each fit using the residual analysis.

## Residual Process: MAV resid.1 = fp-tsmav.fp ## Residual Process: Local Polynomial resid.2 = fp-fit.loc.fp ## Residual Process: Spline resid.3 = fp-fit.spl ## Residual Process: Parametric resid.4 = fp-para.fit

y.min = min(c(resid.1,resid.2, resid.3, resid.4)) y.max = max(c(resid.1,resid.2, resid.3, resid.4)) ts.plot(resid.1,lwd= 2 ,ylab="Residual Process",col="plum", ylim=c(- 1500 , 2000 )) lines(resid.2,col="purple") lines(resid.3,col="green") lines(resid.4,col="blue") legend(x= 2010 ,y= 2000 ,legend=c("Moving Average","LOESS", "Splines", "Parametric quadratic"),lt y = 1 , col=c("plum","purple", "green", "blue"))

acf(resid.1,lag.max= 24 * 4 ,main="Moving Average model")

acf(resid.4,lag.max= 24 * 4 ,main="Parametric model")

Response:1c

The residuals from the trend models show clear non-stationarity, suggesting that trend removal alone using any of the four models is not sufficient for accounting for non stationary variations in the time series.

Question 1d: Differenced Data Modeling

Now plot the difference time series and its ACF plot. Apply the four trend models in Question 1b to the differenced time series. What can you conclude about the difference data in terms of stationarity? Which model would you recommend to apply (trend removal via fitting trend vs differencing) such that to obtain a stationary process?

ts.plot(diff(fp), col = "black", xlab = "", ylab = "Differenced FP", main = "Differenced FP Exchange Rate by Time") grid()

acf(diff(fp), lag.max = 52 * 12 , xlab = "Lag", ylab = "ACF ", main = "Diff FP ACF Analysis")

# 2. Fit a parametric quadratic polynomial model x1 <- time.pts[- 1 ] x2 <- time.pts[- 1 ] ^ 2 para.model <- lm(diff(fp) ~ x1 + x2) para.fit <- ts(fitted(para.model), start = 2010 , frequency = 12 ) ts.plot(diff(fp), xlab = "", ylab = "Differenced FP", main = "Differenced Parametric Quadratic Polynomial Analysis") grid() lines(para.fit, lwd = 2 ,col = "orange")

# 3. Fit a local polynomial model loc.model <- loess(diff(fp) ~ time.pts[- 1 ]) loc.fit <- ts(fitted(loc.model), start = 2010 , frequency = 12 ) ts.plot(diff(fp), xlab = "", ylab = "Differenced FP", main = "Differenced Local Polynomial Analysis") grid() lines(loc.fit, lwd = 2 ,col = "green")

# 5. Compare all estimated trends vals <- c(mav.fit, para.fit, loc.fit, gam.fit) ylim <- c(min(vals), max(vals)) ts.plot(mav.fit, lwd = 2 , col = "black", ylim = ylim, xlab = "", ylab = "FP", main = "Differenced Regression Model Comparison") grid() lines(mav.fit, lwd = 2 , col = "red") lines(para.fit, lwd = 2 , col = "orange") lines(loc.fit, lwd = 2 , col = "green") lines(gam.fit, lwd = 2 , col = "blue") legend("bottomright", legend = c("MAV", "PARA", "LOC", "GAM"), col = c("red", "orange", "green", "blue"), lwd = 2 )

Response 1d

The time series plots seem to clearly show the appropriateness of fit of the models and the indication of stationarity in the differenced data.

The fitted line showing the splines trend seems to have the least variability. The parametric quadratic model also has little variability, but not as much as the splines model which has higher deviations in trend, and local polynomial model which has the highest deviations as shown in the combined graph. The moving average trend model, however, has many ‘kinks’ capturing the minor movements that might not be of use in determining the trend.

From this analysis, we can confirm the property of stationarity for the differenced data; hence using the differenced data is a more effective approach for removing the trend in making the time series stationary.

Part 2: Temperature Analysis

Background

In this problem, we will analyze quarterly average temperature data. The data file Temperature HW 2.csv contains average monthly temperature from a southern region from January 1980 through Dec 2016. We will aggregate the data on a quarterly basis, by taking the average rate within each quarter. We will fit the models on the data until Quarter 4 of 2015 and evaluate the predictions for Q1 to Q4 2016.

Instructions on reading the data

To read the data in R, save the file in your working directory (make sure you have changed the directory if different from the R working directory) and read the data using the R function read.csv()

plot(diff(temp),xlab="Time",ylab="Temperature",main="Quarterly Temperature: 1-Differenced")

acf(diff(temp),lag.max= 12 * 12 ,main="Quarterly Temperature ACF: 1-Differenced")

plot(diff(temp, 4 ),xlab="Time",ylab="Temperature",main="Quarterly Temperature: 4-Differenced")

The plot of the 1st-order differenced data shows that trend has been removed. The seasonality effect, however, still seems to be present. For the 1st-order differenced data, the first seasonal lag in the ACF large and decays slowly over multiples of the lag. The ACF for the 1st-order differenced data exhibits a large first seasonal lag that decays slowly over multiples of the lag, indicating that the 1st-order differenced data is not suitable for effectively capturing the seasonality in the data.

Since we know that the 1st order difference doesn’t appropriately address seasonality, we can apply a 4 lag difference as provided above. The absence of a cyclical pattern in the ACF plot indicates that seasonality has been removed to a great extent; however, there is still evidence of a trend in the data, given the presence of slowly-decaying lags.

Question 2b: Seasonality Estimation

Separately fit a seasonality harmonic model and the ANOVA seasonality model to the temperature data. Evaluate the quality of each fit using the residual analysis. Does one model perform better than the other? Which model would you select to fit the seasonality in the data?

times<-ts(seq( 1 : 768 )) Timereq<-times Timereq2<-times^ 2 ## Estimate seasonality using ANOVA approach td_lm<- dynlm(temp ~ season(temp)) summary(td_lm)

Time series regression with "ts" data:

Start = 1980(1), End = 2016(4)

Call:

dynlm(formula = temp ~ season(temp))

Residuals:

Min 1Q Median 3Q Max

-1.97692 -0.42640 0.05287 0.46782 1.

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 12.1441 0.1137 106.85 <2e-16 ***

season(temp)Q2 13.7640 0.1607 85.63 <2e-16 ***

season(temp)Q3 15.2350 0.1607 94.78 <2e-16 ***

season(temp)Q4 3.8372 0.1607 23.87 <2e-16 ***

---

Signif. codes: 0 '' 0.001 '' 0.01 '' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.6914 on 144 degrees of freedom

Multiple R-squared: 0.989, Adjusted R-squared: 0.

F-statistic: 4302 on 3 and 144 DF, p-value: < 2.2e-

plot(temp, type = "l") lines(fitted(td_lm), col = "blue")

## Estimate seasonality using harmonic model td_lm2 <- dynlm(temp ~ harmonic(temp)) summary(td_lm2)

Time series regression with "ts" data:

Start = 1980(1), End = 2016(4)

Call:

dynlm(formula = temp ~ harmonic(temp))

Residuals:

Min 1Q Median 3Q Max

-2.12674 -0.68405 -0.09583 0.73012 1.

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 20.35311 0.07497 271.48 <2e-16 ***

harmonic(temp)cos(2pit) -7.61751 0.10602 -71.85 <2e-16 ***

harmonic(temp)sin(2pit) 4.96338 0.10602 46.81 <2e-16 ***

---

Signif. codes: 0 '' 0.001 '' 0.01 '' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.912 on 145 degrees of freedom

Multiple R-squared: 0.9807, Adjusted R-squared: 0.

F-statistic: 3677 on 2 and 145 DF, p-value: < 2.2e-