Time series analysis

Similar documents
LabView Exercises: Part II

PS User Guide Series Seismic-Data Display

Hands-on session on timing analysis

NENS 230 Assignment #2 Data Import, Manipulation, and Basic Plotting

Digital Image and Fourier Transform

MIE 402: WORKSHOP ON DATA ACQUISITION AND SIGNAL PROCESSING Spring 2003

Upgrading E-learning of basic measurement algorithms based on DSP and MATLAB Web Server. Milos Sedlacek 1, Ondrej Tomiska 2

Investigation of Digital Signal Processing of High-speed DACs Signals for Settling Time Testing

Laboratory Assignment 3. Digital Music Synthesis: Beethoven s Fifth Symphony Using MATLAB

ECE438 - Laboratory 4: Sampling and Reconstruction of Continuous-Time Signals

A NEW LOOK AT FREQUENCY RESOLUTION IN POWER SPECTRAL DENSITY ESTIMATION. Sudeshna Pal, Soosan Beheshti

Reconstruction of Ca 2+ dynamics from low frame rate Ca 2+ imaging data CS229 final project. Submitted by: Limor Bursztyn

Lab 5 Linear Predictive Coding


Restoration of Hyperspectral Push-Broom Scanner Data

Single Channel Speech Enhancement Using Spectral Subtraction Based on Minimum Statistics

Swept-tuned spectrum analyzer. Gianfranco Miele, Ph.D

Speech and Speaker Recognition for the Command of an Industrial Robot

ON THE INTERPOLATION OF ULTRASONIC GUIDED WAVE SIGNALS

Algebra I Module 2 Lessons 1 19

Removing the Pattern Noise from all STIS Side-2 CCD data

Handout 1 - Introduction to plots in Matlab 7

Getting started with Spike Recorder on PC/Mac/Linux

2. AN INTROSPECTION OF THE MORPHING PROCESS

EE373B Project Report Can we predict general public s response by studying published sales data? A Statistical and adaptive approach

BitWise (V2.1 and later) includes features for determining AP240 settings and measuring the Single Ion Area.

More About Regression

ANALYSIS OF COMPUTED ORDER TRACKING

1 Ver.mob Brief guide

Machinery Diagnostic Plots Part 2 ORBIT Back-to-Basics: What does the data really tell us?

MATLAB Basics 6 plotting

The Measurement Tools and What They Do

Analysis, Synthesis, and Perception of Musical Sounds

Elasticity Imaging with Ultrasound JEE 4980 Final Report. George Michaels and Mary Watts

DATA COMPRESSION USING THE FFT

DISPLAY WEEK 2015 REVIEW AND METROLOGY ISSUE

A Parametric Autoregressive Model for the Extraction of Electric Network Frequency Fluctuations in Audio Forensic Authentication

Frequencies. Chapter 2. Descriptive statistics and charts

CS229 Project Report Polyphonic Piano Transcription

NanoGiant Oscilloscope/Function-Generator Program. Getting Started

PulseCounter Neutron & Gamma Spectrometry Software Manual

EE 261 The Fourier Transform and its Applications Fall 2007 Problem Set Two Due Wednesday, October 10

MPEGTool: An X Window Based MPEG Encoder and Statistics Tool 1

The Effect of Time-Domain Interpolation on Response Spectral Calculations. David M. Boore

EASY-MCS. Multichannel Scaler. Profiling Counting Rates up to 150 MHz with 15 ppm Time Resolution.

Calibrate, Characterize and Emulate Systems Using RFXpress in AWG Series

E X P E R I M E N T 1

The Research of Controlling Loudness in the Timbre Subjective Perception Experiment of Sheng

Study of White Gaussian Noise with Varying Signal to Noise Ratio in Speech Signal using Wavelet

m RSC Chromatographie Integration Methods Second Edition CHROMATOGRAPHY MONOGRAPHS Norman Dyson Dyson Instruments Ltd., UK

An Effective Filtering Algorithm to Mitigate Transient Decaying DC Offset

StaMPS Persistent Scatterer Practical

Supplemental Material for Gamma-band Synchronization in the Macaque Hippocampus and Memory Formation

Noise. CHEM 411L Instrumental Analysis Laboratory Revision 2.0

Multiple-Window Spectrogram of Peaks due to Transients in the Electroencephalogram

Getting Started. Connect green audio output of SpikerBox/SpikerShield using green cable to your headphones input on iphone/ipad.

Setting Up the Warp System File: Warp Theater Set-up.doc 25 MAY 04

Tempo and Beat Analysis

Detection and demodulation of non-cooperative burst signal Feng Yue 1, Wu Guangzhi 1, Tao Min 1

Quick-Start for READ30

Analysis of WFS Measurements from first half of 2004

Processing data with Mestrelab Mnova

Pre-Processing of ERP Data. Peter J. Molfese, Ph.D. Yale University

Normalization Methods for Two-Color Microarray Data

What is Statistics? 13.1 What is Statistics? Statistics

A Parametric Autoregressive Model for the Extraction of Electric Network Frequency Fluctuations in Audio Forensic Authentication

A Matlab toolbox for. Characterisation Of Recorded Underwater Sound (CHORUS) USER S GUIDE

Automatic LP Digitalization Spring Group 6: Michael Sibley, Alexander Su, Daphne Tsatsoulis {msibley, ahs1,

Pole Zero Correction using OBSPY and PSN Data

Measurement of overtone frequencies of a toy piano and perception of its pitch

ECE 4220 Real Time Embedded Systems Final Project Spectrum Analyzer

Course Web site:

Generating Spectrally Rich Data Sets Using Adaptive Band Synthesis Interpolation

Blueline, Linefree, Accuracy Ratio, & Moving Absolute Mean Ratio Charts

Experiment 2: Sampling and Quantization

EE391 Special Report (Spring 2005) Automatic Chord Recognition Using A Summary Autocorrelation Function

StaMPS Persistent Scatterer Exercise

UNIVERSAL SPATIAL UP-SCALER WITH NONLINEAR EDGE ENHANCEMENT

Appendix D. UW DigiScope User s Manual. Willis J. Tompkins and Annie Foong

potentiostat/galvanostat

Music Segmentation Using Markov Chain Methods

Please feel free to download the Demo application software from analogarts.com to help you follow this seminar.

TERRESTRIAL broadcasting of digital television (DTV)

GG450 4/12/2010. Today s material comes from p in the text book. Please read and understand all of this material!

Electrospray-MS Charge Deconvolutions without Compromise an Enhanced Data Reconstruction Algorithm utilising Variable Peak Modelling

CSC475 Music Information Retrieval

Robert Alexandru Dobre, Cristian Negrescu

Practicum 3, Fall 2010

USB Mini Spectrum Analyzer User s Guide TSA5G35

Visual Encoding Design

The XYZ Colour Space. 26 January 2011 WHITE PAPER. IMAGE PROCESSING TECHNIQUES

1 Overview. 1.1 Digital Images GEORGIA INSTITUTE OF TECHNOLOGY. ECE 2026 Summer 2018 Lab #5: Sampling: A/D and D/A & Aliasing

MestReNova A quick Guide. Adjust signal intensity Use scroll wheel. Zoomen Z

Research Article. ISSN (Print) *Corresponding author Shireen Fathima

Results of the June 2000 NICMOS+NCS EMI Test

Analysis. mapans MAP ANalysis Single; map viewer, opens and modifies a map file saved by iman.

ECE438 - Laboratory 1: Discrete and Continuous-Time Signals

Moving on from MSTAT. March The University of Reading Statistical Services Centre Biometrics Advisory and Support Service to DFID

1 Overview. 1.1 Digital Images GEORGIA INSTITUTE OF TECHNOLOGY. ECE 2026 Summer 2016 Lab #6: Sampling: A/D and D/A & Aliasing

Lecture 2 Video Formation and Representation

Transcription:

Time series analysis (July 12-13, 2011) Course Exercise Booklet MATLAB function reference 1

Introduction to time series analysis Exercise 1.1 Controlling frequency, amplitude and phase... 3 Exercise 1.2 Signal representation in the Frequency Domain... 4 Analysis in the Time Domain Exercise 2.1 Event Spacing of the Cave Creek runoff data... 5 Exercise 2.2 Autocorrelation of the Cave Creek runoff data... 6 Exercise 2.3 Autocorrelation of the SPECMAP Stack... 7 Signal Filtering Exercise 3.1 Removing long-term trends from the Mauna data set... 8 Exercise 3.2 Calculating the first difference curve of the Mauna data... 9 Exercise 3.3 Using a moving average... 10 Exercise 3.4 Filtering in the frequency domain... 11 The Frequency Domain and Spectral Analysis Exercise 4.1 Calculation of a periodogram... 12 Exercise 4.2 Time in the Frequency domain... 13 Exercise 4.3a Trends in the Frequency domain... 14 Exercise 4.3b Trends in the Frequency domain(2)... 15 Exercise 4.4 Processing unequally spaced data... 16 Exercise 4.5 The CLEAN algorithm... 17 Exercise 4.6 White noise in the Frequency Domain... 18 Exercise 4.7 Welch-Overlapped-Segment-Averaging... 19 Exercise 4.8 Red noise in the Time and Frequency Domains... 20 The Time-Frequency Plane Exercise 5.1 A Stationary Signal... 21 Exercise 5.2 A nonstationary signal... 222 Exercise 5.3 Evolutionary spectral analysis of a nonstationary signal... 233 Exercise 5.4: Evolutionary spectral analysis of the ODP677 18 O record... 244 PCA / EOF Analysis Exercise 7.1 PCA of a collection of time series 25 Exercise 7.2 PCA and the Hockey Stick".26 Correlation of two time series Exercise 8.1 Correlation and autocorrelation 27 Exercise 8.2 A semi-emperical ice volume model 28 2

Exercise 1.1 Controlling frequency, amplitude and phase. The first task of this exercise is to calculate and plot a sinusoid with a given frequency. To do this we need to make time points, define the frequency of the sinusoid and then calculate the cycle itself. >> time=[0:1:800] ; Time points between 0 and 800 with a spacing of 1 >> f=0.01 Define the frequency of the sinusoid as 0.01 >> signal=sin(2.*pi.*time.*f); Calculate the sinusoid >> plot(time,signal) Plot the sinusoid against time Now we can repeat the calculation but also define the amplitude of the sinusoid. >> time=[0:1:800] ; Time points between 0 and 800 with a spacing of 1 >> f=0.01 Define the frequency of the sinusoid as 0.01 >> A=2.5 Define the amplitude of the sinusoid as 2.5 >> signal=sin(2.*pi.*time.*f).*a; Calculate the sinusoid >> plot(time,signal) Plot the sinusoid against time Finally, calculate a sinusoid with a defined phase. >> time=[0:1:800] ; Time points between 0 and 800 with a spacing of 1 >> f=0.01 Define the frequency of the sinusoid as 0.01 >> phi=pi./2 Define the phase of the sinusoid as /2 >> signal=sin(2.*pi.*time.*f+phi); Calculate the sinusoid >> plot(time,signal) Plot the sinusoid against time It is possible to calculate three separate sinusoids with the same frequencies as the main Milankovitch cycles. These sinusoids can then be added together and plotted. >> time=[0:1:800] ; Time points between 0 and 800 with a spacing of 1 >> fe=1./100; Define the eccentricity frequency >> eccen=sin(2.*pi.*time.*fe); Calculate a sinusoid with frequency fe >> fo=1./41; Define the obliquity frequency >> obliq=sin(2.*pi.*time.*fo); Calculate a sinusoid with frequency fo >> fp=1./21; Define the precession frequency >> prec=sin(2.*pi.*time.*fp); Calculate a sinusoid with frequency fp >> final=eccen+obliq+prec; Sum the 3 sinusoids together to make a final signal >> plot(time,final); Plot the signal against time Does your signal which contains the Milankovitch periods look like known patterns of orbital scale climate change? 3

Exercise 1.2 Signal representation in the Frequency Domain The aim of this exercise is for you to investigate how a signal made from a mixture of sine waves is represented in the Frequency Domain (in other words what happens when you perform the Fourier transform. Using MATLAB calculate 3 different sine waves (i.e. with different amplitudes, frequencies and phases) and add them together to give a composite signal. 1. Produce a plot which shows your composite signal 2. Use fft_plot to produce a plot that shows the frequency spectrum of your composite signal. So, the next part is the important bit; 3. Provide an interpretation of the frequency spectrum that you obtained from your composite signal. Think about the positions of the peaks in the diagram and their relative heights. 4

Exercise 2.1 Event Spacing of the Cave Creek runoff data The time series we will study shows the monthly amount of runoff water (measured in inches) from Cave Creek in Kentucky. Use the event spacing method to estimate the period of each runoff cycle. In MATLAB we first have to load the data into the memory. You can load the Cave Creek data using: >> load cave_creek >> whos This will tell you which variables are in the memory At the bottom of the screen you will see some text which looks like: Name Size Bytes Class month 216x1 1728 double array runoff 216x1 1728 double array Grand total is 432 elements using 3456 bytes This shows us that the variables month and runoff have been loaded into the memory. We can now make a plot of the data: >> plot(month,runoff) >> xlabel('month') Add a label to the x-axis >> ylabel('runoff (inches)') Add a label to the y-axis We will use the function ginput.m to mark the points on the chart which we think are events: >> [x,y]=ginput Obtain data from the figure You can now use the mouse to record the positions that you think correspond to events in the runoff data. When you have clicked on all the points of interest, press Enter to stop the function and return to the MATLAB prompt. The variables x and y now correspond to the coordinates of the positions you clicked on. You can mark these points on your chart in the following way: >> hold on This allows you to add more data to the existing plot. >> plot(x,y, o ) Add the click positions to the plot as circles. We just need to find the difference between the values in x, in MATLAB this is simple: >> x_diff=diff(x) >> x_mean=mean(x_diff); mean is the command for calculating a mean >> x_std=std(x_diff); std is the command for calculating a standard deviation The event spacing of the Cave Creek runoff data = ± months What is your interpretation of the mean event spacing that you calculated? 5

Exercise 2.2 Autocorrelation of the Cave Creek runoff data The time series we will study shows the monthly amount of runoff water (measured in inches) from Cave Creek in Kentucky. Use the event spacing method to estimate the period of each runoff cycle. In MATLAB we first have to remove any data which might still be in the memory using the clear command and then load the Cave Creek data; >> clear all, close all Remove all data from the memory >> load cave_creek Load the data set Our first task is to detrend the data. To do this we can use the function detrend_signal.m >> runoff_d=detrend_signal(month,runoff,1); This will produce the detrended data runoff_d, which has had a straight-line trend removed. A figure is produced by the function that shows the original data, the fitted line and the detrended data. The data is now prepared for the autocorrelation. To do the analysis we use the function autocorr.m >> [rt,lags]=autocorr(runoff_d); This function has two outputs rt and lags >> figure Make a new figure window >> plot(lags,rt) Plot the final autocorrelogram >> xlabel('lag number'); Add a label to the x-axis >> ylabel('r'); Add a label to the y-axis Again, here is the important part, what is your interpretation of the autocorrelogram and how does it compare to the Event Spacing method? 6

Exercise 2.3 Autocorrelation of the SPECMAP Stack Load the SPECMAP file into MATLAB, you will find it contains two variables; age and data (the units of age are ka and the data units are normalised oxygen isotope ). Use MATLAB to produce a plot of the SPECMAP record. Add appropriate labels to both the x and the y axes. Detrend the SPECMAP signal. Perform an autocorrelation including significance levels on the detrended series. Make an interpretation of the autocorrelation in terms of known climate variation. 7

Exercise 3.1 Removing long-term trends from the Mauna data set We can use the function detrend_signal to remove long-term change from the Mauna data and keep only the higher frequency variation. >> load mauna load the data into MATLAB There are two variables time (units of days) and mauna (CO 2 data, units of ppm) First, try removing a simple linear trend (a straight line) of the form: y a1x a0 >> xc=detrend_signal(time,mauna,1) In this case the 1 tells MATLAB to remove a 1 st order polynomial from the data, i.e. a straight line. 2 Next, try removing a quadratic function of the form: y a2 x a1x a0 >> xc=detrend_signal(time,mauna,2) In this case the 2 tells MATLAB to remove a 2 nd order polynomial from the data, i.e. a quadratic function. 3 2 Finally, try removing a cubic function of the form: y a3x a2 x a1x a0 >> xc=detrend_signal(time,mauna,3) In this case the 3 tells MATLAB to remove a 3 rd order polynomial from the data, i.e. a cubic function. 8

Exercise 3.2 Calculating the first difference curve of the Mauna data Calculate the first difference curve for the Mauna CO 2 data. Then plot both the original data and the first-difference curve. >> load mauna load the data into MATLAB There are two variables time (units of days) and mauna (CO 2 data, units of ppm) >> xd=diff(mauna); Calculate the first-difference of the CO 2 data >> xt=(diff(time)./2)+time(1:229); Calculate the mid-points of the time data Now we can start the plotting, we want to plot two sets of axes in one window, so we can use the subplot command. >> subplot(2,1,1) This tells MATLAB to set the figure window into a 2 x 1 matrix of plotting areas, and chooses the 1 st area to be active. >> plot(time,mauna) Plot the original data >> xlabel('time (yrs)') Puts a title onto the x-axis >> ylabel('co2 content (ppm)') Puts a title onto the y-axis Now we want to make the second plot so we give the subplot command again, but now choose the 2 nd area to be active. >> subplot(2,1,2) >> plot(xt,xd) Plot the first-difference data >> xlabel('time (yrs)') Puts a title onto the x-axis >> ylabel('first Difference (ppm)') Puts a title onto the y-axis 9

Exercise 3.3 Using a moving average The MATLAB file porosity contains normalised porosity data from core GeoB4311-02 taken from the equatorial Atlantic. Use the maverage function to produce a smooth record of the data. What is a good smoothing window to use? >> load porosity load the porosity data First look at the data by making a simple plot >> plot(age,porosity,'g') This plots the data as a green line >> xlabel('age (kyr)') Add a x-axis title >> ylabel('normalised porosity') Add a y-axis title >> hold on Allow lines to be added to the plot You can read the instructions on the use of maverage. One required input is a matrix that describes the moving average window to be used. For example; >> f=[1 1 1 1 1] 5pt moving average with equal weights >> f=[1 3 5 3 1] 5pt weighted moving average >> [Xout,Yout]=maverage(age,porosity,f); Calculate the smoothed data >> plot(xout,yout,'r') Plot the smoothed data as a red line If you want to try different smoothing windows in a new plot give the command; >> figure this will produce a new plot window in which you can plot the new data. 10

Exercise 3.4 Filtering in the frequency domain Use the function filter_signal to filter the normalised porosity from core GeoB4311. Use the subplot command to make plots which compare the filtered porosity record to the SPECMAP stack. The normalised oxygen isotope data for the SPECMAP stack can be found in the MATLAB file SPECMAP. 11

Exercise 4.1 Calculation of a periodogram In this exercise we will construct two signals with different frequencies and amplitudes and determine their variance. The signals are then combined together and the periodogram (unsmoothed variance spectrum) is calculated using the function pdg. We can then check that the variance returned in the spectrum is the same as in the individual input components. >> time=[0:1:1000]'; Form a time array between 0 and 1000 >> f1=1./100; Frequency of the first signal >> f2=1./40 Frequency of the second signal >> data1=sin(2.*pi.*time.*f1); Calculate the first sinusoid >> data2=sin(2.*pi.*time.*f2); Calculate the second sinusoid Now we have the sinusoids but they have equal amplitudes and equal variances. If we change the amplitude then that will also change the variance. >> data1=data1.*4.5; Set the amplitude of the first signal to 4.5 >> data2=data2.*7.0; Set the amplitude of the second signal to 7.0 >> v1=var(data1) Calculate the variance of the first signal >> v2=var(data2) Calculate the variance of the second signal >> input_data=data1+data2; Combine the signals to give the final time series >> subplot(2,1,1) Create a plot in the upper half of the figure >> plot(time,data1,time,data2) Plot the individual sinusoids >> xlabel('time') Add a x-axis title >> ylabel('signal') Add a y-axis title >> subplot(2,1,2) Create a plot in the lower half of the figure >> plot(time,input_data) Plot the final input time series >> xlabel('time') Add a x-axis title >> ylabel('signal') Add a y-axis title The final part of this exercise is to calculate the periodogram and plot the results. >> [f,pyy]=pdg(time,input_data); Calculate the periodogram >> figure Make a new figure window >> plot(f,pyy) Plot frequency against the variance >> xlim([0 0.05]) Set the maximum plotted frequency to 0.05 >> xlabel('frequency') Add a x-axis title >> ylabel('variance') Add a y-axis title You should now be able to compare the peak heights in the periodogram (i.e. the variances of the different spectral components) to the variances you calculated for the individual input sinusoids, are they the same? 12

Exercise 4.2 Time in the Frequency domain We can compare the frequency spectra of the original SPECMAP stack and a modified version. These signals are basically the opposite of each other, but how are their frequency spectra different? >> clear all Remove all existing data from the memory >> close all Close all the existing figure windows >> load SPECMAP2 Load the data into the memory There should now be 3 variables in MATLAB s memory. The first is age, then data1 which is the normal SPECMAP record and data2 which is the flipped and reversed stack. First you can plot the data to check they are okay. >> subplot(2,1,1) Create a plot in the upper half of the figure >> plot(age,data1); Plot the original SPECMAP data >> subplot(2,1,2) Create a plot in the lower half of the figure >> plot(age,data2); Plot the reversed SPECMAP data Now a periodogram can be calculated for each signal and they can be compared in the frequency domain. >> [f1,pyy1]=pdg(time,data1) Periodogram of the original SPECMAP data >> figure Create a new plot window >> subplot(2,1,1) Create a plot in the upper half of the figure >> plot(f1,pyy1) Plot the periodogram >> xlabel('frequency') Add a x-axis title >> ylabel('variance (original)') Add a y-axis title >> [f2,pyy2]=pdg(time,data2) Periodogram of the reversed SPECMAP data >> subplot(2,1,2) Create a plot in the lower half of the figure >> plot(f2,pyy2) Plot the periodogram >> xlabel('frequency') Add a x-axis title >> ylabel('variance (flipped)') Add a y-axis title You should now be able to compare the periodograms of the two signals. What differences can you see? 13

Exercise 4.3a Trends in the Frequency domain Using the raw Mauna Loa CO 2 data we can investigate the effect long-term trends have in the frequency domain >> clear all Remove all existing data from the memory >> close all Close all the existing figure windows >> load mauna Load the Mauna Loa data into the memory There are two variables in the data set, time (measured in days) and mauna which is the CO 2 data (measured in ppm). >> subplot(2,1,1) Create a plot in the upper half of the figure >> plot(time,mauna); Plot the original CO 2 data >> xlabel('time (days)') >> ylabel('co2 cotent (ppm)') >> subplot(2,1,2) Create a plot in the lower half of the figure >> [f,pyy]=pdg(time,mauna) Periodogram of the CO 2 data >> plot(f,pyy); Plot the periodogram >> xlabel('frequency (1/days)') >> ylabel('variance') What structure do you see in the periodogram? 14

Exercise 4.3b Trends in the Frequency domain(2) You can now try detrending the Mauna Loa CO 2 data using the function detrend_signal. What effect does the detrending have on the periodogram. >> xc=detrend_signal(time,mauna) Detrend the data with a 1 st order polynomial We can no repeat the previous part of the exercise, but using the detrended rather than the raw data. >> figure Make a new figure window >> subplot(2,1,1) Create a plot in the upper half of the figure >> plot(time,xc); Plot the detrended CO 2 data >> xlabel('time (days)') >> ylabel('detrended signal') >> subplot(2,1,2) Create a plot in the lower half of the figure >> [f,pyy]=pdg(time,xc) Periodogram of the detrended data >> plot(f,pyy); Plot the periodogram >> xlabel('frequency (1/days)') >> ylabel('variance') How has the structure of the periodogram changed now the data has been detrended? 15

Exercise 4.4 Processing unequally spaced data If we have data that are unequally spaced in the time domain then we must interpolate them onto an equally spaced time axis before calculating a traditional periodogram. In the file interpolation there are two versions of the same signal, one equally space and one unequally spaced in time. >> load interpolation The variables time_reg and data_reg make up the equally spaced data set, whilst time_irreg and data_irreg are the unequally spaced data set. First we can plot the signal. >> subplot(3,1,1) Activate the top third of the plot window >> plot(time_reg,data_reg); Plot the regularly spaced signal >> xlabel('time') >> ylabel('signal') You will see this is a complicated signal and to understand it we should study the periodogram of the equally space data. >> [f,pyy]=pdg(time_reg,data_reg); Periodogram of the regularly spaced signal >> subplot(3,1,2) Activate the middle third of the plot window >> plot(f,pyy) Plot the Periodogram >> xlabel('frequency') >> ylabel('variance') Examination of the periodogram shows that the signal is made up of 8 sinusoids with equal amplitudes and frequencies at spacings of 0.05. To calculate a periodogram for the irregularly spaced signal we must interpolate the data onto a regularly spaced time array. The simplest way to do this is to interpolate it onto the original time_reg points. >> data_interp=interp1(time_irreg,data_irreg,time_reg,'linear') The above command performs a one-dimensional linear interpolation of the irregular signal (time_irreg and data_irreg) onto the regular time array (time_reg). We can now use the original time_reg points with the data_interp array to calculate a periodogram for the interpolated signal. >> [f,pyy]=pdg(time_reg,data_interp); Periodogram of the interpolated signal >> subplot(3,1,3) Activate the bottom third of the plot window >> plot(f,pyy) Plot the Periodogram >> xlabel('frequency') >> ylabel('variance') What effect does the interpolation appear to have on the signal? 16

Exercise 4.5 The CLEAN algorithm The CLEAN algorithm is just one method where frequency spectra can be calculated directly from unequally spaced data without a need for interpolation. To apply CLEAN to the data simply give the command: >> [f,pyy]=clean(time_irreg,data_irreg) You can see from the input variables that we are putting the irregularly spaced data directly into the algorithm without any interpolation. >> figure >> plot(f,pyy) Plot the CLEAN spectrum >> xlabel('frequency') >> ylabel('variance') The high frequency parts of the signal are not attenuated and CLEAN has successfully produced an accurate spectrum for the signal. There are a number of alternative methods available for spectral analysis of unevenly spaced data. A few references are: Roberts D.H., Lehar J., Dreher J.W., (1987) Time Series Analysis with Clean - I - Derivation of a Spectrum. Astronomical J. 93, (4) 968 J.D. Scargle, (1982) Studies in astronomical time series analysis II. Statistical aspects of spectral analysis of unevenly sampled data. Astrophysical J. 263, 835-853. And with specific reference to palaeoclimatic data: Schulz, M. and Stattegger, K. (1997): SPECTRUM: Spectral analysis of unevenly spaced paleoclimatic time series. Comput. Geosci., 23, 929-945. Heslop D. and Dekkers M.J. (2002). Spectral analysis of unevenly spaced climatic time series using CLEAN: signal recovery and derivation of significance levels using a Monte Carlo simulation, Phys. Earth Planet. Inter., 130, 103-116. 17

Exercise 4.6 White noise in the Frequency Domain We will construct 3 white noise signals with different lengths (64, 256 and 1024 data points) and study the form of their periodograms. We can generate normally distributed random numbers (with a mean = 0 and variance = 1.0) in MATLAB using the randn function. >> time1=[0:1:63]; Generate a time array ranging from 0 to 63 >> signal1=randn(1,64); 64 normally distributed random numbers for the signal >> time2=[0:1:255]; Generate a time array ranging from 0 to 255 >> signal2=randn(1,256); 256 normally distributed random numbers for the signal >> time3=[0:1:1023]; Generate a time array ranging from 0 to 1023 >> signal3=randn(1,1024);1024 normally distributed random numbers for the signal It is a good idea to first plot the signals so you have some idea what they look like. >> subplot(3,1,1) Activate the top one-third of the plot window >> plot(time1,signal1) Plot signal1 >> subplot(3,1,2) Activate the middle one-third of the plot window >> plot(time2,signal2) Plot signal2 >> subplot(3,1,3) Activate the bottom one-third of the plot window >> plot(time3,signal3) Plot signal3 We can now calculate the periodogram for each signal and plot them in a new figure window. >> figure Make a new plot window >> [f,pyy]=pdg(time1,signal1) Calculate the periodogram of signal1 >> subplot(3,1,1) Activate the top one-third of the plot window >> plot(f,pyy) Plot the periodogram of signal1 >> xlabel('frequency') >> ylabel('variance') >> [f,pyy]=pdg(time2,signal2) Calculate the periodogram of signal2 >> subplot(3,1,2) Activate the middle one-third of the plot window >> plot(f,pyy) Plot the periodogram of signal2 >> xlabel('frequency') >> ylabel('variance') >> [f,pyy]=pdg(time3,signal3) Calculate the periodogram of signal3 >> subplot(3,1,3) Activate the bottom one-third of the plot window >> plot(f,pyy) Plot the periodogram of signal3 >> xlabel('frequency') >> ylabel('variance') 18

Exercise 4.7 Welch-Overlapped-Segment-Averaging Using the pwelch function it is possible to investigate how the power spectra of a white noise signal changes as the WOSA method is applied >> clear all Remove existing variables from the memory >> close all Close all the existing figure windows >> signal=randn(1,1024); 1024 normally distributed random numbers for the signal We can now calculate the spectra for the signal with different segment lengths (1024, 256 and 64) and then plot the results. >> [Pyy,f]=pwelch(signal,1024,[],[],1) Calculate with a 1024 point segment >> subplot(3,1,1) Activate the top one-third of the plot window >> plot(f,pyy) Plot the spectrum >> xlabel('frequency') >> ylabel('power') >> [Pyy,f]=pwelch(signal,256,[],[],1) Calculate with 256 point segments >> subplot(3,1,2) Activate the middle one-third of the plot window >> plot(f,pyy) Plot the spectrum >> xlabel('frequency') >> ylabel('power') >> [Pyy,f]=pwelch(signal,64,[],[],1) Calculate with 64 point segments >> subplot(3,1,3) Activate the bottom one-third of the plot window >> plot(f,pyy) Plot the spectrum >> xlabel('frequency') >> ylabel('power') Compare the frequency spectra. It is important to consider how close they are to the theoretical spectra for white noise and their relative frequency resolutions. 19

Exercise 4.8 Red noise in the Time and Frequency Domains The function AR1n can be used to generate red noise series with a given value of. We will generate 3 different series and see what they look like in both the time and frequency domains. >> clear all Remove existing variables from the memory >> close all Close all the existing figure windows >> rho=0.0 Define the lag-one autocorrelation coefficient >> [Rn1,Wn1]=AR1n(rho,2048); Create a series of length 2048 The AR1 function has two outputs; Rn1 is the red noise series and Wn1 is the white noise series that was used in the construction of Rn1. >> figure Generate a new plot window >> subplot(2,1,1) Activate the upper half of the plot window >> plot([0:1:2047],wn1, b ) Plot the white noise as a blue line >> hold on Allow more data to be added to the chart >> plot([0:1:2047],rn1, r ) Plot the red noise as a red line We saw in the previous exercise that to obtain a consistent spectrum we should use the WOSA method included in the function pwelch. >> [Pyy,f]=pwelch(Rn1,64,[],[],1) Calculate with a 64 point segment >> subplot(2,1,2) Activate the lower half of the plot window >> hold off Switch off the hold command >> plot(f,pyy) Plot the spectrum >> xlabel('frequency') >> ylabel('power') Because we set = 0 the calculated series is simply white noise and should have a flat spectrum. What happens if you have a non-zero value for the lag-one autocorrelation coefficient? Repeat the exercise with = 0.8 and = 0.99. 20

Exercise 5.1 A Stationary Signal First construct a vector containing the points in time, in this case starting at 0, and finishing at 3199 in intervals of 1 unit. >> time=[0:1:3199]'; We will calculate 4 sine waves with different frequencies, so first we define the frequency values: >> f1=1./400; Long period Eccentricity >> f2=1./100; Eccentricity >> f3=1./41; Obliquity >> f4=1./23; Precession Now calculate a sine wave for each of the 4 frequencies: >> signal1=sin(2.*pi.*f1.*time); >> signal2=sin(2.*pi.*f2.*time); >> signal3=sin(2.*pi.*f3.*time); >> signal4=sin(2.*pi.*f4.*time); Finally we add all 4 signals together to produce a composite signal: >> signal=signal1+signal2+signal3+signal4; >> plot(time,signal) Now determine the periodogram using pdg and plot the resulting frequency spectrum and set the x-axis limit between 0 and 0.1 >> [f,pyy]=pdg(time,signal); The frequency data is given in the variable f and the power of the signal at each frequency is stored in the variable Pyy. >> figure >> plot(f,pyy) >> xlim([0 0.1]) >> xlabel('frequency') >> ylabel('variance') 21

Exercise 5.2 A nonstationary signal Now we ll construct a signal that changes through time. We will work with 4 different time segments and each one will contain a cycle with a different frequency. Finally, we will combine the segments and calculate a periodogram. Segment 1: spans 0 to 799 kyr with a frequency of 1/400 kyr -1 Segment 2: spans 800 to 1599 kyr with a frequency of 1/100 kyr -1 Segment 3: spans 1600 to 2399 kyr with a frequency of 1/41 kyr -1 Segment 4: spans 2400 to 3199 kyr with a frequency of 1/23 kyr -1 As a first step we construct our 4 different time segments: >> clear all, close all >> time1=[0:1:799]'; Time segment spanning 0 to 799 >> time2=[800:1:1599]'; Time segment spanning 800 to 1599 >> time3=[1600:1:2399]'; Time segment spanning 1600 to 2399 >> time4=[2400:1:3199]'; Time segment spanning 2400 to 3199 Then define the 4 different frequencies: >> f1=1./400; Long period Eccentricity >> f2=1./100; Eccentricity >> f3=1./41; Obliquity >> f4=1./23; Precession Now calculate the sine waves for each time segment: >> signal1=sin(2.*pi.*f1.*time1); Long eccentricity spanning time 0 to 799 >> signal2=sin(2.*pi.*f2.*time2); Eccentricity spanning time 800 to 1599 >> signal3=sin(2.*pi.*f3.*time3); Obliquity spanning time 1600 to 2399 >> signal4=sin(2.*pi.*f4.*time4); Precession spanning time 2400 to 3199 Now we must combine the time segments and signal segments together. We do this using the square brackets operator, which allows us to glue existing variables together: >> time=[time1;time2;time3;time4]; This tells MATLAB to make a new variable called time that starts with the vector time1. Underneath time1 it places vector time2, then underneath that places time3 and finally places time4 at the bottom. Now we do the same procedure for the signal, plot the result and perform the spectral analysis: >> signal=[signal1;signal2;signal3;signal4]; >> plot(time,signal) >> [f,pyy]=pdg(time,signal); >> plot(f,pyy) >> xlim([0 0.1]) 22

Exercise 5.3 Evolutionary spectral analysis of a nonstationary signal Perform evolutionary spectral analysis on the nonstationary signal you produced in Exercise 5.2. Perform the analysis with a spacing of 10 kyr and try some different window lengths (for example 200 kyr, 500 kyr, 800 kyr, 1200 kyr). e.g. >> evolpsd(time,signal,200,10); >> evolpsd(time,signal,500,10); >> evolpsd(time,signal,800,10); >> evolpsd(time,signal,1200,10); What influence does varying the window length have? What size window lengths do you need to obtain a good resolution in time and frequency? 23

Exercise 5.4: Evolutionary spectral analysis of the ODP677 18 O record Load the data into MATLAB from the file ODP677 >> load odp677 There are two variables: age and data The ODP677 18 O record is nonstationary, make an interpretation of its spectral content and climatic information using evolutive spectral analysis. Do the relative contributions of the different Milankovitch cycles change through time? >> evolpsd(age,data,window,spacing) adjust the window size to get the best balance between resolution time and frequency. Extra Hint: Before you start the analysis plot the data, do you think it is necessary to detrend the signal first? 24

Exercise 7.1 PCA of a collection of time series The data are stored in the file pca_climate.mat. There are 4 variables, age (a common time scale for the data), ms (magnetic susceptibility), gs (grain size) and d18o (oxygen isotopes). We ll now plot the 3 time series. >> clear all, close all Clear the memory and close all figures >> load pca_climate Load the data file >> figure Create a new figure >> subplot(3,1,1) axes in 3 rows & 1 column, activate first set of axes >> plot(age,zscore(ms)) plot the standardized magnetic susceptibility data >> ylabel('normalized Susceptibility') label the y-axis >> subplot(3,1,2) axes in 3 rows & 1 column, activate second set of axes >> plot(age,zscore(gs)) plot the standardized grain size data >> ylabel('normalized Grain size') label the y-axis >> subplot(3,1,3) axes in 3 rows & 1 column, activate second set of axes >> plot(age,zscore(d18o)) plot the standardized oxygen isotope data >> ylabel('normalized d18o') label the y-axis >> xlabel('age [ka]') label the x-axis We ll combine 3 time series which are on a common timescale into a single matrix and then standardize the columns. If the time series were not on a common timescale then they would have to be interpolated first to obtain versions that are on a common timescale. PCA is performed using the function princomp. With the results we can find out what proportion of the variance the 1st PC explains and plot the scores as a function of age. >> X=[ms,gs,d18O]; form a matrix composed of the time series >> X=zscore(X); standardize the columns of X >> [loadings,scores,latent]=princomp(x); perform the PCA >> latent=latent./sum(latent) normalize the PC contributions >> latent(1) the variance explained by the 1st PC >> figure Generate a new figure >> plot(age,scores(:,1)) plot the scores of the 1st PC >> ylabel('pc Score') label the y-axis >> xlabel('age [ka]') label the x-axis Using the example above plot the scores of the 2nd PC as a function of age. What proportion of the total variance does the 2nd PC account for? 25

Exercise 7.2 PCA and the Hockey Stick We ll run a test with a collection of red noise time series to analyze the PCA technique employed by Mann et al. (1999) and see how it compares to the classical PCA approach. To start we ll create a collection of age points between 1400 and 2000 AD and then generate 112 red noise time series with =0.8. >> clear all, close all >> age=[1400:1:2000]'; Create the age points >> Rn1=AR1n(0.8,numel(age),112); Generate 112 red time series >> figure New figure >> plot(age,rn1) Plot the time series >> xlabel('age [Yr]') Label the x axis >> ylabel('red Noise input') Label the y axis Now calculate the 1 st principal component of the data using the classical approach where the standard deviation of each record is set to 1 and the mean is set to 0. >> A=bsxfun(@minus,Rn1,mean(Rn1)); Subtract the mean of each record >> A=bsxfun(@rdivide,A,std(A)); Divide by the std of each record >> [coeffa,scorea] = princomp_hs(a); Perform the PCA on the data in A >> figure New figure >> plot(age,zscore(scorea(:,1)),'b') Plot the scores of the 1 st component >> xlabel('age [Yr]') Label the x axis >> ylabel('pca Score') Label the y axis Now we ll calculate the 1 st principal component of the data using the Mann approach where the standard deviation of each record is set to 1 during the period 1900-2000 AD and the mean is set to 0 during the same period. >> idx=find(age>=1900 & age<=2000); Find points in the calibration period >> B=bsxfun(@minus,Rn1,mean(Rn1(idx,:))); Normalized the mean >> B=bsxfun(@rdivide,B,std(B(idx,:))); Normalized the standard deviation >> [coeffb,scoreb] = princomp_hs(b); Perform the PCA on the data in B >> hold on Add to the current plot >> plot(age,zscore(scoreb(:,1)),'r') Plot the scores of the 1 st component >> legend('classical PCA','Mann PCA',0) Add a legend Look at the final plot and compare the time series generated by the classical and Mann approaches. Remember, we are working with random data, that will show no preference to increasing or decreasing in the calibration period. 26

Exercise 8.1 Correlation and autocorrelation We ll generate two random red noise time series consisting of 200 points and test the significance of their correlation with and without employing the effective sample size. >> clear all, close all Clear the memory, close all existing figures >> N=200; Define the length of the time series >> X=AR1n(0.95,N); 200 point red noise series with =0.95 >> Y=AR1n(0.99,N); 200 point red noise series with =0.99 >> figure Generate a new figure >> subplot(2,1,1) 2 x 1 subplot, activate the first plot >> plot(1:200,x,'k') Plot the X time series >> ylabel('x Series') label the y-axis >> subplot(2,1,2) 2 x 1 subplot, activate the second plot >> plot(1:200,y,'k') Plot the X time series >> ylabel('y Series') label the y-axis >> xlabel('time') label the x-axis We can now form a bivariate plot and calculate the correlation >> figure, plot(x,y,'.') Form an XY plot of the data >> xlabel('x'), ylabel('y') Label the axis >> r = corrcoef(x,y) Calculate the correlation matrix >> r = abs(r(2,1)); Find the absolute value of r First we ll assess if the correlation is significant whilst ignoring the existence of any autocorrelation. >> t0 = r.*sqrt((n-2)./(1-r.^2)) Calculate the t statistic >> t_crit0=tinv(1-0.05./2,n-2) Calculate the critical value of t Is t0 > t_crit0, if so we can state that the two time series are significantly correlated at the = 0.05 level. If not then the correlation is not significant. Now we ll assess if the correlation is significant whilst taking the existence of autocorrelation into account. >> Neff=N.*(1-0.95*0.99)./(1+0.95*0.99) Effective number of samples >> t1 = r.*sqrt((neff-2)./(1-r.^2)) Calculate the t statistic >> t_crit1=tinv(1-0.05./2,neff-2) Calculate the critical value of t Is t0 > t_crit0, if so we can state that the two time series are significantly correlated at the = 0.05 level. If not then the correlation is not significant. 27

Exercise 8.2 A semi-empirical ice volume model We ll test the significance of the correlation upon which ice volume predictions are made. First we ll load the data and plot the time series. >> clear all, close all Clear the memory >> load sealevel_data load the data >> temp=detrend(temp) detrend the temperature rise >> rate=detrend(rate) detrend the rate >> subplot(2,1,1) 2 x 1 plots, use the 1st plot >> plot(temp,'.k-') plot the temperature rise >> ylabel('rate of Change (cm/yr)') label the y-axis >> subplot(2,1,2) 2 x 1, use the 2nd plot >> plot(rate,'.k-') plot the rate >> ylabel('warming above 1951-1980 mean') label the y-axis >> xlabel('time') label the x-axis Now test the significance of the relationship whilst not taking the autocorrelation into account. >> r = corrcoef(temp,rate); Calculate the correlation matrix >> r = abs(r(2,1)); The absolute correlation coefficient >> N=length(rate) Number of values in the time series >> t0 = r.*sqrt((n-2)./(1-r.^2)) calculate the t statistic >> t_crit0=tinv(1-0.05./2,n-2) calculate the critical value of t Now we ll repeat the test, but first estimating the effective number of samples based on the autocorrelation coefficients. >> r = corrcoef(temp,rate); Calculate the correlation matrix >> r = abs(r(2,1)); The absolute correlation coefficient >> N=length(rate) Number of values in the time series >> rho_temp=ar1(temp); AR1 coefficient of temp >> rho_rate=ar1(rate); AR1 coefficient of rate >> Neff=N.*(1-rho_temp*rho_rate)./(1+rho_temp*rho_rate) effective N >> t1 = r.*sqrt((neff-2)./(1-r.^2)) calculate the t statistic >> t_crit1=tinv(1-0.05./2,neff-2) calculate the critical value of t Draw conclusions concerning the validity of the model, given the values in t0, t_crit0, t1 and t_crit1. 28

Using the maverage.m function [Xout,Yout]=maverage(x,y,f) This function applies a moving average to your data Inputs x: x-axis data, this will normally be depth or time y: y-axis data, this is the signal data f: the moving average window. Xout: the smoothed version of x Yout: the smoothed version of y. Output Examples For a traditional 3-point moving average use: [Xout,Yout]=maverage(x,y,[1 1 1]) For a traditional 5-point moving average use: [Xout,Yout]=maverage(x,y,[1 1 1 1 1]) To apply different weights to each position just enter them into f For a 5-point weighted moving average use: [Xout,Yout]=maverage(x,y,[1 3 5 3 1]) (this applies a weight of 1 to the first point, 3 to the second point, 5 to the third point, 3 to the forth point and 1 to the fifth point). You can use any window length and weightings you want. 29

Using the filter_signal.m function [Xout,Yout]=filter_signal(x,y,f,type) This function applies a frequency domain filter to your data. Inputs x: x-axis data, this will normally be depth or time y: y-axis data, this is the signal data f = cut-off frequency, for a low-pass or high-pass filter this is a single cut-off value. For a band-pass filter you must provide two frequencies which are the minimum and maximum frequencies that will be allowed through the filter, i.e. [fmin fmax]. type: what kind of filter should the function perform you can enter: 'low' for a low-pass filter 'high' for a high-pass filter 'band' for a band-pass filter Xout: the output of x. Yout: the filtered version of y. Output Examples [Xout,Yout]=filter_signal(x,y,f,'low') for a low-pass filter. [Xout,Yout]=filter_signal(x,y,f,'high') for a high-pass filter. [Xout,Yout]=filter_signal(x,y,[fmin fmax],'band') for a band-pass filter. 30