Fitting a Sine Curve to Data

Steven R. Dunbar

Department of Mathematics

University of Nebraska-Lincoln

Lincoln, NE, USA  68588

sdunbar@math.unl.edu

http://www.math.unl.edu/~sdunbar

sinefit.mws

Copyright: 1999, 2005

Lesson Information

Physics Content

Making a best sine curve fit to a set of sparse data from observations of the star 51 Pegasi.

Making a best  sine curve fit to a set of sparse data from observation of the tides in the Bay of Fundy.

Mathematics Content

Curve fitting, and simple statistics, least-squares optimization

Keywords

best fit curve, least-squares fitting, sine curve fit, sparse data

Objective

Illustrate nonlinear curve fitting with Maple, using both elementary commands and sophisticated tools.

Audience

Anyone interested in using Maple to do simple curve curve fitting.

Time

30-60 minutes

Prerequisite

Understanding of curve fitting problem.

Understanding of mean of a set of data.

Understanding of sine and arcsine functions.

Understanding vertical, horizontal shifting of a graph, understanding change of scale of a graph.

Understanding of least-squares line fitting.

References

http://www.physics.upenn.edu/courses/gladney/talks/partnerships/Pegasi/index.html

Plan

We want to find the parameters A, f, phi, K, so that the curve

y = A*sin(f*(x - phi)) + K

fits the set of data points { [x_i, y_i]} well.

The mean of the curve is clearly K, so we can take the mean of the data to be K.

The vertical range of the curve is 2*A, so we can take the vertical range of the vertically rescaled data to be 2*A, and solve for A.

Then

(y - K)/A = sin(f*(x - phi))

and

(1/f)* arcsin( (y-K)/A ) + phi = x

so the line

(1/f)*z + phi = x

should be a good fit to the set of data points { [arcsin( (y_i -K)/A, x_i)] }

> restart;

Fitting a Sine Curve to Sparse Data from Observations of the Star 51 Pegasi

Data

Velocity of the star 51 Pegasi measured in m/s

> Velocity := [2.65, 2.80, 2.96, 3.80, 3.90, 4.60, 4.80, 4.90, 5.65, 5.92];

Velocity := [2.65, 2.80, 2.96, 3.80, 3.90, 4.60, 4.80, 4.90, 5.65, 5.92]

Days, measured in Julian Days

> Days := [-45.5, -48.8, -54.0, -13.5, -7.0, 42.0, 50.5, 54.0, 36.0, 14.5];

Days := [-45.5, -48.8, -54.0, -13.5, -7.0, 42.0, 50.5, 54.0, 36.0, 14.5]

> Pairs := zip((x,y) -> [x,y], Velocity, Days);

Pairs := [[2.65, -45.5], [2.80, -48.8], [2.96, -54.0], [3.80, -13.5], [3.90, -7.0], [4.60, 42.0], [4.80, 50.5], [4.90, 54.0], [5.65, 36.0], [5.92, 14.5]]

> plot( Pairs, symbol = DIAMOND, style = POINT);

[Plot]

>

Mean of the Data, the vertical translation of the data, range

> with(stats);

[anova, describe, fit, importdata, random, statevalf, statplots, transform]

> K := describe[mean](Days);

K := 2.820000000

> RescaledDays := map( y -> y - K, Days);

RescaledDays := [-48.32000000, -51.62000000, -56.82000000, -16.32000000, -9.820000000, 39.18000000, 47.68000000, 51.18000000, 33.18000000, 11.68000000]RescaledDays := [-48.32000000, -51.62000000, -56.82000000, -16.32000000, -9.820000000, 39.18000000, 47.68000000, 51.18000000, 33.18000000, 11.68000000]

> ScaledDayRange := describe[range](RescaledDays);

ScaledDayRange := -56.82000000 .. 51.18000000

> A := max( abs(op(1,ScaledDayRange)), abs(op(2, ScaledDayRange)));

A := 56.82000000

>

Transform the Data

> ArcsinScaledDays
:= map( y -> arcsin( (y-K)/A), Days);

ArcsinScaledDays := [-1.016754184, -1.139639582, -1.570796327, -.2913262132, -.1736986022, .7608619127, .9957022703, 1.121468282, .6235851710, .2070373460]ArcsinScaledDays := [-1.016754184, -1.139639582, -1.570796327, -.2913262132, -.1736986022, .7608619127, .9957022703, 1.121468282, .6235851710, .2070373460]

>

Fitting Transformed Data

> FittedLine := fit[leastsquare[[z,x], x=RecipFreq*z+phi]]([ArcsinScaledDays,Velocity]);

FittedLine := x = 1.010138883*z+4.246846268

> f := 1/coeff(rhs(FittedLine), z, 1);
phi := coeff(rhs(FittedLine), z, 0);

f := .9899628822

phi := 4.246846268

> DataPlot := plot( Pairs, symbol = DIAMOND, style = POINT):

> CurvePlot := plot( A*sin(f*(x-phi)) + K, x=Velocity[1]..Velocity[nops(Velocity)], color = blue):

> plots[display](DataPlot, CurvePlot);

[Plot]

> RMSError :=
sqrt(

   sum(

       ( A*sin(f*(Velocity[i]-phi))+K - Days[i])^2,

       i = 1..nops(Velocity)

   )

);

RMSError := 61.55198827

>

Some Additional Fitting by Eye

> f := 1.5;
CurvePlot := plot( A*sin(1.5*(x-phi)) + K, x=Velocity[1]..Velocity[nops(Velocity)], color = blue):

f := 1.5

> plots[display](DataPlot, CurvePlot);

[Plot]

> RMSError :=
sqrt(

   sum(

       ( A*sin(f*(Velocity[i]-phi))+K - Days[i])^2,

       i = 1..nops(Velocity)

   )

);

RMSError := 41.41143989

> phi := 4.0;
CurvePlot := plot( A*sin(f*(x-phi)) + K, x=Velocity[1]..Velocity[nops(Velocity)], color = blue):

phi := 4.0

> plots[display](DataPlot, CurvePlot);

[Plot]

> RMSError :=
sqrt(

   sum(

       ( A*sin(f*(Velocity[i]-phi))+K - Days[i])^2,

       i = 1..nops(Velocity)

   )

);

RMSError := 10.51640692

>

Curve Fitting by Least-Squares (Nonlinear) Optimization

> with(Optimization);

[ImportMPS, Interactive, LPSolve, LSSolve, Maximize, Minimize, NLPSolve, QPSolve]

> NonlinearResiduals := [ seq( A0*sin(f0*(Velocity[i]-phi0))+K0 - Days[i], i = 1..nops(Velocity)) ]:

> lssoln := LSSolve( NonlinearResiduals, initialpoint={A0=A, f0=f, phi0=phi, K0=K});

lssoln := [6.58359492644242117, [phi0 = 4.00788329266851751, A0 = 53.2912683442544974, f0 = 1.50491325251160024, K0 = 1.57411173451896635]]lssoln := [6.58359492644242117, [phi0 = 4.00788329266851751, A0 = 53.2912683442544974, f0 = 1.50491325251160024, K0 = 1.57411173451896635]]

> CurvePlot := plot( subs( lssoln[2], A0*sin(f0*(x-phi0)) + K0), x=Velocity[1]..Velocity[nops(Velocity)], color = blue):

> plots[display]( DataPlot, CurvePlot);

[Plot]

> RMSError :=
sqrt(

   sum(

       ( subs( lssoln[2], A0*sin(f0*(Velocity[i]-phi0))+K0) - Days[i])^2,

       i = 1..nops(Velocity)

   )

);

RMSError := 3.628662256

The RMSError and the objective function measured by the Least-Squares minimization are related by Least-Squares = (1/2)*RMSError^2:

> lssoln[1], (1/2)*RMSError^2;

6.58359492644242117, 6.583594885

>

Fitting a Sine Curve to Sparse Data from observations of the Tides in the Bay of Fundy

> restart;

The Bay of Fundy in Nova Scotia has some of the largest tides in the world:  the water level can rise and fall 50 feet in one day.  Nova Scotia bends when the tide comes in! As 14 billion tonnes (14 cubic kilometers) of sea water flow into Minas Basin twice daily, the Nova Scotia countryside actually tilts slightly under the immense load.  In this project I will analyze some data concerning those tides.

The table below gives times and heights for low and high tides at Cape Blomidon.  The heights are measured in meters and represent the depth of the water compared to a fixed "standard" waterline.

From the data, find a function that models the water depth at Cape Blomidon as a function of time.  The model should cover the entire range of time covered by the data.  Make clear the units and the meaning of the independent variable, and the function values.  

>

Date Day High*Time High*HeightHigh*Height Low*Time Low*Height High*Time High*HeightHigh*Height Low*Time Low*Height
28*Sep Tue





1924 .88
29*Sep Wed 138 12.71 747 .88 1400 12.63 2007 .78
30*Sep Thu 222 12.59 828 1.06 1441 12.57 2049 .88
Oct Fri 303 12.32 908 1.40 1520 12.35 2130 1.15
2*Oct Sat 345 11.94 948 1.84 1601 12.02

>

Data

> TidalHeights := [0.88, 12.71,0.88,12.63,0.78,12.59,1.06,12.57,0.88,12.32, 1.40, 12.35, 1.15,11.94, 1.84, 12.02];

TidalHeights := [.88, 12.71, .88, 12.63, .78, 12.59, 1.06, 12.57, .88, 12.32, 1.40, 12.35, 1.15, 11.94, 1.84, 12.02]

Times, measured from midnight, 28 September

> TidalTimes := map( evalf,
         [19+24/60,

         24+1+38/60, 24+7+47/60, 24+14, 24+20+7/60,

         48+2+22/60, 48+8+28/60, 48+14+41/60, 48+20+49/60,

         72+3+3/60,  72+9+8/60,  72+15+20/60,  72+21+30/60,

         96+3+45/60, 96+9+48/60, 96+16+1/60]);

TidalTimes := [19.40000000, 25.63333333, 31.78333333, 38., 44.11666667, 50.36666667, 56.46666667, 62.68333333, 68.81666667, 75.05000000, 81.13333333, 87.33333333, 93.50000000, 99.75000000, 105.8000000...TidalTimes := [19.40000000, 25.63333333, 31.78333333, 38., 44.11666667, 50.36666667, 56.46666667, 62.68333333, 68.81666667, 75.05000000, 81.13333333, 87.33333333, 93.50000000, 99.75000000, 105.8000000...

> Pairs := zip((x,y) -> [x,y], TidalTimes, TidalHeights);

Pairs := [[19.40000000, .88], [25.63333333, 12.71], [31.78333333, .88], [38., 12.63], [44.11666667, .78], [50.36666667, 12.59], [56.46666667, 1.06], [62.68333333, 12.57], [68.81666667, .88], [75.05000...Pairs := [[19.40000000, .88], [25.63333333, 12.71], [31.78333333, .88], [38., 12.63], [44.11666667, .78], [50.36666667, 12.59], [56.46666667, 1.06], [62.68333333, 12.57], [68.81666667, .88], [75.05000...Pairs := [[19.40000000, .88], [25.63333333, 12.71], [31.78333333, .88], [38., 12.63], [44.11666667, .78], [50.36666667, 12.59], [56.46666667, 1.06], [62.68333333, 12.57], [68.81666667, .88], [75.05000...

> plot( Pairs, symbol = DIAMOND, style = POINT);

[Plot]

>

Mean of the Data, the vertical translation of the data, range

> with(stats);

[anova, describe, fit, importdata, random, statevalf, statplots, transform]

> K := describe[mean](TidalHeights);

K := 6.750000000

> RescaledTidalHeights := map( y -> y - K, TidalHeights);

RescaledTidalHeights := [-5.870000000, 5.960000000, -5.870000000, 5.880000000, -5.970000000, 5.840000000, -5.690000000, 5.820000000, -5.870000000, 5.570000000, -5.350000000, 5.600000000, -5.600000000,...RescaledTidalHeights := [-5.870000000, 5.960000000, -5.870000000, 5.880000000, -5.970000000, 5.840000000, -5.690000000, 5.820000000, -5.870000000, 5.570000000, -5.350000000, 5.600000000, -5.600000000,...RescaledTidalHeights := [-5.870000000, 5.960000000, -5.870000000, 5.880000000, -5.970000000, 5.840000000, -5.690000000, 5.820000000, -5.870000000, 5.570000000, -5.350000000, 5.600000000, -5.600000000,...

> ScaledTidalHeightsRange := describe[range](RescaledTidalHeights);

ScaledTidalHeightsRange := -5.970000000 .. 5.960000000

> A := max( abs( op(1,ScaledTidalHeightsRange)), abs( op(2, ScaledTidalHeightsRange)));

A := 5.970000000

>

Transform the Data

> ArcsinScaledTidalHeights
:= map( y -> arcsin( (y-K)/A), TidalHeights);

ArcsinScaledTidalHeights := [-1.387507530, 1.512908336, -1.387507530, 1.396937719, -1.570796327, 1.361726790, -1.263314789, 1.346156370, -1.387507530, 1.202656293, -1.111010151, 1.216882418, -1.216882...ArcsinScaledTidalHeights := [-1.387507530, 1.512908336, -1.387507530, 1.396937719, -1.570796327, 1.361726790, -1.263314789, 1.346156370, -1.387507530, 1.202656293, -1.111010151, 1.216882418, -1.216882...ArcsinScaledTidalHeights := [-1.387507530, 1.512908336, -1.387507530, 1.396937719, -1.570796327, 1.361726790, -1.263314789, 1.346156370, -1.387507530, 1.202656293, -1.111010151, 1.216882418, -1.216882...

>

Fitting Transformed Data

> FittedLine := fit[leastsquare[[z,x], x=RecipFreq*z+phi]]([ArcsinScaledTidalHeights,TidalTimes]);

FittedLine := x = 2.336435638*z+65.75776848

> f := 1/coeff(rhs(FittedLine), z, 1);
phi := coeff(rhs(FittedLine), z, 0);

f := .4280023741

phi := 65.75776848

> DataPlot := plot( Pairs, symbol = DIAMOND, style = POINT):

> CurvePlot := plot( A*sin(f*(x-phi)) + K, x=TidalTimes[1]..TidalTimes[nops(TidalTimes)], color = blue):

> plots[display](DataPlot, CurvePlot);

[Plot]

> RMSError :=
sqrt(

   sum(

       ( A*sin(f*(TidalTimes[i]-phi))+K - TidalHeights[i])^2,

       i = 1..nops(TidalTimes)

   )

);

RMSError := 25.21098210

>

Some Additional Fitting by Eye

> AverageHigh := describe[mean]([ seq(TidalHeights[2*i], i = 1..nops(TidalHeights)/2 )]);
AverageLow := describe[mean]([ seq(TidalHeights[2*i-1], i = 1..nops(TidalHeights)/2 )]);

A := (AverageHigh - AverageLow)/2;

K := (AverageHigh + AverageLow)/2;

AverageHigh := 12.39125000

AverageLow := 1.108750000

A := 5.641250000

K := 6.750000000

> period := describe[mean]( [ seq( TidalTimes[i+2]-TidalTimes[i], i=1..nops(TidalTimes)-2) ]);

period := 12.34166667

> f := evalf(2*Pi/period);

f := .5091034685

> phi := solve( f*(TidalTimes[1]-x) = -Pi/2, x);

phi := 22.48541667

> CurvePlot := plot( A*sin(f*(x-phi)) + K, x=TidalTimes[1]..TidalTimes[nops(TidalTimes)], color = blue):

> plots[display](DataPlot, CurvePlot);

[Plot]

> RMSError :=
sqrt(

   sum(

       ( A*sin(f*(TidalTimes[i]-phi))+K - TidalHeights[i])^2,

       i = 1..nops(TidalTimes)

   )

);

RMSError := 1.210391489

>

Curve Fitting by Least-Squares (Nonlinear) Optimization

> with(Optimization);

[ImportMPS, Interactive, LPSolve, LSSolve, Maximize, Minimize, NLPSolve, QPSolve]

> A;f;phi;K;

5.641250000

.5091034685

22.48541667

6.750000000

> NonlinearResiduals := [ seq( A0*sin(f0*(TidalTimes[i]-phi0))+K0 - TidalHeights[i], i = 1..nops(TidalTimes)) ]:

> lssoln := LSSolve( NonlinearResiduals, initialpoint={A0=A, f0=f, phi0=phi, K0=K});

lssoln := [.146936077892452610, [A0 = 5.92429337951901846, f0 = .515526112861721675, K0 = 6.79905069581446232, phi0 = 22.5976309365197316]]lssoln := [.146936077892452610, [A0 = 5.92429337951901846, f0 = .515526112861721675, K0 = 6.79905069581446232, phi0 = 22.5976309365197316]]

> CurvePlot := plot( subs( lssoln[2], A0*sin(f0*(x-phi0)) + K0), x=TidalTimes[1]..TidalTimes[nops(TidalTimes)], color = blue):

> plots[display](DataPlot,CurvePlot);

[Plot]

> RMSError :=
sqrt(

   sum(

       ( subs( lssoln[2], A0*sin(f0*(TidalTimes[i]-phi0))+K0 - TidalHeights[i])^2),

       i = 1..nops(TidalTimes)

   )

);

RMSError := .5420997685

The RMSError and the objective function measured by the Least-Squares minimization are related by Least-Squares = (1/2)*RMSError^2:

> lssoln[1], (1/2)*RMSError^2;

.146936077892452610, .1469360795

>