Fitting a Sine Curve to Data
Steven R. Dunbar
Department of Mathematics
University of Nebraska-Lincoln
Lincoln, NE, USA 68588
sdunbar@math.unl.edu
http://www.math.unl.edu/~sdunbar
sinefit.mws
Copyright: 1999, 2005
Lesson Information
Physics Content
Making a best sine curve fit to a set of sparse data from observations of the star 51 Pegasi.
Making a best sine curve fit to a set of sparse data from observation of the tides in the Bay of Fundy.
Mathematics Content
Curve fitting, and simple statistics, least-squares optimization
Keywords
best fit curve, least-squares fitting, sine curve fit, sparse data
Objective
Illustrate nonlinear curve fitting with Maple, using both elementary commands and sophisticated tools.
Audience
Anyone interested in using Maple to do simple curve curve fitting.
Time
30-60 minutes
Prerequisite
Understanding of curve fitting problem.
Understanding of mean of a set of data.
Understanding of sine and arcsine functions.
Understanding vertical, horizontal shifting of a graph, understanding change of scale of a graph.
Understanding of least-squares line fitting.
References
http://www.physics.upenn.edu/courses/gladney/talks/partnerships/Pegasi/index.html
Plan
We want to find the parameters A, f, phi, K, so that the curve
y = A*sin(f*(x - phi)) + K
fits the set of data points { [x_i, y_i]} well.
The mean of the curve is clearly K, so we can take the mean of the data to be K.
The vertical range of the curve is 2*A, so we can take the vertical range of the vertically rescaled data to be 2*A, and solve for A.
Then
(y - K)/A = sin(f*(x - phi))
and
(1/f)* arcsin( (y-K)/A ) + phi = x
so the line
(1/f)*z + phi = x
should be a good fit to the set of data points { [arcsin( (y_i -K)/A, x_i)] }
| > | restart; |
Fitting a Sine Curve to Sparse Data from Observations of the Star 51 Pegasi
Data
Velocity of the star 51 Pegasi measured in m/s
| > | Velocity := [2.65, 2.80, 2.96, 3.80, 3.90, 4.60, 4.80, 4.90, 5.65, 5.92]; |
Days, measured in Julian Days
| > | Days := [-45.5, -48.8, -54.0, -13.5, -7.0, 42.0, 50.5, 54.0, 36.0, 14.5]; |
| > | Pairs := zip((x,y) -> [x,y], Velocity, Days); |
| > | plot( Pairs, symbol = DIAMOND, style = POINT); |
![[Plot]](images/sinefit_4.gif)
| > |
Mean of the Data, the vertical translation of the data, range
| > | with(stats); |
| > | K := describe[mean](Days); |
| > | RescaledDays := map( y -> y - K, Days); |
![]()
| > | ScaledDayRange := describe[range](RescaledDays); |
| > | A := max( abs(op(1,ScaledDayRange)), abs(op(2, ScaledDayRange))); |
| > |
Transform the Data
| > | ArcsinScaledDays
:= map( y -> arcsin( (y-K)/A), Days); |
![]()
| > |
Fitting Transformed Data
| > | FittedLine := fit[leastsquare[[z,x], x=RecipFreq*z+phi]]([ArcsinScaledDays,Velocity]); |
| > | f := 1/coeff(rhs(FittedLine), z, 1);
phi := coeff(rhs(FittedLine), z, 0); |
| > | DataPlot := plot( Pairs, symbol = DIAMOND, style = POINT): |
| > | CurvePlot := plot( A*sin(f*(x-phi)) + K, x=Velocity[1]..Velocity[nops(Velocity)], color = blue): |
| > | plots[display](DataPlot, CurvePlot); |
![[Plot]](images/sinefit_16.gif)
| > | RMSError :=
sqrt( sum( ( A*sin(f*(Velocity[i]-phi))+K - Days[i])^2, i = 1..nops(Velocity) ) ); |
| > |
Some Additional Fitting by Eye
| > | f := 1.5;
CurvePlot := plot( A*sin(1.5*(x-phi)) + K, x=Velocity[1]..Velocity[nops(Velocity)], color = blue): |
| > | plots[display](DataPlot, CurvePlot); |
![[Plot]](images/sinefit_19.gif)
| > | RMSError :=
sqrt( sum( ( A*sin(f*(Velocity[i]-phi))+K - Days[i])^2, i = 1..nops(Velocity) ) ); |
| > | phi := 4.0;
CurvePlot := plot( A*sin(f*(x-phi)) + K, x=Velocity[1]..Velocity[nops(Velocity)], color = blue): |
| > | plots[display](DataPlot, CurvePlot); |
![[Plot]](images/sinefit_22.gif)
| > | RMSError :=
sqrt( sum( ( A*sin(f*(Velocity[i]-phi))+K - Days[i])^2, i = 1..nops(Velocity) ) ); |
| > |
Curve Fitting by Least-Squares (Nonlinear) Optimization
| > | with(Optimization); |
| > | NonlinearResiduals := [ seq( A0*sin(f0*(Velocity[i]-phi0))+K0 - Days[i], i = 1..nops(Velocity)) ]: |
| > | lssoln := LSSolve( NonlinearResiduals, initialpoint={A0=A, f0=f, phi0=phi, K0=K}); |
![]()
| > | CurvePlot := plot( subs( lssoln[2], A0*sin(f0*(x-phi0)) + K0), x=Velocity[1]..Velocity[nops(Velocity)], color = blue): |
| > | plots[display]( DataPlot, CurvePlot); |
![[Plot]](images/sinefit_27.gif)
| > | RMSError :=
sqrt( sum( ( subs( lssoln[2], A0*sin(f0*(Velocity[i]-phi0))+K0) - Days[i])^2, i = 1..nops(Velocity) ) ); |
The RMSError and the objective function measured by the Least-Squares minimization are related by Least-Squares = (1/2)*RMSError^2:
| > | lssoln[1], (1/2)*RMSError^2; |
| > |
Fitting a Sine Curve to Sparse Data from observations of the Tides in the Bay of Fundy
| > | restart; |
The Bay of Fundy in Nova Scotia has some of the largest tides in the world: the water level can rise and fall 50 feet in one day. Nova Scotia bends when the tide comes in! As 14 billion tonnes (14 cubic kilometers) of sea water flow into Minas Basin twice daily, the Nova Scotia countryside actually tilts slightly under the immense load. In this project I will analyze some data concerning those tides.
The table below gives times and heights for low and high tides at Cape Blomidon. The heights are measured in meters and represent the depth of the water compared to a fixed "standard" waterline.
From the data, find a function that models the water depth at Cape Blomidon as a function of time. The model should cover the entire range of time covered by the data. Make clear the units and the meaning of the independent variable, and the function values.
| > |
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
| > |
Data
| > | TidalHeights := [0.88, 12.71,0.88,12.63,0.78,12.59,1.06,12.57,0.88,12.32, 1.40, 12.35, 1.15,11.94, 1.84, 12.02]; |
Times, measured from midnight, 28 September
| > | TidalTimes := map( evalf,
[19+24/60, 24+1+38/60, 24+7+47/60, 24+14, 24+20+7/60, 48+2+22/60, 48+8+28/60, 48+14+41/60, 48+20+49/60, 72+3+3/60, 72+9+8/60, 72+15+20/60, 72+21+30/60, 96+3+45/60, 96+9+48/60, 96+16+1/60]); |
![]()
| > | Pairs := zip((x,y) -> [x,y], TidalTimes, TidalHeights); |
![]()
![]()
| > | plot( Pairs, symbol = DIAMOND, style = POINT); |
![[Plot]](images/sinefit_90.gif)
| > |
Mean of the Data, the vertical translation of the data, range
| > | with(stats); |
| > | K := describe[mean](TidalHeights); |
| > | RescaledTidalHeights := map( y -> y - K, TidalHeights); |
![]()
![]()
| > | ScaledTidalHeightsRange := describe[range](RescaledTidalHeights); |
| > | A := max( abs( op(1,ScaledTidalHeightsRange)), abs( op(2, ScaledTidalHeightsRange))); |
| > |
Transform the Data
| > | ArcsinScaledTidalHeights
:= map( y -> arcsin( (y-K)/A), TidalHeights); |
![]()
![]()
| > |
Fitting Transformed Data
| > | FittedLine := fit[leastsquare[[z,x], x=RecipFreq*z+phi]]([ArcsinScaledTidalHeights,TidalTimes]); |
| > | f := 1/coeff(rhs(FittedLine), z, 1);
phi := coeff(rhs(FittedLine), z, 0); |
| > | DataPlot := plot( Pairs, symbol = DIAMOND, style = POINT): |
| > | CurvePlot := plot( A*sin(f*(x-phi)) + K, x=TidalTimes[1]..TidalTimes[nops(TidalTimes)], color = blue): |
| > | plots[display](DataPlot, CurvePlot); |
![[Plot]](images/sinefit_104.gif)
| > | RMSError :=
sqrt( sum( ( A*sin(f*(TidalTimes[i]-phi))+K - TidalHeights[i])^2, i = 1..nops(TidalTimes) ) ); |
| > |
Some Additional Fitting by Eye
| > | AverageHigh := describe[mean]([ seq(TidalHeights[2*i], i = 1..nops(TidalHeights)/2 )]);
AverageLow := describe[mean]([ seq(TidalHeights[2*i-1], i = 1..nops(TidalHeights)/2 )]); A := (AverageHigh - AverageLow)/2; K := (AverageHigh + AverageLow)/2; |
| > | period := describe[mean]( [ seq( TidalTimes[i+2]-TidalTimes[i], i=1..nops(TidalTimes)-2) ]); |
| > | f := evalf(2*Pi/period); |
| > | phi := solve( f*(TidalTimes[1]-x) = -Pi/2, x); |
| > | CurvePlot := plot( A*sin(f*(x-phi)) + K, x=TidalTimes[1]..TidalTimes[nops(TidalTimes)], color = blue): |
| > | plots[display](DataPlot, CurvePlot); |
![[Plot]](images/sinefit_113.gif)
| > | RMSError :=
sqrt( sum( ( A*sin(f*(TidalTimes[i]-phi))+K - TidalHeights[i])^2, i = 1..nops(TidalTimes) ) ); |
| > |
Curve Fitting by Least-Squares (Nonlinear) Optimization
| > | with(Optimization); |
| > | A;f;phi;K; |
| > | NonlinearResiduals := [ seq( A0*sin(f0*(TidalTimes[i]-phi0))+K0 - TidalHeights[i], i = 1..nops(TidalTimes)) ]: |
| > | lssoln := LSSolve( NonlinearResiduals, initialpoint={A0=A, f0=f, phi0=phi, K0=K}); |
![]()
| > | CurvePlot := plot( subs( lssoln[2], A0*sin(f0*(x-phi0)) + K0), x=TidalTimes[1]..TidalTimes[nops(TidalTimes)], color = blue): |
| > | plots[display](DataPlot,CurvePlot); |
![[Plot]](images/sinefit_122.gif)
| > | RMSError :=
sqrt( sum( ( subs( lssoln[2], A0*sin(f0*(TidalTimes[i]-phi0))+K0 - TidalHeights[i])^2), i = 1..nops(TidalTimes) ) ); |
The RMSError and the objective function measured by the Least-Squares minimization are related by Least-Squares = (1/2)*RMSError^2:
| > | lssoln[1], (1/2)*RMSError^2; |
| > |