Labfans是一个针对大学生、工程师和科研工作者的技术社区。 | 论坛首页 | 联系我们(Contact Us) |
![]() |
![]() |
#1 |
高级会员
注册日期: 2019-11-21
帖子: 3,006
声望力: 66 ![]() |
![]()
We have been investigating a recent bug report about fitnlm, the Statistics and Machine Learning Toolbox function for robust fitting of nonlinear models.
Contents
![]() The bug report comes from Greg Pelletier, an independent research scientist and biogeochemical modeler in Olympia, Washington. Greg has been studying the vulnerability of sensitive marine organisms to increases in ocean acidification. One of the most important of these organisms is Mercenaria mercenaria, the hard clam. Especially here in New England, hard clams are known by their traditional Native American name, quahog. They have a well-deserved reputation for making excellent clam chowder. Acidification We are all aware of increasing levels of carbon dioxide in the earth's atmosphere. We may not be as aware of the effect this increase has on the health of the earth's oceans. According to NOAA, the ocean absorbs about 30% of the atmospheric carbon dioxide. A definitive and controversial 2009 paper by Justin Ries and colleagues, then at the Woods Hole Oceanographic Institution, is "Marine calcifiers exhibit mixed responses to CO2-induced ocean acidification", https://doi.org/10.1130/G30210A.1. The hard clam example in Greg's bug report comes from figure 1K in the Ries et al. paper. The independent variable in experiments is the ratio of alkalinity of sea water to the concentration of dissolved inorganic carbon. The dependent variable is the calcification rate, which compares how fast the organism builds its shells to how fast the shells are dissolving. ![]() Separable Least Squares The model chosen by Ries at al. is $$ y \approx \beta_1 + \beta_2 e^{\lambda t} $$ where $t$ is the ratio of alkalinity to dissolved carbon and $y$ is the calcification rate. The data have only four distinct values of $t$, with several observations of $y$ at each value. The parameters $\beta_1$, $\beta_2$ and $\lambda$ are determined by least squares curve fit. This is a separable least squares problem. For any given value of $\lambda$, the parameters $\beta_1$ and $\beta_2$ occur linearly and the least squares solution can be obtained by MATLAB's backslash. Gene Golub and Victor Pereyra described separable least squares in 1973 and proposed solving it by a variable projection algorithm. Since 1973 a number of people, including Pereyra, Linda Kaufman, Fred Krogh, John Bolstadt and David Gay, have contributed to the development of a series of Fortran programs named varpro. In 2011, Dianne O'Leary and Burt Rust created a MATLAB version of varpro. Their report, https://www.cs.umd.edu/~oleary/software/varpro/, is a good background source, as well as documentation for varpro.m. I have a section on separable least squares, and an example, expfitdemo, in NCM, Numerical Computing with MATLAB. I have modified expfitdemo to work on Greg's quahogs problem. Centering Data It turns out that the problem Greg encountered can be traced to the fact that the data are not centered. The given values of $t$ are all positive. This causes fitnlm to print a warning message and attempt to rectify the situation by changing the degrees of freedom from 22 to 23, but this only makes the situation worse. (We should take another look at the portion of fitnlm that adjusts the degrees of freedom.) It is always a good idea in curve fitting to center the data with something like t = t - mean(t)The values of $y$ are already pretty well centered. Rescaling $y$ with y = 10000*ymakes interpretation of results easier. Exponential Fitting With the data centered and scaled, we have three different ways of tackling Greg's problem. All three methods agree on the results they compute.
![]() Software The codes for this post are available here quahogs_driver.m and here varpro.m. Thanks Thanks to Greg Pelletier for the bug report and to Tom Lane for his statistical expertise. |
![]() |
![]() |