A question I address to programmers.
I work extensively with the ‘LineThroughPt’ command, and while reproducing the theoretical example I’ve attached, I’ve encountered a mathematical inconsistency: the minimization calculation of the Rhino command does not coincide with the one calculated in Wolfram MATHEMATICA using the ‘least squares method.’ As shown in the figure, the differences are quite evident.
I wonder which formulation the Rhino programmers have used and whether they are aware of this error. Thank you.
Hello- a developer answered a similar question, ending with this
" But if they need details and a mathematical formula they should implement it themselves or use a scientific computing package with a robust documentation. Rhino is not really made with writing academic papers in mind and there is not one definite and simple math formula in Rhino’s LineThroughPt."
Thank you, Pascal, for your response, which I do not understand, however. Rhino is an extraordinary calculator for three-dimensional geometries, but here we are talking about geometric regressions necessary in a CAD environment, both 2D and 3D. I don’t expect to use it in a scientific context, but I am simply asking to verify the theoretical foundations of the regression methods (linear, circular, and planar) implemented in Rhino. These are not complex calculations; any Excel can perform them easily and correctly. I consider these tools fundamental for reverse-engineering, but they should be mathematically accurate to achieve correct geometries.
This difference is not due to mathematical error, but because Rhino does not use what statisticians and Mathematica call “the least-squares method”. Instead Rhino finds a line through points by computing an orthogonal distance regression.
The least-squares method is used in statistics to fit the graph of a function to a collection of sample points that are assumed to have no error in their x coordinate (the independent variable). All the error is assumed to be on the y coordinate (the dependent variable or observable). At a given x coordinate, we want the graph to be as close as possible to the sample at this x coordinate, minimizing the error in the y coordinate only. As a result, when fitting a linear function, the line obtained is the line that minimizes the sum of the squared vertical distances between the points and the line.
In Rhino however, points are assumed to have measurement errors in all directions: x, y (and z if points are 3d). The distance to use in that case to measure the error in the fit, is the orthogonal distance between the point and the graph, i.e. the distance between the point and its closest point on the graph. When fitting a line to points, the result is the line that minimizes the sum of the squared orthogonal distances between the points and the line. This is called an orthogonal distance regression.
This article has an image showing the differences between the two distances in the common case of a linear regression:
And for a fun example where results are completely different, yet mathematically correct in both cases, try fitting a line through the points (0,0), (1,0), (0,2), (1,2):
I’m sorry that thread got derailed a little bit and had to be closed, but if you’ve got any more questions on this topic I’ll be happy to answer them through the private messages here or on a new thread.