Excel for Microsoft 365 Excel for Microsoft 365 for Mac Excel for the web Excel 2021 Excel 2021 for Mac Excel 2019 Excel 2019 for Mac Excel 2016 Excel 2016 for Mac Excel 2013 Excel 2010 Excel 2007 Excel for Mac 2011 Excel Starter 2010 More...Less This article describes the formula syntax and usage of the SUMXMY2 function in Microsoft Excel. Returns the sum of squares of differences of corresponding values in two arrays. SUMXMY2(array_x, array_y) The SUMXMY2 function syntax has the following arguments:
Copy the example data in the following table, and paste it in cell A1 of a new Excel worksheet. For formulas to show results, select them, press F2, and then press Enter. If you need to, you can adjust the column widths to see all the data.
This article describes the formula syntax and usage of the LINEST function in Microsoft Excel. Find links to more information about charting and performing a regression analysis in the See Also section. DescriptionThe LINEST function calculates the statistics for a line by using the "least squares" method to calculate a straight line that best fits your data, and then returns an array that describes the line. You can also combine LINEST with other functions to calculate the statistics for other types of models that are linear in the unknown parameters, including polynomial, logarithmic, exponential, and power series. Because this function returns an array of values, it must be entered as an array formula. Instructions follow the examples in this article. The equation for the line is: y = mx + b –or– y = m1x1 + m2x2 + ... + b if there are multiple ranges of x-values, where the dependent y-values are a function of the independent x-values. The m-values are coefficients corresponding to each x-value, and b is a constant value. Note that y, x, and m can be vectors. The array that the LINEST function returns is {mn,mn-1,...,m1,b}. LINEST can also return additional regression statistics. SyntaxLINEST(known_y's, [known_x's], [const], [stats]) The LINEST function syntax has the following arguments: Syntax
The following illustration shows the order in which the additional regression statistics are returned.
Remarks
ExamplesExample 1 - Slope and Y-InterceptCopy the example data in the following table, and paste it in cell A1 of a new Excel worksheet. For formulas to show results, select them, press F2, and then press Enter. If you need to, you can adjust the column widths to see all the data.
Example 2 - Simple Linear RegressionCopy the example data in the following table, and paste it in cell A1 of a new Excel worksheet. For formulas to show results, select them, press F2, and then press Enter. If you need to, you can adjust the column widths to see all the data.
Example 3 - Multiple Linear RegressionCopy the example data in the following table, and paste it in cell A1 of a new Excel worksheet. For formulas to show results, select them, press F2, and then press Enter. If you need to, you can adjust the column widths to see all the data.
Example 4 - Using the F and r2 StatisticsIn the preceding example, the coefficient of determination, or r2, is 0.99675 (see cell A17 in the output for LINEST), which would indicate a strong relationship between the independent variables and the sale price. You can use the F statistic to determine whether these results, with such a high r2 value, occurred by chance. Assume for the moment that in fact there is no relationship among the variables, but that you have drawn a rare sample of 11 office buildings that causes the statistical analysis to demonstrate a strong relationship. The term "Alpha" is used for the probability of erroneously concluding that there is a relationship. The F and df values in output from the LINEST function can be used to assess the likelihood of a higher F value occurring by chance. F can be compared with critical values in published F-distribution tables or the FDIST function in Excel can be used to calculate the probability of a larger F value occurring by chance. The appropriate F distribution has v1 and v2 degrees of freedom. If n is the number of data points and const = TRUE or omitted, then v1 = n – df – 1 and v2 = df. (If const = FALSE, then v1 = n – df and v2 = df.) The FDIST function— with the syntax FDIST(F,v1,v2)— will return the probability of a higher F value occurring by chance. In this example, df = 6 (cell B18) and F = 459.753674 (cell A18). Assuming an Alpha value of 0.05, v1 = 11 – 6 – 1 = 4 and v2 = 6, the critical level of F is 4.53. Since F = 459.753674 is much higher than 4.53, it is extremely unlikely that an F value this high occurred by chance. (With Alpha = 0.05, the hypothesis that there is no relationship between known_y’s and known_x’s is to be rejected when F exceeds the critical level, 4.53.) You can use the FDIST function in Excel to obtain the probability that an F value this high occurred by chance. For example, FDIST(459.753674, 4, 6) = 1.37E-7, an extremely small probability. You can conclude, either by finding the critical level of F in a table or by using the FDIST function, that the regression equation is useful in predicting the assessed value of office buildings in this area. Remember that it is critical to use the correct values of v1 and v2 that were computed in the preceding paragraph. Example 5 - Calculating the t-StatisticsAnother hypothesis test will determine whether each slope coefficient is useful in estimating the assessed value of an office building in Example 3. For example, to test the age coefficient for statistical significance, divide -234.24 (age slope coefficient) by 13.268 (the estimated standard error of age coefficients in cell A15). The following is the t-observed value: t = m4 ÷ se4 = -234.24 ÷ 13.268 = -17.7 If the absolute value of t is sufficiently high, it can be concluded that the slope coefficient is useful in estimating the assessed value of an office building in Example 3. The following table shows the absolute values of the 4 t-observed values. If you consult a table in a statistics manual, you will find that t-critical, two tailed, with 6 degrees of freedom and Alpha = 0.05 is 2.447. This critical value can also be found by using the TINV function in Excel. TINV(0.05,6) = 2.447. Because the absolute value of t (17.7) is greater than 2.447, age is an important variable when estimating the assessed value of an office building. Each of the other independent variables can be tested for statistical significance in a similar manner. The following are the t-observed values for each of the independent variables.
These values all have an absolute value greater than 2.447; therefore, all the variables used in the regression equation are useful in predicting the assessed value of office buildings in this area. |