Apostila Do Software Statistica 7
Example 7: Detecting Outliers Data File. This example is based on the data file Poverty.sta. The data are based on a comparison of 1960 and 1970 Census figures for a random selection of 30 counties. The names of the counties were entered as case names. The information for each variable is listed in the [accessible by selecting the Data tab and in the Variables group clicking All Specs (ribbon bar), or selecting from the menu (classic menus)]. Open the data file: Select the Home tab. In the File group, click the Open arrow and on the menu, select Open Examples.
The Open a Statistica Data File dialog box is displayed. Poverty.sta is located in the Datasets folder. On the menu, select to display the Open a Statistica Data File dialog box; Poverty.sta is located in the Datasets folder. Research Question. In other examples (e.g.,,, ), it was illustrated how to analyze the correlates of poverty, that is, the variables that best predict the percent of families below the poverty line in a county. In the course of those analyses, at least one outlier was detected.
T2-2 Online Tutorial T2: Statistica Software Project 2. Click OK on the variable selection dialog and then click OK again on the Select dependent variables and predictors dialog to complete the selections. Ucf Speech Language Pathology Program. Click the Node Browserbutton to display the browser and then open the folder Classification and Discrimination (see Figure T2.2).
For this example, we are interested in locating any outliers that might exist in the data set. Starting the Analysis. Start the Basic Statistics and Tables module, which provides both graphical and quantitative approaches to detecting outliers. Select the Statistics tab.
In the Base group, click Basic Statistics to display the. Classic menus. On the menu, select to display the Basic Statistics and Tables Startup Panel. To begin the analysis, double-click Descriptive statistics to display the dialog box. Graphical approach.
A common graphical means of detecting outliers is to construct a box plot of the data. To do this, click the Variables button in the Descriptive Statistics dialog box to display a variable selection dialog box. Because we are interested in detecting any existing outliers, click the Select All button, and then click OK in the variable selection dialog box. Now, on the Quick tab in the Descriptive Statistics dialog box, click the Box & whisker plot for all variables button. Clearly, there is greater variability within the variable N_EMPLD than within the other variables. In this initial graph, potential outliers and extreme values are not displayed. To enable this feature, double-click in the background of the graph to display the dialog box.
Select the Box/Whisker tab (located under Plot). Click the More button to display the dialog box, in which you can select additional options to compute the box and whiskers, control the display of outliers and extremes, and use the trimmed distribution of the dependent variable to compute mean/median. In the Outliers drop-down list, select Outl. Click the Close button in the Box/Whiskers More Options dialog box, and click the OK button in the dialog box to update the graph with outliers and extreme values. As we suspected, there seems to be an outlier in the variable N_EMPLD. The Basic Statistics and Tables module also provides certain quantitative methods for detecting outliers, one of which is the Grubbs test.
To perform this test, return to the Descriptive Statistics dialog box and select the Robust tab. This tab contains options for including trimmed means, Winsorized means, and Grubbs test statistic in the Descriptive Statistics spreadsheet. Grubbs test for outliers (Grubbs 1969; Stefansky 1972) can be used to detect a single outlier at a time.
Essential Visuals Plugin For Virtual Dj Crack S. It works by quantifying how far the suspected outlier is from other data points. Grubbs test statistic (G) is calculated as the ratio of the largest absolute deviation from the sample mean to the sample standard deviation.
On the Robust tab, select the Grubbs test for outliers check box. Now click the Summary: statistics button to generate a spreadsheet that contains descriptive statistics for all variables. Here, we can see that the Grubbs Test Statistic for N_EMPLD is 4.88. It has a p-value of 0.00.
This small p-value is evidence that there is at least 1 outlier in the N-EMPLD variable. Recoding outliers. Once the presence of outliers has been detected, it is up to the researcher to determine whether the outlier represents a genuine property of the underlying phenomenon (variable) or is due to measurement errors or other anomalies that should not be modeled.