# Extreme case

library(MMRcaseselection)

The extreme case is selected with regard to the cases’ values on an independent variable or the outcome. It is defined by the absolute difference between the case value on the chosen variable and the variable’s mean. For example, for the outcome the extremeness of a case is defined as $$|Y_i-\hat{Y}|$$, with $$i$$ being the case index. The extreme case is the case with the maximum absolute difference. (see Seawright (2016)) Depending on the research question or substantive interest, one might be interested in the extreme case in the lower or upper range of a variable. Extremeness is then calculated with $$Y_i-\hat{Y}$$.

The extreme_on_x() and extreme_on_y() functions take an lm object as input and calculate the extremeness of all cases. For extremeness on an independent variable, one additionally needs to specify the variable of interest as a character. The output is a dataframe and cases are ordered by absolute extremeness in decreasing order. The dataframe also presents the extremeness values that show whether the case is extreme in the lower range of the variable (negative values) or the positive range (positive values).

df <- lm(mpg ~ disp + wt, data = mtcars)
extreme_on_x(df, "wt")
#>                      mpg  disp    wt abs. extremeness extremeness
#> Lincoln Continental 10.4 460.0 5.424          2.20675     2.20675
#> Chrysler Imperial   14.7 440.0 5.345          2.12775     2.12775
#> Cadillac Fleetwood  10.4 472.0 5.250          2.03275     2.03275
#> Lotus Europa        30.4  95.1 1.513          1.70425    -1.70425
#> Honda Civic         30.4  75.7 1.615          1.60225    -1.60225
#> Toyota Corolla      33.9  71.1 1.835          1.38225    -1.38225
#> Fiat X1-9           27.3  79.0 1.935          1.28225    -1.28225
#> Porsche 914-2       26.0 120.3 2.140          1.07725    -1.07725
#> Fiat 128            32.4  78.7 2.200          1.01725    -1.01725
#> Datsun 710          22.8 108.0 2.320          0.89725    -0.89725
#> Merc 450SE          16.4 275.8 4.070          0.85275     0.85275
#> Toyota Corona       21.5 120.1 2.465          0.75225    -0.75225
#> Pontiac Firebird    19.2 400.0 3.845          0.62775     0.62775
#> Camaro Z28          13.3 350.0 3.840          0.62275     0.62275
#> Mazda RX4           21.0 160.0 2.620          0.59725    -0.59725
#> Merc 450SLC         15.2 275.8 3.780          0.56275     0.56275
#> Merc 450SL          17.3 275.8 3.730          0.51275     0.51275
#> Ferrari Dino        19.7 145.0 2.770          0.44725    -0.44725
#> Volvo 142E          21.4 121.0 2.780          0.43725    -0.43725
#> Duster 360          14.3 360.0 3.570          0.35275     0.35275
#> Maserati Bora       15.0 301.0 3.570          0.35275     0.35275
#> Mazda RX4 Wag       21.0 160.0 2.875          0.34225    -0.34225
#> Dodge Challenger    15.5 318.0 3.520          0.30275     0.30275
#> Valiant             18.1 225.0 3.460          0.24275     0.24275
#> Hornet Sportabout   18.7 360.0 3.440          0.22275     0.22275
#> Merc 280            19.2 167.6 3.440          0.22275     0.22275
#> Merc 280C           17.8 167.6 3.440          0.22275     0.22275
#> AMC Javelin         15.2 304.0 3.435          0.21775     0.21775
#> Merc 230            22.8 140.8 3.150          0.06725    -0.06725
#> Ford Pantera L      15.8 351.0 3.170          0.04725    -0.04725
#> Merc 240D           24.4 146.7 3.190          0.02725    -0.02725
#> Hornet 4 Drive      21.4 258.0 3.215          0.00225    -0.00225

The calculation of extremeness on the outcome only requires an lm object as input.

df <- lm(mpg ~ disp + wt, data = mtcars)
Y_extreme <- extreme_on_y(df)
#> Lincoln Continental 10.4 460.0 5.424         9.690625   -9.690625