Software and datasets to support 'Modern Applied Statistics with S',
fourth edition, by W. N. Venables and B. D. Ripley.
Springer, 2002, ISBN 0-387-95457-0.
This file documents software changes since the third edition.
- eqscplot has new arguments ratio and uin.
- stepAIC will not drop strata terms in coxph or survreg models.
- profile.glm will report inadequate supplied glm fits, not just fail.
- new method confint.lm.
- fractions/rational allow missing values.
- mvrnorm has an 'empirical' argument.
- predict.lda and predict.qda try harder to avoid exponential underflow.
- new function fitdistr for ML estimation of univariate distributions.
- new function glmmPQL to use lme to fit GLMMs by PQL
- truehist allows rule for nbins to be specified as a character string.
- parcoord function.
- new datasets bacteria, epil, nlschools, SP500
- polr allows control argment for optim, reports lack of convergence.
- stepAIC works again if formula has an offset (R had changed).
- biplot.correspondence now shows the origin as a cross.
- polr was not preserving contrasts to put in the fit object.
- vcov methods for lme, gls, coxph and survReg.
- Added 'tol' argument to isoMDS.
- stepAIC now allow 'direction=both' starting from a full model.
- glm.nb allows R-style 'start' argument.
- truehist passes ... on to both plot.default() and rect().
- isoMDS now uses the C interface to optim.
- addterm, dropterm, stepAIC now work with lme and gls fits.
- huber checks for MAD equal to zero.
- glmmPQL now loads nlme if not already loaded.
- glmmPQL handles list 'random' arguments (7.0-11).
- The MASS datasets no longer require data(foo) to load them. (7.0-11)
- mvrnorm uses eigen(EISPACK=TRUE) for back-compatibility (7.0-11, R 1.7.0)
- print.summary.polr could lose dimnames for 1 coefficient.
- remove heart as survival in R now has it.
- confint.{lm,glm} didn't handle specifying parm in all cases.
- confint and confint.lm have been migrated to base in R.
- addterm.default, dropterm.default and stepAIC work better inside functions.
- glm.nb now sets AIC in the object, and has a logLik() method.
- truehist now accepts a 'ylab' argument.
- negative.binomial and neg.bin no longer generate objects with
package:MASS in their environment.
- stepAIC now drops (if allowed) 0-df terms sequentially from the right.
- lda(CV=TRUE) now works for rank-deficient fits.
- predict methods for lda, polr now check newdata types.
- model.frame.lda/polr now look for the environment of the original formula.
- polr has a new `model' argument defaulting to TRUE.
- fitdistr supports the trivial case of a Normal distribution.
- sammon and isoMDS now allow missing values in the dissimilarity matrix, and
isoMDS allows Minkowski distances in the configuration space.
- cov.trob works better if wts are supplied, and may converge a little faster
in any case.
- The ch11.R script now uses mclust not mclust1998.
- The default xlab for boxcox() is now greek lambda.
- glmmPQL now handles offset terms.
- add predict.rlm method to correct predict.lm in the case se.fit=TRUE.
- weighted rlm fits are handled better, and default to "inv.var".
- logtrans works without specifying 'data'.
- predict() method for glmmPQL.
- polr() has an option for probit or proportional hazard fits.
- neg.bin() and negative.binomial() had an error in the aic() formula.
- The ch05.R script now includes the code for Figure 5.8.
- Datasets austres, fdeaths, lh, mdeaths, nottem and rock
are now visible in the 'datasets' package of R 2.0.0 and so have
been removed here.
- Script ch07.R now gives details using the gam() function in package gam as
well as that in package mgcv.
- rlm's fitted component is now always unweighted.
- theta.{md,ml,mm} now have one help file with examples.
- polr() has a new method "cauchit" suggested by Roger Koenker.
(Requires R >= 2.1.0)
- polr() now works with transformed intercepts, and usually converges
better (contributed by David Firth).
- polr() handles a rank-deficient model matrix.
- polr() now returns the method used, and uses it for predictions.
- anova() method for polr (contributed by John Fox).
- predict.glmmPQL was not using the na.action in the object as intended.
- The default methods for addterm and dropterm and anova.polr now check
for changes in the number of cases in use caused e.g. by na.action=na.omit.
- Added vcov() method for rlm fits.
- eqscplot() accepts reversed values for xlim and ylim.
- Script ch10.R uses se.contrast to calculate se's missing from model.tables.
- profile() and confint() methods for polr().
- glm.convert() was not setting the `offset' component that R's glm objects
have.
- sammon() now checks for duplicates in the initial configuration.
- isoMDS() and sammon() work around dropping of names.dist in 2.1.0
- lda() now gives an explicit error message if all group means are the same.
- fitdistr() now has a logLik() method, chooses the optim() method if not
supplied, handles the log-normal by closed-form and no longer attempts to
handle the uniform.
- glm.nb() now accepts 'mustart'.
- glm.nb() now supports weights: they used to be ignored when estimating
theta.
- fitdistr() now supports geometric and Poisson distributions, and
uses closed-form results for the exponential.
- lm.ridge, lqs and rlm allow offset() terms.
- the 'prior' argument of predict.qda is now operational.
- script ch12.R now has b1() adapted for R's contour().
- anova.polr() quoted model dfs, not residual dfs.
- stepAIC() applied to a polr fit now gets the correct rdf's in the
anova table.
- lm.gls() now returns fitted values and residuals on the original
coordinates (not the uncorrelated ones).
- parcoord() now allows missing values and has a new argument
'var.label' to label the variable axes. (Contributed by Fabian Scheipl.)
- rlm() has a 'lqs.control' argument passed to lqs() where used for
initialization.
- rlm() could fail with some psi functions (e.g. psi.hampel) if 'init' was
given as a numeric vector.
- rlm() handles weighted fits slightly differently, in particular trying
to give the same scale estimate if wt.method="case" as if replicating the
cases.
- confint.nls copes with plinear models in R (now profile.nls does).
- The wrappers lmsreg() etc have been adapted to work in the MASS namespace.
- qda() accepts formulae containing backquoted non-syntactic names.
- polr() gives an explicit error message if 'start' is misspecified.
- glmmPQL() evaluates the formulae for 'fixed' and 'random', which may
help if they are given as variables and not values.
- There are anova() and logLik() methods for class "glmmPQL" to stop misuse.
- profile.polr() now works for a single-coefficient model.
- The print and print.summary methods for polr and rlm make use of
naprint() to print a message e.g. about deleted observations.
- Class "ridgelm" now has a coef() method, and works for n < p.
- lda() and qda() now check explicitly for non-finite 'x' values.
- ch06.R has been updated for multcomp >= 0.991-1
- profile.glm is more likely to find the model frame in complicated
scopes.
- message() is used for most messages.
- truehist() checks more thoroughly for erroneous inputs.
- polr(model=TRUE) works again.
- add logLik() method for polr.
- the summary() methods for classes "negbin" and "rlm" now default to
correlation = FALSE.
- there is a vcov() method for class "negbin": unlike the "glm" method
this defaults to dispersion = 1.
- coding for 'sex' in ?Melanoma has been corrected.
- the example for gamma.shape has a better starting point and so converges
- avoid abbreviation of survreg(dist=) in example(gehan)
- profile() and confint() methods for "glm" objects now handle
rank-deficient fits.
- profile.glm() produced an output in a format plot.profile could not
read for single-variable fits. Also for confint() on intercept-only
fits.
- The print() methods for fitdistr() and lm.ridge() now return invisibly.
- vcov() and profile() methods for polr() used starting values in the
external not internal parametrization, which could slow convergence.
- glm.nb() called theta.ml() incorrect when weights were supplied whch did not
sum to n.
- removed unused argument 'nseg' to plot.profile.
- 'alpha' in the "glm" and "polr" methods for profile() is now interpreted
as two-tailed univariate for consistency with other profile methods.
- 'mammals': corrected typos in names, some thanks to Arni Magnusson.
- profile.glm() now works for binomial glm specified with a matrix response
and a cmpletely zero row.
- there is a "negbin" method for simulate()
- the use of package mclust has been removed from the ch11.R script
because of the change of licence conditions for that package.
- change ch13.R script for change in package 'survival' 2.35-x.
- glmmPQL looks up variables in its 'correlation' argument (if a formula)
in the usual scope (wish of Ben Bolker: such arguments are unsupported).
- added a simulate() method for unweighted polr() fits.
- kde2d() allows a length-2 argument 'n'.
- the default for truehist(col=) is now set to a colour, not a colour number.
- the returned fitted values and (undocumented) linear predictor for
polr() did not take any offset into account (reported by Ioannis Kosmides).
- the vcov() method for polr() now returns on the zeta scale (suggested by
Achim Zeileis).
- fitdistr() gains a vcov() method (suggested by Achim Zeileis).
- ch06.R has R alternatives to fac.design.
- ch11.R has R alternatives for ggobi and factor rotation.
- hubers() copes in extreme cases when middle 50% of data is constant.
- tests/ now includes dataset for polr.R, so checking depends only on
base packages and lattice.
- The "glm" method for profile() failed when given a binomial model
with a two-column response.
- fitdistr() works harder to rescale the problem when fitting a gamma.
- cov.trob() handles zero weights without giving a warning (reported by
John Fox).
- boxcox() works better when 'y' is very badly scaled, e.g. around 1e-16
(patch by Martin Maechler).
- mvrnorm() no longer defaults to the deprecated EISPACK=TRUE (and
hence changes the results). It gains an argument 'EISPACK' for
back-compatibility.
- the "polr" method for profile() could lose dimensions in its return object
(reported by Joris Meys)
- kde2d() throws an error if given zero bandwidths or constant data.
- ldahist(sep = TRUE) was missing a dev.flush().
- addterm.glm() mis-calculated F statistics for df > 1.
- anova.loglm() needed revision for changes in R.
- The addterm() default method allows update() to fail.