Logistic Regression for Dichotomous Dependent Variables

Arguments

formula

a symbolic representation of the model to be estimated, in the form y ~ x1 + x2, where y is the dependent variable and x1 and x2 are the explanatory variables, and y, x1, and x2 are contained in the same dataset. (You may include more than two explanatory variables, of course.) The + symbol means ``inclusion'' not ``addition.'' You may also include interaction terms and main effects in the form x1*x2 without computing them in prior steps; I(x1*x2) to include only the interaction term and exclude the main effects; and quadratic terms in the form I(x1^2).

model

the name of a statistical model to estimate. For a list of other supported models and their documentation see: http://docs.zeligproject.org/articles/.

data

the name of a data frame containing the variables referenced in the formula or a list of multiply imputed data frames each having the same variable names and row numbers (created by Amelia or to_zelig_mi).

...

additional arguments passed to zelig, relevant for the model to be estimated.

by

a factor variable contained in data. If supplied, zelig will subset the data frame based on the levels in the by variable, and estimate a model for each subset. This can save a considerable amount of effort. You may also use by to run models using MatchIt subclasses.

cite

If is set to 'TRUE' (default), the model citation will be printed to the console.

below

(defaults to 0) The point at which the dependent variable is censored from below. If any values in the dependent variable are observed to be less than the censoring point, it is assumed that that particular observation is censored from below at the observed value. (See for a Bayesian implementation that supports both left and right censoring.)

robust

defaults to FALSE. If TRUE, zelig() computes robust standard errors based on sandwich estimators (see and ) and the options selected in cluster.

if

robust = TRUE, you may select a variable to define groups of correlated observations. Let x3 be a variable that consists of either discrete numeric values, character strings, or factors that define strata. Then z.out <- zelig(y ~ x1 + x2, robust = TRUE, cluster = "x3", model = "tobit", data = mydata) means that the observations can be correlated within the strata defined by the variable x3, and that robust standard errors should be calculated according to those clusters. If robust = TRUE but cluster is not specified, zelig() assumes that each observation falls into its own cluster.

Value

Depending on the class of model selected, zelig will return an object with elements including coefficients, residuals, and formula which may be summarized using summary(z.out) or individually extracted using, for example, coef(z.out). See http://docs.zeligproject.org/articles/getters.html for a list of functions to extract model components. You can also extract whole fitted model objects using from_zelig_model.

Details

Additional parameters avaialable to this model include:

Methods

show(signif.stars = FALSE, subset = NULL, bagging = FALSE)

Display a Zelig object

See also

Vignette: http://docs.zeligproject.org/articles/zelig_logit.html

Examples

library(Zelig) data(turnout) z.out1 <- zelig(vote ~ age + race, model = "logit", data = turnout, cite = FALSE) summary(z.out1)
#> Model: #> #> Call: #> z5$zelig(formula = vote ~ age + race, data = turnout) #> #> Deviance Residuals: #> Min 1Q Median 3Q Max #> -1.927 -1.296 0.707 0.777 1.072 #> #> Coefficients: #> Estimate Std. Error z value Pr(>|z|) #> (Intercept) 0.03837 0.17692 0.22 0.82832 #> age 0.01126 0.00305 3.69 0.00023 #> racewhite 0.64555 0.13448 4.80 1.6e-06 #> #> (Dispersion parameter for binomial family taken to be 1) #> #> Null deviance: 2266.7 on 1999 degrees of freedom #> Residual deviance: 2228.8 on 1997 degrees of freedom #> AIC: 2235 #> #> Number of Fisher Scoring iterations: 4 #> #> Next step: Use 'setx' method
summary(z.out1, odds_ratios = TRUE)
#> Model: #> #> Call: #> z5$zelig(formula = vote ~ age + race, data = turnout) #> #> Deviance Residuals: #> Min 1Q Median 3Q Max #> -1.927 -1.296 0.707 0.777 1.072 #> #> Coefficients: #> Estimate (OR) Std. Error (OR) z value Pr(>|z|) #> (Intercept) 1.03911 0.18384 0.22 0.82832 #> age 1.01133 0.00309 3.69 0.00023 *** #> racewhite 1.90704 0.25646 4.80 1.6e-06 *** #> --- #> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 #> #> (Dispersion parameter for binomial family taken to be 1) #> #> Null deviance: 2266.7 on 1999 degrees of freedom #> Residual deviance: 2228.8 on 1997 degrees of freedom #> AIC: 2235 #> #> Number of Fisher Scoring iterations: 4 #>
x.out1 <- setx(z.out1, age = 36, race = "white") s.out1 <- sim(z.out1, x = x.out1) summary(s.out1)
#> #> sim x : #> ----- #> ev #> mean sd 50% 2.5% 97.5% #> [1,] 0.748 0.0115 0.748 0.727 0.771 #> pv #> 0 1 #> [1,] 0.25 0.75
plot(s.out1)