Saturday, September 15, 2012

R Bootstrapping, stratification

http://www.statmethods.net/advstats/bootstrapping.html


Bootstrapping a Single Statistic (k=1)

The following example generates the bootstrapped 95% confidence interval for R-squared in the linear regression of miles per gallon (mpg) on car weight (wt) and displacement (disp). The data source is mtcars. The bootstrapped confidence interval is based on 1000 replications.
# Bootstrap 95% CI for R-Squared
library(boot)
# function to obtain R-Squared from the data
rsq <- function(formula, data, indices) {
  d <- data[indices,] # allows boot to select sample
  fit <- lm(formula, data=d)
  return(summary(fit)$r.square)
}
# bootstrapping with 1000 replications
results <- boot(data=mtcars, statistic=rsq,
   R=1000, formula=mpg~wt+disp)

# view results
results
plot(results)

# get 95% confidence interval
boot.ci(results, type="bca")





http://rss.acs.unt.edu/Rdoc/library/sampling/html/strata.html

library('sampling')


############
## Example 3
############
# Uses the 'swissmunicipalities' data for drawing a sample of units
data(swissmunicipalities)
# the variable 'REG' has 7 categories in the population; it is used as stratification variable
# Computes the population stratum sizes
table(swissmunicipalities$REG)
# do not run
#  1   2   3   4   5   6   7 
# 589 913 321 171 471 186 245 
# the sample stratum sizes are given by size=c(30,20,45,15,20,11,44)
# the method is simple random sampling without replacement (equal probability, without replacement)
st=strata(swissmunicipalities,stratanames=c("REG"),size=c(30,20,45,15,20,11,44), method="srswor")
# extracts the observed data
# the order of the columns is different from the order in the swsissmunicipalities database
table(getdata(swissmunicipalities, st)$REG)
# do not run
# 1  2  3  4  5  6  7 
# 20 15 45 30 20 11 44
z<-getdata(swissmunicipalities, st)
dim(z)
#[1] 185  25
dim(swissmunicipalities)
#[1] 2896   22

No comments: