Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 22 additions & 10 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,15 +1,27 @@
Package: TeachingSampling
Type: Package
Title: Selection of Samples and Parameter Estimation in Finite Population
Version: 4.2.0
Date: 2026-06-15
Authors@R: c(
person("Hugo Andres", "Gutierrez Rojas",
email = "hagutierrezro@gmail.com",
role = c("aut", "cre")),
person("Yury Vanessa", "Ochoa Montes",
email = "yury.ochoa@urosario.edu.co",
role = "ctb",
comment = "kish_allocation function"))
Description: Allows the user to draw probabilistic samples and make
inferences from a finite population based on several sampling designs,
including simple random, systematic, Bernoulli, Poisson, PPS,
stratified, and cluster sampling. Provides Horvitz-Thompson,
Hansen-Hurwitz, and generalised regression (GREG) estimators of
totals, means, ratios, regression coefficients, and quantiles,
along with exact and approximate variance estimators.
License: GPL (>= 2)
Version: 4.1.1
Date: 2020-04-21
Author: Hugo Andres Gutierrez Rojas <hagutierrezro@gmail.com>
Maintainer: Hugo Andres Gutierrez Rojas <hagutierrezro@gmail.com>
Depends:
R (>= 3.5),
dplyr,
magrittr
Description: Allows the user to draw probabilistic samples and make inferences from a finite population based on several sampling designs.
Depends: R (>= 3.5), dplyr, magrittr
Encoding: UTF-8
RoxygenNote: 7.1.0
NeedsCompilation: no
URL: https://github.com/psirusteam/TeachingSampling
BugReports: https://github.com/psirusteam/TeachingSampling/issues
Config/roxygen2/version: 8.0.0
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ export(T.SIC)
export(VarHT)
export(VarSYGHT)
export(Wk)
export(kish_allocation)
export(nk)
export(p.WR)
import(stats)
Expand Down
Binary file added R.zip
Binary file not shown.
53 changes: 47 additions & 6 deletions R/Deltakl.r
Original file line number Diff line number Diff line change
@@ -1,8 +1,49 @@
#' @export
#'
#' @title
#' Matrix of Joint Inclusion Probability Differences

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mantener el título de la función original

#' @description
#' Computes the matrix \eqn{\Delta_{kl} = \pi_{kl} - \pi_k \pi_l} for all
#' pairs of units in a finite population. This matrix appears in the exact
#' Horvitz-Thompson variance formula.
#' @return
#' An \code{N x N} matrix where entry \eqn{(k, l)} equals
#' \eqn{\pi_{kl} - \pi_k \pi_l}. Diagonal entries equal
#' \eqn{\pi_k(1 - \pi_k)}.
#' @details
#' The matrix \eqn{\Delta} is central to the Horvitz-Thompson variance
#' estimator:
#' \deqn{V(\hat{t}_{y,\pi}) = \sum_k \sum_l \Delta_{kl} \frac{y_k}{\pi_k}
#' \frac{y_l}{\pi_l}}
#' It requires computing both first-order (\code{\link{Pik}}) and
#' second-order (\code{\link{Pikl}}) inclusion probabilities, so it is only
#' feasible for small populations.
#' @author Hugo Andres Gutierrez Rojas <hagutierrezro at gmail.com>
#' @param N Population size. Recommended \code{N <= 15}.
#' @param n Sample size.
#' @param p Vector of probabilities for each possible sample in the support.
#' Must sum to 1.
#'
#' @references
#' Sarndal, C-E. and Swensson, B. and Wretman, J. (1992),
#' \emph{Model Assisted Survey Sampling}. Springer.\cr
#' Gutierrez, H. A. (2009), \emph{Estrategias de muestreo: Diseno de encuestas
#' y estimacion de parametros}. Editorial Universidad Santo Tomas.
#'
#' @seealso \code{\link{Pik}}, \code{\link{Pikl}}, \code{\link{VarHT}}
#'
#' @examples
#' U <- c("Yves", "Ken", "Erik", "Sharon", "Leslie")
#' N <- length(U)
#' n <- 2
#' p <- c(0.13, 0.2, 0.15, 0.1, 0.15, 0.04, 0.02, 0.06, 0.07, 0.08)
#' sum(p)
#' # Variance-Covariance matrix of the sample membership indicators
#' Deltakl(N, n, p)

Deltakl <- function(N, n, p){
Ind <- Ik(N,n)
P1 <- as.matrix(Pik(p, Ind))
Delta <-Pikl(N,n,p)-(t(P1)%*%P1)
return(Delta)
}
Deltakl <- function(N, n, p) {
Ind <- Ik(N, n)
P1 <- as.matrix(Pik(p, Ind))
Delta <- Pikl(N, n, p) - (t(P1) %*% P1)
return(Delta)
}
90 changes: 81 additions & 9 deletions R/Domains.r
Original file line number Diff line number Diff line change
@@ -1,12 +1,84 @@
#' @export
#'
#' @title
#' Domain Indicator Matrix
#' @description
#' Creates a binary indicator matrix that identifies the domain membership
#' of each unit in the sample. Each column corresponds to one domain
#' (level of \code{y}) and each row to one unit.
#' @return
#' A binary matrix of dimension \code{n x D}, where \code{D} is the number
#' of domains (levels of \code{y}). Entry \eqn{(k, d) = 1} if unit \eqn{k}
#' belongs to domain \eqn{d}, and 0 otherwise. Column names are the domain
#' labels.
#' @details
#' This function is useful for domain estimation, where population totals or
#' means must be estimated for subgroups of the population. The indicator
#' matrix can be multiplied element-wise with the variable of interest to
#' restrict estimation to each domain.
#' @author Hugo Andres Gutierrez Rojas <hagutierrezro at gmail.com>
#' @param y A vector (factor or coercible to factor) identifying the domain
#' membership of each unit in the sample.
#'
#' @references
#' Sarndal, C-E. and Swensson, B. and Wretman, J. (1992),
#' \emph{Model Assisted Survey Sampling}. Springer.\cr
#' Gutierrez, H. A. (2009), \emph{Estrategias de muestreo: Diseno de encuestas
#' y estimacion de parametros}. Editorial Universidad Santo Tomas.
#'
#' @seealso \code{\link{E.SI}}, \code{\link{E.STSI}}
#'
#' @examples
#' ############
#' ## Example 1
#' ############
#' # This domain contains only two categories: "yes" and "no"
#' x <- as.factor(c("yes","yes","yes","no","no","no","no","yes","yes"))
#' Domains(x)
#'
#' ############
#' ## Example 2
#' ############
#' # Uses the Lucy data to draw a random sample of units according
#' # to a SI design
#' data(Lucy)
#' attach(Lucy)
#'
#' N <- dim(Lucy)[1]
#' n <- 400
#' sam <- sample(N,n)
#' # The information about the units in the sample is stored in an object called data
#' data <- Lucy[sam,]
#' attach(data)
#' names(data)
#' # The variable SPAM is a domain of interest
#' Doma <- Domains(SPAM)
#' Doma
#' # HT estimation of the absolute domain size for every category in the domain
#' # of interest
#' E.SI(N,n,Doma)
#'
#' ############
#' ## Example 3
#' ############
#' # Following with Example 2...
#' # The variables of interest are: Income, Employees and Taxes
#' # This function allows to estimate the population total of this variables for every
#' # category in the domain of interest SPAM
#' estima <- data.frame(Income, Employees, Taxes)
#' SPAM.no <- estima*Doma[,1]
#' SPAM.yes <- estima*Doma[,2]
#' E.SI(N,n,SPAM.no)
#' E.SI(N,n,SPAM.yes)

Domains<-function(y){
y<-as.factor(y)
d<-as.double(y)
n<-length(d)
Dom<-matrix(0,n,max(d))
colnames(Dom)<-levels(y)
for(k in 1: max(d)){
Dom[,k]<-as.double(d==k)}
Dom
Domains <- function(y) {
y <- as.factor(y)
d <- as.double(y)
n <- length(d)
Dom <- matrix(0, n, max(d))
colnames(Dom) <- levels(y)
for (k in 1:max(d)) {
Dom[, k] <- as.double(d == k)
}
Dom
}
6 changes: 2 additions & 4 deletions R/E.1SI.R
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ E.1SI <- function(NI, nI, y, PSU) {

Total <- matrix(NA, nrow = 4, ncol = dim(y)[2])
rownames(Total) = c("Estimation", "Standard Error", "CVE",
"DEFF")
"DEFF")
colnames(Total) <- names(y)

fI <- nI/NI
Expand All @@ -79,6 +79,4 @@ E.1SI <- function(NI, nI, y, PSU) {
Total[, k] <- c(ty, sqrt(Vty), CVe, DEFF)
}
return(Total)
}


}
Loading