DataGen_rare_group_usage
Source:vignettes/DataGen_rare_group_usage.Rmd
DataGen_rare_group_usage.Rmd
Introduction
The DataGen_rare_group
function generates synthetic data
for rare group analysis, simulating structured datasets for testing and
validating algorithms. This vignette demonstrates how to use
DataGen_rare_group
with example inputs.
Generate Synthetic Data
Run the DataGen_rare_group
function to generate the
synthetic dataset:
# Generate data
seed =1
p = 5
n1 = 100
n2 = 100
n.common = 50
n.group = 30
sigma.eps.1 = 1
sigma.eps.2 = 3
ratio.delta = 0.05
network.k = 5
rho.beta = 0.5
rho.U0 = 0.4
rho.delta = 0.7
sigma.rare = 10
n.rare = 20
group.size = 5
DataGen.out <- DataGen_rare_group(seed, p, n1, n2, n.common, n.group, sigma.eps.1, sigma.eps.2, ratio.delta, network.k, rho.beta, rho.U0, rho.delta, sigma.rare, n.rare, group.size)
#> Warning: package 'MASS' was built under R version 4.4.1
#> Warning: package 'fastDummies' was built under R version 4.4.2
#> Warning: package 'rsvd' was built under R version 4.4.1
#> Warning: package 'Rcpp' was built under R version 4.4.2
#> Warning: package 'RcppArmadillo' was built under R version 4.4.3
#> Warning: package 'inline' was built under R version 4.4.3
#>
#> Attaching package: 'inline'
#> The following object is masked from 'package:Rcpp':
#>
#> registerPlugin
#> >> setting environment variables:
#> PKG_LIBS = $(SHLIB_OPENMP_CFLAGS) $(LAPACK_LIBS) $(BLAS_LIBS) $(FLIBS)
#> PKG_CPPFLAGS = -I../inst/include $(SHLIB_OPENMP_CFLAGS)
#>
#> >> LinkingTo : RcppArmadillo, Rcpp
#> CLINK_CPPFLAGS = -I"F:/R-4.4.0/library/RcppArmadillo/include" -I"F:/R-4.4.0/library/Rcpp/include"
#>
#> >> Program source :
#>
#> 1 :
#> 2 : // includes from the plugin
#> 3 : #include <RcppArmadillo.h>
#> 4 : #include <Rcpp.h>
#> 5 :
#> 6 :
#> 7 : #ifndef BEGIN_RCPP
#> 8 : #define BEGIN_RCPP
#> 9 : #endif
#> 10 :
#> 11 : #ifndef END_RCPP
#> 12 : #define END_RCPP
#> 13 : #endif
#> 14 :
#> 15 : using namespace Rcpp;
#> 16 :
#> 17 : // user includes
#> 18 :
#> 19 :
#> 20 : // declarations
#> 21 : extern "C" {
#> 22 : SEXP file5ed86024ff( SEXP n_, SEXP mu_, SEXP sigma_) ;
#> 23 : }
#> 24 :
#> 25 : // definition
#> 26 : SEXP file5ed86024ff(SEXP n_, SEXP mu_, SEXP sigma_) {
#> 27 : BEGIN_RCPP
#> 28 :
#> 29 : using namespace Rcpp;
#> 30 : int n = as<int>(n_);
#> 31 : arma::vec mu = as<arma::vec>(mu_);
#> 32 : arma::mat sigma = as<arma::mat>(sigma_);
#> 33 : int ncols = sigma.n_cols; // Corrected syntax
#> 34 : arma::mat Y = arma::randn(n, ncols);
#> 35 : return wrap(arma::repmat(mu, 1, n).t() + Y * arma::chol(sigma));
#> 36 :
#> 37 : END_RCPP
#> 38 : }
Examine the Output
Explore the structure and key components of the generated dataset:
# View structure of the output
str(DataGen.out)
#> List of 12
#> $ delta1 : num [1:100, 1:5] 0 0 0 0 0 0 0 0 0 0 ...
#> $ delta2 : num [1:100, 1:5] 0 0 0 0 0 0 0 0 0 0 ...
#> $ u.1 : num [1:100, 1:5] 0.206 1.437 0.28 0.71 -0.543 ...
#> $ u.2 : num [1:100, 1:5] 0.468 1.595 -0.152 -1.13 -0.165 ...
#> $ S.1 : num [1:100, 1:100] 1.393 -0.529 2.842 1.97 0.438 ...
#> ..- attr(*, "dimnames")=List of 2
#> .. ..$ : chr [1:100] "1" "2" "3" "4" ...
#> .. ..$ : chr [1:100] "1" "2" "3" "4" ...
#> $ S.2 : num [1:100, 1:100] 9.553 1.097 -6.412 7.909 0.147 ...
#> ..- attr(*, "dimnames")=List of 2
#> .. ..$ : chr [1:100] "51" "52" "53" "54" ...
#> .. ..$ : chr [1:100] "51" "52" "53" "54" ...
#> $ S.1.0 : num [1:100, 1:100] 2.019 0.0913 2.4329 1.0762 -0.636 ...
#> $ S.2.0 : num [1:100, 1:100] 2.471 0.644 0.321 -1.221 -0.615 ...
#> $ X.group.source:'data.frame': 100 obs. of 30 variables:
#> ..$ .data_1 : int [1:100] 1 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_2 : int [1:100] 0 1 0 0 0 0 0 0 0 0 ...
#> ..$ .data_3 : int [1:100] 0 0 1 0 0 0 0 0 0 0 ...
#> ..$ .data_4 : int [1:100] 0 0 0 1 0 0 0 0 0 0 ...
#> ..$ .data_5 : int [1:100] 0 0 0 0 1 0 0 0 0 0 ...
#> ..$ .data_6 : int [1:100] 0 0 0 0 0 1 0 0 0 0 ...
#> ..$ .data_7 : int [1:100] 0 0 0 0 0 0 1 0 0 0 ...
#> ..$ .data_8 : int [1:100] 0 0 0 0 0 0 0 1 0 0 ...
#> ..$ .data_9 : int [1:100] 0 0 0 0 0 0 0 0 1 0 ...
#> ..$ .data_10: int [1:100] 0 0 0 0 0 0 0 0 0 1 ...
#> ..$ .data_11: int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_12: int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_13: int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_14: int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_15: int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_16: int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_17: int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_18: int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_19: int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_20: int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_21: int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_22: int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_23: int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_24: int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_25: int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_26: int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_27: int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_28: int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_29: int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_30: int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> $ X.group.target:'data.frame': 100 obs. of 30 variables:
#> ..$ .data_1 : int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_2 : int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_3 : int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_4 : int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_5 : int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_6 : int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_7 : int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_8 : int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_9 : int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_10: int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_11: int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_12: int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_13: int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_14: int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_15: int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_16: int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_17: int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_18: int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_19: int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_20: int [1:100] 0 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_21: int [1:100] 1 0 0 0 0 0 0 0 0 0 ...
#> ..$ .data_22: int [1:100] 0 1 0 0 0 0 0 0 0 0 ...
#> ..$ .data_23: int [1:100] 0 0 1 0 0 0 0 0 0 0 ...
#> ..$ .data_24: int [1:100] 0 0 0 1 0 0 0 0 0 0 ...
#> ..$ .data_25: int [1:100] 0 0 0 0 1 0 0 0 0 0 ...
#> ..$ .data_26: int [1:100] 0 0 0 0 0 1 0 0 0 0 ...
#> ..$ .data_27: int [1:100] 0 0 0 0 0 0 1 0 0 0 ...
#> ..$ .data_28: int [1:100] 0 0 0 0 0 0 0 1 0 0 ...
#> ..$ .data_29: int [1:100] 0 0 0 0 0 0 0 0 1 0 ...
#> ..$ .data_30: int [1:100] 0 0 0 0 0 0 0 0 0 1 ...
#> $ pairs.rel.CV :'data.frame': 305 obs. of 3 variables:
#> ..$ row : chr [1:305] "17" "116" "21" "81" ...
#> ..$ col : chr [1:305] "77" "146" "142" "113" ...
#> ..$ type: chr [1:305] "related" "related" "related" "related" ...
#> $ pairs.rel.EV :'data.frame': 305 obs. of 3 variables:
#> ..$ row : chr [1:305] "10" "1" "50" "42" ...
#> ..$ col : chr [1:305] "130" "92" "140" "71" ...
#> ..$ type: chr [1:305] "related" "related" "related" "related" ...
# Print the first few rows and columns of the S.1 matrix
cat("\nFirst 5 rows and columns of S.1:\n")
#>
#> First 5 rows and columns of S.1:
print(DataGen.out$S.1[1:5, 1:5])
#> 1 2 3 4 5
#> 1 1.3925742 -0.5291006 2.842317 1.969923 0.4383936
#> 2 -0.5291006 12.0059956 7.980915 2.551015 -2.5642034
#> 3 2.8423166 7.9809147 8.981994 3.166662 -2.1011907
#> 4 1.9699229 2.5510151 3.166662 6.419417 -6.7080622
#> 5 0.4383936 -2.5642034 -2.101191 -6.708062 5.6919244
# Print the first few rows and columns of the S.2 matrix
cat("\nFirst 5 rows and columns of S.2:\n")
#>
#> First 5 rows and columns of S.2:
print(DataGen.out$S.2[1:5, 1:5])
#> 51 52 53 54 55
#> 51 9.5531329 1.0969313 -6.4123782 7.9090043 0.1468358
#> 52 1.0969313 3.6596876 -0.9416046 -2.6162838 -5.4793315
#> 53 -6.4123782 -0.9416046 -1.6856890 -0.7816355 -3.5461235
#> 54 7.9090043 -2.6162838 -0.7816355 9.0438723 -2.6619630
#> 55 0.1468358 -5.4793315 -3.5461235 -2.6619630 -0.3270468