Mamadou Yauck, M; Moodie, E; Apelian, H; Peet, M; Lambert, G; Grace, D; Lachowsky, N; Hart, T; Cox, J. Epidemiologic Methods. 2020. De Gruyter.
Respondent-Driven Sampling (RDS) is a variant of link-tracing, a sampling technique for surveying hard-to-reach communities that takes advantage of community members’ social networks to reach potential participants. As a network-based sampling method, RDS is faced with the fundamental problem of sampling from population networks where features such as homophily (the tendency for individuals with similar traits to share social ties) and differential activity (the ratio of the average number of connections by attribute) are sensitive to the choice of a sampling method. Though not clearly described in the RDS literature, many simple methods exist to generate simulated RDS data, with specific levels of network features, where the focus is on estimating simple estimands. However, the accuracy of these methods in their abilities to consistently recover those targeted network features remains unclear. This is also motivated by recent findings that some population network parameters (e.g.~homophily) cannot be consistently estimated from the RDS data alone.
In this paper, we conduct a simulation study to assess the accuracy of existing RDS simulation methods, in terms of their abilities to generate RDS samples with the desired levels of two network parameters: homophily and differential activity. The results show that (1) homophily cannot be consistently estimated from simulated RDS samples and (2) differential activity estimates are more precise when groups, defined by traits, are equally active and equally represented in the population. We use this approach to mimic features of the Engage Study, an RDS sample of gay, bisexual and other men who have sex with men in Montreal.