Statswork

How to Estimate Sample size to examine the True prevalence COVID-19: Evidence-based Guidelines

SW - How to Estimate Sample size to examine the True prevalence

In-Brief

Introduction

Sample size evaluation or justification is the first step in designing a clinical study. It is about the number of patients or other explored parts that will be included in research and need to answer the research hypothesis in the research study. The objective of the sample size calculation is to identify the number of units required to discover the unknown clinical parameters or the treatment effects or the association after data collection.

If the sample size is enormous, the researcher may struggle to answer the research question. On the other side, the amount of patients in many studies is limited due to feasibilities such as cost, patient inconvenience, and decisions need not be progress without an investigation or an extended study time. Investigators should evaluate the optimum sample size before proceeding with data collection to avoid error because of limited sample size and also wasting economy and time, because of the large sample size. Online statistical data collection and analysis service provide relevant data for any clinical analysis.

Apart from sample size evaluation for research projects, they are a significant part of the study protocol for submission to ethical commission or some peer-review journals. It is essential to identify the sample size according to the aim and design of the study. The incorrect or insufficient result may occur due to wrong sample size calculations.

The sample size calculation for the clinical study

There is a basic formula for sample size calculation for quantitative data. Still, it has some practical issues in identifying values for the assumption needed in the formula too. In some cases, the decision to choose the appropriate values for these assumptions are complicated. The following is a sample size analysis calculator. The formula would be used for evaluating the sample size in the prevalence study.

 Z is the statistical corresponding to the level of confidence, n is the sample size, P is the expected prevalence (which can be gained from same research studies or a pilot study conducted by the researchers), and d is the precision (corresponding to effect size).

n=Z2P(1-P)/d2

The confidence level usually aimed at 95%; most of the researchers present their outcomes with a 95% confidence interval (CI). But few researchers’ needs more confidence can choose a 99% confidence interval.

The assumed P value in the formula is estimated from the previous studies published in the same research domain or from the pilot study conducted with few samples to identify the assumed P value. The assumed P value plays a significant role in identifying the precision (d) selected according to the amount of P. Still, there is no proper guideline for selecting an appropriate value of d. Some authors prefer to select a precision of 5%. If the disease’s prevalence is found to be in between 10% and 90%,  then assumed prevalence is very small (going to be below 10%), the precision of 5% seems to be inappropriate. For example, presented sample size evaluation for three different P and precision are mentioned in the below table.    

The researcher should notice to the appropriate precision based on assumed P. The wrong precision leads to incorrect sample size (too small or too large).

PrecisionAssumed Prevalence
 0.050.20.6
0.01182561479220
0.04114384576
0.10186192

Background

The number of COVID-19 victims divided by the total population size is used as a rough measure for the amount of disease in a population. Nevertheless, this fraction heavily based on the sampling intensity and numerous test standards used in different dominions, and various resources indicate that a huge fraction of cases tends to go undetected.

Methods

The true prevalence estimation of COVID-19 within the population can be done by random sampling. Here simulation is used to discover confidence intervals of prevalence based on various sampling strategies, exploring standard sample sizes and sample pooling at a range of prevalence levels. If you find difficult with data collection and analysis, you can consult with statistical data collection and analysis help.

Sample pooling:

It involves mixing various samples, together in a “batch” or pooled sample with a diagnostic test. This approach helps to increase the number of victims tested with a few amount of resources. For example, four samples may be diagnosed together, with the resource required to test the single sample. However, because materials diluted, which result in less viral genetic samples available to detect, there is a huge chance of getting negative results, particularly if not properly validated. Pooling sample of method works well for low prevalence cases, which means it expects more negative result than positive results.

Results

Sample pooling significantly reduces the total number of tests needed for prevalence estimation. It is theoretically conceivable to pool hundreds of samples in population with low-prevalence, which had an only marginal loss of precision. It can be applied to pool up to 15 samples, even when the true prevalence is as high as 10%, but this comes with the cost of not knowing which patients were positive. It is particularly beneficial when the test contains defective specificity and can offer a more accurate estimation of the prevalence than an equal quantity of individual-level tests.

Conclusion

Sample pooling must be considered in COVID-19 prevalence estimation efforts to get better Report Generation, results and analysis.

References

  1. Pourhoseingholi, M. A., Vahedi, M., & Rahimzadeh, M. (2013). Sample size calculation in medical studies. Gastroenterology and Hepatology from bed to bench6(1), 14.
  2. Brynildsrud, O. (2020). COVID-19 prevalence estimation by random sampling in population-optimal sample pooling under varying assumptions about true prevalence. BMC medical research methodology20(1), 1-8.
Exit mobile version