On CBS.com: Six show girls attacked
Find Articles in:
all
Business
Reference
Technology
News
Sports
Health
Autos
Arts
Home & Garden
advertisement
advertisement

Content provided in partnership with
Thomson / Gale

Precision estimates and suggested sample sizes for length-frequency data

Fishery Bulletin,  Jan, 2007  by Hans D. Gerritsen,  David McGrath

Abstract--For most fisheries applications, the shape of a length-frequency distribution is much more important than its mean length or variance. This makes it difficult to evaluate at which point a sample size is adequate. By estimating the coefficient of variation of the counts in each length class and taking a weighted mean of these, a measure of precision was obtained that takes the precision in all length classes into account. The precision estimates were closely associated with the ratio of the sample size to the number of size classes in each sample. As a rule-of-thumb, a minimum sample size of 10 times the number of length classes in the sample is suggested because the precision deteriorates rapidly for smaller sample sizes. In absence of such a rule-of-thumb, samplers have previously under-estimated the required sample size for samples with large fish, while over-sampling small fish of the same species.

**********

Length measurements are fundamental to many aspects of fisheries science. However, there is little formal guidance on the appropriate size of a length sample. Such guidance is of particular relevance when the number of fish available exceeds the number that can be measured at a reasonable cost, and a subsample needs to taken. Clearly, the required precision of a length sample depends on the purpose of sampling. In order to identify modes of individual year classes for a length-based assessment, the precision of the sample needs to be quite high. Sample sizes of more than 1000 are necessary to identify more than half the modes in a typical length distribution (Erzini, 1990). A sample size of at least 100 adult fish was recommended for age-based stock assessment purposes (Anderson and Neumann, 1996), although the authors did not mention how they arrived at this number.

Regardless of the type of assessment that is used, the shape of the length-frequency distribution is of interest, rather than simple summary statistics such as the mean or the variance. For this reason, it has proved difficult to quantify what constitutes a representative or adequately precise length distribution. Some studies have attempted to find minimum or optimum sample sizes by comparing samples to an expected distribution (e.g., Muller (1); Gomez-Buckley et al. (2); Vokoun et al., 2001). However, the true distribution is usually unknown, and dissimilarity from the expected distribution does not necessarily indicate an imprecise sample. In addition, these methods provide only indirect measures of precision that are difficult to evaluate objectively.

Thompson (1987) used the precision of a sample explicitly to establish an appropriate sample size. Thompson proved that a sample size of 510 is sufficient to be 95% confident that all estimated proportions in a multinomial distribution are no more than 5% from the true proportion. However, Thompson based this figure (n=51) on a worst-case scenario, which, in the present case, is a length-frequency distribution that is evenly apportioned over three size classes. Because this is not the typical shape of a length-frequency distribution used in fisheries science, Thompson's measure of precision is too conservative for the vast majority of cases.

For most fisheries applications, it would be more useful to define the precision of a length-frequency sample as the mean precision over the entire size range. However, it appears that this approach has not been used to establish an optimum sample size. Such mean precision estimates over the entire size range might be used to obtain a rule-of-thumb for sample sizes that are required in order to obtain a certain precision level of the catch at each location. In the present study we aim 1) to determine a rule-of-thumb for obtaining an appropriate sample size when the number of fish available in a particular sample exceeds the number that can be measured at a reasonable cost, and 2) to examine the sample sizes that have been taken in the past, in absence of such guidance.

Materials and methods

Data were used from the Irish Groundfish survey, which was carried out on RV Celtic Explorer in the waters around Ireland during October and November 2005. The catch was sorted into species and, if appropriate, into size grades, each of which were treated as a separate length sample. Length measurements were taken from all fish and squid species that were caught. If the number of individuals in a sample was large, a subsample was taken by repeatedly transfering the sample from each fish box into two other boxes and discarding one of these. This method ensures that the entire catch is represented uniformly in the subsample. At the time of the survey, the samplers did not have any particular guidance on the appropriate size for a subsample; they used their own judgment to decide on the sample size.

The precision of the number of observations in each length class of a random sample can be estimated by assuming a multinomial distribution (Smith and Maguire, 1983). If the precision in each length class is expressed in the form of a coefficient of variation (CV), an overall measure of precision can be obtained by weighting each CV by the number of fish in each length class. This mean weighted CV (MWCV) provides a description of the precision over the entire range of size classes in a length frequency distribution.