# the ages of a group of randomly selected adult females have a standard deviation of years. assume that the ages of female statistics students have less variation than ages of females in the general population, so let years for the sample size calculation. how many female statistics student ages must be obtained in order to estimate the mean age of all female statistics students? assume that we want % confidence that the sample mean is within one-half year of the population mean. does it seem reasonable to assume that the ages of female statistics students have less variation than ages of females in the general population?

### James

Guys, does anyone know the answer?

get the ages of a group of randomly selected adult females have a standard deviation of years. assume that the ages of female statistics students have less variation than ages of females in the general population, so let years for the sample size calculation. how many female statistics student ages must be obtained in order to estimate the mean age of all female statistics students? assume that we want % confidence that the sample mean is within one-half year of the population mean. does it seem reasonable to assume that the ages of female statistics students have less variation than ages of females in the general population? from EN Bilgi.

## PLEASE HELP WITH MY STATISTICS

Start with the formula for Z:

Z = (x-µ)/(σ/√n)

We want the sample mean to be within one-half year of the population mean, so we set x-µ=0.5. We are looking for a 99% confidence interval, so we set Z=2.7578. We are told to use σ=18.1. Plugging those values into the formula, we get:

2.5758 = 0.5(18.1/√n)

We can rearrange to solve for n:

((2.5758-18.1)/0.5)^{2} = n

Plugging that into our calculator, we get n = 964.003. Since we can't have a fraction of a person in our sample, it would be safest to round up to n=965. (But since .003 is so small, I'd also accept 964 as an answer.)

## Find the sample size required to estimate the population mea

Find step-by-step Statistics solutions and your answer to the following textbook question: Find the sample size required to estimate the population mean. Ages of 147 randomly selected adult females, and those ages have a standard deviation of 17.7 years. Assume that ages of female statistics students have less variation than ages of females in the general popul ation, so let $\sigma$ = 17. 7 years for the sample size calculation. How many female statistics student ages must be obtained in order to estimate the mean age of all female statistics students? Assume that we want 95% confidence that the sample mean is within one-half year of the population mean. Does it seem reasonable to assume that ages of female statistics students have less variation than ages of females in the general population?.

#### Question

#### Explanation

## Create a free account to see explanations

#### Related questions

## 3.2.2 Probability sampling

Statistics: Power from Data! is a web resource that was created in 2001 to assist secondary students and teachers of Mathematics and Information Studies in getting the most from statistics. Over the past 20 years, this product has become one of Statistics Canada most popular references for students, teachers, and many other members of the general population. This product was last updated in 2021.

Probability sampling refers to the selection of a sample from a population, when this selection is based on the principle of randomization, that is, random selection or chance. Probability sampling is more complex, more time-consuming and usually more costly than non-probability sampling. However, because units from the population are randomly selected and each unit’s selection probability can be calculated, reliable estimates can be produced and statistical inferences can be made about the population.

There are several different ways in which a probability sample can be selected.

When choosing a probability sample design, the goal is to minimize the sampling error of the estimates for the most important survey variables, while simultaneously minimizing the time and cost of conducting the survey. Some operational constraints can also have an impact on that choice, such as characteristics of the survey frame.

In the present section, each of these methods will be described briefly and illustrated with examples.

## Simple random sampling

To draw a simple random sample from a telephone book, each entry would need to be numbered sequentially. If there were 10,000 entries in the telephone book and if the sample size was 2,000, then 2,000 numbers between 1 and 10,000 would need to be randomly generated by a computer. All numbers would have the same chance of being generated by the computer. The 2,000 telephone entries corresponding to the 2,000 computer-generated random numbers would make up the sample.

SRS can be done with or without replacement. An SRS with replacement means that there is a possibility that the sampled telephone entry may be selected twice or more. Usually, the SRS approach is conducted without replacement because it is more convenient and gives more precise results. In the rest of the text, SRS will be used to refer to SRS without replacement, unless stated otherwise.

SRS is the most commonly used method. The advantage of this technique is that it does not require any information on the survey frame other than the complete list of units of the survey population along with contact information. Also, since SRS is a simple method and the theory behind it is well established, standard formulas exist to determine the sample size, the estimates and so on, and these formulas are easy to use.

On the other hand, this technique necessitates a list of all units of the population. If such a list doesn’t already exist and the target population is large, it can be very expensive or unrealistic to create one. If a list already exists and includes auxiliary information on the units, then the SRS is not taking advantage of information that allows other methods to be more efficient (like stratified sampling, for example). If collection has to be made in-person, SRS could give a sample that is too spread out across multiple regions, which could increase costs and duration of the survey.

Imagine that you own a movie theatre and you are offering a special horror movie film festival next month. To decide which horror movies to show, you survey moviegoers to ask them which of the listed movies are their favorites. To create the list of movies needed for your survey, you decide to sample 10 of the 100 best horror movies of all time. One way of selecting a sample would be to write all of the movie titles on slips of paper and place them in an empty box. Then, draw out 10 titles and you will have your sample. By using this approach, you will have ensured that each movie had an equal probability of selection. You could even calculate this probability of selection by dividing the sample size (n=10) by the population size of the 100 best horror movies of all time (N=100). This probability would be 0.10 (10/100) or 1 in 10.

## Systematic sampling

In the example above, you can see that there are only four possible samples that can be selected, corresponding to the four possible random starts:

1, 5, 9, 13 … 393, 397

2, 6, 10, 14 … 394, 398

3, 7, 11, 15 … 395, 399

4, 8, 12, 16 … 396, 400

This method is often used in industry, where an item is selected for testing from a production line to ensure that machines and equipment are of a standard quality. For example, a tester in a manufacturing plant might perform a quality check on every 20th product in an assembly line. The tester might choose a random start between the numbers 1 and 20. This will determine the first product to be tested; every 20th product will be tested thereafter.

Interviewers can use this sampling technique when questioning people for a sample survey. The market researcher might select, for example, every 10th person who enters a particular store, after selecting the first person at random. The surveyor may interview the occupants of every fifth house on a street, after randomly selecting one of the first five houses.

The advantages of systematic sampling are that the sample selection cannot be easier: you only get one random number, the random start, and the rest of the sample automatically follows. The biggest drawback of the systematic sampling method is that if there is some periodical feature in the way the population is arranged on a list and that periodical feature coincides in some way with the sampling interval, the possible samples may not be representative of the population. This can be seen in the following example:

Suppose you run a large grocery store and have a list of the employees in each section. The grocery store is divided into the following 10 sections: deli counter, bakery, cashiers, stock, meat counter, produce, pharmacy, photo shop, flower shop and dry cleaning. Each section has 10 employees, including a manager (making 100 employees in total). Your list is ordered by section, with the manager listed first and then, the other employees by descending order of seniority.

If you wanted to survey your employees about their thoughts on their work environment, you might choose a small sample to answer your questions. If you use a systematic sampling approach and your sampling interval is 10, then you could end up selecting only managers or only the newest employees in each section. This type of sample would not give you a complete or appropriate picture of your employees’ thoughts.

## Sampling with probability proportional to size

## Stratified sampling

Why create strata? There are many reasons, the main one being that it can make the sampling strategy more efficient. It was mentioned in the previous section that in order to an estimation of a certain precision, a larger sample size is needed for a characteristic that varies greatly from one unit to the other than for a characteristic with smaller variability. For example, if every person in a population had the same salary, then a sample of one individual would be enough to get a precise estimate of the average salary.

Another advantage is that stratified sampling ensures an adequate sample size for subgroups of interest in the population. When a population is stratified, each stratum becomes an independent population and a sample size is calculated for each of them.

Suppose you want to estimate how many high school students have part-time jobs at the national level and provincial level. If you were to select a simple random sample of 25,000 people from a list of all high school students in Canada (assuming such a list was available for selection), you would end up with just a little over 100 people from Prince Edward Island, since they account for less than 0.5% of the Canadian population. This sample would probably not be large enough for the kind of detailed analysis you were planning for. Stratifying your list by province and then determining a sample size needed in each province would allow you to get the required level of precision for Prince Edward Island and for each of the other provinces as well.

Stratification is most useful when the stratifying variables are

## Cluster sampling

Cluster sampling divides the population into groups or clusters. A number of clusters are selected randomly to represent the total population, and then all units within selected clusters are included in the sample. No units from non-selected clusters are included in the sample. They are represented by those from selected clusters. This differs from stratified sampling, where some units are selected from each stratum. Examples of clusters are factories, schools and geographic areas such as electoral subdivisions.

Suppose you are a representative from an athletic organization wishing to find out which sports Grade 11 (or secondary 4) students are participating in across Canada. It would be too costly and lengthy to survey every Canadian in Grade 11, or even a couple of students from every Grade 11 class in Canada. Instead, 100 schools are randomly selected from all over Canada. These 100 schools are the sampled clusters. Then all Grade 11 students in all 100 clusters are surveyed.

Cluster sample creates “pockets” of sampled units instead of spreading the sample over the whole territory, which allows for cost reduction in collection operations. Another reason to use cluster sampling is that sometimes a list of all units in the population is not available, while a list of all clusters is either available or easy to create.

Another drawback to cluster sampling is that you do not have total control over the final sample size. Since not all schools have the same number of Grade 11 students and you must interview every student in your sample, the final size may be larger or smaller than you expected.

## Multi-stage sampling

Multi-stage sampling is like cluster sampling, except that it involves selecting a sample within each selected cluster, rather than including all units from the selected clusters. This type of sampling requires at least two stages. In the first stage, large clusters are identified and selected. In the second stage, units are selected from within the selected clusters using any of the probability sampling methods. In this context, the clusters are referred to as primary sampling units (PSU) and units within clusters are referred to as secondary sampling units (SSU). When there are more than two stages, tertiary sampling units (TSU) are selected within SSE, and the process continues until there is a final sample.

In Example 5, a cluster sample would choose 100 schools and then interview every Grade 11 student from those schools. Instead, you could select more schools, get a list of all Grade 11 students from these selected schools and select a random sample of Grade 11 students from each school. This would be a two-stage sampling design. Schools would be the PSU and students the SSU.

You could also get a list of all Grade 11 classes in the selected schools, pick a random sample of classes from each of those schools, get a list of all the students in the selected classes and finally select a random sample of students from each selected class. This would be a three-stage sampling design. Schools would be the PSU, classes would be the SSU and students would be the TSU. Each time a stage is added, the process becomes more complex.

Now imagine that each school has on average 80 Grade 11 students. Cluster sampling would then give your organization a sample of about 8,000 students (100 schools x 80 students). If you wanted a bigger sample, you could select schools with more students. For a smaller sample, you could select schools with fewer students. One way to control the sample size would be to stratify the schools into large, medium and small sizes (in terms of the number of Grade 11 students) and select a sample of schools from each stratum. This is called **stratified cluster sampling**.

As an alternative method, you could use a three-stage design. You would select a sample of 400 schools, then select two Grade 11 classes per school and finally, select 10 students per class. This way, you still end up with a sample of about 8,000 students (400 schools x 2 classes x 10 students), but the sample would be more spread out.

You can see from this example that with multistage sampling, you still have the benefit of a more concentrated sample for cost reduction. However, the sample is not as concentrated as cluster sampling and the sample size needed to obtain a given level of precision would still be bigger than for an SRS because the method is less efficient. Nonetheless, multistage sampling could still save a large amount of time and effort compared to SRS because you would not need to have a list of all Grade 11 students. All you would need is a list of the classes from the 400 schools and a list of the students from the 800 classes.

## Multi-phase sampling

Multi-phase sampling is quite different from multistage sampling, despite the similarity of their names. Although multi-phase sampling also involves taking two or more samples, all samples are drawn from the same frame. Selection of a unit in the second phase is conditional to its selection in the first phase. A unit not selected in the first phase will not be part of the second-phase sample. Like for multistage sampling, the more phases used, the more complex the sample design and estimation.

Multi-phase sampling is useful when the sampling frame lacks auxiliary information that could be used to stratify the population or to screen out part of the population.

Suppose that an organization needs information about cattle farmers in Alberta, but the survey frame lists all types of farms—cattle, dairy, grain, hog, poultry and produce. To complicate matters, the survey frame does not provide any auxiliary information for the farms listed there.

A simple survey whose only question would be “Is part or all of your farm devoted to cattle farming?” could be conducted. With only one question, this survey should have a low cost per interview (especially if done by telephone) and, consequently, the organization should be able to draw a large sample. Once the first sample has been drawn, a second, smaller sample can be extracted from among the cattle farmers and more detailed questions asked of these farmers. Using this method, the organization avoids the expense of surveying units that are not in this specific scope (i.e. non-cattle farmers).

In the example 7, data collected in the first phase has been used to exclude units that are not part of the target population. In another context, this data could have been used to improve the efficiency of the second phase, by creating strata, for example. Multi-phase sampling can also be used to reduce response burden or when there are very different costs associated with different questions of a survey, as illustrated in the next example.

In a health survey, participants are asked some basic questions about their diet, smoking habits, exercise routines and alcohol consumption. In addition, the survey requires that respondents subject themselves to some direct physical tests, such as running on a treadmill or having their blood pressure and cholesterol levels measured.

Filling out questionnaires or interviewing participants are relatively inexpensive procedures, but the medical tests require the supervision and assistance of a trained health practitioner, as well as the use of an equipped laboratory, both of which can be quite costly. The best way to conduct this survey would be to use a two-phase sample approach. In the first phase, the interviews are performed on an appropriately sized sample. From this sample, a smaller sample is drawn. Only participants selected in the second sample would take part in the medical tests.

Guys, does anyone know the answer?