CSIR study materials - Research Methodology and Aptitude Part-2

Sampling: Sampling is the process of selecting units (e.g., people) from a population of interest so that by studying the sample we may fairly generalize our results back to the population from which they were chosen. A response is a specific measurement value that a sampling unit supplies. If you measure the entire population and calculate a value like a mean or average, it is called parameter of the population. The distribution of an infinite number of samples of the same size as the sample in your study is known as the sampling distribution.

In sampling contexts, the standard error is called sampling error. Sampling error gives us some idea of the precision of our statistical estimate. A low sampling error means that we had relatively less variability or range in the sampling distribution. How do we calculate sampling error? on the standard deviation of our sample. The greater the sample standard deviation, the greater the standard error /the sampling error. The standard error is also related to the sample size. The greater your sample size, the smaller the standard error. Because the greater the sample size, the closer your sample is to the actual population itself. If you take a sample that consists of the entire population you actually have no sampling error because you don't have a sample, you have the entire population. In that case, the mean you estimate is the parameter.

Probability sampling method is any method of sampling that utilizes some form of random selection such as picking a name out of a hat, or choosing the short straw.

The simplest form of random sampling is called simple random sampling. Simple random sampling is simple to accomplish and is easy to explain to others. Because simple random sampling is a fair way to select a sample, it is reasonable to generalize the results from the sample back to the population. Simple random sampling is not the most statistically efficient method of sampling and you may, just because of the luck of the draw, not get good representation of subgroups in a population.

Stratified Random Sampling, also sometimes called proportional or quota random sampling, involves dividing your population into homogeneous subgroups and then taking a simple random sample in each subgroup. It assures that you will be able to represent not only the overall population, but also key subgroups of the population, especially small minority groups. Second, stratified random sampling will generally have more statistical precision than simple random sampling. This will only be true if the strata or groups are homogeneous.

The problem with random sampling methods when we have to sample a population that's disbursed across a wide geographic region is that you will have to cover a lot of ground geographically in order to get to each of the units you sampled. It is for precisely this problem that cluster or area random sampling was invented. In cluster sampling, we follow these steps:
1. divide population into clusters (usually along geographic boundaries)
2. randomly sample clusters
3. measure all units within sampled clusters

Non-probability sampling. The difference between nonprobability and probability sampling is that nonprobability sampling does not involve random selection and probability sampling does. We can divide nonprobability sampling methods into two broad types: accidental or purposive. In accidental sampling, sample is chosen accidently and we have no evidence that they are representative of the populations we're interested in generalizing to and in many cases we would clearly suspect that they are not. e.g. college students in some psychological survey. In purposive sampling, we sample with a purpose in mind. We usually would have one or more specific predefined groups we are seeking. For instance, have you ever run into people in a mall or on the street who are carrying a clipboard and who are stopping various people and asking if they could interview them? Most likely they are conducting a purposive sample. Purposive sampling can be very useful for situations where you need to reach a targeted sample quickly and where sampling for proportionality is not the primary concern. With a purposive sample, you are likely to get the opinions of your target population, but you are also likely to overweight subgroups in your population that are more readily accessible.

One of purposive sampling is quota sampling. In quota sampling, you select people nonrandomly according to some fixed quota. There are two types of quota sampling: proportional and non proportional. In proportional quota sampling you want to represent the major characteristics of the population by sampling a proportional amount of each. e.g. getting 40% females from a population of say 1000.

Then there is snowball sampling. In snowball sampling, you begin by identifying someone who meets the criteria for inclusion in your study. You then ask them to recommend others who they may know who also meet the criteria.

Research Design


Research design provides the glue that holds the research project together. A design is used to structure the research, to show how all of the major parts of the research project the samples or groups, measures, treatments or programs, and methods of assignment work together to try to address the central research questions. Design can be either experimental or non-experimental.

Data analysis is the last part of the research. In most social research the data analysis involves three major steps, done in roughly this order:

Cleaning and organizing the data for analysis (Data Preparation)
Describing the data (Descriptive Statistics)
Testing Hypotheses and Models (Inferential Statistics)

Data Preparation involves checking or logging the data in; checking the data for accuracy; entering the data into the computer; transforming the data; and developing and documenting a database structure that integrates the various measures.

Descriptive Statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample and the measures. Together with simple graphics analysis, they form the basis of virtually every quantitative analysis of data. With descriptive statistics you are simply describing what is, what the data shows.
Inferential Statistics investigate questions, models and hypotheses. In many cases, the conclusions from inferential statistics extend beyond the immediate data alone. For instance, we use inferential statistics to try to infer from the sample data what the population thinks. Or, we use inferential statistics to make judgments of the probability that an observed difference between groups is a dependable one or one that might have happened by chance in this study. Thus, we use inferential statistics to make inferences from our data to more general conditions; we use descriptive statistics simply to describe what's going on in our data.