Sample Survey Vs. Census Survey
Why do a sample survey? Sampling is usually
done to achieve some efficiency in time, resources, and expense in an
Organizational Survey project. Contrary to client expectations, survey
sampling may result in a more complex and more costly survey process.
We cannot make a determination at this point as to what sampling approach
is best for your company. Much will depend on an examination of the
organizational structure and the utility associated with decisions that
will be made with the survey results. We can only counsel client organizations,
at this time, to be flexible in its thinking about survey sampling.
Survey sampling is both a science and an
art. The experience of the survey consultant will allow the wedding
of common sense to methodological rigor. The purpose of sampling is
the accomplishment of efficiency, representation, and minimal disruption.
We assume that you elected to use a sample survey to permit savings
in time and expense, minimize the disruption of the staff, and yet,
maintain statistical representation during the project.
There are some serious drawbacks to sampling.
For example, very complicated stratified random samples might put an
increased burden on the client's staff for resources. This may involve
greater use of consultants, client time, and computer usage to test
and manage the sampling model. This added resource cost may offset any
savings in a sample survey administration. Also, the requirement for
representation in many sub-groups raises the specter of surveying most
of the population, anyway.
Sampling resources: Resources must be identified
within your organization that can supply computed or case-wise data
from the personnel resources database(s). In a two-step process, your
survey consultant will:
- Conduct a discovery
process to understand the makeup of your organization's population
on a person level (i.e., demographics, title, job level, etc.),
and on a unit level (region, function, Field office, etc.);
- Specify, in detail,
a sampling plan for data collection that will yield representative
results with acceptable levels of reliability and precision.
|
The efficiency by which a survey sample
is developed will depend on: internal skills in your company for data
query and reporting, consolidation versus dispersion of relevant databases,
data accuracy, and the extent to which there are multiple files with
multiple data structures.
Your survey consultant may expect to download
computed and case-wise data for their own tabulation and analysis, in
addition to computations and analyses provided by your organization's
resources. Your survey consultant should understand they will not have
free access into your personnel data systems, but that you will provide
data that they regard as necessary. These data may have e-mail addresses,
office telephones, and possibly fax numbers.
Classical Vs. Bayesian Sampling
We advise a sampling approach
that yields the smallest sample size possible, consistent with the decision-making
needs of the client. Most survey consultants take a classical approach
to survey sampling. We take a bayesian approach. In most applications,
a classical sampling model will yield much higher sample sizes than
a bayesian sampling model. The reason is that a bayesian sampling technique
lets us make assumptions about the utility of the decisions which the
client will make with the survey data. When trigger points for decisions
are very close, and when profits/losses associated with decisions are
very high, classical and bayesian sampling tend to yield the same sample
sizes.
For example, if you assume
that field office number 1 is average in satisfaction (60 percent favorable),
and you wouldn't take action unless the new survey number was at least
30 points lower, then you have a trigger point that is very far away.
If you observed survey results at 28 percent favorable, but would decide
not to take immediate steps, but only watch the field office for a while,
then there is little cost to your decision. As a result, sample sizes
can be quite small. This is the advantage of a bayesian approach over
a classical approach.
On the other hand, if you expect
to take action if the results are only two points lower (a very close
trigger point), and your decision will be to fire the regional director,
institute retraining for everyone, and reassign all the staff, then
the costs of your decisions are quite high. As a result, sample sizes
will be rather high. Classical and bayesian sampling approaches would
tend to yield the same sample sizes in this example.
In our experience with typical
Organizational Surveys, we can usually reduce the sample sizes from
a classical approach by half. Depending on the circumstances it can
be significantly lower.
Another issue in sampling is
proportional representation. It is often assumed that an overall representation
of 10 percent, as an example, should be represented in each of the subgroups.
So if we sample 10 percent of the field population, 200 people, then,
it is assumed, we should also sample 10 percent of the Headquarters
staff, 800 people. This assumption is false.
If we are to compare
field vs. headquarters with equal reliability, then we need equal sample
sizes. The Standard Error of the Mean (SEM) is a function of one thing,
the sample size, n. This is evident in the formula, SEM = s /( [square
root] n ), where s is the standard deviation, and n is the sample size.
Therefore, we only need 200 headquarters staff to give results as reliable
as 200 field people. This is another technique, often overlooked by
survey consultants, that can keep the sample size down to a minimum.
ANOVA DESIGN
Conventional wisdom holds that cell sizes
in an ANOVA design be at least 10, but certainly no lower than 5. In
a complex design, with many main effects, the number of cells could
be very large, and the sample yield would have to be very large, as
well. For example, let's look at a 2 x 3 x 4 x 5 design. This might
be gender (2) by job (3) by region (4) by field office (5). Such a design
would yield 120 cells in an ANOVA design. The sample yield would be
1,200, and the total number of people approached might be as high as
2,400.
There is a way to reduce this sample size
dramatically. The recommended approach requires a cell size of 1, and
a total sample yield of 120 people. At first blush this may seem nonsensical
because error variance is estimated from within-cell variance. A cell
size of 1 has no variance, thus no within-cell variance, and is of no
use in estimating error variance. The solution is to use another source
of variance as your estimate of the error variance. To do this you examine
the ratio of the variance of the 3rd and 4th interaction terms. If the
ratio is not significant, then pool the variances and use them as an
estimate of the error variance.
There is one major
technical flaw with this technique. You are changing your model, contingent
upon your data. This would be a serious matter in a design requiring
great precision and strict adherence to experimental method. However,
in the context of hypothesis generation, pilot study, minimal concern
about interactions, focusing only on main effects, and interim observations,
it can be very efficient and minimize the demand on the survey consultant's
resources and on your client.