Determine the scope, method, modality, and population of interest by acknowledging and refining your attitude

Er=Wpa + Ipi : The summation of panoramic wisdom and piercing insight.


1. Acknowledge and refine your attitude.

We all experience reality from within a basic human framework. However, this basic framework allows for a diverse set of (at least at times) widely variant worldviews, which are derived from the interactions of a number of known factors. When doing research, phenomenologists pay particular attention to one such factor: one's own attitude. While they tend to focus particularly on one’s attitude at the onset of the experience, they also generally continue to note changes in attitude evoked during the experience, as well as its subsequent recollection, representation, analysis, and interpretation. While we encourage you to explore the specialized meaning that phenomenologists apply to the term attitude, here we define it generally as the paradigm or mindset motivating the desire to understand a person (or group of people) in some manner, and for some purpose.

It is important to explicitly acknowledge one’s initial attitude at the outset of a project, because this attitude determines one’s preconceived tendencies in terms of:
  • the perceived phenomenon of study, including the types of questions one asks, experiences one examines, or stimuli one creates;
  • the perceived context of the phenomenon, including that which surrounds its experience, interpretation, and the application of its results;
  • the types and aspects of potential data that one perceives to be focally important, and therefore attends to for greater periods of time;
  • the methods that one employs to analyze, essentialize, or evaluate one’s data;
  • the types and classes or states of variables that one selects to examine;
  • the natural language sample one acquires to represent or reveal the experience of the phenomenon.
For the same reasons, it is also important to let one’s attitude remain flexible, such that it can accommodate the experience itself throughout the project, and serve to elaborate on the revealed or evoked content of that experience in its apparently original form, rather than inadvertently forcing it into a theoretical, disciplinary, or other Procrustean bed.

Before turning to the specific actions involved in selecting an attitude, we note that the attitude that Raven’s Eye employs is described in the General tendencies and attitudes section of these Technicals. It is this attitude that both facilitates your research, and enables you to avoid some of the attitude-borne assumptions and limitations programmed into other natural language processing and qualitative data analysis programs.

The actions involved in acknowledging and refining an attitude now follow.

1.1 Select or construct a phenomenon.

Decide on the stimulus, experience, situation, or other phenomenon that you would like to understand more fully. Define it in terms of its apparent structure, function, or both.

1.2 Determine your lens.

Identify the disciplinary and theoretical lens by which you are operating, the extant and relevant information or related scholarship on the phenomenon, and the outcomes desired from your project. Also examine the practicalities of your particular situation, and the way in which the data will be put to use.

1.3 Determine your focus.

Given the pre-existing knowledge base and current context informing your project, identify aspects of the phenomenon that would be particularly useful to understand more fully, and the specific ways in which an additional project could realistically complement or extend the knowledge already present.

1.4 Determine your procedures and adjunct analyses.

Based on your lens and focus, determine those procedures that should be followed, as well as any additional analyses that should be performed, in order to understand the apparent form or function of your phenomenon. In this process, specifically note how you plan to acquire and analyze the types of quantitative and qualitative data necessary for your project.

1.5 Select your variables and their respective classes or states.

According to the procedures and analyses that you have selected, identify specific variables that both appear to be associated with the phenomenon, and are amenable to the types of analyses you expect to perform. As part of this, identify the appropriate scale or level of measurement for each variable, and delineate the states or classes in which you expect each variable to exist.

1.6 Define and acquire your natural language sample.

Once you have performed the preceding substeps, you are ready to define and acquire a natural language sample. To define your sample, consider the most appropriate means of revealing your phenomenon in natural language, and the group of people or textual records that best represent or reveal it. Having done so, define your natural language sample according to these bounds, and design your procedures such that you can most effectively acquire it. Note any practical limitations arising from your design, as well as any circumstances or events occurring during data acquisition that might influence results.

1.6.1 Sampling for generalization.

If you are operating from a quantitative or logical positivistic perspective in which generalization is of concern, you may use random and representative sampling procedures to select your sample, and then make scientifically sound claims about your confidence in the ability to generalize your results to a given population and stimulus. Calculating the relationship between sample size and confidence in generalization does, however, involve a different procedure than is typically found in most social science research methods handbooks.

This is because naturally produced language results in word proportions that are not normally distributed, nor independent. Fortuitously, when combined with the results produced by our algorithm the systematically predictable nature word relationships often leads to relatively fewer cases (or participants) required for scientifically sound generalization than is generally otherwise required by typical statistical sample size calculations.

Our results include an Overrepresentation score for each word in a given dataset. This score is depicted along the vertical axis in the main chart, and also presented in the rightmost column of the main table. As described in the Understanding your results page of our Practicals, this score represents the proportion of each word in the response set (or column) as compared to that same word's proportion in your selected language corpus. For instance, an Overrepresentation score of 10 means that the word is found in the responses or cases at 10 times its typical use in the background corpus. Except in very small samples, such an Overrepresentation score would generally indicate a moderate degree of association between the word and the stimulus or question leading to the production of the natural language response. The word is, therefore, rather particular to the group and the stimulus. If the word is also relatively frequent, it is likely a popular means of expressing an idea that is central to the themes in the responses.

Estimating confidence in already acquired datasets. The Overrepresentation score is also at the same time an estimate of confidence in the ability to make general claims about the word's particularity to the stimulus (or survey question) and population. All else being equal, if the sample producing the Overrepresentation score from the previous example consisted of 500 responses, this would mean that for such a result to happen purely by chance (and not because it is particularly related to the stimulus) you would need to acquire an additional 4,500 responses (n-1; 10-1 = 9 x 500 = 4500), each of whom would need to refrain from ever mentioning that word in each and every one of their responses. If the word appears in the most frequent 50% of your data and your sample is of sufficient size, it is highly improbable that such an Overrepresentation is produced by chance alone. Instead, you can be confident that the word is predictably related to your stimulus and sample.

Calculating sample size needs prior to acquiring data. Depending on the population size to which you intend to generalize your results, the word Overrepresentation score thresholds that you decide are sufficient to warrant inclusion in your results, and the amount of confidence you require in your results, a sample size can be calculated before collecting data. Continuing the example, suppose that you have a population of 10,000 people, and want to be sure that words with Overrepresentation scores of 10 or greater in your sample can be confidently applied to it, such that even if no participant outside your sample ever mentions the word once*, the word still remains overrepresented at a proportion that is 1.5 times its typical background rate. Being 50% more likely to be expressed when presented your stimulus than its likelihood of expression in your selected language corpus generally, this word would still then be somewhat particular to your stimulus. To have this degree of confidence in your results, you would then need a sample of 1,500 cases (10,000 / 10 = 1,000 x 1.5 = 1,500).

Those words with Overrepresentation scores higher than your selected threshold of 10 would of course also remain proportionally overrepresented, even if they were also never mentioned again in any of the subsequent responses*. In our example of a population of 10,000 people and an Overrepresentation threshold of 10:1.5 (provided no one else ever mentions the overrepresented words again), a word with an original Overrepresentation score of 20 would remain overrepresented at 3 times its rate in the background corpus. Similarly, a word with an Overrepresentation score of 100 would remain overrepresented at 15 times its background rate.

Provided a discrete stimulus—such as an open-ended survey or interview question on one topic requiring about a few paragraphs or less of text to respond—the algorithm produces Overrepresentation scores that often range in the 50s to the 1,000s for words related to the stimulus. For such stimuli then, sample sizes of around 100 cases may produce results that can be confidently generalized to populations of 10,000 and more. For comparison, were the stimulus posed a binary (yes/no) question and one wanted 95% confidence at a +/- 5% interval, 370 cases would be required to generalize results to a population of 10,000 people (and approximately 2100 cases would be required for 99% confidence at a +/- 2.5% interval).

*As noted previously, this would be quite highly unlikely if the word is in the most frequent 50% of your data.

Describing your attitude.

We encourage you to explicitly describe in writing the specific concepts, processes, and contexts involved during the acknowledgement and refinement of your attitude. Doing so not only helps others to understand the peoples, purposes and procedures involved in your project, and therefore the scope of your results, but also serves as a source of latter complementary investigation.