Purpose: Remove dimensions unnecessary for the research question from the textual structures
While the textual structures are intended to represent substantively all of the discussion in the text, only certain parts are likely to be relevant for a particular question (in a similar manner to how only certain accounting measures are likely to be relevant to a particular research question). As such, an integral stage is determining desired dimensions, either ignoring, or removing, dimensions superfluous to the research question.
For example, dimensions such as the original text, or data of employment, are unlikely to be useful in many research questions, and removing them may make the textual structures easier to view and work with. While other properties, such as PERSON_NAME and COMPANY_NAME, may not provide the basis for comparison, they may be necessary for meaningful comparisons (e.g., seeing recharacterizations at specific companies).
Purpose: Adjust any classifications based on the nature of the research question
While the textual structures are intended to capture the dimensions of the text as noted earlier, at the very specific property level concepts can be ‘fuzzy’ (e.g., Murphy, 2002), and differences between concepts such as MARKETING and ADVERTISING may or not be desired. Thus, despite continuing effort to ensure a basis for classifications and validate the levels used, it is anticipated that a degree of re-coding may sometimes be required. The textual structures facilitate recoding of dimensions, with the ability to combine labels or further split concepts as desired. By providing the basis for a majority of dimensions, this task is much simplified; it is far easier to recode a limited number of functional-labels (such as designating that MARKETING and ADVERTISING should be combined) than to manually code the tens of thousands of permutations in the raw titles.