The CHRR Database Investigator
11.0 Extracting VariablesWe are now going to briefly demonstrate how to extract variables using the CHRR Database Investigator software. A more detailed explanation of the sophistication of the software can be found in the software manual, CHRR Database Investigator User's Manual. This manual is on your CD, link to it under the subdirectory "Documents" in the Contents window. The process of extracting variables is one that prepares the raw data from the NLS for use in one of the statistical packages, defines the output file type, writes documentation for the extracted set, writes the codebook, etc. To extract variables, you must have 'tagged' variables in a tagset. Variables are always extracted in ascending reference number order, so when an extracted data set is read into sas/spss/stata/etc., R00001 will always be first (if selected), followed by the next highest reference number, etc. 11.1 Tagging VariablesLet's extract the variables on 'age'. We have already investigated the Age variables and know that they are part of the KEYVARS area of interest. We could go ahead and extract them at this time. But, remember, you should probably always include the 'case identification' variable in all extracts, because it may help you resolve inconsistencies in a particular case. By including the case identification code you could go to any 'outlying' case and look at the particular datum for an anomaly and investigate whether to keep the case in the extract or to throw it out because of the conflicting data. We have already investigated the 'CaseID' variable (R0000100 IDENTIFICATION CODE 79 INT) and know that it is part of the COMMON area of interest. You should probably also include two other COMMON variables, the 'race' (R0214700 R'S RACIAL/ETHNIC COHORT FROM SCREENER 79 INT) and 'sex' (R0214800 SEX OF R 79 INT) of the respondent.
Now we need to select the 'Age' variables from the KEYVARS area of interest. (An alternate approach to finding all the variables on 'Age' would be to open the Any Word in Context index [Contents window, double-click on 'Any Word in Context'] and look for the word 'age' in it. If found, open the group [double-click on 'age'] in the Variables window, sort the variables by description [left-click on the Description heading] and then select the appropriate sub-group of variables from there.) But we know we want "AGE OF R AT INTERVIEW DATE ..." and that sub-group of variables is in KEYVARS, which will make our job a little easier.
You now want to select the 'block' of variables that begin with the word 'Age...'. To select a block of variables and mark them for extraction,
When the block is highlighted as in Figure 12 below,
After choosing 'Tag Selected' the variables in the block should all have checkmark by them and your screen should look like Figure 12 above (without the tagging pop-up menu). 11.2 Reviewing Tagged VariablesNow you will want to verify that the variables you have selected are all together in a 'tag set'. To do this you will need to 'review' the tagged variables.
When you choose Review Tagged Variables the screen shown in Figure 13 below should appear. From this screen you can do a number of things: review your selections, un-check individual variables if you'd like, accept the tagset and save it as an autonomous group, or 'Extract Tagged Variables'. If you choose the latter 'Extract Tagged Variables' from the 'Extract' menu (at the top of the screen) without saving the tagset, you will automatically be asked to save the tagset and give it a name. The reason that you must name the file before extraction is that during the extraction, the software automatically creates related files for documentation, statistical packages, codebook, etc. and uses the file's name to attach the various, appropriate extensions to the created files.
When saving a tagset the standard Windows dialogue box will appear and you will have to give the file a name. If you name this file 'age', you need only type in the word 'age' without an extension because the software will automatically add the extension '.ythpub'. All output files derived from this tagset will use this file name and attach the appropriate extension. For sample extensions, see 11.4 Extract Selections and Output Files below.
11.3 Running an ExtractIf, after review, you wish to continue and extract the variables, you may proceed in two ways: 1. You may save the tagset, then run the extract; or 2. You may run the extract and be required to name and save the tagset. To perform the extract, you must choose this command from the Extract drop-down menu.
When you choose 'Extract Tagged Variables...' the dialogue box shown in Figure 14 will appear. From this dialogue box you may make a series of selections concerning the output files; you may limit the sample (universe) by using Boolean logic (left half of the dialogue box), choose the extract data file type, and/or write out the codebook for the selected variables in the tagset. The "Write Codebook" button writes out the codebook for the tagset without running the extract. The "Extract Codebook File" check box when checked cause the software to write out the codebook for the tagset during execution of the extract. Figure 14 below displays the default selections that will appear each time you choose 'Extract Tagged Variables...' from the Extract. If you were to run the extract with the defaults, you would get the Extract Report shown in Figure 14 on the right.
Shown below are various iterations of selections made in the Extract Dialogue Box and the resulting Extract Reports. Study what 'extract data file types' produce what output files.
11.4 Extract Selections and Output Files
Formatted ASCIIIf you make the following selections in the Extract dialogue box (figure on the left) before you run the extract, the output and final Extract Report will look like the figure on the right. The extracted files will be placed in the same directory as the original tagged variables (*.ythpub) file. You may open any of the files in a text editor (notepad.exe, wordpad.exe, etc.) to view the file format.
Delimited ASCIIIf you make the following selections in the Extract dialogue box (figure on the left) before you run the extract, the output and final Extract Report will look like the figure on the right. The extracted files will be placed in the same directory as the original tagged variables (*.ythpub) file. You may open any of the files in a text editor (notepad.exe, wordpad.exe, etc.) to view the file format.
DBASE3If you make the following selections in the Extract dialogue box (figure on the left) before you run the extract, the output and final Extract Report will look like the figure on the right. The extracted files will be placed in the same directory as the original tagged variables (*.ythpub) file. You may open any of the files in a text editor (notepad.exe, wordpad.exe, etc.) to view the file format.
Stata DictionaryIf you make the following selections in the Extract dialogue box (figure on the left) before you run the extract, the output and final Extract Report will look like the figure on the right. The extracted files will be placed in the same directory as the original tagged variables (*.ythpub) file. You may open any of the files in a text editor (notepad.exe, wordpad.exe, etc.) to view the file format.
|