INSTRUCTIONS FOR RUNNING NewNexisFilter.pl DOWNLOADING FROM NEXIS Run the appropriate search in LexisNexis, and then use the "Download Documents" option (the icon that quaintly looks like a disk -- remember those, before we had flash drives?). Use "Select Items" in the "Document Range" box to download a maximum of 500 stories at a time. In addition, the program assumes the following options have been selected for downloading: Format: text Document View: Full document Font: Courier No additional options should be checked. You can increase the efficiency by eliminating sports stories and news summaries: these will be skipped anyway. Note that the program is currently set up only to filter for Agence France Presse stories. RUNNING THE FILTER 1. Put all of the Nexis files you intend to filter and the following two programs in a folder: NewNexisFormat.pl nexisreverse.pl 2. In the Terminal (command-line), move to that folder 3. Assuming your Nexis downloads have a file name of the form Agence_France_Presse_-_English2007-09-14_16-31.TXT enter the command ls Agence_Fr* > format.files Alternatively, just use the command ls > format.files and manually edit out all of the files that are not downloaded files 4. Enter the command perl NewNexisFormat.pl where is the prefix for the formatted file. For example perl NewNexisFormat.pl AFPLVT 5. Program should run, with the dates and headlines of the various stories scrolling past as they are processed. If the program stops working -- crashes or stops responding -- the last story displayed (or the one following it) is probably the cause, so just eliminate that story and try running the program again. 6. When the program has finished, enter the command ls * > filelist where is the prefix you entered earlier. For example ls AFPLVT* > filelist 7. Enter the command perl nexisreverse.pl 8. The resulting TABARI input files are in the file reverse.output which can be renamed at this point, and a summary of the number of entries in this file can be found in filelist.summary Note that as currency configured, nexisreverse.pl only gets the first sentence of the story. Programmer: Philip A. Schrodt (schrodt@ku.edu) Last Update: 22 March 2008