This node allows you to read in a data file, it also allows you to set parameters on how to read in the file.
- File Name section allows you to choose the input data file to be read in.
- File Delimitor section allows you to specify what the field seperator is for the input data file. You can choose the Comma , Space, Tab or the Other. With the Other you can specify a single character that is used as the field seperator.
NOTE: If the Comma is specified then the input file must conform to Microsoft's CSV (Comma Seperated Value) format, that is:
- If any data value contains the comma then that data value must be enclosed with in quotes (ie: $10,000 => "$10,000")
- If any data value contains the quote then that data value must be enclosed with in quotes and each quote must be proceeded with a quote (ie: The "Apple" was green => "The ""Apple"" was green")
- The first data value on any given row must begin at the begining of the row. That is no space characters are allowed before the first character of the actual data value. If the data value requires quotes then there must not be any leading spaces before the quote.
- The above rule applies to the beginning of each data value when it must come immediately after the comma delimitor for the previous data value.
- The same rules apply for the header if chosen.
- File Options section allows you to select how the input data file should be interpreted. Has the data file got a header and whether the file is in UNIX format otherwise it is in DOS format. You can also select to remove spaces leading and lagging the data elements read in. Also you can select to remove quotes surrounding the data element, which means the first and last character of the data element must be quotes otherwise no quotes will be removed. The Strip Space option will be performed before the Strip Quotes option if both are selected. You can also specify a start and end row which allows you select the rows you wish to read in from the file.
- Output Data Columns section allows you to analyse the input data file and to change the column type. Clicking the Analyse button will read at most 100 lines from the input data file and from this determine the name of the columns (if specified) and what the columns data type are. If the Has Header is not specified, a default header is given as (C0,C1,C2,etc). The column type is either specified as an continuous (ie: 123) or a discrete (ie: ABC). Once you analyse you can change the column type by first clicking on the column and then clicking on the Change Type button.
- File Format The input file can be in DOS or UNIX format. The file must be delimited with a specified delimitor - most likely a comma. Your data file may also have missing values which are specified by the '?' symbol. If DataManager encounters the '?' symbol in any data value it is reverted to a missing value. If you have discrete data then the data values must not contain the '?' symbol otherwise they will be interpreted as missing values. You can also specify that your data has a header which must be the first line in the data file. You will have to notify DataManager of this otherwise it will interpret the line as data values. If you do not have a header DataManager will create a header for you.