EXTRACTOR Configuration Syntax

The Prognosis File Extractor function provides a means for data to be extracted from nominated files and specific command outputs into Prognosis records by way of user-defined extractor scripts. The EXTRACTOR Configuration is used to define the location of the script files and the data sources (i.e. log files, data files or command output). It also defines the collection interval and the optional limitation on the number of records to buffer in memory. Details about File Extractor can be found in the Application Development Interface chapters of the User Guides.

The syntax of the EXTRACTOR Configuration is detailed below:

SUBSYS EXTRACTOR
EXTRACT FILE    (<extractor-name>, <extractor-script>, <input-filename>, <collection-interval>[, <max-records>[, <prestart[=n]>]])
EXTRACT LOG     (<extractor-name>, <extractor-script>, <input-filename>, <collection-interval>[, <max-records>[, <prestart[=n]>]])
EXTRACT COMMAND (<extractor-name>, <extractor-script>, <command-to-run>, <collection-interval>[, <max-records>[, <command-output-log>][, <prestart[=n]>]])
PARAM-SP        (<param-separator>)

EXTRACT Type Definition

The following extraction types are available; FILE, LOG or COMMAND

  • For FILE collection, each collection reads the whole file.
  • For LOG collection, only data added to the file since the last collection is read.
  • For COMMAND collection, the command (in the <input-filename> parameter position) is run and its entire output read.

Syntax Elements

EXTRACT {FILE | LOG | COMMAND}

<extractor-name>

The extractor instance symbolic name of 1 to 24 characters that identifies the extractor process to use. This is the SERVER named in the DEFSSRV file. FSCOL is the standard predefined name.

Some users may wish to run a second instance of the extractor. To do this please contact Prognosis Technical Support.

<extractor-script>

The filename of the file extractor script to process by the extractor.

<input-filename>

The filename of the input file for the extractor script (i.e. the file to extract data from). 

<command-to-run>

The name of the command to be run together with any required arguments.

The following optional command line parameters can be added as needed:

%p

Substitutes the 'Subsystem Configuration' parameters, specified in the 'Data View Definition', into the command. If there is more than one parameter specified then the parameters will be separated by the | symbol. The separator symbol can be overridden using the PARAM-SP option.

The %p parameter only applies to when the <collection-interval> parameter is set to '0' in the EXTRACTOR Configuration; this is because the Requestor 'Collection Interval' and the Extractor script <collection-interval> are aligned and the parameters can be passed from one to the other. When the intervals are not aligned, this behavior is not possible.

Example:

If PARAM-SP was set to "," then the COMMAND "FUP INFO (%p )" may be expanded to "FUP INFO ($SYSTEM.PRGNOSIS.WVLOG, $DATA.PRGNOSIS.WVLOG )" when it is executed. 

%s

Substitutes the collection interval (in seconds) into the command. See the description for the <collection-interval> parameter for details on how the collection interval is determined.

Example:

If the collection interval is 10 seconds then the COMMAND "getfiles.bat %s" will be expanded to "getfiles.bat 10" as it is executed.

EXTRACT COMMAND (FSCOL, TEST.EXE, getfile.bat %s, 0)
%u

Where %u is substituted for a unique incrementing number on each request.

Example:

In this example, on each execution of the command, it will output to the console '1', then '2' etc...

EXTRACT COMMAND (FSCOL, TEST.EXE, ECHO %u, 0)

<collection-interval>

The collection interval for this file extractor instance, i.e. the file extractor script will be run once every <collection-interval>. It is specified in seconds, with a minimum of 1 second and a maximum of 1000000 seconds.

  • When specifying a non-zero value, data from the Extractor is only collected on those intervals, therefore if the <collection-interval> is set to 30 seconds and the Display (or Database Collection, etc...) has a 'Collection Interval' set to 10 seconds, then the Display will show the same data for three consecutive intervals.
  • If this is set to zero, then the Extractor script is executed using the same 'Collection Interval' specified in the Display (or Database Collection, etc...); however there are some caveats to this that want to be considered.
    • The Displays 'Collection Interval' should not be less than 10 seconds, and the best practice is to use the standards 10, 30, 60 or 300 seconds.
    • The minimum 'Collection Interval' should be at least time to execute the Extractor script + 5 seconds, to ensure consistent results. In this case, it may be better to set the <collection-interval> to a longer interval to cover the processing time of the Extractor script.
    • Some non-Prognosis commands will have their own queuing/aggregation mechanisms that run on a specific interval to present data, therefore setting to '0' will have no effect on these and can possibly have negative effects on the reliability of data. Again, it may be better to set the <collection-interval> to a longer interval that allows the external command to complete its data processing.

<max-records>

This the maximum number of records to be retained from a collection. If omitted, or specified as zero, then there is no limit on the number of records collected. If specified as nonzero, then records are stored in a circular buffer that can contain at most max-records records, and new records overwrite older records in the buffer when it wraps around. This is useful for log files so that a history of max-records logfile records is retained, and new records replace older ones. This option is typically specified for LOG extraction and not for the other types.

<command-output-log>

The filename to which the command places its output in for COMMAND collections. This is useful if the command outputs to a fixed filename rather than to the standard output. If this parameter is omitted then output is assumed to be from the standard output.

<prestart[=n]>

This case-insensitive option specifies that the command has to be started not when the data delivery for the first interval is due but rather when a data requestor (e.g. online display or database collection) is started.

The optional <n> argument specifies the duration of the do-not-disturb period in seconds. It means that after the command startup, the Extractor will not be running the script (in order to avoid disturbing the command with prompts) for the duration of the do-not-disturb period.

Note that the Extractor will restart the command if it fails to produce any output for the duration of the timeframe specified (implicitly or explicitly) by the <timeout> argument of the SendPrompt script instruction. The do-not-disturb period applies not only to the initial command’s startup but also to each subsequent restart (if any).

If the no-data timeout specified by the SendPrompt instruction is too short (as a result Extractor does not have much ‘patience’ towards commands which fail to produce any output) and on the other hand the do-not-disturb period specified in the PRESTART option is too lengthy, then a deadlock is likely to occur. Many commands will not produce any output until prompted - which will not happen until the do-not-disturb period expiry. This will keep the command silent and trigger its restart (after the no-data timeout) which would impose a new do-not-disturb period in an endless loop.

PARAM-SP (<param-separator>)

<param-separator>This optional field is used to input a character to override the default '|' character which is used to separate multiple parameters when using the %p option in COMMAND collections.
Provide feedback on this article