Using the Map Operator

This topic explains how the Map operator works and the actions you can take on its Properties view.

Introduction

The Map operator accepts a single input stream and creates a single output stream with an arbitrary number of output tuple fields, based on the evaluation of expressions. This one-to-one expression execution is the core idea. Often a Map operator is used to add to, modify, or drop fields from a data stream.

In the single output stream produced by Map, there can be one or more output fields, named anything, with any computable value. The set of output fields may include all, some, or none of the input fields, as well as any desired additional fields. The Map operator is not order sensitive.

For example, an input stream might contain an item's price in U.S. dollars (USD). A Map operator could be used to convert the prices to Euros (EUR) by applying the current conversion rate. The output stream could contain both the USD input price and the EUR equivalent (by passing the input field and adding the output field), or it could just contain the EUR price.

"Map" is a common name for this kind of operation in programming languages; that is, the idea of applying an expression to a sequence to create a new sequence. The concept and name originates with Lisp, and exists in many languages including Perl and Python. In languages where the concept does not exist by this name "Map" (for example, in C or Java), the concept generally does not exist at all.

This topic describes the actions you can take on each tab of the Map operator's Properties view.

General Tab

Name: Every application component must have a unique name. Use this field to specify or change the component's name. The name must contain only alphabetic characters, numbers, and underscores, and no hyphens or other special characters. The first character must be alphabetic or an underscore.

Enable Error Output Port: Check this box to add an Error Port to this component. In the EventFlow canvas, the Error Port shows as a red output port, always the last port for the component. See Using Error Ports and Error Streams to learn about Error Ports.

Description: Optionally, enter a description to briefly describe the component's purpose and function. In the EventFlow canvas, you can see the description by pressing Ctrl while the component's tooltip is displayed.

Output Settings Tab

The Output Settings tab allows you to specify the field names and values this Map operator will output.

The value for each output field is computed by its corresponding expression.

Specify output fields using one of the two Output options:

  • Choose the All Input Fields with Changes option to preload existing fields in the input stream. By default, properties are set to pass all input fields through to the output stream; you only have to specify the changes that you want to make.

  • Choose Explicitly Specified Fields to add the output fields manually.

All Input Fields with Changes

The All input fields with specified changes option specifies how you want the output to differ from the current input stream. You add one or more fields from the current input stream to the Output Fields table. You then specify the changes you want to make by applying the following actions to each field:

  • Add inserts the input field in the output stream.

  • Replace changes the input field and includes it in the output stream.

  • Remove omits the input field from the output stream.

To complete the output settings:

  1. Add the fields you want to the Output Fields table by using one of the following methods:

    • Use the Add and Remove button. In each new row, the Add action is preset; you must specify the Field Name and Expression.

    • Click the Fill All Rows button to automatically fill the Output Fields table with all of the fields from the input stream. Alternatively, use the arrow to the right of the button to specify individual fields.

      The following screen shows an Output Fields table that has been populated using the Fill All Rows button. The third row has been selected, showing the drop-down Action control (we will discuss the Action control later):

  2. In the Output Fields table, edit the row for each field.

    • If you used the Add button:

      1. Accept the default Add action if you want an existing input field to be included in the output. Otherwise, click the Action control and choose Remove or Replace.

      2. In the Field Name column enter the input field name that you want to add.

      3. If the action is Add or Replace, enter an expression that resolves to the value you want the mapped output field to have.

    • If you used the Fill All Rows button to load all input fields:

      1. Accept the default Replace action if you want to pass the field through to the output stream. Otherwise, click the Action control to change the action.

      2. The Field Name is automatically loaded with the input stream field name. This field is editable, so you can change the input field being replaced.

      3. The Expression is automatically loaded with the input field name. This setting passes the input field through unchanged. Edit the expression to change the value of the output field.

    • If you used the arrow to the right of the Fill All Rows button to specify an individual field:

      1. Accept the default Add action to add a new output field, or click the Action control to change it.

      2. In the Field Name column enter the input field name that you want to add.

      3. In the Expression column, enter an expression that resolves to the value you want the mapped output field to have.

Explicitly Specified Fields

With the explicit Output type selected, you define each output field, instead of indicating differences between the inputs and outputs.

  1. Click the Add button to add a row for each field that you want in the output stream.

    Alternatively, click the Pass All button to load the table with some or all of the fields from the Map operator's input stream.

  2. For each input field, specify the Output Field Name and an expression that resolves to the value you want.

    The following screen shows an Output Fields table that has been populated using the explicit option.

Dynamic Variables Tab

The Dynamic Variables tab allows you to define variables for this operator that can then be used in one of its expressions. A dynamic variable can be updated by any input stream or output stream in your application. For more information, see Using Dynamic Variables.

Concurrency Tab

Run this component in a separate thread

This option causes the server to process the component's requests concurrently with other processing in the application. You can distribute the processing of the threads automatically across multiple processors on an SMP machine.

If this is a compute-intensive component and you know that it can run without data dependencies on other components in the StreamBase application, you may be able to improve performance by enabling this option.

Caution

These features are not suitable for every application. For details, see Execution Order, Concurrency, and Parallelism. It includes important guidelines for the use of these features.

Run in parallel threads

If you checked the first option, you can also choose this option, which causes the server to run multiple instances of this component. That is, each instance runs in its own thread. At run time, tuples are dispatched to particular instances based on the Key Expression value (which must evaluate to an int).

Null Values

  • In an operation that performs sorting, any tuple with a null value in the ordering field or in a Boolean expression, will be ignored.

  • If the evaluation of a predicate results in a NullValueException error, the tuple will be dropped.

  • If this component contains a Group Options tab, any null value in a Group By expression will be grouped.

    For more information, see Using Nulls.