Contents
The Map to Leaf Fields option appears in two places in StreamBase Studio:
-
In the Data File Options dialog called from the Feed Simulation editor, when specifying options for a CSV data file used as input for a feed simulation.
-
In the StreamBase Test editor, when specifying options for a CSV data file used as a data validation file for a unit test.
Use the Map to Leaf Fields option to specify that the fields of a flat CSV file are to be mapped to the subfields of tuple fields, not to the tuple fields themselves. This feature lets Studio read flat CSV files generated manually or generated by other applications, and apply them to schemas that have nested schemas.
Do not enable this option for reading any CSV file whose fields have subfields, designated according to the CSV standard.
Do not enable this option for reading hierarchical CSV files generated by StreamBase Studio, or by a StreamBase adapter such as the CSV File Writer Output adapter. For example, let's say StreamBase generates a CSV file to capture data sent out an output stream whose schema includes nested schemas. In this case, the generated CSV file is already in the correct format to reflect the nested schema structure, and does not need further processing to be recognized as such.
However, CSV files generated by third-party applications, such as Microsoft Excel, generally have a flat structure, with each field following the next, each field separated by a comma, tab, space, or other character. Despite the flat structure, if the fields of a CSV file are ordered correctly, you can use the Map to Leaf Fields option to feed or validate a stream with nested schemas.
The examples in this section will clarify this feature.
Let's say we have an input stream that has a two-field schema, where both fields are
of type tuple. Field T is a tuple with three
int fields, n1, n2, and n3. Field W is a tuple with two
double fields, d1 and d2.
There are two ways to create a CSV file that contains fields that correctly map to this schema:
-
Create a CSV file that contains the expected hierarchy, separated and quoted according to CSV standards.
-
Create a flat CSV file that contains the correct number of fields in the right order, then tell StreamBase to interpret this file by mapping to the subfields of the two tuples.
The following example shows a hierarchical CSV file that can be used with the schema shown above. In this file, each line maps to two fields, and each field contains subfields. To use a CSV file like this example as a feed simulation data file or a unit test validation file in Studio, do not enable the Map to Leaf Fields option.
'1,1,1','2.1,2.2' '3,3,3','4.2,4.4' '5,5,5','6.3,6.6'
When specifying a CSV file to use with a feed simulation, the Data File Options dialog shows you graphically how the CSV file will be interpreted. However, StreamBase does not fully validate the CSV file against the input port's schema until the application is run.
When using the Map to Leaf Fields option for a feed simulation, use the following steps to make sure your CSV file is validated as expected:
-
Run the application.
-
Start the feed simulation that uses the CSV file.
-
Examine the resulting tuples in the Application Input view.
For the example CSV files in the previous section, the following Application Input view shows that the tuples fed to the input stream were interpreted as expected, and all subfields were filled with data:
|
The same results are obtained in both cases:
-
Using the hierarchical CSV file with the Map to Leaf Fields option disabled.
-
Using the flat CSV file with the Map to Leaf Fields option enabled.
If you see several fields interpreted as null, this indicates that the Map to Leaf Fields option is enabled for the wrong file type, or that the fields in the CSV file do not line up field-for-field with the schema of the input port you are feeding.
The following shows an example of an incorrect result. In this case, only the first subfield of each tuple received input.
|
