Using the Query Table Data Construct

Throughout this topic, remember that with StreamBase Developer Edition, you can only use memory-based Query Tables. You must be running Enterprise Edition to use disk-based Query Tables.

This topic explains how the Query Table data construct interacts with one or more Query operators, and describes the Query Table's options on its Properties View.

Introduction

Query Tables are containers for sharing data and maintaining state within a StreamBase application. They accomplish this by storing tuple values outside of any stream. At runtime, a Query Table can have its values modified by one or more Query operators that write to (populate), or read from, their associated Query Table.

Query Tables can be associated with multiple Query operators, but each Query operator can be associated only with one Query Table.

Query Tables can be declared to reside in-memory, or on-disk. For a memory-resident table, any data stored at runtime is not saved after the StreamBase Server is shut down. For disk-based Query Tables (as with external data sources), the data can persist indefinitely. For example, you could store data for a StreamBase application that runs only during a particular time each day, and requires data from the previous run during initialization. Another possible use of persisent tables is to maintain state information for a StreamBase application.

In the following figure, the selected Query Table has been declared to be in-memory. Also notice that the Query Table is connected to multiple Query Operators:

In the case of memory-resident Query Tables, data stored in the table at runtime is not saved after the StreamBase Server is shut down. In the case of disk-based Query Tables, the data stored in the tables may persist between StreamBase Server sessions.

Note

As shown in the sample screen above, the use of disk-based Query Tables require an extra license agreement with StreamBase Systems; however, it does not require a different license file. For details, please contact your StreamBase Systems representative.

To make disk-based Query Table data persist between StreamBase Server sessions, follow these steps:

  1. In the Table Settings tab of the Query Table's properties, select On disk as the Type.

  2. In the Table Settings tab of the Query Table's properties, ensure that Truncate on startup is false.

  3. Set a datadir value for the StreamBase application, using one of the following techniques:

    • On the sbd command line, use the --datadir option.

    • In the STREAMBASE_DATA environment variable's value.

    • In the StreamBase Server's *.sbconf configuration file used by the application, define a datadir parameter in the server section. For example:

      <server>
        <param name="datadir" value="/path/myValue"/>
      </server>
      

    If a data directory is not specified or if the special value +TEMP+ is used, a temporary directory is created on server startup, and deleted on shutdown.

StreamBase also provides options for transactional updates of disk-based Query Tables, adding new configuration parameters in the server section of an sbconf file. For example:

<server>
  <param name="datadir" value="C:\myapps\streambase\datadir"/>
  <param name="disk-querytable-transaction_mode" value="1"/>
  <param name="disk-querytable-flush-interval-ms" value="5000"/>    
</server>

For details about the configuration options, please see Server Configuration Options in the Administration Guide.

Also, if your application uses disk-based Query Tables, you may be able to improve performance by setting a value for the following parameter in the server section of the sbconf file configuration file:

<param name="disk-querytable-cache" value="n">

The value="n" attribute sets the amount of main memory that will be used to cache data from disk-based Query Table operations. The units are in MB of main memory and must be a power of 2. When unspecified, the default value is 1 MB. Use caution when setting this parameter, because too high a value may consume more memory than needed and could negatively impact other resources that require memory during the execution of the StreamBase application or other applications. As with any resource setting, you should establish baseline performance metrics and then test the effect of increasing or decreasing values under normal and peak loads.

The remainder of this topic describes the actions you can take on each tab of the Query Table's Properties View.

General Tab

Name: Every application component must have a unique name. Use this field to specify or change the component's name. The name must contain only alphabetic characters, numbers, and underscores, and no hyphens or other special characters. The first character must be alphabetic or an underscore.

Description: Optionally, enter a description to briefly describe the component's purpose and function. In the EventFlow canvas, you can see the description by pressing Ctrl while the component's tooltip is displayed.

Table Settings Tab

Use the Table Settings tab to specify where the Query Table resides (its type) and, if it resides on disk, how the table should behave when the StreamBase Server starts. Also use this tab to control the table's access level and use it in a module, as described in this section.

Type

One of the following:

in memory

Data in the Query Table will not available after server shutdown.

on disk

Data in a Query Table will be available for the application the next time the server starts, as long as you follow the instructions outlined in the introduction to this topic. (That section also contains important information about licensing agreement and configuration options related to the use of disk-based Query Tables.)

Truncate on startup

The Truncate On Startup value can be selected only when you have already selected a Type of "on disk" for the Query Table. If you then set the Truncate On Startup value to "Yes":

  • If the table already exists, its records are deleted (the table is initialized).

  • If the table does not exist, it is created.

Or if you then set Truncate On Startup to No:

  • If the table already exists, it is opened, and any records in it remain there.

  • If the table does not exist, it is created.

Access Level

Use the Access Level options to control the visibility of tables across application modules. That is, when this Query Table is part of an inner module that is referenced in another, outer module.

Private

Restricts the table's visibility to the module containing the table.

Shared

Makes a table accessible by a Query Operator in an outer module. When a Module Reference contains a shared table, its icon displays a grey data port on its top edge, similar to the data port on top of a Query Table icon. You can connect a query operator in the outer module to the shared table through this data port (as described in Using the Query Operator).

Placeholder

Marks a table as dependent on data from outside the module. Set this option when you want to substitute the table with another table in the outer module. When a Module Reference contains a placeholder table, its icon displays a grey data port on its bottom edge, similar to the data port of a Query Operator icon. You can connect a query table in the outer module to this data port (as described in Defining Module Reference Properties).

See Working with Shared Query Tables for more details.

Edit Schema Tab

Use this tab to define the Query Table's schema:

  1. Use the control at the top of the Edit Schema tab to choose the schema type:

    Private schema

    Populate the schema fields using one of these methods:

    • To define the fields manually, use the Add button at the top of the tab to add a row for each schema field. Enter values for the Field Name and Type, and if necessary, the Size. The Description is optional. For example:

      Field Name Type Size Description
      symbol string 10 Stock symbol
      quantity int 4 Number of shares

      The name must begin with an alphabetic character or an underscore, can contain only alphabetic characters, numbers, and underscores, and cannot contain hyphens or other special characters.

      The type must be one of the supported StreamBase Data Types. The editor fills in the size for data types with fixed sizes, but you must enter a valid size for strings and blobs, which are variable-length.

    • You can reuse an existing schema whose fields are appropriate for this component and your application. It can be a schema that you previously saved, one that you imported from your file system (as described in Importing and Exporting Resources), or one that exists in another application. To do so, click the Copy Schema button. If prompted to save the application at this point, click Yes to continue.

      In the Copy Schema dialog, identify the schema you want. See Copying Schemas for details.

      The schema fields are loaded in your Fields table.

      Tip

      Alternatively, you can drag a saved schema from the Saved Schemas view and drop it onto the schema table in the Edit Schema tab. If you already have entries in the table, StreamBase prompts for permission to replace them with the contents of the schema you are copying.

    In addition to the Add button, you can use the Remove, Move Up, and Move Down buttons to edit and order your schema fields. If you have copied a schema, remember that this is a local copy: Any changes you make here do not affect the original schema that you copied.

    Named schema

    If you have previously defined any named schemas for this application, you can choose one from the list. (If no named schemas exist, only the Private Schema option is available.)

    When you choose a named schema, its fields are loaded into the Fields table. The text is grey and cannot be edited here.

    Tip

    You can change a named schema by editing it in the Named Schemas tab of your application's EventFlow Editor.

  2. Optionally, document your schema in the Schema Description field.

Primary Index Tab

Query Tables must have a primary index to be valid. The Primary Index tab allows you to select which fields will be used to look up values in the Query Table:

In the Available Fields list, double-click each field that you want to add to the index. (Alternatively, use the arrow buttons.)

You can also choose how keys are indexed for table read operations by using the Index Type control:

Unordered, no ranges (hash):

Keys are unsorted, and they are evenly distributed (hashed) across the index. A hash index is used for accessing keys based on equality, and are generally best for doing simple lookups.

Ordered, with ranges (btree)

Keys are sorted. A btree index is used when output ordering and range queries are desired. Note that the sort depends on the order of the fields in the index keys.

The relative performance of hash and btree methods varies depending on many factors, including the distribution of keys in your dataset: we recommend trying both methods if you are in doubt which to use. Also consider that StreamBase Studio allows you to specify a key range and sort order using btrees, but not using hash access.

Secondary Indices Tab

The Secondary Indices tab allows you to create one or more secondary indices, whose fields are used to look up values in the Query Table. Query Tables do not need to have secondary indices, but if they do, no secondary index can be the same as the primary index.

For each secondary index:

  1. Click Add to display the Edit Secondary Index dialog.

  2. In the Available Fields list, double-click each field that you want to add to the index.

  3. Click the Add to Index List button.

Fields are indexed using the hash method by default. Just as with primary indexes, you can also use the btree method. To change the index type, click in the field's Index Type column and select either hash or btree from the drop-down list. You can intermix hash and btree indexes in the same Query Table. For example the primary index could be hash and a secondary index could be btree. Note that secondary indices make write and delete operations slower, because each secondary index in the query table must be updated with any changes to the Query Table.

Null Values

If the key field for a tuple record being written to a Query Table is null, the tuple is stored. In a Query Table with a sorted (btree) index, the null-keyed stored records are evaluated as less (in value) than other non-null records. On a subsequent read (lookup) operation, the null-keyed tuples can be located. For more information, see Using Nulls.