Developers: StreamBase Application Programming

Home
Documentation
Library
Sample Code and Applications
FAQs
Articles
Community
Training
Download Center
Contact DevZone

Printer Friendly

Frequently Asked Questions

StreamBase Application Programming

Note: Differences between the Developer and Enterprise Editions are listed here.

When the Query operator's Query Settings tab specifies a limit number of output rows value, does the operation always return n rows, or at most n rows?

Top n returns at most n rows, and may return fewer. For example, you can specify alternate values to be used in the Operation Settings tab table under the option, If the read fails because no row is found, substitute these values. This option does not apply the values provided to any row that had no results when some number of rows greater than 0 but fewer than n is returned. Rather, it uses the values in an output tuple when the query returns no rows. To put it another way, the limit parameter in a Query operator is an at-most directive, not a guarantee of how many rows you will get.

Your application may need to guarantee that queries with top n operations return exactly n rows, even if some or all are null values. For example, suppose you have a top n query operation that reads output into an aggregate that is computing values over a window of size n. If you cannot be sure that reads will always return n rows, then the logic of the window starts and stops becomes much harder. The logic is easier if you always return n rows, using nulls in rows that had no data.

Here is a solution that relies upon the deterministic nature of StreamBase after Version 3.0.

The map creates the tuple that flushes the aggregate. This works better than using a Heartbeat operator, because it always flushes the aggregate immediately after each query.

Applicable To: StreamBase 3.5, 3.7
September 22, 2006


 

How do I set up more than one workspace directory for StreamBase Studio?

The environment variable STREAMBASE_WORKSPACE is where StreamBase Studio looks for projects. So if you would like to have more than one Workspace folder, set STREAMBASE_WORKSPACE to whatever path and folder you prefer. The next time you open StreamBase Studio, that folder will be used as the current workspace, and new projects will be created there.

Applicable To: StreamBase 3.5, 3.7
September 22, 2006


 

If I query for tuples on more than one index, in what order are the indexes processed?

The primary index is used first to filter records in a query. The secondary indexes are then filtered in the order that they are entered in the query operator.

For example: To get all the tuples where symbol=shoe and the timestamp is the oldest in the table, the query table should list the symbol field as the first index in the secondary indices, and the timestamp field as the second. Tuple output would be limited to one, and the results would be ordered on the timestamp field.

Applicable To: StreamBase 3.5, 3.7
September 22, 2006


 

Is it possible to gain access to all the tuples in a window instead of just the first and last tuple?

Various StreamBase expressions (such as firstval, openval, lastval, and closeval) let you see the outer values in a window. To see all the tuples in a window you can use either of these two methods:

  • Split the stream with a split operator, which duplicates the tuples and sends them onto two different streams. Downstream, one of the split outputs can do the aggregation and the other can process each tuple.
  • Use the emit feature of the aggregate operator, which gives you an output on each input tuple.

Applicable To: StreamBase 3.5, 3.7
September 22, 2006


 

Where is the monitor snapshot interval configured?

In the the sbmonitor element within the sbd.sbconf file in there is an entry like this:

<streambase-configuration/sbmonitor/param name="period-ms"

For details, see the StreamBase Help topic, StreamBase Server Configuration.

Applicable To: StreamBase 3.5, 3.7
September 22, 2006


 

How do I detect when a Query operation finishes?

The figure below illustrates a method that can work with StreamBase 3.0 and higher:

 

In this solution, the top Map operator passes through tuples with an endFlag field set to false, while the bottom Map operator sets the endFlag field to true. Because the Split operator guarantees that the top branch flows to completion first, the query results are output first. When that flow ends, the endFlag=true branch runs, signaling the completion of the query run.

Note that in addition to setting the endFlag to true, the bottom branch's Map operator must also explicitly add fields to match the Query's output schema (filling values in with nulls is probably best), so that the Union doesn't complain about schema mismatches.

For synchronous queries, you can't always guarantee there will be any input tuples before you want to see query results. Briefly, here are some common approaches:

  1. Send through two tuples: the real tuple, followed by a marker tuple. The mark tuple triggers a query that returns null table values (that is, a key that doesn't exist).
  2. Use a metronome to continually produce end-of-result-set tuples, most of which probably don't correspond to real queries.

Option 1 is trickier to code, but tends to produce less load and less latency when you run the application.

Option 1 is also good for non-synchronous queries where you want only to mark results. If you prefer to avoid the second query, or if the semantics of the query does not accommodate certain operations you want (for example, Read All Rows), there are various ways to use a sequence number field to ensure ordering of the query result and marker tuples in the resulting streams. Of course, all such methods require more operators and can slow the application down.

Applicable To: StreamBase 3.5, 3.7
July 14, 2006


 

Should I use Lock and Unlock after StreamBase 3.0?

Before StreamBase 3.0, Lock and Unlock was sometimes used to protect data integrity, by processing only one tuple at a time from a particular sender or to a particular destination. This ensured concurrent access to the data by different senders or destinations.

After StreamBase Version 3.0, you can still use Lock and Unlock in your applications, but in most cases, you do not need to. The new scheduling rules guarantee that the server processes a tuple to completion before processing any subsequent tuples from the same stream. Reasons to continue using Lock and Unlock include:

  • Multiple input streams, with data dependencies between them, are used to feed the same critical section.
  • Operators or modules are marked as potentially concurrent along a path.
  • A shared state exists between custom operators that might require locking.

Applicable To: StreamBase 3.5, 3.7
July 14, 2006


 

How do I efficiently read query tables?

Avoid scanning an entire query table when you only need to scan a single row. Reading a single row is usually more efficient.

Sometimes two reads are better than one. For example, two read operations, each reading one row, may be faster than one read operation reading both rows.

Applicable To: StreamBase 3.5, 3.7
May 23, 2006


 

Should I process upstream for the best application performance?

Yes. To ensure that your StreamBase applications run as quickly as possible, push processing upstream as much as possible. We recommend the principle, "filter on the left, process on the right", to guide your application development. For more information on application tuning, see the Tuning Guide in StreamBase Help.

Applicable To: StreamBase 3.5, 3.7
May 23, 2006


 

What's the best way to handle null values?

Here are some guidelines for using null values in StreamBase expressions.

  • To test for null, use isnull() and notnull().
    You cannot use relational operators to test for null, because the results are UNDEFINED.
  • Using the == operator on a null value results in a null, not true or false.
  • A null value for any predicate subexpression results in null being the value of the entire predicate expression.
  • If a field can be null, you must test that condition before comparing it with non-null values. For example:

        if isnull(i) then false else if i>10 then true else false

and not: 

        if i>10 then true else false

In the preceding expression, i>10 produces unpredictable results if i is null

Applicable To: StreamBase 3.5, 3.7
May 23, 2006


 

How do I increase the tuple size for large incoming streams?

To increase the tuple size in your application for large incoming streams of data, set the maximum tuple size in the Streambase sbd.sbconf configuration file's page-pool element, as follows:

<param name="page-size" value="n"/>

The default maximum size is 4096 for Windows and 8192 on SPARC. 

Applicable To: StreamBase 3.5, 3.7
May 23, 2006