Developers: Understanding How Windows Open and Close

Home
Documentation
Library
Sample Code and Applications
FAQs
Articles
Community
Training
Download Center
Contact DevZone

Printer Friendly

Library Articles

Understanding How Windows Open and Close

Author: Simon Keen
Contributor: Dr. John Lifter
StreamBase Systems
1-May-2007

Topics:

Introduction 

When developing an EventFlow application, you use an aggregate operator to compute aggregations over moving windows of tuple values. The window is defined through entries you make on the Edit Dimension window, which is accessed through the aggregate operator's Properties view, Dimensions tab. This article discusses how the entries you make in the Opening policy and Window size groups affects the behavior of the window.

Edit Dimensions Window

Behavior Derived From the Open and Close Settings 

The settings within the Opening policy group determine under what conditions a new window is opened; for example, every tenth tuple or every minute. The settings within the Window size group determine when to close a window; for example, after ten tuples have entered the window or after one minute has elapsed. While it is possible to enter a closing (size) value without specifying an opening (advance) value, the reverse is not a valid configuration; specifying when a window opens only makes sense if you have also defined when a window closes. However, if you specify only a closing condition, then, by default, a new window will open after the preceding window closes. This is not equivalent to providing the same values for both the Opening policy, Open per: and Window size, Close and emit after entries, as you will learn in this article.

When both settings are specified, the opening and closing of the window is relative to a numbering space that is anchored to the window itself, whereas when only a close setting is specified, the numbering space is anchored to the first tuple in each window. Let's use some examples to understand what this means.

Time Based Windows

When both settings are specified, all windows will be anchored to the system time when the aggregate operator opens the first window. For example, if the first tuple enters the aggregate operator at 16:05:28.000, and the open setting is five seconds while the close setting is ten seconds, the aggregate operator will set the opening time of the first window to 16:05:25.000. Then each subsequent window's opening and closing time will be anchored to this opening time. Since the windows are configured to close after ten seconds, this first window will close at 16:05:35.000. Similarily, since the windows are configured to open at five second intervals, the second window will open at 16:05:30.000. While it is possible that some windows will not process any tuples, and therefore not emit any tuples, each window's opening and closing time will be a multiple of five or ten second intervals from the anchor time 16:05:25.000. Note that the anchor time is dependent on the open setting. In this example, if the open setting had been ten seconds, and the first tuple arrived at 16:05:28.000, the anchor time would have been set to 16:05:20.000.

When only the close setting is specified, the aggregate operator will not open a window until the first tuple that will be contained in that window arrives. For example, when the very first tuple processed by the application arrives, perhaps at 16:05:20.000, the aggregate operator opens the first window, which will remain open for the period of time specified in the close setting. If the close setting is ten seconds, this window will close at 16:05:30.000. However, since there is no open setting, the aggregate operator does not know how frequently to open a new window. Consequently, it waits for the first tuple to arrive after the preceding window has closed before it opens a subsequent window. In this example, suppose that after 16:05:30.000 the next tuple does not arrive until 16:06:15.000. This tuple will close the current window and the opening time of the second window will be 16:06:15.000. Note that the second window does not open at a multiple of a ten second interval from the opening time of the first window. Each window's opening time is anchored to the arrival time of the first tuple in that window.

Field Based Windows

The behavior of a Field based window exactly parallels the behavior of a Time based window. If the field used to define the window is of type Timestamp, then the window will show the same behavior as the Time based window except that the field value, rather than the system time, is used to establish the anchor time. For example, if the field value is 16:09:12.000 and both the open and close settings are ten seconds, the aggregate operator will set the opening time of the first window to 16:09:10.000, which will become the anchor time. And subsequent opening and closing times will be multiple ten second intervals from this time.

When only the close setting is specified, the aggregate operator does not open a window until the first tuple contained in that window arrives and its opening time is set to the value in the Timestamp field; the window will close when a tuple arrives containing a Timestamp field value larger than opening time plus the close setting. Each subsequent window's opening time is set to the value of the Timestamp field in its first tuple.

When the field used to define the window is a numeric type, a similar behavior is observed. If both the open and close settings are specified, then the first window's opening value is zero and its closing value is equal to the close setting. Subsequent windows open when the value in the numeric field increases by a multiple of the open setting and close when the numeric field value increases by the close setting.

When only the close setting is specified, the opening value of each window is the value in the numeric field of the first tuple in the window and the close value is equal to the opening value plus the close setting.

Tuple Based Window

With Tuple based windows, differences in behavior are difficult to discern. When you provide only a close setting, the behavior of the window is indistinguishable from the situation in which you provide the same entries for both the open and close settings. When you provide values for both settings, you observe the anticipated overlapping of multiple windows.

When to Exploit These Behavioral Differences 

If your application receives tuples at a rapid and/or consistent rate, there will be little difference whether your windows open at evenly spaced intervals or open with the arrival of a tuple. In both scenarios, each window will contain the same number, or nearly the same number, of tuples. However, if your application is receiving tuples at a slower and/or irregular rate, your aggregate expressions may produce more accurate results when window opening events are linked to the arrival of a tuple. For example, if a sensor periodically produces a burst of readings, opening the window as the first reading arrives will allow you to capture all the readings in a single window and avoid the situation in which the readings are spread across multiple windows.

StreamSQL Applications 

When developing StreamBase applications using the StreamSQL text based paradigm, a window definition without a specific entry for ADVANCE is interpreted as if the value assigned to ADVANCE is the same as the value assigned to the SIZE entry. Therefore, the following two window specifications are equivalent.

  [SIZE # ...]
  [SIZE # ADVANCE # ...]

Consequently, the behaviors demonstrated by an EventFlow window are not available in an application developed using the StreamSQL text based approach.

Back to Top ^