HTTP Reader Input Adapter

Introduction

Note

The adapter uses the libpcap (Linux) and WinPcap (Windows) libraries to tap into the IP packet stream. Before using this adapter with realtime data, you must install and configure these libraries on the system that will host the HTTP Reader adapter. Locate libpcap at http://www.tcpdump.org/ and WinPcap at http://www.winpcap.org.

The HTTP reader input adapter reads Internet Protocol packets, either from the IP stack of a running system or from an archived capture file, and emits a tuple for each HTTP message found in the TCP data stream. Dedicated tuple fields hold the request and response messages, the HTTP headers (Authorization, From, Referrer, Content-Length, User-Agent, and so on.), and the request and response body of the message. When reading IP traffic on a running system, a filter string can be configured to limit the IP traffic being read. The adapter uses the libpcap (Linux) and WinPcap (Windows) libraries to tap into the IP packet stream.

Note

This adapter is not provided in the StreamBase kit, but is made available in a separate installation kit. Please contact StreamBase Systems if you are interested in the kit.

Properties

Property Description
Data Source Select Realtime Data or Capture File to read HTTP traffic from the IP stack of a running system or from an archived capture file, respectively.
Network Interface IP Address IP address of the network interface to read HTTP traffic from the IP stack of a running system or leave empty to use any available interface.
Capture File Name The fully-qualified name of the capture file holding the IP packets to process. The capture file must be on a libpcap- or WinPcap-compatibile format. The property is ignored when Data Source is set to Realtime Data.
TCP Port Number The well-known TCP port number of HTTP server, which defaults to 80.
Capture Filter String A libpcap- or WinPcap-compatible filter string that can be used to limit the IP traffic being read.
Schema (schema) The schema to output, which must include the following fields, which the adapter automatically creates. The user can override the length of any string field by adding that field in Studio's Edit Schema tab:
  • ClientAddr(string):IP Address of the Client application

  • ClientPort(string):IP Port of the Client application

  • ServerAddr(string):IP Address of the Server application

  • ServerPort(string):IP Port of the Server application

  • Authorization (string):Contents of the Authorization HTTP header

  • From (string): Contents of the From HTTP header

  • IfModifiedSince(string): Contents of the IfModifiedSince HTTP header

  • Referer (string): Contents of the Referer HTTP header

  • RequestContentLength (string): Contents of the Content-Length HTTP request header

  • UserAgent (string): Contents of the UserAgent HTTP header

  • Accept (string): Contents of the Accept HTTP header

  • AcceptCharset (string): Contents of the AcceptCharset HTTP header

  • AcceptEncoding (string): Contents of the AcceptEncoding HTTP header

  • AcceptLanguage (string): Contents of the AcceptLanguage HTTP header

  • URI (string): Contents of the URI HTTP header

  • AcceptRanges(string): Contents of the AcceptRanges HTTP header

  • Connection (string): Contents of the Connection HTTP header

  • Date (string): Contents of the Date HTTP header

  • Host (string): Contents of the Host HTTP header

  • Version (string): Contents of the Version HTTP header

  • Method (string): Contents of the Method HTTP header

  • Allow (string): Contents of the Allow HTTP header

  • ContentEncoding (string):Contents of the ContentEncoding HTTP header

  • ResponseContentLength (string): Contents of the ContentLength HTTP header

  • ContentType(string): Contents of the ContentType HTTP header

  • Expires (string): Contents of the Expires HTTP header

  • LastModified (string): Contents of the LastModified HTTP header

  • Location (string): Contents of the Location HTTP header

  • Pragma (string): Contents of the Pragma HTTP header

  • Server (string): Contents of the Server HTTP header

  • WWWAuthenticate (string): Contents of the WWWAuthenticate HTTP header

  • RetryAfter (string):Contents of the RetryAfter HTTP header

  • ContentLanguage (string): Contents of the ContentLanguage HTTP headert

  • Link(string): Contents of the Link HTTP header

  • MIMEVersion (string): Contents of the MIMEVersion HTTP header

  • Title (string):Contents of the Title HTTP header

  • ResponseVersion (string): Contents of the Version HTTP header

  • StatusCode (string): Contents of the StatusCode HTTP header

  • StatusReason (string): Contents of the StatusReason HTTP header

  • ResponseAcceptRanges (string): Contents of the AcceptRanges HTTP header

  • Age(string): Contents of the Age HTTP header

  • CacheControl (string): Contents of the CacheControl HTTP header

  • ContentDisposition (string): Contents of the ContentDisposition HTTP header

  • ContentMD5 (string): Contents of the ContentMD5 HTTP header

  • TransferEncoding (string): Contents of the TransferEncoding HTTP header

  • RequestBody (string): Contents of the request message body

  • ResponseBody (string): Contents of the response message body

  • TruncatedFields (string): Comma-separated list of field names above that were truncated. For example, if the ResponseBody field is set to 256, and an HTTP message with a longer response is processed, the response is truncated and this field includes the string ResponseBody. Note that this field itself will be truncated if it is not large enough to hold all the field names that were truncated for a specific message.

Typechecking and Error Handling

Schema fields entered in Studio's Edit Schema tab not listed above, or of type other than string and blob, cause a typecheck error to be thrown.

A tuple is emitted only when a complete HTTP request and response message is detected in the TCP stream. If one or more IP packets are missing such that the TCP stream cannot be reconstructed, no tuple is emitted for the corresponding HTTP message.

Suspend/Resume Behavior

When suspended, the adapter stops processing IP packets. Note that doing so may cause the libpcap or WinPcap libries to discard IP packets, resulting in lost tuples when the adapter is eventually resumed.

Related Topics