About Clipen

Clipen - Clickstream Processing Engine brings you the most complex and quality clickstream data completely processed and sessionized in near real-time. It does not require any inaccurate server log files or website modifications, which are unavoidable when incorporating client-side collecting methods.

While easily and quickly deployable within your operating environment, the Clipen technology provides unbeatable ability to effectively cope with complex server infrastructures as well as with multiple websites.

Specially designed for high performance and scalability, the engine allows cost-effective incremental building of applied hardware resources. Including Generic Content Tracking and the processing customization feature, the Clipen technology is able to meet any clickstream-specific demands accompanying web analytics and data warehousing projects.

Product Objectives

The clickstream data volume and the rate of its real-time generation are the major technological problems faced by clickstream data processing. High-volume websites receive hundreds of millions of hits per day that have to be quickly and effectively processed. To be able to manage such demands, the processing engine has to provide high performance and scalability that are achieved through parallel processing techniques. In view of the fact that the volume of data transferred over the Internet is constantly increasing, there is a necessity for technology that will be able to face these challenges on a long-term basis.

The Clipen technology does just that. Clipen is a high-performance and scalable clickstream processing engine designed to produce complex and quality clickstream data especially for the purposes of clickstream data warehousing and web analytics. It is a production technology for the infrastructure of web analytics systems and applications. The technology is completely non-invasive as no tracked website modifications are needed to deploy it.

Clipen gets input clickstream data by sniffing a network TCP communication between tracked website's HTTP server(s) and website users' client applications. The sniffed data are then processed and stored in a relational database where they are post-processed using RDBMS and made available for extraction. The most important part of the processing is session identification using a session tracking mechanism. Accordingly, the most important part of the post-processing is a sessionization that ties the identified session data together to make a session and to aggregate specific session statistics and indicators.