Skip to content
This repository has been archived by the owner on Jul 22, 2024. It is now read-only.
Mark Sturdevant edited this page Apr 13, 2018 · 7 revisions

Short Name

Ingest and analyze event data streams for timely insights

Short Description

Visualize statistics about taxi rides while the event data is streamed from an external program

Offering Type

Analytics

Introduction

Built for anyone interested in doing data analytics on event data while it is streaming, this code pattern uses a Jupyter notebook, Spark SQL and matplotlib to show taxi trip event statistics while the events are streaming. A Java program streams the data into IBM Db2 Event Store which is optimized for event-driven data processing and analytics.

Author

By Jacques Roy and Mark Sturdevant

Code

Demo

  • n/a

Video

Overview

In this code pattern, a Java program runs as a daemon and will submit events to IBM Db2 Event Store. A Jupyter notebook is used to show how to interact with the event store using Python. An animated matplotlib chart is used to visualize the changing data while the events are streaming. Taxi trip data is used as the event stream. The average trip duration for each start time is continuously updated. The chart also shows the trip count to help visualize the growing database of taxi trips.

We chose to use taxi data from a CSV file so that you can easily run this code pattern without signing up for another external data feed, but it should be clear that the code pattern is designed to demonstrate event-driven data processing and analytics that can scale to support massive amounts of data. The code pattern can easily be modified to work with your own event stream. Our data has timestamps to make it easy to see simple statistics on all the data including the latest events. With your own events, you can use the notebook to experiment with charts and show how your events are trending with up-to-the-minute statistics.

When the reader has completed this code pattern, they will understand how to:

  • Install IBM Db2 Event Store developer edition
  • Interact with Db2 Event Store using Python and a Jupyter notebook
  • Use a Java program to insert into IBM Db2 Event Store
  • Query the database while inserts are in progress
  • Show live updates with an animated chart

Flow

architecture

  1. User runs Jupyter notebook in DSX Local
  2. Notebook connects to Db2 Event Store to analyze live event stream
  3. External Java program sends live events

Included components

  • IBM Db2 Event Store: In-memory database optimized for event-driven data processing and analysis.
  • IBM Data Science Experience: Analyze data using RStudio, Jupyter, and Python in a configured, collaborative environment that includes IBM value-adds, such as managed Spark.
  • Jupyter Notebook: An open source web application that allows you to create and share documents that contain live code, equations, visualizations, and explanatory text.
  • Python: Python is a programming language that lets you work more quickly and integrate your systems more effectively.
  • Java: A secure, object-oriented programming language for creating applications.

Featured technologies

  • Databases: Repository for storing and managing collections of data.
  • Analytics: Analytics delivers the value of data for the enterprise.
  • Data Science: Systems and scientific methods to analyze structured and unstructured data in order to extract knowledge and insights.

Blog

Links