Integrating Media Analytics & Reporting For Direct Response Campaigns
Diray Media is a leading DRTV Direct Response Advertising Agency focusing on Integrated Direct Response (IDR), combining TV and Digital to create an integrated, customized campaign that maximizes ROI for its customers.
Diray enables its customers to get the power of TV, online, mobile, and social media working together to tell more stories, activate more responses, and drive more sales. Their proprietary software platforms, MediaSync and MediaInsights, ensure all media channels are working in unison to achieve optimum results. This technology integrates media buys and syncs with additional analytics and reporting to provide the best data allowing them to hyper target specific audiences to ensure their customer’s ad dollar is used effectively to maximize online and retail sales.
Diray Media needed the ability to ingest, analyze, and report on large amounts of both new and updated data from numerous sources. Then they needed the ability to process the data into a Redshift Data Warehouse in a timely and efficient manner. The intraday ingestion of data contains not only new records sets, but also updates to previous records. Although Redshift is a perfect solution for large summarization queries over records spanning multiple years, updating the intraday record set in Redshift is neither efficient nor best practice.
Columnar Database Management Systems like Redshift are not optimal for multi-intraday data loads since many of those records, in Diray’s case, were updates. As updates occur, the system would become unreliable at the current resource node allocation level. While it was possible to add additional nodes to aid loading during those processing windows, the additional resources became underutilized when outside of those update windows.
In order to achieve an optimal architectural configuration centered on best practice, a better approach was needed for the loading and processing of data into Redshift. Additional architectural objectives include meeting the time window allocated for intraday data processing, mitigating and controlling costs, and enabling Diray to have the capability of reporting on the latest data.
The solution was designed to accommodate daily data feeds consisting of a sizable amount of new records, in addition to updates to existing records. We established an OLTP RDBMS (Postgres RDS) to handle intraday activity. The core analytical Data Warehouse remained on Redshift, and was loaded from the Postgres RDS during off peak hours. All supplemental tables, including control tables, were moved to the RDS, eliminating superfluous data in Redshift which left only the data warehouse and legacy reporting tables in the cluster.
Not sure how to begin your journey adopting AWS? Let WCI be your guide.
Our hybrid approach and utilization of the RDS service allows us to perform inserts and updates in an efficient and timely manner which enabled intraday reporting on the most current data elements. Once the data has been reconciled and validated, it is extracted and loaded into the Redshift cluster. Since the data being loaded into Redshift contains only new data records, the load process is efficient, as data is only appended to the current historical record set being maintained. Once the data has been loaded into Redshift, the inserted data is then deleted from the RDS instance.
Diray’s reporting needs require queries against both daily transactional data (RDS) as well as non-transactional analytical data (Redshift). To meet these needs, pgbouncer-rr (an open source PostgreSQL connection pooler) was incorporated. Addition of pgbouncer-rr provided two new significant features:
- Routing: it can intelligently send queries to different database servers from one client connection which allows it to partition or load balance across multiple servers/clusters.
- Rewrite: pgbouncer-rr provides the ability to intercept and programmatically change client queries before they are sent to the server. This enables us to use it to optimize or otherwise alter queries without modifying the source application itself.
These features were utilized to route queries either to the RDS instance or the Redshift cluster, depending on the subject matter and timeframe of the queries.
Reporting on all non-transactional daily loads and the historical information utilizes Redshift’s ability to run massive summation queries over large data sets, while the RDS instance is designated to handle the summarization queries on transactional data.
Results & Benefits
With the guidance and support of WCI Data Solutions, Diray Media was able to utilize several services inside the AWS platform to meet their business needs and better serve their customers, while reducing their overall spend. After implementing this solution, Diray was able to reduce the Redshift cluster by half without having to provision and spend unnecessarily on underutilized resources. Diray is still able to leverage Redshift’s query performance while providing near real-time reporting through the RDS. As a result, Diray Media has been able to enjoy substantial cost savings and a marked increase in system performance.