May 04 2020
Engineering

Find Out What Kinds of Challenges We Tackle Every Day

VideoAmp Engineering faces interesting and complex problems across a number of diverse fronts. We deal with incredible scale, insane data complexity, diverse methodologies, hybrid cloud and data center infrastructure, while designing services and products for a fast shifting and highly competitive industry. 

How do we tackle such an immense set of tasks? By dividing and conquering across many teams: infrastructure engineering, front-end engineering, data engineering, systems engineering, high-frequency engineering and data science. Read on for a deeper dive into our engineering disciplines and how each solves for the big challenges they face every day.

Infrastructure Engineering Team: Building Newer and Better Products Faster

As the Infrastructure Engineering Team, Our primary focus is to enable engineers to create powerful, simple and efficient tools and processes. To accomplish this goal, we look at operations, infrastructure, security, reliability and CICD as a software problem and provide automated solutions. Our culture consists of engineers across the development and operations spectrum who work closely to break down silos between teams. We do this by building self-service tools and incorporating the right technologies and standards to get the job done. We move fast, break things, learn, improve, and then we do it all over again.

We move fast, break things, learn, improve and then we do it all over again. — Hector Sahagun, Engineering Manager

WHAT CUSTOMER CHALLENGES DOES YOUR TEAM FACE, AND HOW DO YOU SOLVE THEM?

Shipping quality software fast. To accomplish this, we need to enable our engineers to build solutions at an improving level of quality and efficiency, we need to deliver pipelines, infrastructure, security and monitoring that promote ease of adoption, provide quick feedback loops and are known to work a high percentage of the time, whether they’re POCs or hardened production systems. 

We solve these problems by providing solutions as code. Infrastructure, security and CICD pipelines are all managed as software solutions using a lot of the same tools our developers use. Building solutions this way allows us to quickly iterate to improve and maintain the tools and infrastructure our developers depend on.

WHAT TECHNICAL CHALLENGES DOES YOUR TEAM FACE, AND HOW DO YOU SOLVE THEM?

One of the bigger challenges we face is the sheer number of technologies and services we manage and support. The sprawl can sometimes be difficult to manage. To combat this, the team is constantly learning, cross training, and automating. We favor solutions that require low to no operational support and if we need to do something manually more than once, we automate. We celebrate our failures as much as our successes, taking away key learnings each time that help us build better solutions in the future.  

Front-End Engineering Team: Building Beautiful, Scalable User Interfaces

We provide the infrastructure necessary to build beautiful, scalable, and performant user interfaces that deliver information to our customers in optimal and intuitive ways, so they can plan and execute quickly with confidence. We do this by supporting the platform teams with our centralized applications: authorization and authentication, organization administration, UI component library (PreAmp), and other various packages that support the front-end infrastructure.

WHAT CUSTOMER CHALLENGES DOES YOUR TEAM FACE, AND HOW DO YOU SOLVE THEM?

Our customers face daunting applications with poorly designed user interfaces that provide a one-size-fits-all approach, which oftentimes will not work for everyone. We work very closely with our UX design and UX research teams to establish paradigms tailored for various user groups. Prototyping helps us achieve that. The intent of creating a prototype is user testing, which tells us how usable and valuable our product is to the end-user. We gain inputs and insights about how real users would actually use the product and what we can improve to address their concerns. The result is either a new UI component that needs to be built or new application workflows that need to be architected.

WHAT TECHNICAL CHALLENGES DOES YOUR TEAM FACE, AND HOW DO YOU SOLVE THEM?

As projects become more complex and more teams depend on our Front-End infrastructure, it becomes harder to make changes without breaking existing functionality. With this in mind, we are constantly streamlining the development workflow around our applications.

To deploy with confidence, we employ additional tooling on top of our current test methodologies with automated visual UI testing (Percy and Cypress) to ensure that our applications always look exactly as intended. This is coupled with pre-merge capabilities, where stakeholders can interact with the application that reflects the engineer’s work without it being deployed into a production environment.

We use industry best practices and enforce standards when developing our apps with shared configurations and shared libraries for reusable functionality. When scaling across the engineering teams, we maintain UI/UX consistency with our own bespoke design system, PreAmp. This allows us to iterate fast.

We consider all these projects “internally open-sourced” and encourage our engineers to contribute to these projects as a way of sharing their knowledge across team boundaries. 

It’s important to have people whose sole purpose is to think about how everything fits together. That’s why we work closely with other engineers, UI designers, and product leaders. — David Ung, Director of Engineering

The most important role these engineers have is to share context and act as a relay between engineering and product and design. It’s important to have those whose sole purpose is to think about how everything fits together. Without proper ownership and oversight, shared infrastructure tends to become a complex mess of patches that feels more like a bottleneck than a helpful tool. Communication can not be overstated.

PreAmp is VideoAmp’s open-source design system for building high-quality, consistent user experiences.

Learn more about it

Data Engineering Team: Extracting the Most Value From Data

We build measurement solutions for advertisers who wish to optimize their linear, digital, OTT and cross-screen campaigns. Our platform measurement solutions allow advertisers to quickly identify media investments that are underperforming, so they can shift budget to get a higher ROI on their campaigns. This requires processing millions of rows of data from numerous disparate third-party datasets and then aggregating them into our visualization dashboards to provide meaningful insights. 

We build measurement solutions for advertisers who wish to optimize their linear, digital, OTT and cross-screen campaigns. — Austin Guthals, VP of Engineering

WHAT CUSTOMER CHALLENGES DOES YOUR TEAM FACE, AND HOW DO YOU SOLVE THEM?

Customers want information that is representative of the entire U.S. census. This is challenging because we only get a subset of data based on our independent panels. We must model and extrapolate the data in our subset to scale numbers to be representative of the U.S. population. This requires the implementation of machine learning and significant computing resources. Another challenge is keeping pace with the adtech landscape. Privacy laws, GDPR and CCPA constantly challenge us to find elegant solutions. 

WHAT TECHNICAL CHALLENGES DOES YOUR TEAM FACE, AND HOW DO YOU SOLVE THEM?

Since we are working with large datasets that scale up to the U.S. census, we routinely have to solve big issues with the scale of data. How do you aggregate over two million rows and return an immediate response to a user on a SaaS platform? This scale of data is usually processed offline in batch reports; however, we want to build a SaaS platform that gives our customers the ability to instantly gain insights. How do we do this? No matter how hard we try, some data must still get pre-processed in an offline mode to prepare it for ingestion into our SaaS platform. We use databases like Hive, PostgreSQL and Snowflake to gain blazing performance. Offline reporting is achieved in our Spark clusters running Python and Scala with a NodeJS API layer running in Kubernetes to pipe the data back to the user. 

Our architecture allows our users to pivot, sort and search two million row reports in seconds, as opposed to minutes or hours. This was achieved by using Snowflake in an unconventional way. Snowflake is traditionally used as a data warehouse. But it is, at heart, a distributed columnar store database that allows for infinite scaling in both compute power and storage.  We were one of the first customers to point our NodeJS APIs directly at Snowflake for real-time reporting on large datasets. We received early beta access of materialized views, and through rigorous tuning and performance testing, we have achieved some amazing results.  We continue to fine-tune our databases to ensure optimal performance by using the right cluster keys, partitions, indexes, materialized views, user defined table functions (UDTF) and efficient queries.

Central Data Team: Delivering the Best TV Data Possible 

Data quality and data engineering go hand-in-hand here at VideoAmp, and we strive to deliver the best TV data possible to our customers. — Joel Normandin, Sr Director of Engineering

VideoAmp solutions enable marketers and media owners to optimize their entire portfolio of linear TV, OTT and digital video inventory by measuring how advertising performed against in-market sales prospects. TV and premium video advertising is being reinvented through big data, predictive analytics and other audience ontology tools that were previously only available in the digital world. 

We exist to solve the big data and data engineering problem: how to collect and measure advertisements, linear TV viewership, user segments and much more. Whether it’s making high-volume data available for internal application teams, enabling deep dive investigations for data science modeling and/or producing custom reports for customer focused teams, providing the data is what we do. Obfuscating the complexities of all our upstream source data into a cohesive and performant data structure empowers the business to be successful.

WHAT CUSTOMER CHALLENGES DOES YOUR TEAM FACE, AND HOW DO YOU SOLVE THEM?

At VideoAmp, we employ a number of practices to ensure that only the best TV data reaches our production environment. Our engineering teams have heavily invested in building data pipelines that make it easy to cleanse, enrich and audit our datasets at scale. Each day, we process hundreds of millions of records from our upstream data providers. During the cleanse and enrichment steps, we transform the disparate data into one canonical format, providing an ability to slice the data to better understand relationships and outcomes. As part of our auditing framework, we collect thousands of metrics on top of the incoming data, which allows us to automatically audit the data.

After a dataset has passed our audit checks, it then moves onto the automated acceptance testing phase. This provides a means of automatically performing the quality inspections and reporting needed to vet the quality of the data.

WHAT TECHNICAL CHALLENGES DOES YOUR TEAM FACE, AND HOW DO YOU SOLVE THEM?

We ingest billions of records, store terabytes of data and process thousands of CPU hours each day. We focus on technical excellence and industry leading technologies to accomplish these tasks efficiently and with repetition.

The engineering and operation teams have built flexible and powerful Kubernetes Spark clusters and distributed data warehousing with technologies such as Snowflake and BigQuery. These technologies, along with HDFS and Hive, provide the infrastructure needed to scale processing and make this data available. Utilizing the automatic auditing system, we are able to validate functional changes to ensure backwards compatibility and functional validity while maintaining an impressive team velocity.

Systems Engineering Team: Making Sure Systems Work Well Together

Our main goal is to architect, deploy, operate and maintain large-scale systems and network infrastructure. We focus on achieving consistency, security and scalability in these areas. We work hard to improve efficiencies, minimize risk and reduce operational costs through automation.

We work hard to improve efficiencies, minimize risk and reduce operational costs through automation.— Eric Lakich, VP of Engineering

WHAT CUSTOMER CHALLENGES DOES YOUR TEAM FACE, AND HOW DO YOU SOLVE THEM?

One challenge we face is delivering high performance and highly available solutions for both external and internal customers. The utilization of distributed computing, redundant systems, fault-tolerant network hardware and sophisticated solutions for monitoring and automated remediation are essential to providing the availability, performance, and change propagation on live systems.

WHAT TECHNICAL CHALLENGES DOES YOUR TEAM FACE, AND HOW DO YOU SOLVE THEM?

Rapid growth, fluid requirements and complexity of services require continuous evolution of systems and network infrastructure. Implementing new methodologies or researching and trying new technologies is important to ensure maintaining operational excellence, availability and pushing the boundaries of performance.

High-Frequency Engineering Team: Building for Speed, Scale and Perfection

Advertisers need a sophisticated platform to connect with users in a customized and intimate way. We know that content is consumed in a variety of ways and reaching those users is a complex task. Our real-time system allows advertisers to reach users on multiple devices including OTT, connected TV and mobile devices in real time, while the user is watching the content. The system is also connected to our VideoAmp data platform, which allows advertisers to target audiences from linear TV through a digital platform. Our team helps connect advertisers and users in real time through content combined with intelligent data and algorithms. 

Our team helps connect advertisers and users in real time through content combined with intelligent data and algorithms. — Jayant Kumar, Director of Engineering

WHAT CUSTOMER CHALLENGES DOES YOUR TEAM FACE, AND HOW DO YOU SOLVE THEM?

Advertisers like to use our linear datasets to optimize their digital advertising spend. They want to reach audiences that have seen certain programs or networks that align with their brand needs. Our buying platform ingests linear viewership audience data into our real-time bidding platform for digital ad buying.

Campaign managers need to access the campaign performance in real time — with 99.9% uptime — so they can optimize their campaigns as needed.They also need the ability to access on-demand reports for their clients. So we introduced a hybrid design consisting of ETL and ELT to build a data warehouse, which fuels both reporting and analytics UI. This new on-cloud Snowflake warehouse offers an on-demand reporting capability with near real-time performance.

Advertisers need to pace their campaign spend in the desired way. They need capabilities like smooth pacing, live-stream pacing and programmatic guaranteed pacing. For this reason, we built a PID Pacer based on multiple research papers that can recover the spend automatically based on its historical performance and offers customized pacing for live-stream traffic.

WHAT TECHNICAL CHALLENGES DOES YOUR TEAM FACE, AND HOW DO YOU SOLVE THEM?

The real-time bidding system processes one million+ requests per second with latency of 30 ms or less. The system consists of 15+ applications, and all of them have to work in concert with one another. It needs a very detailed holistic monitoring and alerting system. That’s why we introduced Datadog. It allows us to customize and separate business KPI-related alerts from engineering KPI related ones.

We built the buying platform from scratch and have introduced new technologies to the company, like Flume and Aerospike, to offer the required low latency SLA. We also introduced a new language at VideoAmp — named Golang — for all of our backend applications to build highly optimized applications.

Every millisecond counts when dealing with trillions of requests each day, so we place a lot of importance on using algorithms and data structures to write highly optimized and performance-centric code. You can routinely overhear conversations between engineers about code performance in terms of memory and time complexity with Big O notation.

We need a way to massage and aggregate the data in real time for the real-time bidding system. We use Kafka, Apache Flume, Apache Airflow and Spark Jobs (built in-house) to offer such capability.

It’s very necessary to build an E2E testing system for such a complex and distributed system. That’s why we developed an in-house E2E suite test, which runs on all environments — development, staging and production — on a regular basis.

Data Science Team: Instilling Confidence in Data

Even some of the richest and most complex data in the industry can be rendered useless or even detrimental without skillful interpretation. Our primary objective is to develop rigorous methodologies to ensure all of our data is leveraged in mathematically robust ways and empower our engineering and product teams to build state-of-the-art tools — ultimately allowing data to drive our clients’ business decisions with the utmost confidence.

WHAT CUSTOMER CHALLENGES DOES YOUR TEAM FACE, AND HOW DO YOU SOLVE THEM?

Customers are often entrenched in traditional heuristics, relying on antiquated vanity metrics that may not accurately reflect the reality of the advertising landscape. By combining our customers’ industry knowledge with our experience in machine learning, statistical methods and mathematical modeling, we develop metrics that matter for the business outcomes they care about, supercharging their advertising performance. 

WHAT TECHNICAL CHALLENGES DOES YOUR TEAM FACE, AND HOW DO YOU SOLVE THEM?

VideoAmp constructs a full view of consumers’ journeys from their TV and digital exposures all the way to the final consumer conversion. We leverage this unique perspective to develop methodologies that provide our customers with a full pipeline to plan, allocate, measure and optimize for business outcomes. We deeply explore the rich individual-level data, constructing the most meaningful combinations of features for use in the training and testing of bespoke machine learning models. Uncovering actionable insights amidst the sheer scale of our dataset poses the biggest difficulty in providing these tools. 

Even with 18 million households in our panel, we must take extra precaution to ensure that our data represent the true shape of the U.S. population. Our novel skew correction protocol corrects sampling bias by accounting for each household’s demographics and TV viewership. With our unbiased panel, we can forecast the total number of ad impressions seen by any set of target demographics by constructing fast and flexible tree-based models with handcrafted features and custom-built audiences.

Many advertisers care less about total ratings than deduplicated reach. Accurately forecasting the number of unique households reached is impossible without building a holistic picture of the media landscape. Our taxonomy classifies viewers into “clusters” that can be understood and predicted independently, dramatically increasing our ability to forecast how many unique households will be exposed to a given ad campaign. 

Once we’ve forecasted the reach for all the ads in a customer’s campaign, we search the enormous space of possible media plans to find the one that maximizes the reach at the lowest cost and with the fewest off-target impressions. Our genetic algorithm creates thousands of media plans that compete, reproduce and evolve until only the fittest plan survives. Then, by monitoring the performance of our customers’ campaigns in real time, we can employ in-flight methodologies to provide them with data-driven insight into how they should optimally reallocate ad inventory to meet their desired business goals.

With a mixture of machine learning and game theory, we combine our consumer-level viewership data with our customers’ business data to understand how ad exposures lead to actual business decisions. — Ali Beyram, Sr Director of Data Science

Ultimately at VideoAmp, we care about measuring what really matters: the impact that advertising campaigns truly have on our customers’ bottom lines. With a mixture of machine learning and game theory, we combine our consumer-level viewership data with our customers’ business data to understand how ad exposures lead to actual business decisions.

Quality Engineering Team: Ensuring High Standards are Always Met

We exist to ensure our clients and their business requests and expectations are held to the highest quality on every deliverable. High-quality releases build trust with our clients and business counterparts, which, in turn, instills trust and confidence, giving us the edge on rapidly delivering competitive marketing data. 

Getting continuous delivery to its optimal velocity requires fast feedback loops in pre-commits and staging environments. Gaining confidence in a production build requires quality control in each release candidate, which we accomplish through our rigorous test suites. Our testing requirements are defined by product teams and SDETs working together implementing business rules and creating automated scripts around the business logic. This formula empowers engineering teams to iterate quickly and identify risks in a clear reporting analysis. Gauging confidence in the final build is reviewed together, as every engineer and stakeholder is responsible for the overall quality.

WHAT CUSTOMER CHALLENGES DOES YOUR TEAM FACE, AND HOW DO YOU SOLVE THEM?

With more than 12,000,000 lines of code in over 500 repos, a sound testing culture is essential to our success. In order to manage large sets of code, we chose scalable, enterprise software that’s agnostic to operating systems and browsers. This allows the quality engineers (QEs) more time to stay focused on the output, write more tests, and qualify builds quickly. By not spending time debugging frameworks, overcoming infrastructure and technology challenges, QEs are freed up to tackle the customer and company challenges.

With more than 12,000,000 lines of code in over 500 repos, a sound testing culture is essential to our success. — Joel Normandin, Sr Director of Engineering

WHAT TECHNICAL CHALLENGES DOES YOUR TEAM FACE, AND HOW DO YOU SOLVE THEM?

Visibility and traceability are always a difficulty when it comes to centralizing output from disparate systems. Providing easily consumed stats and progress in a uniform dashboard without using cost-prohibiting solutions is done through customization. This raises the bar even higher in both the skill sets required and the availability of customizable GNU software. 

Our approach is to use historically proven software like Jenkins and TestRail to open up the world of our progress and test results. All of these can be fitted to a monitor or linkable in a report making them ideal solutions to the problem. Jenkins has a beautifully baked in option for percentage-progress and pass/fail status called “Build Monitor View.” That view displays a clear progress bar of the branch being built along with its teststatuses in a visually identifiable pass/fail color scheme.

Test results are not helpful without traceability. A stack trace is a key to correcting errors however, manually parsing log files is timely and costly. We use TestRail as a solution in test case management that provides the team’s optics on individual test status and records logs on errors. More importantly, TestRail records test case log failures that include a trace specific to the test event. The precession in reporting buys back time for other efforts on projects.

Data Platform Team: Exchanging Value, Not Data

We are building VideoAmp’s next generation data platform. Privacy, security, and governance are our highest priorities. By focusing on these three things, we can unlock billions of dollars of latent value, locked up in datasets that can’t be shared.

Privacy, security, and governance are our highest priorities. — Drew Goya, Architect

We are building our infrastructure on Google Cloud Platform, leveraging Google’s near infinite scale and deep investments in security. Their products have allowed us to design data processing that happens “in place” without complex ETLs.

WHAT CUSTOMER CHALLENGES DOES YOUR TEAM FACE, AND HOW DO YOU SOLVE THEM?

Our customers need a way to bring transparency and accountability to ad buys without violating user privacy. We are providing the technology infrastructure to do this.

WHAT TECHNICAL CHALLENGES DOES YOUR TEAM FACE, AND HOW DO YOU SOLVE THEM?

We needed to design a system that does not require our customers to trust us. To accomplish that our systems are built on a chain of trust, that asks our customer to trust two things:

  1. Google Cloud Platform products behave as described in their documentation.
  2. Google Cloud Platform audit logs faithfully represent what happened in a project.

Every claim we make about security, privacy, and governance is independently verifiable. 

We also need to protect user privacy in a dynamic data environment, where our customers are empowered to use their own data in queries and data processing jobs. This has required us to build differential privacy into the core of the data platform.