Imagine placing videos through auction adjacent to niche-topic movies (like political thrillers) on Google Play in the same way we place videos adjacent to topical content through auction today. During the next presidential election that’s exactly how we’ll do it.
Most of what I know about databases and data management I taught myself while working with (or thinking about the problems of) political campaigns. The rest I learned in grad school while thinking about thinking about politics. Either way, most of my professional life has been spent organizing data and trying to figure out what, if anything, we could actually know from looking at data. If nothing else, I’ve figured out one thing for sure: organizing the data is inextricably linked to learning anything from it.
I have always been frustrated by the lack of careful thought about the relationship between how data is stored and modeled and its intended use. You can think of this relationship as the organizational posture of data—the state of data when you are not looking at it determines how easy it is to get down to understanding data when the time comes. Awareness of posture matters just as much for business as it does in academic research and political campaigns.
Most small organizations get their data posture wrong because getting data management right used to be expensive and difficult. Worse still, if you got it wrong, you could actually make a nightmare out of trivial access to mission-critical data. Those circumstances created some bad habits.
Now data management infrastructure has matured to the point where tools for managing and collecting transactional data are readily and cheaply available. Whether browser-based database applications, e-commerce tools, point-of-sale customer identification, email marketing services, QR-codes in printed communications, or mobile computing apps, we have available to us plentiful ways to measure the mechanisms of audience engagement.
Likewise, tools and techniques for understanding and analyzing data have reached the point where high-value insights are available for the simple cost of collecting good data and managing it well. Business Intelligence (BI) tools—built on the premise that ad hoc data analysis should be easy to implement without hand-coding and complicated programming—have broken out of the Enterprise-only realm and are accessible to organizations of even modest scale.
Inexpensive transactional data management coupled with accessible BI tools may sound like data paradise but there’s a missing piece of the puzzle: the tools for managing and collecting transactional data don’t necessarily store and present data optimally for analytical tools—and they probably shouldn’t.
In technical jargon this is the difference between OLTP (online transactional processing) and OLAP (online analytical processing). Most small-to-medium organizations (especially those that are undercapitalized) will try to get by conducting analytical functions from within their transactional databases. This tactic is often manageable at small volumes but it does not scale well. The tipping point comes when the analytic potential of non-mission-critical data becomes valuable enough that it makes sense for the organization to start gathering and collecting it. That the analytical data is valuable is assumed; but if the organization pools that data, in what is likely the only data management tool available, with mission-critical transactional data then problems develop.
First, as live transactional data becomes a smaller and smaller proportion of the overall dataset; transactional performance suffers—requiring processing power or worse, users, to sort through irrelevant data to get to current customer information. Second, analytic power suffers because the organizational schema is built to optimize insertion of individual transactions rather than extraction and summarization of large-scale analytics.
The art of striking the right organizational data posture is to recognize when you are at the point of realizing serious value from data analytics. There are two things you want to avoid: (1) you don’t want to be playing catch-up later against competitors who got it right and (2) you don’t want to have to disentangle an intertwined mess of OLTP/OLAP databases.
Most people interested in politics have at least some exposure to the use of statistical inference at a very basic level–at the very least, most would accept that it does work and know of some specialists more adept than themselves at using it. Polls are stock-in-trade of most political news coverage and so every entry-level political junkie quickly develops some facility with the jargon necessary for trading suppositions based on horse-race polling.
Of course, a little experience lends the ever-so-slightly-seasoned political operative to learn that the real value of polling lays not in measuring where we are but in guiding resource allocation. The most important inferences (educated guesses about facts which cannot be directly observed) for campaigns to make have to do with the differential effect that issue messages or various framings of contestable issues have on persuadable voters—and the key information here is which messages move just enough voters at the lowest cost. In a campaign with scarce resources (and I’ve never seen any other kind) investing in information that helps you spend your communications budget wisely is almost always a good choice.
Micro-targeting is a deeper use of statistical inference that is becoming increasingly available to campaigns. It started with the largest (nationwide, presidential) campaigns, but the method has become more accessible to smaller (statewide, nationally-targeted congressional, and statewide-targeted legislative) campaigns. The practice is revolutionary for campaigns and works like this:
Start with a voter file…improve that voter file with commercially available data about consumer behavior (especially consumer behavior reasonably linked to political interests)…draw a large sample from the improved voter file…poll large sample on a battery of political interest questions…regress poll results on sample to derive a model that relates political interests to consumer behavior…apply that model to the broader voter file to predict electoral behavior…use predictions to constrain persuasion and mobilization communication expenditures and reduce wasted communication.
Now, the casual observer might notice that there are quite a few steps of inference and supposition in that chain of reasoning, but these are sensible when performed by someone who fully understands the statistical pitfalls.
A deeper way to apply statistical inference to improve targeting and resource allocation is to combine micro-targeting predictions (and/or traditional polling results) with tactical field campaign results to further pare down target universes or to highlight mobilization segments.
Micro-targeting is revolutionary as a first application of bringing large datasets to bear to drive inferences about electoral behavior. But the state-of-the art has already begun to pass this technique by. The obvious innovation of combining micro-targeting predictions with tactical field data suggests a more powerful methodological move we could make. Field campaign data represents the tip of the iceberg when it comes to measurable campaign activity. Generally, field efforts in well-organized campaigns collect two dimensions of data (turnout intent & candidate preference) coded on a five- to seven-point scale. The development of statewide shared voter files through browser-based data applications has fostered the spread of best-practices, improved coding, and made it possible to collect, store, and accumulate such data across election cycles.
But vote choice and vote intent (while highly instrumental variables) are only a very narrow view into political behavior. Given that we have a wide array of tools available to deliver calls-to-action across a number of media and a number of corresponding tools to measure response we actually have the capacity to generate call and response filters that can define multi-dimensional segmentation for political audiences. This gives us an opportunity—if we will take it—to walk non-voters through a succession of calls-to-action and measured iterative communication to determine which paths will lead which sub-segments to convert to voters, to become our voters, or to adopt a shared identity.
If you are interested to learn how to implement these techniques in your campaign, we’d love to hear from you.