Building a F2P Liveops "Playbook" ft. Matt Emery
Matt Emery "The Mobile Games Consultant" shares his secrets to driving meaningful improvements in F2P liveops
Back in November of 2010, during the early days of F2P gaming, Zynga sued Playdom for stealing its trade secrets, specifically Zynga’s Liveops “playbook.”
Last fall Zynga sued rival gaming developer Playdom (recently acquired by Disney) for an array of issues including misappropriation of trade secrets, breach of contract, and breach of the duty of loyalty.
In short Zynga accused Playdom of stealing its confidential ‘Zynga Playbook‘.
Source: TechCrunch 2010
Today, many F2P game studios guard their Liveops best practices like state secrets. While most of these secrets are overblown in their ability to drive game performance, some tips & tricks can make all the difference. This is especially true when some games’ profitability may hang on a knife’s edge of slim margins.
Enter Matt Emery aka “the mobile games consultant” who today speaks about his own Liveops playbook. He’s built his playbook over many years and with experience having worked on game projects for companies like Blizzard, Boss Fight, People Fun, Glu, D3, and many others.
🎧 Listen on Spotify, Apple Podcasts, or Anchor
Speakers:
Joseph Kim. CEO at LILA Games
Matt Emery. Owner, Product Manager, Consultant at Turbine Games Consulting
Make sure to watch or listen to the full episode, but I include key takeaways from my conversation with Matt below:
How do you deal with low KPI metrics?
Matt Emery: We certainly look at a game’s KPIs and look for problem areas that could give us ideas for optimizations, but we don't focus a terrible amount on benchmarking a game against other games.
We haven't found that to be as useful as just looking at the game itself and looking for low-hanging fruit, looking for easy wins to improve any KPI that we think is moveable with reasonable effort. We don't try to keep up with the Joneses per se because a lot of games are very, very different.
They might be acquiring from different user bases. Like there are so many, there are so many differences that make benchmarking a little bit less actionable to us than just focusing on how do we drive your games KPI upward with minimum effort?
We take what I would call it pretty intuitively an ROI-first approach. And in practice, what this means is that we keep a backlog where you can imagine basically a spreadsheet with all of our initiative ideas for a particular game.
All of the things we think of that could be worth doing that could ultimately drive LTV, which is what we're looking for if we're focusing product side. And so we'll produce an impact estimate that rolls up to LTV for each potential idea or each potential initiative. And then we have in this hypothetical spreadsheet, another column where we enter our predicted effort on a 1 to 10 scale.
And so then you can imagine this spreadsheet where you have a list of initiatives, you have a column with all of your predicted LTV lifts. You have your column with your predicted effort numbers from 1 to 10, you divide the LTV column by the effort column that would give you a relative ROI score that you could use to sort the list in a very basic way.
There are several benefits to prioritizing that way. First and foremost, sorting by ROI surfaces low-hanging fruit, which is a term that you'll hear me abuse over and over again. Because that's what we love to do is find low-hanging fruit.
And those will be, of course, modestly scoped initiatives, things that would normally get swept under the rug or often neglected and passed over in favor of larger sexier initiatives and huge game features. I'm sure people can relate to that.
Another benefit to this approach we found is that when you build a discipline of forcing yourself to produce LTV impact estimates. And then you can obviously check your predictions against the actual results. If you split test, then you start to build your prediction muscles. That's probably the only way to build prediction muscles.
Finally, when you take this approach after working on enough games, you're building a vault of past initiatives and their impacts that can be usefully applied to future games. So at Turbine, we have a vault with hundreds and hundreds of split test results by this point from our favorite initiatives that we like. Every new test we run, it feeds our flywheel. It makes us a little bit smarter and it helps us be better equipped to help our clients place the right bets on which initiatives can drive the maximum impact.
What needs to go into a Liveops Playbook?
Matt Emery: In our process, there's a core loop and a meta loop, if you will.
The core loop starts with an idea for a specific initiative and it ends with a live split test of that initiative in most cases. And there are four major steps to get us from here to there.
So the first step is to prioritize and choose which initiative to work on next. And again, we take that ROI first approach based on KPI estimates and heavily informed by our historical vault of past tests and results. The goal here is to, wherever possible, pick the thing or always be working on the thing with the best possible ratio of impact to effort.
So once you've identified the initiative, the next step is the design phase which is pretty straightforward. We create a design doc for each initiative, as well as a design for the parameters of the split test because that's just as important. And at this phase, we also have a design library full of templates and wireframes, and specs that we've used in the past.
Then the next step is building the feature or initiative. And we have some process in our playbook here for how to reduce error in translating the designs into production code and art.
And then the last step is a big one, which is split testing. It's an area of particular focus in our playbook as anyone who's run split tests can probably attest to this. There are a million and one ways for them to go wrong and only a few ways for them to go right. So we built a process and a lot of guardrails for making split testing as efficient and repeatable and fail-proof as possible.
So that's the core loop. And over top of that, we have a meta structure, which is basically just a roadmap for how our teams collaborate together to turbo through as many of those core loops as possible over four to six months or more while also deploying and realizing any gains from split testing along the way.
Okay. So that's from wide aperture, that's what our playbook looks like. And of course, it continues to evolve all the time.
Biggest mistakes to avoid?
Matt Emery: The first big one will sound pretty obvious: is just not taking an ROI-first approach.
If you don't estimate ROI and prioritize accordingly, there's this tendency to neglect low-hanging fruit, and instead just build big juicy features. Big juicy features are great and they certainly have their place. You need them, but if you do neglect low-hanging fruit, then you're just not getting the best bang for your buck.
And that puts you at a significant competitive disadvantage, which you can't afford to have these days in this market. Don't neglect low-hanging fruit. That's the first mistake.
Another surprisingly common mistake is, and this one is pretty relatable to a lot of people… I've been surprised how many product managers and designers don't play enough top-grossing games, right?
They don't regularly play their competitors' games as well as top-grossing games from other categories with a critical eye. As a result, there's this tendency to reinvent the wheel or unnecessarily invent untested novel solutions to design problems that aren't novel. There are a lot of mobile games out there.
A lot of smart people have worked on a lot of things. And so almost any design problem you can think of… Like how do I get players to come back more days of the week? Or how do I increase the average purchase price? These are like red ocean design spaces where tons of things have been tried by a lot of smart people and mature, heavily battle-tested solutions have emerged to become best practice. So, you probably don't want or need to be in the business of showing up late to the party and just throwing a new dart at that dart board, right? Use your creativity elsewhere.
I rail about that a little bit in a soapbox, a little bit in some of my articles, but that's one of the ways we help our clients too, is just by having a broad perspective and having played and worked on a lot of games from different categories.
And so we frequently will cross-pollinate and import mechanics from one category to another. For example, from casual to core or from idle to arcade in and out of casino. I definitely encourage PMs and designers to play a lot of games. If it's top-grossing, you should be familiar with it.
It pays such dividends to see what other smart people are releasing. Typically after split testing, it's just a fountain of good ideas that can be put to good use.
3 Easy tips for PMs to try to drive KPI improvements
Matt Emery: So one area we love to explore is difficulty in economy tuning. We've found that aggressive tinkering or balancing here can have pretty profound results with only modest effort.
What we love to find is these low-hanging fruit type initiatives. Specifically, we've seen LTV lifts as high as like 20% for basic sources and sinks tuning. Similarly, we've had LTV lifts of up to 10% for just difficulty tuning. A lot of teams have done some difficulty tuning, but when we show up and repeat it, we tend to be a little more aggressive than people have a baseline tendency to.
We like to take big swings in our split tests and see what happens. 20% LTV lift, 10% LTV lift. These are the kinds of lifts that you would hope to get if you're lucky from a huge live ops feature that costs over 10 times more to develop. So these are the kind of things that we like to look at.
Anything that's a tuning test that can have a big impact is gonna be very high ROI if it works. You just have to be careful about downstream impact.
Another juicy target area for us is app store asset testing. This is something many teams are aware of, and many teams do, but probably not aggressively enough.
So testing new icons, new screenshots, Google Play feature banners is a little bit lesser known [tip]. Each of these requires like zero engineering because it's just art asset testing and development. But when you find a new winner, if you're testing on Google Play experiments or Apple's new equivalent feature, if you find a new winner, each type of asset can lift your install rate by 2, 3, 4, we've seen 10% even higher.
What we suggest is that at minimum teams test five icons, five screenshots, five feature banner concepts every month, and basically never stop. The ROI here is so profound that when a winner is found, it's such a no-brainer that it's really hard to over-invest in these areas. So, we really like ASO testing and many teams obviously are already focused there, but again, most don't do it enough, not as much as we recommend.
And then a third area of focus is IAP merchandising. This is just another area where you're leaning a bit harder on art than on engineering in most cases. And yet you can still drive 10 to 40% or even higher lifts in LTV.
So, anywhere where you can focus on leaning into art resources and a little bit lighter on engineering resources, that is very friendly to the resource constraints of most teams. And so, any wins you can find in those arenas are really, really nice.
Biggest lesson learned?
Matt Emery: One experience I had as a product manager that was particularly enlightening, maybe formative was running UA campaigns as a PM. I got to do that for a few clients and then for Glu mobile on a few games there, and that experience taught me two big lessons that I think kind of seep into all of my thought around my articles and our conversation today.
One, profitable user acquisition is really, really hard to do. And two that CPI are always going up, so you can't afford to stand still and expect to stay profitable.
Maybe it's worth taking a moment, if we have time, to dig into the mechanics a little bit of why CPI goes up.
It's not necessarily something that a lot of developers think about or have at the forefront of their consciousness. And, my sense is that, on the development side, I was in this boat, developers tend to assume that UA is just kind of happening happily in the background. And it's probably going fine cause I'm not hearing anything.
But the reality is, on the front lines, campaigns like UA campaigns are usually just hanging on a knife's edge of profitability and not certain to pay back even after a year or so. And of course, CPI going up all the time makes things even that much harder.
And so why, why does that happen? Why do CPI creep upward for a particular game? The first reason is what you could probably call like, maybe a golden cohort, which people have heard of or maybe a reverse selection effect. Which is that if you run a bunch of ads this month, then the users who see your ads are, and the users who are most interested in installing your game probably will.
And what does that leave behind? A slightly smaller audience for your ads next month, but who are also statistically speaking, less interested on average. In other words, every month, you skim the cream off the top and the users left over who have already seen your ads and chosen not to install yet are self-selecting. It's harder to get.
So that means every month you have to work a little bit harder to get those next installs. Work harder means practically that you have to deliver and pay for more ad impressions to get your next click and that means higher CPI. So your target audience, generally speaking, gets lower quality and more dilute every month as you continue advertising.
So that's the first factor.
The second factor, probably a little more obvious is that, every month, 500 new games hit the iOS app store, right? So chances are a few of these at least are playing in your pond, right? They're running UA, targeting audiences that overlap yours. And so basic supply and demand says your CPIs gonna go up.
And then third, your competitors hopefully are also doing continuous KPI improvement. And by competitors. I mean, the people that are advertising to the same space, the same users that you are trying to advertise to. So if they're doing continuous KPI improvement, then some of them are probably, if they're successful, they'll be able to afford higher CPI themselves relative to last month. And that means that you too will need to bid higher. And so your CPI goes up.
So those three factors and probably others that I didn't mention, or maybe I'm not even aware of, but those three factors certainly conspire to put upward pressure on CPI. And because of that you can't afford to stand still whether you like it or not.
I don't mean to sound dire, but you're locked in an LTV and, and CPI arms race with your competitors. And so you need to continually up your game just to stay above the waterline. So these are things that really only became clear to me after running UA myself.
I highly recommend any product manager try to get your hands dirty running UA campaigns so that you can experience, like viscerally experience, the front half, the CPI half, of the business.