Product Discovery Backstory

Back in the Fall of 2017, I had an internal conversation with the product team at GitLab on Product Discovery. Since we’re transparent by default, the session was recorded and posted to YouTube. The original video and a loose transcription is below. Enjoy!

Mark: Well, this was an item that rolled over from last week, but it came up a couple of times and it’s actually been coming up a lot lately; the idea of Product Discovery, versus, whatever it is we call what we do today. I’ve been thinking about how to approach this. I was even tempted to organize a workshop or something where I can describe what it is that I mean by Product Discovery at least . I’ve also been tempted to do an experiment with a team or two to do a Product Discovery sprint, or more than one or whatever, and slowly introduce it. But what I’ll do today is just talk about it meta a little bit: what it is, why I think it’s important for us to understand what product discovery means, and why we should consider doing an experiment; and then see what you all think about it and whether we want to go forward or not.

So I did include a link to a bad blog post I wrote at some point. It’s not very long or detailed , but it’s the extent of my notes on the subject, unfortunately.

This is actually derived very much from work I did at Heroku, just to be upfront, and how we did Product Discovery there. And it’s a process that we, discovered so to speak, sort of accidentally. The reason I say that is because it also looks very similar to Lean Startup, but we didn’t know that until after I read the book later, months later. But to be fair, a lot of it was also probably inspired by, one of the founders of Heroku who had been mentoring me on stuff and he had gone through YC. So I don’t know, there’s probably some relationship between YC and Lean Startup and so it’s all related, but the point is, it sounds a lot like Lean Startup, but I didn’t actually read the book yet.

Actually originally it was derived from an interview. At Heroku, we did week long starter projects and I came up with this really compressed schedule because I wanted something for a candidate where the work was nearly guaranteed to succeed and I was just interviewing the candidate on what it was like to work with them. And so I came up with this really aggressive thing: let’s make a tiny, tiny little sliver of a feature that we can guarantee we will ship within a week. In fact, to make sure we guarantee we ship it, let’s guarantee we ship it in two days. And then we’ve got the rest of the week; in case we screw up radically, we can still fix it in the three days left. it turned out though that we actually shipped an amazing amount of stuff in a week. And it was awesome and it was an eye opening experience for me, which is one of the reasons I want to share it with you.

So we stumbled on it during this one starter project and then slowly it sort of calcified into this process that I’m going to describe. And it turned out that , anytime we ended up challenging the process, we basically said, no, this process works exactly as it is. But if there was a challenge, it was an interpretation of how we were supposed to do things, but it was just like the process actually really worked, which is also especially funny because we started with this whole thing because our old process wasn’t working and we threw out process entirely. And so all of this was generated from scratch from a couple of people who basically said, “scrum sucks; all this planning poker and all these other things are just a total waste of time, let’s throw out all process altogether and let’s just talk”. And then we ended up coming up with this very rigid process, which is just ironic if nothing else, but it really, really works because it was based on better principles . I’m not saying scrum wasn’t based on better principles, but it wasn’t working for us. Also to give a little more emphasis on this, I was literally working with a team of eight people and we were doing scrum and every sprint, we do a weekly sprint, every sprint, we try to squeeze out a few percentages improvements in our process, but the things were still just horribly bad. We spent a whole bunch of time like “okay, let’s iterate on making sure that we’ve got really good high fidelity mockups”, so the engineers have something better to work with than the low fidelity mock-ups. And then we’d still have problems where the engineers would implement something wrong, whatever. So then we got even higher fidelity mockups. Then we got like, “well, let’s annotate the mockups with the exact dimensions of things”, or “let’s talk about the things we care about”. Because back then we were using Photoshop too. And gradients that didn’t exactly translate from Photoshop to CSS. And so people would always interpret things a little bit differently and one of the struggles was the designer was like, “I want to just say , I don’t really care about this. I want it to look like this. Don’t follow my Photoshop mock-up exactly because it’s not CSS, but this one I do care about; this really has to be five pixels, don’t make it three or seven, it’s got to be five”.

And so we ended up literally mocking this up, like an engineering architecture diagram ,and we had this really, really rigid thing. So Product and Design would work really, really closely. And then we’d still throw it over the fence to Engineering. And so the best we got to was we would do that. We’d spend a week mocking up something, and then we’d give Engineering two weeks before they’d pick it up. And that was based on a recommendation from, Marty Cagan’s book. and then engineering would pick it up and then they’d work on it for a week and then they’d show out at the end of the week on demo day, and then we’d see it and then be like, “okay, well that looks pretty good”. And we’d come up with some immediate things, but then after we go to use it, we come up with more. So we’d come up with all the bug fixes over the next week. We’d feature assurance as Product. And then the following week Engineering would pick up all those bugs and then deliver it. So, net we’re talking about , I don’t know, five weeks of time, calendar time with, with a pretty large team.

When we ended up doing this product discovery process, we basically delivered that, that same kind of thing in one week with half the people. And so I roughly speaking, say that this was like a 10 X improvement, not in terms of lines of code, but in terms of wall clock time, of idea of a feature to delivery of the feature; one week time. Also, the quality was much better because we didn’t have these stupid misinterpretations between Engineering and Design. And worse is the feature assurance part. A lot of times you design a static mockup with the ideal lorem ipsum lines of text, and you say, “okay, let’s put three lines of text” and it turns out that in fact, people don’t use three lines of lorem ipsum. They use weird bullets and they use 20 lines in fact and they use all these other things and designing with real data is really, really hard. So a key thing about this is within 24 hours, we’re actually working with real data, we can see whether it really works or not, and we can iterate the next day or the next four hours in fact to make changes based on that. Also I banned lorem ipsum; we will never use lorem ipsum again. I always used at least realistic data, but still that wasn’t enough. You had to have actual real data.

So anyway, all sorts of things learned. I’m going to try and speed this up just a little bit; I know we’ve got an hour, but I don’t want to use up all of it. So anyway, the idea here… key things are: Product did not fully spec things out before passing off to the team. Things were just spec’d enough to know that we could deliver in a week, that it was viable to deliver in a week at least, and we had at least some belief that it was an important feature, and we had a hypothesis, that if we do X, then this will improve something. What we would then do is take that sort of nub of a topic and we’d do this Think Big session and the Think Big session would literally be a couple of hours long and we’d really open it up to brainstorming, but also… There’s some subtleties here that it’s really hard to express, but… One of the big things about Think Big and the brainstorming in general is not the actual brainstorming per se, but it’s about getting the whole team on the same page so that all the ideas are now shared instead of having three different people, all having different parts of the stuff in their heads and talking about it on an issue where they only talk about some small intersection, but not really actually being on the same page. And you do the Thing Big session and, invariably, we found just people were on the same page afterwards. Product and Engineering and Design and everybody knew, basically, the entire scope of what we could possibly be talking about here.

Also an important thing about the Think Big was: totally disregard, limitations, like whether something was feasible, whether you could do it on a week or not. Even though of course the ultimate goal is to deliver in a week. In the Think Big session, it was like: let’s radically rethink everything we could possibly rethink on this topic. I know that’s kind of broad and scary, but again, it’s only two hours, so, you know, but it helped you then open up ideas that you had never thought about before , like what if we didn’t just do this, but what if we threw that whole thing out and did something else instead. And sometimes you ended up being like, “Yeah, actually that sounds like a really good point. This idea that we thought was really worthwhile actually isn’t, let’s do something else.” It doesn’t happen very often, hopefully but you know it sometimes does. But more often, it’s just, it gives you a bigger perspective of what’s going on again, being on the same page.

Anyway, then a really crucial part was the Think Small. And this was again the accidental thing because I was just trying to make sure that the starter project worked, but we really said what can you ship in 24 hours. In practice we’ve hardly ever shipped in 24 hours, so it’s really more like 48 hours. Although at some points, we got really good and got it down to 24 hours. But the net idea is you’re starting on a Monday, let’s say by Wednesday, you’ve got the first iteration ready by Wednesday lunchtime, not Wednesday end of day. And you’ve got the first iteration and it’s going to be ugly and it’s going to be disappointing and it’s going to be whatever, but it’s going to functionally be there. And it may as well just go along with the story now. This first time we did this, we had the Heroku status site. It told us, told the customers whether Heroku was experienced an issue or not. And it was all manual entry done kind of stuff. We had completely revved that site. It’s now dead, so if you look at it, it’s not that site, but it lasted for a good several years.

Well in the top right hand corner was a little place where we wanted you… But actually, while I was there, it was a place to subscribe to stuff. But before that existed, there was no place to subscribe. The idea was… We said, “okay, if Heroku is down, it’s great that I can discover this there’s a problem and then go and check the status site, but why don’t you just tell me that the damn thing’s down or even better during the command line, tell me if things are out or whatever, but just tell me proactively that something’s down instead of warning me. Especially since you don’t necessarily discover things until your customers are complaining about stuff. And so this was back before any of this stuff was automated. So we said, let’s let people subscribe to notifications on the status site and that was it. To flesh out the story a little bit more, I went in as Product Management thinking a couple of things were key. I thought people are going to want to be able to subscribe by email, but they’re probably also really gonna want to get text messages because they’re not sitting there by the computer all the time and I want to be alerted in the middle of the night if my critically awesome important website is about to go to the VCs and Heroku is down. I need to know about it. So be able to text me, maybe even phoned me, I don’t know, but at least somehow communicate on my device.

And then I thought as a user, you already have my login. So don’t make me create a new account. Don’t make me log in again. Or maybe make you log in, but then get my email address from my login. And I thought that was a critical piece. Those are the only things that I really thought about. Beyond that. I said, okay, I don’t know what else there is.

The awesome thing is during the Think Small session, one of the developers was like, “Well making you log in first and everything else, that’s actually really hard. What if we just make you type in your email address again?” And I was like, “Well, that makes no sense. You already know my email.” But if I make you log in, you’re gonna have to type in your email address anyway, to log in. This way we’re at least not asking you to type in a passwor; let’s just type in your email address. And then I was like, “Well, how do you unsubscribe? How do I go to my settings to unsubscribe.” He’s like, “Well, you get the email. Click on the link to unsubscribe. You don’t need account management system. You just need an email and an unsubscribe mechanism.” And that simplification which was awesome, came from a developer, scoped it down. Which, I will honestly say in the entire history of Heroku, we never tied the login systems together. This thing that I, as product management thought was critical, was not. And in fact, made it awesome to not do it. Because of course, there’s this obvious downside of: if Heroku goes down, how the hell do you log in to go and do anything on the status site? That’s kinda dumb, like decoupling these things makes a lot of sense. Just a good little lesson learned there: this is why this stuff’s important and why it’s important not just to have Product involved, because Product can be done sometimes and think that something’s important when it’s not.

So anyway, so Think Big, Think Small, you get it down to what can you ship for the next 24 hours. And then you actually go and work on it. And, and again, the first version we had showed it to a bunch of people including one of the founders of Heroku. And he’s kind of like “Meh, whatever. Fine.” Not the experience we were looking for. We were hoping he would be like, “Oh yeah, this is cool. I can see that it’s ugly, but I can see the potential.” But he’s like, “no, whatever,” just didn’t care. But then we iterated on it and at some point… We started off with roughly speaking rail scaffolding. I don’t think it was actually real scaffolding; I’m sure it used our own CSS, but it was still roughly speaking scaffolding. And then we said, “okay, it’s gotta be integrated a little bit better.” And so we ended up putting it in the toolbar, or not in the toolbar, there was no toolbar, but we put it in, the top right hand corner. And so there’s a little subscribe link and we made it all a single form, popup, whatever. So a bunch of GUI optimizations. And we did a whole bunch of other stuff too. We actually implemented an API on the backend and we also implemented a Twitter feed . And so we said , “when you go to subscribe, we’ll give you all these mechanisms” we gave you, you can subscribe by email, you can subscribe by text message. Here’s the API, so you can do whatever the hell you want with it. Here’s the Twitter feed. And then, I don’t know, there’s five different things you could do. So it was a little bit of scope creep, but it was awesome because we got to it and we said, well, we actually can do these things. We scrambled and made sure that all these things actually worked. And then, we did that and basically iterated by the next day, by Thursday, we had this thing that was pretty complete and pretty polished. And then by the time I showed the founders then, they’re like, “Holy crap, this is awesome. Like, this is pretty much everything we believe in.”

And the reason he said that was because we made a bunch of interesting other choices like, when you subscribed by email, you would get all the updates on a message. So you’d get an alert. And then you’d also get when the alert was updated and when it was closed, and everything else. But if you got a text message, we only alerted you when the thing started. Because if you were woken up in the middle of the night, it’s awesome that you now know Heroku’s down, but do you really need to know that, “Oh, we’re still working on it” every 10 minutes. Do you need to stay awake for that? We figured if you got the text message, if you really wanted to, you would then go to your computer and then follow along by email.

But part of the thing is that we’d have these huge debates about: do we need to let people choose? Because maybe I do want to know when it’s closed, I want to know all this stuff. Do we need to give you options and choices? And we said, “screw that no options and choices, let’s just think really hard, about what the right answer is, and we will give you that.” So you had an option, of course, of email or texts, but that’s it. And then when we made the rest of the choices for you. That’s one of the reasons that the founders really loved it. Because it was like we made it really, really simple.

So anyway, we iterated then by day four, Thursday, we had this, we actually launched it to our private beta list, with metrics because we said, “okay, what does success look like?” And that’s a thing I didn’t really mention, but in the very first iteration we even said, “what does success look like?” before we even started coding. And we came up with what our metrics were and we said, “well, if 10% of the beta list subscribes and keeps on there and doesn’t opt out after they get the first a few alerts, then that’s success.”

So anyway by Thursday afternoon, we had launched it and so we actually had numbers within four hours. We knew how many people subscribed and we got a bunch of feedback and everything else. So by Friday, by the time - the starter project was basically done at noon - we already had measured response rates. We knew that 20% of the people had actually signed up and stayed there; that this was actually valuable. We had the anecdata responses of people saying they loved it, but we had the numbers to back it up. And we were fully done. Now there were a few little bugs we fixed over the next week or two, but basically we shipped an entire feature, from start to end in a week.

The reason I really love this and the reason I think we should start caring about this is that this is really valuable for high-uncertainty projects. We didn’t know for sure what subscriptions should look like. We didn’t know what many of our features at Heroku should look like. And we really wanted to get feedback from the beta list early on too. And so part of that was a two way conversation with the beta folks. Whereas currently a lot of things we do is… We work on something for a month and then we ship it. And then actually in fact, we feature freeze in then wait several weeks before we get any feedback about it. But by then, we’re already working on the next iteration of it. And so we better be right in some level. I’ve always said this before though so speed does sort of solve all our problems. And the thing that GitLab really has going forward it is that we ship quickly. And so yeah, we ship it a month and even if we’re totally wrong, who cares, it’s just a month of effort we throw away. More realistically, most of the time we’re right and we just need to tweak it, in which case, then we’ll tweak it really quickly. It still does mean though that calendar time - it can take three months before we actually have a polished version of something. And, boy, I’d really love to get that cycle time down. Right. This is part of what we pitch. Right. We pitch our cycle time as meaning something. And it’s awesome what we ship in three months. And it’s awesome the sheer quantity of things we ship. So the bandwidth is high, but the latency, so to speak is, high as well, which is not a good thing. And product discovery can really get that latency down. It can mean that we ship things quicker and with higher uncertainty, we can discover - that’s the whole reason we call it Product Discovery - we can discover what the feature is supposed to be. The reason this all came up is because we were talking about: Do we design things ahead of time? Do we have things fully spec’d out before we hand it off to engineering? And that works for some types of problems and it’s really bad for some other types of problems. If you just sit there and design a static mock up and then say, “great, go and ship it.” But then it turns out to just not really satisfy anybody’s needs, then you haven’t really done anything. Or at least it’s two months before you realize how you’ve made a mistake and then you can go forward.

Bringing a little bit of Lean Startup thought on this, and I’ll try and wrap this up quickly, is: the difference for me between the way Product Discovery works and the way that Scrum works is Scrum can tolerate being wrong. The idea is you move quickly, and you can react to your customer telling you how you made mistakes, et cetera. Whereas lean assumes you are wrong, you just don’t know where you’re wrong. And so you want to get it out as quickly as possible so you can learn what you make your mistakes on. And that’s not a problem that you made mistakes; this is a high uncertainty situation. So get something out there in front of target users, beta users, whatever, as quickly as possible so that you can find out where you are wrong.

And I think that’s, in a lot of ways, different from our approach with GitLab too. We are tolerant of being wrong. We have this issue tracker. We have public contributions. We have all this kind of stuff, but still we’re basically marching forward assuming we are right. We’re doing an issue and we have the next issue and we have the next issue planned out. And we’re assuming that less and less, I mean, we used to assign five issues worth and we know that that’s not right, but still we’re basically like cross budget pipelines we’re doing right now. We’ve basically planned three releases worth. And we’re not listening to any feedback in the middle of that or any part of it really, we’re still just assuming we’re right and we’re delivering it. And we’re tolerant of making mistakes. And I think the big difference again, with Product Discovery is assuming we are wrong, we want to get there as quickly as possible. We want to learn. Also does tend to mean that we have a polished, a more polished product more quickly.

So anyway, that is, I think enough of my time.

Fabio: So, Mark, just the question for you, about your story, were you and your team just focusing on that specific feature during that week or so of work or are you also doing something else during that?

Mark: That is an excellent question. And that is probably the biggest downside of this process.

Actually, there’s two big downsides. One is that it is basically a hundred percent focused. Like the product manager, I would be doing other things, but the developers, the designers, they’re working a hundred percent on this one thing, which is great but means you’re doing no bug fixes, you’re doing no maintenance, you’re doing nothing else, whatever. So Marty Cagan who a lot of this stuff was inspired by… and by the way, so he’s got this really great book, but he’s got a better blog. Because his book is old and basically he doesn’t believe in everything he wrote in his book anymore, but he’s actually really a smart guy and his blog post is right on target. But if you read the book like I did and thinking, “well, this makes no sense. This is dumb.” It is dumb but his new stuff’s awesome. Anyway, he, in his book though proposed basically having two teams, like you’ve got your product discovery team, and that’s mostly about product and design and you’d have like one or two engineers allocated to doing the discovery. And then when that discovery is done, you’d pass it off to the delivery team. And so you’d have a discovery team and a delivery team. And then of course the delivery team can be working on anything else in the meantime. I personally, never did a discovery team and a delivery team separately, so I’m biased here. I’m not even sure I think that’s a good idea. His arguments were things like, well, if you do discovery, it’s still just a proof of concept. You’ll move faster if you’re not trying to write production code. But then pass it off to a delivery team that writes good production code. But from my perspective, I just had one team that wrote good enough code right away and we would ship it. And maybe we refactor it later on. But I don’t know, I didn’t do that. But the other aspect is that potentially if you have that delivery team, then that delivery team can be working on bugs. Instead, what we would do is every once in a while, we would not do a discovery sprint and we would do a bunch of bug fixing sprints or whatever, or what you’d do is you’d wrap up a bunch of bug fixing into some other topic that then you would do a discovery sprint on or something.

But also I’ll just say in practice, we just shipped a lot in this team and it wasn’t, in practice, a problem, but I know that we do things differently here, and we might very well need more maintenance and whatever.

The other big downside to this is the synchronous communication. We had at least four hours of overlap with every member of the team. In fact, in the early days it was seven or eight hours. Everybody was in the same building. We would discuss something in the morning at our stand up, we would sync back up four hours later, and then we sync up again four hours later.

And there was a lot of communication. As a product manager, it took at least half of my time just working with the discovery stuff. So massive high bandwidth requirements from the product. The positive side of course, is that, and this is where a lot of this stuff was inspired by, was that, engineering would be like, “Oh, I tried to implement this thing. And it’s a problem. What do we want to do?” Or “this spec was unclear.” And so instead of just getting stuck, putting in a comment on an issue and then going off and doing something else, you would say, “okay, get the product manager and the designer together right then.” And we would make a decision within five minutes.

That was actually before we had the product discovery sprint. That was the precursor to all this. If you’ve got a problem, basically pull the red line on the Japanese Kanban kind of thing. Production stops, you solve the problem, then move on. And that actually was incredibly powerful, frankly.

It meant that our calendar time gets really reduced. I really don’t know how the hell that’s going to work in an asynchronous globally distributed world, which is the big reason that I haven’t been pushing this more. I think it can, and it might mean that you do something like: you’ve got your delivery team that does the normal stuff asynchronously, and then you’ve got a different team that at least has four hours of overlap. It might mean that you can’t have globally distributed, but you can have continentally distributed. I don’t know, throwing this out there.