TTWCP-880-02 Technical Math, Cathy O’Neil of Mathbabe.org: Cathy O’Neil, Data Scientist and Author of “Weapons of Math Destruction” and Math Babe
On This Episode…
Cathy O’Neil received her Ph.D. in mathematics from Harvard University and, after a brief stint in academia, landed a position as a quant on Wall Street. She was eager to put her math skills to use, predicting movements in the market. But when she realized that the hedge fund she was working for was betting against people’s retirement funds, she became deeply disillusioned. Math was being used in a way she felt was immoral. She left Wall Street and became the financial advisor to the Occupy Movement, bringing their message to audiences from NPR’s Morning Edition to the WGA “Best Documentary” award-winning Frontline episode called “Money, Power, and Wall Street.” She then became a data scientist for a New York start-up.
Now, she is an evangelist for the cause that is at the heart of her book, writing about these ideas and much more on her blog Math Babe. O’Neil is uniquely situated to talk about the social and political implications of this kind of math given her deep knowledge of modeling techniques and an insider’s understanding of how companies are using them.
Share This Episode
For Questions, Call or Text:
Below is a rush transcript of this segment, it might contain errors.
Cathy O’Neil – MathBabe
Airing Date: September 24, 2016
Craig Peterson: Welcome back to Tech Talk with Craig Peterson. We’re gonna talk right now about some Mathematics that really should interest all of us. It’s affecting everything in our lives. From the jobs that we’re able to get. The applications for those jobs are gonna be looked at through policing, through politics. And we’re gonna talk right now with the author of a book called Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy by Cathy O’Neil. Now I’ve read through this book and I was just nodding my head again and again because so much of what Cathy is saying in this book is stuff that we’ve talked about here in the radio show for the last 15 or 20 years. She has gone through a lot of trouble here to put it all together presented in a way that really makes a lot of sense and that can really help change our democracy and our lives for the better. Cathy, welcome.
Cathy O’Neil: Thank you so much. So glad to be here.
Craig: Good book. Very interesting read. It’s gonna be at the Harvard Bookstore at October 3rd… excuse me, you are gonna be at the Harvard Bookstore at October 3rd this year for a book signing. I found it fascinating as I went through it. I was saying yes, yes, yes again and again through the book. Everything from even the FICO Score that they used for looking at the so-called credit worthiness and how that’s all put together. I was just nodding my head. You know, I’m someone who hasn’t had a credit card in 17 years now. And when I look at my credit report and you know, even though I pay in cash for everything and run a multi-million dollar business, I can maybe get a $500 secured credit card online. It’s just incredible. What brought you to this point? I saw in the beginning of the book, you’re talking about how as a kid, you would sit in a car and figure out the prime factors of peoples’ license plate numbers. You’ve been a nerd a long time.
Cathy: I have. I come by my nerd properties very honestly. You know, I was just such a nerd kid and I loved mathematics. Like I’ve decided to become a mathematician by going to a math camp when I was in high school at Hampshire College. You know, I was like a hippie nerd. And I was a really naïve, hippie nerd. So I decided to become a mathematician. I decided to become a math professor when I was basically turning 15. And 20 years later, I was turning 35 and you know, I got into Harvard, I got a PhD in Number Theory, I went to MIT as a Post Doc. I ended up at Barnard College in New York City. And I just fell in love with the city. I wanted to be part of the city. But I was still extremely naïve and I was like, well all I know how to do is math, what kind of job can I get? This is in 2006. And you know I applied for and got the only job I knew how to get which was a quant at a hedge fund.
Craig: A quant being?
Cathy: You know, quantitative analyst. One of those math nerds that sort of scrapes the data off the market in order to make money from it. I really didn’t know what I was getting into and I’ve started in June 2007 and almost immediately after that, the financial crisis erupted. So I went and thinking, oh, I’m gonna be able to use mathematics to sort of improve the efficiency of the market, blah blah blah. And almost immediately I was confronted by a very dark realization which is that math within finance was actually being used to, rather than clarify which is my experience, it was being used to obfuscate, to shield corrupt practices in the mortgage market. So I eventually discussed it with my experiences in finance. I left and I started a blog called mathbabe.org where the goal of the blog was to sort of expose the math, the hypocrisy and the way math was being abused within finance. But in the meantime I needed a day job. So I became a data scientist and I worked at a startup, in Adtech in New York City. And that means basically I was deciding which kind of ads to show to people depending on their property. And I knew a lot of people understand this is the tailored ads that they see. I was working in a world of travel. But what I realized relatively quickly and what I also realized that the people around me who hadn’t had the experience of finance that I had was that we were doing, we’re making many of the same mistakes that I’d seen in finance. We’re making the same mistakes in the world of big data.
Craig: It’s interesting you brought up that company. In fact it’s a company that has been on my radio show before. And the whole idea was hey why don’t we present ads to people that they will appreciate, right? It’s if you’re buying a car, why would I wanna see car ads 24/7 for 5 years before I look into buying the car. And a lot of these big data, a lot of the analysis, you know it seems to make sense at the surface. It’s something that’s gonna provide a real benefit.
Cathy: Yeah, if you don’t mind I’m gonna tell you the story of like what made me realize that I needed to write this book.
Cathy: Which was a venture capitalist came into our company because we were looking for a new round of funding. A typical thing that happens in startup world. And the whole company was sort of gathered to listen to the wisdom of the venture capitalist. And what he said was that he was looking forward to the day when ads were so smart they were so tailored that he would only get, he would only be exposed to things like offers for trips to Aruba or jet skis. And that he would never again have to see another University of Phoenix ad. Because University of Phoenix ads weren’t for people like him.
Craig: They have a great football team.
Cathy: The thing is that everyone in the company kind of giggled. And I looked around saying what? You know, I thought you know, the internet was a democratizing force where you all got to have an even playing field. But this soon would expose to me kind of the sort of evil master plan and I’m not saying it’s just, it’s actually a conspiracy. But it’s like the underbelly of, you know, oh we get to see the kinds of things we wanna buy, is that on the far end of the spectrum, essentially for poor people, were preying upon them. And I think of poor profit colleges coz I looked into it, as a very predatory industry.
Craig: Yeah. You’ve got a lot of interesting stuff about the poor profit colleges, again in the book Weapons of Math Destruction where it’s going in showing how they’re advertising, who they’re targeting, how they’re targeting them and paying… these schools are paying, what was it? 150 bucks a pop for some of the leads to get a student into the school?
Cathy: Yeah. And unfortunately, in the sort of dynamics of the market for advertising, poor people are much more valuable to, like, poor profit colleges which, by the way, don’t get money from those poor people. They get money from the federal aids system if they can enroll those students into their schools. They’re much more valuable from the perspective of a predatory industry like payday loans or poor profit colleges than they are for, you know, some consumer purchase. So, where I might see a lamp that I looked at, you know, on a website following me around in the internet, other companies, other demographics of people will not have the same experience. And that’s one of the things I realize was actually kinda more pernicious about big data than had been true of financial manipulation of mathematics which is that it’s actually silent in the sense that the people like me, the technologists, the very well educated, we do not see the damage that’s being done for the most part. Because for the most part, we are exposed with different environment. An environment of opportunities and not of predatory behavior.
Craig: Well I’m in New Hampshire and I can remember very, very clearly one of my clients who was a chain of hospitals. And they had some new IT staff that had come in. And they were praising how New Hampshire had gotten rid of its predatory lending rules that had been in place where you could at that point now charge basically whatever interest you wanted to. There’s no more usury here in the state of New Hampshire. And that really, that blew my mind about how excited they were. Not because, hey, listen, now anyone can get a loan if they really, really need the money won’t it be great? It was, wow we can charge these people anything and that frankly turned my stomach just to see that aspect of it. And I didn’t really realize the depth of it until I read that chapter in your book.
Cathy: Yeah. I’m glad you brought that up. It’s generally speaking true that the algorithms that I worry about and I’ll get into exactly which ones I worry about. I certainly don’t worry about all of them. I have a triage of like which algorithms to worry about. But it’s certainly true that many of the most destructive algorithms are successful in the narrow sense. And narrow sense is typically profit. So from the perspective of those deploying that algorithm, it’s actually a very successful algorithm. But it has a sort of unintended side effects on the population. And that’s what I worry about.
Craig: So instead of getting something that’s more kind of the democratization that we hoped to get from the internet where everyone can learn everyone’s equal. That the SAT scores, for instance are going to be a great equalizing factor that you can go to US news and world report and find out which are the best colleges. These algorithms, these computer programs that are behind many of these decisions and are used by people just blindly, are causing a whole lot of problems. We’re speaking with Cathy O’Neil. She’s a data scientist and author of a new book called Weapons of Math Destruction. You can go down, in fact just this week here if you go down on Monday morning to Harvard here. You’ll be able to see here at the Harvard Bookstore. October 3rd and talk to her a little bit and get a copy of her book and hopefully get that signed. Cathy O’Neil, again, this is something that’s disturbing coz people don’t really realize the assumptions that are behind so much of these data.
Cathy: You know I’m glad you brought up the word democracy because one of the things, and I talked about that near the end of the book, is the way that political campaigns are using data. You know, to make a short story out of it, political campaigns have profiles on every single voter and they don’t care about every single voter. They typically only seem to care about swing states and within those states they only care about certain kinds of voters. But nevertheless they have certain scores for each voter like how likely are you to be persuaded on this issue and which issues you care about and the end result is and most of the stuff happens on Facebook actually. Facebook has an enormous amount of money coming in for political ads. But what happens is that different people literally get different messages from the same politician. And going back to this idea like is this a successful model? It is a successful model from the point of view of the campaign because the campaign can send you the exact message that they want you to hear. And the example that I can give is you know, I, I’m a progressive. But Ron Paul, the politician and I agree about financial reform. Right? Not much, but if he knew me, if he could’ve profiled me, he could’ve sent me that message. Like I agree with you, we have to break up the big banks or whatever it is.
Cathy: I would have been highly motivated to vote for him if I wasn’t also aware of his other… his entire platform.
Cathy: And the problem is that it’s getting increasingly difficult to become, sort of, informed. Because even if you go to the website of a given candidate, they will have tracked you. They will give profile will come with you to that website and they will show you exactly what they want you to see.
Craig: What you wanna see frankly. Or at least what they want you to think that you wanted to see. In the first… no, we’re getting into a loop here. Feedback loop.
Cathy: You know exactly. The point being that what is efficient for the campaign is actually inefficient for democracy as whole. The most efficient thing for a democracy, which campaigns absolutely hate, is literally public debates where we get to know what someone stands for on a whole host of issues. That’s increasingly being replaced by the specific messages that are going to us as individuals. And we don’t know what other people are saying. And by the way, journalists can’t track it either because it’s very difficult to track. You have to have many, many different personas to see what other people are seeing.
Craig: Well then, in fact you mentioned Facebook. They’re very, very conscientious about making sure that you can’t have multiple personas. They want you to use your real name. Same thing with Quora and these other sites online. So they can track you a little better and for very many years now Cathy. I kinda started out my technology career, my first was in banking and the second big job that I had was in direct mail marketing back in the 1970s. And what was interesting to me is how much data we had at the time. So I’ve been conscientious since then to make sure that I use fake names, addresses, answers to questions, played around with different sets of cookies on my browsers as time went on, to make myself harder to track and confuse things. But they’re now trying to make sure that that just plain doesn’t happen.
Cathy: It’s a really good point and I was actually trying to work with a group here at Columbia to understand how Google search works and the accentuates, you know, they show different advertisements depending on your profile.
Cathy: But the problem is that they have a rule against having more than a few emails even though you’re allowed to have a couple fake emails, you know, on Gmail. But you’re not allowed to have a sort of arbitrary number of them. Even if you’re a researcher. So that’s one of instance I call for these companies that have this much power. And they should talk about which algorithms I worry about. But let me just say that Google, Facebook, the really obviously powerful ones need to have more transparency especially for researchers.
Craig: Yeah. It’s a difficult, difficult world from that end, let me tell you. Well, let’s get into the kind of the conclusion here. You’re concerned about a number of different times of algorithms that are being used that policing was one that you had brought up in here. That there are some programs out there that are designed at least intended to try and find the high crime areas so the police can pay more attention to that. But there’s, again, a negative feedback loop.
Cathy: Yeah. Great example. Like in the solved experiment I like to present is imagine that after the credit crisis, after the financial crisis, all the police have been told, go down to Wallstreet and arrest the bankers that put a sense to this mess. If we had done that, which we didn’t of course, we would have had a lot of data about arrests in criminals on Wallstreet. And then that data would have fed into these algorithms and after all algorithms, all they do is find patterns then predict that patterns will continue. They would have sent the police back down to Wallstreet to look for more criminals in 2016. But what actually happened is instead we’ve had uneven policing in black communities and other poor communities. And so these algorithms simply propagate on uneven policing but under the guise of scientific objectivity.
Craig: Sure. If you have a lot of police concentrated in one area that you believe may be a high crime area. Those police are going to observe crimes, mostly very small crimes at some places. And now you have more reports of more crime in that area. And so there would be more police, more observation and then it gets into just a terrible loop.
Cathy: It does and one of the reasons… I mean the biggest reason it gets into that loop is this practice of broken window policing where we’re not actually focusing on the crimes that we should be focusing on. We’re not focusing on violent crimes which, really nobody could argue against trying to prevent and deter. Where instead like arresting people for sound crimes. Things like vagrancy and resisting arrest and typically resisting arrest comes with basically harassing people and stuff and frisks and all that kinds of things that the police… they basically are asked to do 2 things. Are asked to look for violent crimes which is a great thing for them to be asked to do. But they’re also asked to kinda, you know, harass citizens and that is the part of the data that creates a toxic feedback loop. And I’ve just like to jump in and say that the algorithms we’ve talked about are all what I call weapons of math destruction. And those have 3 characteristics. As I said, I don’t care about all algorithms, just, by nature I only care about the ones that have the following 3 characteristics. The first is that they are widespread and they have high impact on people’s lives that we talked about. Like you’re getting arrested. There’s another set of algorithms that decide how long you got to jail. We talked about whether you get to go to college, whether you get a loan, whether you get a job. That kind of thing. The second characteristic is that there’s secret in some sense and often that means that the people who are targeted by these algorithms don’t know how their scores are calculated, their typical scores. And finally, that they’re actually destructive. The people who are considered the losers of these systems actually have their lives affected and they suffer and there’s actually, typically alongside that a destructive feedback loop that ensues as well so that they undermine their original purposes. So for the example the predictive policing that we were just talking about where you send police into a neighborhood to deter crime. It actually has the opposite effect. It doesn’t actually, it doesn’t slow down crime at all. In fact it kinda propagates that kind of crime.
Craig: Very interesting. A lot of good points in the book. It’s well worth the read. And if you can head down on Monday morning here down to the Harvard Bookstore, October 3rd, you will get to meet Cathy O’Neil. Data Scientist and author. Have a look at her website, in fact there’s more details about the visit to the Harvard Bookstore there on the website. Go to mathbabe.org. M-A-T-H-B-A-B-E.org, that’s also her Twitter handle, mathbabe.org. Cathy O’Neil has been our guest. And check out the book online as well. Weapons of Math Destruction by Cathy O’Neil. How Big Data Increases Inequality and Threatens Democracy. Cathy, anything else you like to add?
Cathy: I’m looking forward to seeing people on Monday. Thank you.
Craig: Alright. Thanks again. Take Care. Thanks for being with us.