Transcript (Podcast 4): Recorded Future and Investing Using Social Media

This transcript was taken off a recent episode of Tradestreaming Radio which can be found here.


Hi! I’m Zack Miller, author of the recent book TradeStream Your Way to Profits: Building a Killer Portfolio in the Age of Social Media, and you’re listening to Tradestreaming Radio, our home in the internet radio space. This is our place to discuss how technology is helping investors to become better, smarter, and more accurate at what they do.

You can find the Tradestreaming podcast on iTunes. You can also find lots of other material relating to this podcast, as well as archives of our programs at my website There’s lots of other great content there as well, and I recommend you check it out.

We’ve got a great interview for today. We have Evan Sparks, who after several years of developing quantitative equity strategies on the buy side, is a product engineer at Boston-based at Recorded Future. Recorded Future uses linguistic analysis to harness the predictive power of the web for credit and equity research.

I’ll let Evan introduce himself and his firm next.

Sparks: Absolutely. As you know, I went to Dartmouth by way of background before Recorded Future. I spent several years as an analysis at a mid-sized US asset manager as a quantitative analyst building actively managed strategies in the US equity space, portfolios designed to beat their benchmark, which was typically Russell 3000, or in hedge strategies of various kinds…

I joined Recorded Future last summer. Really with the focus of taking this platform that we’ve built and finding really solid uses for it, and building out towards the financial services space, particularly focusing on quant finance, but also to some degree on some more traditional discretionary financial research.

Miller: Evan, let’s first talk about the platform that Recorded Future has developed, and then let’s drill down further to discuss the applications that some of your work and technology has for the investing field.

Sparks: Sure thing. I guess to begin, sort of our central tenet at Recorded Future, our core belief is that we think the content of the web has predictive power. We think that by being able to break down and quantify what’s going on in the web in clever ways we can find some interesting things that we can make predictions about. This might be financial market data, this might also be, we have some customers in the government intelligence space, as well as people doing things like brand management with the product.

Miller: There have been a lot of companies on line that have focused on using semantic analysis for predictive capabilities, particularly in an investment field. This has always been seen as the holy grail of investing. I wanted to see what set Recorded Future apart.

Sparks: Right. We take kind of a different approach. We do linguistic analysis of the text content, this massive repository of unstructured text content that’s on the web, and we try to apply structure to it. We do things like entity extraction, so any reference to a company, or a product, or a person, or a place, we extract the fact that has occurred in a particular document.

We also do event extractions. If two companies are involved in a acquisition, or a potential acquisition, we extract the fact that there was this acquisition event, or discussion of an acquisition and which two companies it was between, for instance.

We have things like capital markets events, as well as product releases, and also natural disasters. We have about 100-150 different event types that we capture at this point.

Miller: Once you capture that data, how do you turn it into useful information? How do you then process it so that you can begin predicting future events?

Sparks: The third piece that we capture from all of that is also any time references within the content of that text. We try to be very sophisticated about how we capture references to time. We capture when was an article published, when was it downloaded by our system, but we also capture within the text itself references to things like, “next week,” or, “on July 22nd,” or, “in 2012, this may happen.”

By being intelligent about how you capture references to future events you can start to say, “Hey, give me back any references that have been to future M&A activity in the pharmaceuticals industry over the last year.” You can start to see patterns emerge from that data.

So, some more concrete examples I guess of things that we’ve done include looking at whether negative sentiment around the S&P 500 index is a leading indicator of next month’s volatility in the S&P 500. Do we see a pattern there? We have seen some strong statically relationships in the space of volatility, dollar volume for trading, as well as in abnormal returns, in some cases.

Miller: Is the next step for Recorded Future to actually create an investment strategy around some of the data that you guys are bubbling up? Or, is your role as sort of a content provider, content mixer, just to sort of crunch the numbers then hand it off to your clients for future processing?

Sparks: For the most part we’ve been working closely with customers at hedge funds and banks, as well as smaller trading shops, to help them take their ideas and implement them with Recorded Future data. We think we have a data source that’s pretty orthogonal to traditional financial data sources. Normally people look at quarterly filings, and analyst estimates, and of course market and price data, but we think that we provide a different channel, a different set of metrics that you can judge a trading strategy based on.

Miller: From what I’ve seen, Evan, the data and information coming out of Recorded Future seems really valuable. I hope as you get the word out you’ll grow your client base, and they will devise profitable trading strategies based off of this type of analysis.

The question always arises in something like this, why not just raise some money, close off this black box and invest in a proprietary basis? Start your own hedge fund based upon some of the analysis that you’re doing?

Sparks: To answer the question, “Why don’t we start our own hedge fund with this stuff?” I think really what we’re building and the key value that we’re delivering is an ability and a flexible platform that allows anybody to answer complicated questions about the world.

Our expertise, while we think we’re pretty good at forming those kinds of questions around this data, certainly people are going to have their own ideas about what they really want to look at and where they think those values are derived. So, what are certain experts saying about this company or that company? We think maybe we can help you identify who the experts are, but it’s up to the user for how they want to interpret the results and make their investment decisions.

I think we provide a great data platform, and a great analytic platform, but the hard work is really in the analysis, I guess.

Miller: In some sense, and this maybe a poor analogy, you’re selling tools in the gold rush that is sort of what’s going on in quantitative research right now. Is that a fair analogy?

Sparks: Yeah, I’m not sure that it’s necessarily a gold rush. Certainly we’re selling tools to all types of investors, not just quants. Via our web-based UI, we’ve certainly had a lot of discretionary researchers show tremendous interest in taming this big massive data store that is the web, and getting the slices that they need out of it. It’s applicable to lots of different areas in finance.

Miller: That’s very interesting. One of the outputs I saw of the research that you produce, and I believe you were the author of the article, some of the findings your firm had around the crowded hedge fund trade. I guess there is no more crowded trade than Apple these days. Can you tell us a little bit about that research, what it means, and maybe sort of elude to some of the directions you’re going to be taking with your research in the future?

Sparks: The idea here was how do we quantify the level of discussion, or the level of crowdedness around a particular trade based on online media? We wanted to get a measure that was completely orthogonal from market data, something that’s not baked into prices, baked into flows, that kind of thing. Something that you can’t get anywhere else but from looking at online media.

What we did was construct pretty simple sort of basket of words based on relative frequencies of the words and phrases in academic and business articles about momentum investing, so momentum trade, sort of the classic papers as well as some of the newer ones. When people talk about it in blog posts, or The Wall Street Journal, or whatever, what are the words they use there that they don’t use in other kinds of articles?

Based on these word counts and frequencies we then look, over time, throughout our entire repository of business and finance articles. Per day we kind of take an average of this metric we’ve developed based on the usage of these words and phrases.

What you see in the chart and the article is we’ve plotted this over time. We took a look at how has chatter around this concept of momentum investing changed over time? What we saw when we plotted the performance of this metric against the performance of a mutual fund that follows a momentum investing strategy, according to its prospectus, was an inverse correlation, particularly over the last year between our metric and the performance’s fund.

This violated our prior. We certainly thought there would be a positive correlation between people talking about the trade and the performance of the trade. Thinking the logic would be people talk about it more, so they’re buying into it more, driving the price up.

But this negative correlation was pretty interesting when you think about it in kind of an ecological context, around this idea of crowded trade; more people fighting for the same pennies, and they get harder and harder to pick up, and maybe it’s tougher to perform in that kind of scenario.

That’s our current intuition around why this particular trade works. Certainly some future steps would be to dig in a little more, maybe look at fund flows in and out of momentum funds, maybe look at other types of investment strategies. Certainly value is one that people talk about. There are other trades that you can get into. If you can find sufficient data online around the discussion of these trades, maybe there are some interesting signals there.

Miller: I guess what struck me about the article was exactly what you pointed out, sort of its counter intuitiveness. I started thinking a little bit deeper, your outtake from sort of the ecological perspective about being harder and harder to get those falling pennies certainly is one way to explain it.

I was thinking, just recently, it could just also be baked into sort of the methodology at that particular fund that you looked at, right? Maybe it’s not truly a momentum fund, or something like that. That certainly seems to me plausible, not necessarily explanatory, but plausible.

Sparks: Yes. We definitely have thought about that a little bit. We have seen this pattern, this was the one mutual fund that I could find with a sufficient history for comparison against our metric, but over the last year other momentum funds show a very similar pattern.

It seemed to be somewhat persistent. We definitely don’t have enough data points there to make a clear statement there. I think definitely a very interesting area worth pursuing.

Miller: I then asked Evan how curious investors could interface with Recorded Future, and get a feel for what they have to offer, and maybe access some of their services.

Sparks: The first thing you can do if you want to get an idea of what we offer for free, we offer free future alerts, which are a way that you can set up a query in our system and it will send you an email alert for any new results that come online. If you’re interested in M&A rumors in the pharmaceutical space, you could set up a futures alert for acquisitions, pharmaceutical industry, any time in the future. You would get references that come up online to just that acquisitions in pharmaceuticals anytime in the future.

If that’s the kind of thing you like, if you’re interested in the results that are coming back, we encourage you to sign up for our premium product, which is $149 a month. It gives you a much, much richer experience. Several visualizations of the data that come back, timeline view, and network view of what are the entities mentioned together typically in online media, and how has that pattern changed over time, as well as a few other views of the data.

This is the platform that lets you really sort of dig into what’s going on in online media, how is it changing, and how is it changing with respect to this crucial dimension of time that we think is so important.

Miller: Evan, is there a plan to- and I know this is a sensitive question- to maybe syndicate some of these tools, some of the findings that you have within the system into other platforms, into other systems- Yahoo Finance, Bloomberg? That investors are using where they can encounter your tools and research there, as opposed to having it so they come to your website?

Sparks: Everything we have via the web platform is embeddable by default. Anything you see, any visualization you generate with the product, you can embed that in your blog, or your website. We’re adding some social media tools, currently. Definitely we want to get people interested in the platform and what this data can bring to whatever their investment process, or research requirements are.

Miller: Evan, thanks so much for participating in this podcast. This has been really educational for me. I hope it’s been instructional to our listeners, and our readers. Recorded Future sounds quite interesting. I’m going to keep an eye out for it in the future. It’s been a pleasure reading some of the findings you’ve produced on your blog. I’ll link to all this material from my blog, so that my readers can have access to it.

Thanks again.

Sparks: Great. My pleasure. Thanks a lot, Zack. Great talking to you.

Miller: That was Evan Sparks, engineer at Recorded Future, resident genius. It was just a really interesting conversation with a company that I think is sort of breaking out and making really usable and accessible some of these linguistic  analytical tools for investors.

We know, clearly, that there’s a tremendous amount of information residing online, both in the micro and macro level. Obviously drilling down, and looking at 13F filings from an insider, from a hedge fund, and mimicking those are some of the things I talk about in my book, and on my blog. Piggyback investing is important.

But, on a macro level there’s a lot of noise going on. Tools like Recorded Future are helping investors sort of provide an analytical layer to try to make sense of some of those things. The next step is to then take those and devise a strategy around them, back test them, and start predicting events into the future. Again, that just opens up a lot of doors for both individual and professional investors going forward.

Thanks again for tuning into the Tradestreaming podcast. I always appreciate your listening. Head to the blog at I’ll have some additional information there. I hope you turn in again soon. Thanks a lot.