What is it exactly that quants do?
This is the fourth post in a series of curated interviews from Reddit’s IAmA subreddit, a place where individuals of various professions and backgrounds can request the community to “ask me anything”. There has been a lot of new readers to the blog recently, so I strongly recommend new readers to first read my first post on investment bankers.
I have selected some curated questions and responses from that, in my opinion, comprise some of the best quant/algorithmic trading/high frequency trading interviews on Reddit.
- What sort of algorithms does the bot use? What is its strategy?
The bot uses reversion to the mean strategies and is fully automated (its running as we speak) and completely non-discretionary. It requires no intervention – I could die right now and it wouldn’t even notice. It would keep trading and shut itself down at the close. I do extensive back testing using GA optimization and validate using Monte Carlo and walk forward techniques. The bot runs 100% off live data, but is optimized using historical data. I get my historical end of day and intraday data from IQFeed and real time data via my broker’s (Interactive Brokers) API.
My system is reversion to the mean, it doesn’t trade trends. I also don’t use moving averages or any other technical analysis indicators. Its all statistically based. I’d agree backtesting is worthless if you aren’t careful how you do it. Its certainly very easy to uncover fool’s gold. Note that I don’t use the off the shelf dogshit most retail traders attempt to optimize parameters with, plus I test everything 100% out of sample. EVERYONE who trades does some level of backtesting, even discretionary traders. Unless you’re wiping your brain clean each day, your trade decisions are all made based on the positive or negative outcomes of your prior trades. We’re all curve fitting, some of us are just more optimal than others about how we do it.
- Any books you could suggest? Websites you frequent or did frequent to get your feet wet?
I’ve found the best trading books are those from the academic Computational Finance community. Stay away from the Wade Cook garbage at Barnes and Noble. Here’s some favorites:
- Biologically Inspired Algorithms for Financial Modelling (Natural Computing Series)
- Computational Intelligence in Economics and Finance (Advanced Information Processing) Volume 1 and II
- Evidence-Based Technical Analysis: Applying the Scientific Method and Statistical Inference to Trading Signals
- How to Solve It: Modern Heuristics
- Way of the Turtle: The Secret Methods that Turned Ordinary People into Legendary Traders
- The Black Swan: The Impact of the Highly Improbable
In terms of websites, I hang around the Elite Trader forums quite a bit. They’re mean as hell there, so put on your thickest flame jacket before you post, but there’s some nuggets of useful info buried amongst the endless sophomoric dick slamming. Look at the posts by lescor and acrary. lescor (my personal hero) is a former fireman turned prop trader who pulls $50K+/month out of the markets and his equity curve is nearly a straight line.
- Is there a specific broker that you can suggest with lower fees?
I’d suggest Interactive Brokers (they are who I use). They charge 0.005/share but you can get that down slightly with rebates if you add liquidity. There are other brokers out there, but IB is really the best option for retail traders. You can trade stocks, options, futures, and/or FX through IB. And they have the largest selection of markets to trade, including international markets like the DAX, KOSPI, FTSE, etc. Its hard to find a market they don’t offer, really.
- How much does the latency of the data feed play into high frequency trading?
Latency is extremely important if the signals you trade are short term (up to hours). most exchanges have “colocation” facilities in which firms can rent space to run computers with their trading software. Every millisecond is important; firms spend millions of dollars to get an edge over the competitors. A lot of automated trading systems are not high frequency, so being the fastest isn’t that important. Most high frequency trading strategies are based on market microstructure (e.g. There are more shares on the best bid than the best offer, so the price is more likely to go up). It’s a strategic game that’s separated from supply and demand considerations. It is definitely the case that different dollar-amount stocks behave differently in microstructure and in short term movements.
- What percentage of the total daily volume on SP500 index futures do you need to trade in order for HF trading to work ? Do you trade any illiquid equities, stocks with a small float ? What aspect of statistics do you use in your daily job ? Do you back-test your strategies on historical data ? If yes, how far back do you test ?
- You can trade a very small amount (1 share per day) and still qualify as high frequency if the signals you are trading are based on high-frequency market data.
- Yes, we trade illiquid stocks. Any strategy tailored for liquid names obviously has to be rethought for illiquid names, but it is possible and profitable to trade them.
- I do a lot of regressions and similar testing to fit models, etc.
- Yes, we backtest strategies. If it is a simple code change, we might just test a few months; for serious risk management strategy change, it could be a couple years.
- What does high frequency mean? What would be an average or target duration for the length a trade stays open?
High frequency strategies are generally characterized by:
* High cancellation ratio – most limit orders are cancelled before they are filled.
* Low holding period – on the order of hours or less.
In general, high frequency traders don’t try to buy and immediately flip for a higher price (there are statarb strategies that do this). Instead, you usually have ‘signals’ that give you an indication of where things are going based on factors like market microstructure. Your signals indicate that, if you put out a buy limit order and you get filled, then you are in expectation making money. That doesn’t mean you should sell right away when you get filled; you again should generally sell only when you think you are making money by selling.
Is it as simple as “If X happens, does Y?” or is it much more complicated? Back when automated trading was a semi-serious hobby for me, people would say they trade “on book”, not on price. If that sounds familiar to you, could you give an example?
Examples (suppose A and B are correlated):
Trading on price: if stock A moves up and stock B moves down, predict they will converge in price (sell A and buy B).
Trading on book: if there are more bids than offers for stock A and more bids than offers for stock B, predict they are both going up (buy/post bids for both, or at least cancel offers at the front of the queue.)
Some strategies have very simple ideas and others are very mathematically complex.
- Could you also give some insight into the technology stack you use (programming languages, OS, open source tools, etc)?
I’m going to assume that you, like most hobbyists, are trying to dump your data directly from the feed into the database. Your generic databases aren’t designed to be used like this. Take your feed data, process it however you need to, and then just append it to a binary log file, flushing every X amount of bytes. At the end of day, clean or process the data, and then insert it into your RDBMS.
If you need to query lots of data (billion and up), your RDBMS wont be optimal. You can try selectively loading sets of data you want to work with from your binary log files into the RDBMS, but this will limit your working set. You can try to use one of the NoSQL databases if your access patterns are a good match.
If you want something that will perform very well, simple emulate the way kdb+ or Vertica store data. Sort your binary log files by keys, index the file, and memory map for fast searching. On top of that you can splay the file, add block level compression, etc.
We work with several technologies.
- C/C++ and Java with a sprinkling of some lesser known language like Clojure/J
- Lots of open source and we do give back
- Some proprietary third party software/hardware. A lot of our infrastructure was custom built earlier last decade and has been/is being phased out.
- Could you talk a little more about general infrastructure and what parts are key to operating HFT strategies? How do you decide what programming language to choose or rather what is your development process?
I think individuals can compete if they know their shit, put in the time, focus on a niche, and have capital. In my opinion, most people are living in a pipe dream if they think they can go from novice to ATM operator printing money. That is not how it works. There are things larger firms will not operate in because it simply isn’t worth it. You have to consider that there are fixed costs and we need X amount of profit to make deploying something worthwhile. These niche markets are probably less common than one may think. Carve out a strategy where you have a competitive advantage, don’t try to beat me at my own game.
You need low latency infrastructure and fast execution, what specifically do you want to know? If we need to develop and implement something, we define constraints and requirements and pick a language from there. It’s very much a best tool for the job process and generally high performance components are written in C++, enterprise components in Java, prototyping/fast tooling is done in some scripting language, and text processing in awk.