Tonight, two leagues had long delays with their auction drafts. I want to write about what happened, the full context around the circumstances, and how it will be fixed going forward. I think it is very important to be transparent about these issues to show how ottoneu is going to improve to address these concerns.
At 3:07pm ET, I received this tweet:
@ottoneu went to start a draft and got a message saying the servers are at capacity. Help?
— Mark Bracero (@AdamMcDougal) February 23, 2014
I am currently in Copenhagen, Denmark on vacation from my day job, and my data plan is wonky to say the least. I received tweets regarding this issue for the next hour:
@ottoneu @nivshah @chadyoung this is a paid service, really need help guys.
— Mike Dugent (@MikeDugent) February 23, 2014
I also received a number of emails during this time period.
I happened to see this last tweet when opening my Twitter client on my phone. Up to this point in the trip, I received push notifications upon new tweets, but this time I didn’t see anything until I happened to open my phone’s client. As soon as I saw these tweets, I rushed back to the apartment I am staying in and spent about 2 minutes debugging the issue and resolving it.
Draft issues should be fixed. Much delayed, I know. Will be writing a post-mortem on http://t.co/0vuoTpEPLj very soon.
— ottoneu (@ottoneu) February 23, 2014
There are a number of questions about this scenario:
- Why was the issue not acknowledged earlier?
- How was it fixed so quickly when it was open for so long?
- Why are there so many issues with auction drafts on ottoneu?
I will address each of these issues in order. Of course, if you have other concerns, I’m more than happy to address them over email or in the comments.
1. Why was the issue not acknowledged earlier?
I am in Denmark right now, for my first vacation since July of last year. Unlike last July or the previous vacation in November 2012, my laptop was not with me the entire trip and I did not put someone else in charge of any issues in my absence. This is also my first trip during the peak ottoneu months, which are February and March, when all the auction drafts occur. Finally, while I expected Twitter push notifications to my phone, I did not receive any this evening.
Solution:
While ottoneu does not make very much money at all, what little I do make this year will go towards a small laptop that I can keep on my person throughout February and March. I will also be more conscientious of vacations during this time, and much closer to this or some computer during this time, until ottoneu makes enough money to warrant a second employee. There is no excuse for this not being addressed faster during such a sensitive time.
2. How was it fixed so quickly when it was open for so long?
Plenty of ottoneu issues are actually quite simple, and only come up when some code I hastily write is pushed into production. In this case, I did a big rewrite of the auction draft in the offseason to try and improve performance. Part of this was to introduce redis to the ottoneu technology stack. Redis comes highly recommended from the aforementioned day job, and I have some experience with it but I made a couple of fairly simple mistakes. These cropped up quickly when faced with production load, and I was able to sort them out and resolve them quickly.
Solution:
See the above – faster response time will almost always mean a faster resolution. A longer-term outlook has a better test environment and more robust testing, but honestly that is a luxury right now.
3. Why are there so many issues with auction drafts on ottoneu?
While no one has straight-up asked me this question, this is a question I ask myself often. There are basically two competing interests:
1) auctions are hard to schedule and when they are scheduled, everyone wants to run their auction.
2) auctions are computationally difficult to keep real-time, and they are also very sensitive to errors, so there should be 100% confidence in live auction drafts when they are run.
There are two solutions that I am capable of: the first is rewriting large portions of the auction draft code to use more redis and less database. Database bad, cache good. The second is to invest in more servers. I plan on doing the former extensively, as talking to a few colleagues indicates that this will increase capacity considerably. I’ve already done this a bit, and I hope to do this more.
I’ve already increased the number of auction drafts that can run at a single time by 50% over last year. I’m hoping to pop it up to a full 100% and then start exploring more server capacity. So this is a “stay-tuned”, but is also a catch-22, because like I said earlier, ottoneu really doesn’t make much money at all (it was a net-loss the last two years even without any full-time employees). So until ottoneu truly has enough users to afford more server capacity, more efficient code is the best way forward. Like I said, I will continue to work towards this end to make this a reality.
That is the full situation around the issue tonight and the overall auction draft issue. I’m back home in 2 days and will be vigilant on any draft issues through April, when capacity drops considerably and we return to the predictable, boring, wonderful grind of a new baseball season. I hope this has been helpful, and please let me know if you have any further questions or concerns.