Sneak peek at early decisions at HAI

Recently launched a site with some basic vision behind HAI.

http://hai.io

But the tech stack is where the rubber meets the road. I’ve been coding about two months now. At the very beginning I went through a fair amount of thinking and ended up selecting a language for the backend based on a number of factors. From languages I knew, C++, Go, PHP, Python, Java/Scala, and Node.js were on the table. Python and Java were the two top contenders, but I ended up going with Python. 

So far I’ve been really happy with Python for both flexibility of the language, the available libraries for both web and machine learning, and the developer community. Ruby / Rails has an amazing community and great web stack, but given my own lack of familiarity and less work being done in machine learning, it didn’t make my list.

Then I started evaluating open source projects that would be the platform. There are 132 on the list below (looked at least 4x that many). It’s been amazing getting up to speed on the projects that are open source. Although Google, IBM, Amazon and others are clearly going to lead in the machine learning space for the foreseeable future, the open source community is catching up.

Open source is a moving target, and there’s no one size fits all when you are piecing together something new. So, I’ve been using the awesome ZeroMQ library to connect services between libraries, languages.

Finally, thanks to everyone who has provided feedback so far. Can’t wait to get what I’m working on out into the world.

Soliciting Advice: highly concurrent, available, non-blocking server

I’m seeking feedback on a language or platform for a highly reliable and low latency web service / application.

Assumption

Bottlenecks in a web service are usually related to data retrieval and storage, and eventually bandwidth and latency. 
Highly concurrent, lightweight threads provide options for reliability, load distribution, and perceived performance that would otherwise not be available. 

General Requirements 

  • Easy to use (build, deploy, monitor)
  • Plentiful external, stable, pre-integrated Libraries
  • Use case: distributed, non-blocking web services
  • Quite a bit of message and job queueing 
  • multiple databases , caching

Top candidates

Very incomplete list of pros and cons, but some of my thoughts, highlighted. 

  • Scala
    • pros
    • cons
      • new language syntax, paradigm learning curve
      • doubts about JVM memory efficiency and stability as resources are constrained
  • Server Side Javascript – via  node.js  
    • pros
      • redis integration for caching
      • fast, lightweight, easy language. 
      • some custom js would be portable to browsers (coolness)
    • cons
      • very new
      • performance
      • not as many external libs?
  • Tornado
    • pros
      • Python
        • as many libs as Scala
    • cons
      • narrower use cases
      • performance

 Would love comments, but a more complete list is presented in a survey: