Currently there are innumerable data languages that can be used for a wide range of analytic projects, and this amount will surely increase as new languages are being developed.
Currently there are innumerable data languages that can be used for a wide range of analytic projects, and this amount will surely increase as new languages are being developed.
Before models can be placed into scoring engines and then into production, custom code has to be written for each model. This process is labor-intensive and often error-prone. After the model’s custom code is written, Data Scientists have to transport the model to IT.
This year’s StrangeLoop conference is less than a week away and I’m psyched. This meeting with an odd name lies at the intersection of an odd blend of topics, including distributed systems, languages, and data science. It would be a natural place for me to talk about PFA, which covers all three, but instead I decided to talk about something new: a language of histogram aggregation called Histo·grammar.
On Monday, I outlined my view of what makes “an analytic” and the jargon that goes along with it. I ended the post wondering whether “an analytic” is just another name for “business rules” that organizations tend to follow today. In my view, the main difference between the two, which might be no difference at all for a given situation, is that "analytics" are generally more complex mathematically and operate on a more general "feature space". This generalization allows rigorously developed techniques from statistics and applied mathematics to have a chance at being applied to a messy real world problem.
Today I am happy to share a significant step forward for Open Data Group. Stu Bailey and Pete Foley will join ODG as CTO and CEO respectively. I will continue as Chairman and will also assume the role of Chief Data Scientist.