News 2008
Open Cloud Consortium. Open Data Group is participating
in the newly formed Open Cloud Consortium (OCC). The purpose of the
OCC is to support the advancement of research in cloud computing, to
develop open standards in cloud computing, to develop open source
software for cloud computing, to manage testbeds for cloud computing,
to run meetings, workshops and other events related to cloud
computing, and in general to advance the state of the art in cloud
computing. Robert Grossman is the initial Chair of the OCC.
Augustus Version 0.2.6.6 was released on Source
Forge. Augustus is an open source infrastructure for building and
deploying data mining and statistical models for large data sets and
high volume data streams. Augustus is compliant with the Predictive
Model Markup Language.
News 2007
Award. Robert Grossman led the The Angle Project,
which won First Place in the 2007 Analytics Challenge at the ACM/IEEE
International Conference for High Performance Computing and
Communications 2007 (SC07).
The title of the project was "Angle: Detecting Anomalies and Emergent
Behavior from Distributed Data in Near Real Time."
SIGKDD Award: Robert Grossman was awarded the ACM Special Interest Group
on Knowledge Discovery and Data Mining (SIGKDD)
Service Award for his "role in the development of open and
scalable architectures and standards for the SIGKDD and Global KDD
Communities."
Best Paper Award: The paper "Data Quality Models for High
Volume Transaction Streams: A Case Study" by Joesph Bugajski, Robert
Grossman, Chris Curry, David Locke and Steve Vejcik won the second
annual Data Mining Practice Prize at KDD 2007. The prize is awarded
each year "for work that has had a significant and quantitative impact
in the application in which it was applied."
Augustus Version 0.2.6.5 was released on Source Forge.
Augustus is an open source infrastructure for building and deploying
data mining and statistical models for large data sets and high volume
data streams. Augustus is compliant with the Predictive Model Markup
Language.
DM-SSP 07 Workshop. Robert Grossman organized the Workshop
on Data Mining Standards, Services and Platforms (DM-SSP 07), at
KDD-2007 in San Jose on August 12, 2007. The workshop highlighted
recent progress on developing standard-based services for data mining
and data intensive computing. This year's focus was on cloud
computing.
PMML Version 3.2. The Predictive Model Markup Language
(PMML), Version 3.2 was released in May, 2007. Open Data participated
in the development of this standard.
New Methodology Introduced. A very practical mechanism for improving
predictive analytics as the amount of data increases, is to build an
analytic infrastructure that builds automatically many predictive
instead of the more traditional approach that builds one (or a few)
manually. Robert Grossman gave a
lecture on
this recently: Modeling Highly Large, Heterogeneous Data Sets: Towards
a Billion Models, DIMACS Workshop on Recent Advances in Mathematics
and Information Sciences for Analysis and Understanding of Massive and
Diverse Sources of Data, Rutgers University, New Brunswick, May 15,
2007. How this idea was applied to analyze transactional data from
Visa is described in two papers at
KDD 2007: Robert Grossman, Joseph
Bugajski, Chris Curry, David Locke, and Steve Vejcik, Detecting
Changes in Large Data Sets of Payments Cards Data: A Case Study, and
Joseph Bugajski, Chris Curry, Robert Grossman, David Locke, Steve
Vejcik, Data Quality Models for High Volume Transaction Streams at the
KDD Data Mining Case Studies Workshop.
News 2006
DM-SSP 06 Workshop. Robert Grossman organized the Workshop on
Data Mining Standards, Services and Platforms (DM-SSP 06), at KDD-2006
in Philadelphia on August 20, 2006. The workshop highlighted recent
progress on developing standard-based services for data mining and
data intensive computing.
Augustus Version 0.2.4 was released on Source Forge.
Augustus is an open source infrastructure for building and deploying
data mining and statistical models for large data sets and high volume
data streams.
News 2005
Augustus Version 0.2.1 was released on Source Forge. Augustus is
an open source infrastructure for building and deploying data mining
and statistical models for large data sets and high volume data
streams. Augustus is compliant with the Predictive Model Markup
Language (PMML). Augustus supports vectorized operations and is
designed for data sets that are too large for existing open source
data mining systems.
This can be downloaded from Source Forge:
www.sourceforge.net/projects/augustus.
PMML Version 3.1. The Predictive Model Markup Language
(PMML), Version 3.2 was released in December, 2005. Open Data
participated in the development of this standard.
Industry related. Robert Grossman was elected to the six
member executive board for the ACM Special Interest Group on Knowledge
Discovery in Data (ACM SIGKDD) for the term 2005-2009.
Industry related. Robert Grossman was the general chair of
KDD-2005, The Eleventh ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining that
took place on August 21-24, 2005 in Chicago.