WATCH A DEMO
REQUEST DEMO

Blog

The Evolution from PMML and PFA to Agnostic Scoring Engines

Ginger Phelps July 18, 2017
Find me on:

Evolution from PMML and PFA blogBefore models can be placed into scoring engines and then into production, custom code has to be written for each model. This process is labor-intensive and often error-prone. After the model’s custom code is written, Data Scientists have to transport the model to IT.  

The process of moving a model from Data Scientists to IT is often filled with a lot of back and forth configuring and editing. These two teams needed and still need a means of moving models from one environment to another, such as the data scientist’s laptop to the server farm which serves customer requests, that can fight against the practice of reimplementation of transmitting and executing models.

To help execute models into scoring engines, PMML, a model interchange format, was specified. PMML (predictive model markup language) is an XML format used to help with the safety and scalability of scoring engines.

While PMML was a higher attempt to assist scoring engines, there were still challenges within the structure that arose such as:

  • Could only support standard defined models.
  • Models outside the standard require a lot of effort, if even possible.
  • XML tags had to be changed to add new models.

To combat the challenges that arose with PMML, a new model interchange format was defined and implemented, PFA. PFA (portable format for analytics) is a common language that helps smooth the transition from development to production.

PFA was built to cover all of the main functions that PMML does, while also filling the gaps in PMML’s functionality. PFA’s functionality provided a safe environment for its users. PFA designed a new language that made encoding models more secure. This structure prevented any interaction with operating systems that could possibly be “dangerous” from the model.

PFA also had the support to enclose data both pre- and post- processing more generally than PMML, which only supports a limited form. When using PMML, there often required a companion code fragment or script to compliment the model when processing input data prior to application of the model. However, with PFA, the entire scoring flow is represented in a standardized manner, making operationalization of models much easier.  

Although PFA was an alternative solution to PMML, the language had complications that made functioning difficult. A few of these challenges were:

  • Translating models is a laborious task.
  • Different languages used by DS and IT created complications with translating models.
  • Although guaranteed to be stable, PFA does not provide the safety of “contained crashes” that containers do, which limit the impact of a crash.  

Both PFA and PMML were built to help ease the process of scoring engines. While both model interchange formats were a step closer to an easier process, the preference to eliminate the effort of converting to yet another language has outweighed the desire for PFA and PMML.

As of recently, the implementation of an agnostic scoring engine has begun to be utilized to tackle these problems. An agnostic scoring engine can essentially take any model no matter what language and produce a score without any restrictions or configurations done to the model. The implementation of this kind of engine arose through the desire to be able to run any model without having to worry about constraints.

Using an agnostic scoring engine resolves the restrictions that were placed onto the model by PMML and PFA, and in return created many benefits to scoring models that both PMML and PFA could not achieve without excessive effort. With an agnostic scoring engine, Data Scientists and Engineers are able to:

  • Place any model into the engine no matter what language.
  • Run the native language used to write the model without configurations.
  • Capability of scoring and scaling data at a faster and easier rate.

Agnostic scoring engines are just the beginning to achieving faster and easier scores. Our scoring engine, FastScore, has a simple step-by-step process for producing rates. The model is simply placed into FastScore, given an input stream, and the engine scores it through an output stream. All of this is achieved without the hassle of having to reconfigure the model to fit a certain language or tool. To learn more about how agnostic scoring engines work, check out our website!

Topics: FastScore, Model Deployment, Open Data Group, PFA, PMML, csv, agnostic scoring engine