Rodrigo Hernández Mota
whoamiData Engineer / Scientist at LeadGenius



Predictive Markdown Model Language is:
"an XML-based language that provides a way for applications to define statistical and data-mining models as well as to share models between PMML-compliant applications."
Integration with the most popular ML frameworks via JPMML:

We can perform model scoring either with a stream-processing engine or a stream-processing library.

We can use Akka Streams - based on Akka Actors (see syntax example).



According to their website,
"Apache Spark is a unified analytics engine for large-scale data processing."
Spark ML is a practical and scalable machine learning library based on a [Dataset].
Dataset[A].map(fn: A => B): Dataset[B]Dataset[A].flatMap(fn: A => Dataset[B]): Dataset[B]Dataset[A].filter(fn: A => Boolean): Dataset[A]Dataset[Row]TransformerEstimatorPipelineval pmmlBuilder = new PMMLBuilder(schema, pipelineModel)
pmmlBuilder.build()See the official jpmml-sparkml github repo for a complete list of supported PipelineStages types.
We can use Openscoring, a java-based REST web-service, as our scoring-engine of the resulting PMML model.