Feature Engineering, Data Pipelines
& Nothing More.
What is Scribble Enrich?
Scribble Enrich is a feature engineering platform. It sits behind our customers’ firewalls, takes data from a lake or other store, and turns it into features. It is built for scalability, with numerous guardrails to help data science teams accelerate their productivity, whether it is in ML Model deployments, or model training.
Enrich streamlines the most laborious parts of ML model training and productionization, and does so with high auditability, reproducibility, and the highest per-core compute efficiency.
Design Principles Behind the Enrich Platform
Audit framework to increase trust
Scale & Performance
Prepare to scale models with new usecases, and new data
Application framework to cut down development time for each model
Rapid iteration on the features and model development
Components & Architecture
A lightweight data catalog to continuously document what is in the data store
Generate labeled datasets or extend master for richer features
Search interface to understand lineage of every dataset
A programmable health check monitor of data flowing into the data store
Versioned auditable feature computation pipelines
Discover features being computed by the system (for status & reuse)
Extend data by linking with thirdparty datasets
Monitor model performance
Filter and export datasets