SCRIBBLE

ENRICH

Feature Engineering, Data Pipelines
& Nothing More.
scribble assembly line_assembly line ill
What is Scribble Enrich?

Enrich is a customizable Feature Store, built for data science teams that need efficiency and trust in their datasets.

 

Enrich ensures that each feature, and so each dataset built using these features, is reproducible, versioned, quality-checked, and searchable.

This means data science teams can deploy models that much faster, and with that much more confidence in the underlying data. This means faster re-training or debugging, and quicker turnaround time for each new version of these models.

 

 

Enrich is:

  • Highly customizable, and gives ML engineers an SDK

  • Streamlines data transforms and manages complexity using versioned pipelines to ease retraining and debugging

  • Eases collaboration across data teams via its feature marketplace

  • Provide target data sets feeding APIs for applications

  • Helps address emerging requirements of explainability, provenance, auditability.

CAPABILITIES & BENEFITS

Execution.png
PinClipart.com_transform-clipart_1675644

Data Transforms

Benefit

 

  • Stress-tested, reusable modules

  • Fast development

Modules with parameter validation, input/ output validation, documentation

01

Feature Marketplace

Benefit

 

  • Reuse of features, better models 

App to discover datasets computed with any statistical attributes.

04

Metadata Tracking

Benefit

 

  • No dataset without metadata

  • Build lineage & other applications

Data accesses, and writes, process, quality metrics, state changes

02

Built-in Apps

Benefit

 

  • Needs across the lifecycle addressed

Multiple apps including lightweight catalog, lineage search, simple labeller etc.

05

Controlled Deployment

Benefit

 

  • Link all datasets to code commits

Deploy from Github and online upgrade

03

Meta-Computation

Benefit

 

  • Enable monitoring

Compute over data and metadata such as lineage, drift, expectations etc.

06

SCRIBBLE

ENRICH

Components & Architecture

product diagram.png
scribble icons_customize.png
Usecases

Track utilization of the features along with ownership

labeling_1x.png
Implement

SDK and other services to rapidly implement feature engineering modules

core_1x.png
Operate

Administer versioned, auditable, parameterized pipelines, each generating multiple data sets.

audit_1x.png
Audit

Check provenance of datasets by name or other attributes, and compare runs

catalog_1x.png
Access

Discover datasets via a marketplace for features and along with search interface to build cohorts for analysis

marketplace_1x.png
Monitor

Check drift and access other custom usage monitoring services

How it Works

Enrich handles the complexity of computation and data semantics by providing a python SDK to develop, document and test the feature engineering modules (transforms, pipelines, scheduling, etc) and controlled execution on the server-side. 

 

The server provides an interface to discover, operate and audit the resulting features or datasets.

Hooks at either end of Enrich allow for understanding (cataloguing) input data stores, and surfacing features at any frequency through APIs for downstream consumption, by defining data contracts and integration points. 

 

So for Data Scientists, the Enrich feature store experience simplifies, standardizes, and speeds up the model development process, with confidence in their performance.

scribble new diagram-01.png

ADVANCED DEVELOPER
ACCESS

Configurable and interoperable via Python SDK

Enrich SDK Screenshot.jpeg
image.png

GET IN TOUCH

Success! Message received.

 
Scribblescribble1