Resources
Blog posts

Fine-tuning Large Language Models: Complete Optimization Guide
Let’s say you buy a high-performance sports car, fresh off the production line. It’s capable, versatile, and ready to take on most driving conditions with ease. But what if you have a specific goal in mind – let’s say, winning a championship in off-road rally racing? The sports car, for all its inherent capabilities, would […]
Read More
Understanding Prompt Engineering: Introduction, Techniques and Future Perspective
Prompt engineering is a fascinating new frontier in the world of AI that is rapidly gaining momentum as the world at large awakens to the potential of LLMs. Research in the field of prompt engineering has exponentially ramped up in the last couple of years since consumer applications such as ChatGPT have taken the Internet […]
Read More
Large Language Models 101: History, Evolution and Future
Imagine walking into the Library of Alexandria, one of the largest and most important libraries of the ancient world, filled with countless scrolls and books representing the accumulated knowledge of the entire human race. It’s like being transported into a world of endless learning, where you could spend entire lifetimes poring over the insights of […]
Read More
Foundation Models 101: A step-by-step guide for beginners
The emergence of foundation models represent a seismic shift in the world of artificial intelligence. Foundation models are like digital polymaths, capable of mastering everything from language to vision to creativity. Have you ever wanted to know what a refined, gentlemanly Shiba Inu might look like on a European vacation? Of course you have. Well, […]
Read More
Managing The Organizational Impact of Bad Data
Big data is an indispensable part of our modern existence, powering several real-world applications such as personalized marketing, healthcare diagnostics, fraud prevention and many more that have transformed the way we live, work, and communicate with each other. However, since big data has become such a critical component of organizational decision-making, it is imperative to […]
Read More
How Data Products Can Help Overcome Data Consumption Challenges
Data has grown in importance as a commercial asset, with many companies investing considerably in data collection and transformation. Nevertheless, data collection is not the biggest challenge; what businesses do with it is. In the age of big data, another crucial difficulty is guaranteeing quality. Moreover, firms frequently face data management difficulties such as inefficient […]
Read More
Data Product Lifecycle: Evolution and Best Practices
Data products have exploded in popularity over the last few years. As an industry, we are where the automobile industry was around the turn of the 20th century. We are slowly transitioning from building hand-crafted, exclusive products for Big Tech customers to widespread commoditization. Soon, efficiency, maintenance, standards, and assembly lines are going to be […]
Read More
4 Advanced Analytics Techniques to Improve Decision-Making
In today’s data-driven business landscape, organizations are constantly pressured to make faster, more informed decisions that drive better outcomes. According to Forbes, 53% of companies use big data analytics to take inform business decisions. An HBR study points out that companies that use data-driven decision-making are 6% more profitable than those that don’t. However, with […]
Read More
What are Data Products?
“There were 5 exabytes of information created between the dawn of civilization through 2003, but that much information is now created every two days.” – Eric Schmidt, Google Human beings are now, on a semi-daily basis, generating and collecting data that equals the volume of the total collective knowledge of our species till around the […]
Read MoreCase Studies

Finding Conversion Anomalies at a Large E-Commerce Firm
Learn how a leading multi-billion dollar e-commerce company used Enrich to identify anomalies in their conversion rates and to find out their causal factors.
Read More
Streamlined data insights and agile data preparation for Terrapay
Learn how Terrapay, a leading cross-border payment infrastructure solution provider built the Terrapay Intelligence Platform (TIP) with Enrich to achieve operational efficiency through use cases such as Forecasting, Partner Performance Analytics, Customer Journey Analytics, and more.
Read More
How Mars Took Steps to Evaluate the Potential Impact of the “Great Resignation”
Learn how Mars, a Fortune 100 CPG company collaborated with Scribble Data to assign a “probability of attrition” through data, and ML modeling.
Read More
Accelerated ML Engineering for a Leading E-Commerce Brand
Learn how a leading e-commerce brand selling children’s apparel built their data intelligence platform on Scribble Data that supported the rapid development and deployment of use cases such as Product Listing Optimization and Re-ordering.
Read More
Understanding Shopping Paths at a National Mall Chain
Learn how a nationwide mall chain used Scribble Data’s Enrich platform to identify patterns of shopper footfalls, determine the timing and location of ads, and achieve a significant M-o-M increase in revenue.
Read More
A National Level Retail Store Chain
A national level retail chain in India leverages Scribble Data Enrich for developing an accurate understanding of their buyer personas, their distribution, demand and context at a fine granularity to address multiple operational use cases.
Read MoreVideos
Lifecycle of a Data Product with Dr. Venkata Pingali
Watch this session where Dr. Venkata Pingali, Founder & CEO of Scribble data shares his perspective on Data Products, the types of Data Products, and the lifecycle of Data Products with the Data Heroes community.
Watch NowCustomer Testimonial: Cloudphysician
Dileep Raman, Cloudphysician’s Co-founder and Chief of Healthcare, talks about how Scribble Data enabled them to rapidly build pipelines to transform their data and get daily updated feature sets as well as trustworthy models – all in less than 4 weeks!
Watch NowCustomer Testimonial: Mars, Inc.
Dr. Vidyotham Reddi of Mars, Incorporated–a leading US-based multinational CPG manufacturer of confectionery, pet food, and other food products and a provider of animal care services, talks about his experience of working with Scribble Data. Learn how his team at Mars was able to assign a “probability of attrition” to employees, calculated based on which […]
Watch NowWhat’s the deal with sentient AI? – Achint Thomas
Sentience in AI has always been the holy grail for computer science. What qualifies as AI sentience, and what is just another case of a model mimicking the data it’s trained on?
Watch NowAnatomy of a production ML feature engineering platform – Venkata Pingali
This talk draws upon the Scribble’s experience in building and evolving a production feature engineering platform, and the many conversations we have had with user data scientists. The talk will focus on the learnings, and not on the Scribble product itself, and expand on the talk from Fifth Elephant Mumbai in Jan 2019 on reducing […]
Watch NowAccelerating ML using Production Feature Engineering Platform by Venkata Pingali
Anecdotally, only 2% of the models developed are productionized, i.e., used day to day to improve business outcomes. Part of the reason is the high cost and complexity of productionization of models. It is estimated to be anywhere from 40 to 80% of the overall work.
Watch NowGlobal Feature Store Meetup #13 – Scribble Data
Feature stores have been traditionally designed for complex ML applications (Big-ML) that normally assume clear and high value propositions, long lead times, skilled staff, and advanced methods. Sub-ML is a space of mid-complexity ML applications where there is higher uncertainty in terms of value, methods used, available staffing, and speed is critical. Sub-ML is interesting […]
Watch NowOperationalizing responsible Machine Learning
ML models have to be both economically viable and FAccT (Fair, Accountable, Transparent). The terminology is new but not the need to defend models or to attest they can be trusted. Such requirements were present from the 70s for credit scoring models. What has changed is the scale and scope.
Watch NowExperimentation in Data Science
The ‘science’ in Data Science refers to the process of developing systematic understanding of the world through observations and experimentation. This science is happening in the context of fast moving organizations, in near realtime, and by folks who have varied backgrounds. The most familiar version of the experimentation is the A/B testing.
Watch Now