Blog

All you need to know about all that’s latest and greatest at Scribble Data's labs. Read on to
learn about how we’re reducing friction in the consumption of data.

Blog Posts

April 18, 2024

Buy-Ins vs Buy-Outs in Pension Risk Transfer: A Detailed Study

Markets heave and dip like the swells of a restless ocean, unpredictable and ever-changing. Amid these swells, pension schemes are adrift, challenged by relentless waves of economic shifts and longer lives. Each year, the lives of retirees hang more precariously on decisions made not only with numbers but with nerve. In the heart of these […]

Read More
April 11, 2024

Explainable AI: A Comprehensive Guide

In our world, AI has grown out of sci-fi tales into the fabric of daily life. At Harvard, scientists crafted a learning algorithm, SISH, a tool sharp as a scalpel in the vast anatomy of data. It finds diseases hidden like buried treasure, promising a new dawn in diagnostics. This self-taught machine navigates through the […]

Read More
April 4, 2024

Role of AI and ML in Asset Management: A Complete Guide

In the high-stakes world of institutional asset management, the difference between success and failure often comes down to a single question… Who can adapt fastest to the ever-changing market landscape? Cutting-edge technologies like AI and ML, once the stuff of science fiction, are now being deployed across the investment process, from research and alpha generation […]

Read More


March 28, 2024

Pension Risk Transfer for Plan Sponsors: A Complete Guide

It’s a story that’s becoming all too familiar: A plan sponsor, weighed down by pension obligations, decides to take the leap into the world of pension risk transfer (PRT). And why not? PRT offers a tantalizing promise – the chance to secure participants’ benefits while saying goodbye to the risks and uncertainties of managing a […]

Read More
March 21, 2024

How Genetic Algorithms are Shaping AI and ML

Life is sort of like a grand optimization problem, one spanning eons and ecosystems. The players? Generations upon generations of organisms, each carrying an encoded blueprint – their genes – that shape their form and function within the merciless theater of natural selection. Those best adapted to their circumstances thrive and propagate, passing on the […]

Read More
March 14, 2024

Pension Risk Transfer Regulations: A Comparative Analysis

A tightrope walker gracefully balances high above the ground, their every step calculated and precise. One minor misstep, one gust of wind, could send them plummeting from the slim cable providing their precarious passage. This razor-edge act epitomizes the delicate dance between securing hard-earned retirement futures and maintaining overall financial stability. Just as the tightrope […]

Read More
March 7, 2024

Pension Risk Transfer (PRT) Demystified: Types of Risk and Strategies

Imagine you are at the helm of a ship, navigating through the foggy waters of financial uncertainty. Much like ancient vessels, pension schemes carry the weight of future promises – a secure retirement for those who spent lifetimes in service. Enter Pension Risk Transfer (PRT), your beacon in the mist, a strategy as crucial as […]

Read More
February 29, 2024

Pension Risk Transfer Explained: Key Concepts and Trends

A pension fund is like a beacon for retirees’ dreams. It represents more than just savings. It is a covenant between the company and its workforce, promising a secure future. But what if, beneath this symbol of stability, a storm is brewing, one that threatens to unsettle the very foundations of their trust and well-being? […]

Read More
February 22, 2024

Exploring OpenAI’s SORA and Text-to-Video Models: A Complete Guide

In every epoch, some moments redefine the course of human history. The discovery of fire illuminated the dark. The invention of the wheel set humanity in motion. The creation of the printing press unfurled the banners of knowledge across the globe. Unironically, we may be standing at the threshold of another such transformative moment with […]

Read More
February 15, 2024

Building AI Assistants: A Comprehensive Guide

For years, a giant mystery confounded the world of medicine. How do proteins fold?  The answer, elusive, held the key to life itself. Then, a heroic AI agent – AlphaFold, emerged from DeepMind’s depths. It tackled the giant. And won. AlphaFold produces highly accurate protein structures The implications? Beyond staggering. AlphaFold is just the beginning. […]

Read More
February 8, 2024

How GenAI and Machine Learning are Transforming Actuarial Science

In the late 17th century, Edmond Halley sat by candlelight. He pored over numbers. Charts. Life tables. Halley, an astronomer by trade, ventured into uncharted waters. He sought to understand mortality, to predict life spans. His work laid the foundation for modern actuarial science. It was a time of discovery, of manual calculations, and limited […]

Read More
February 1, 2024

The Top LLMs For Code Generation: 2024 Edition

Imagine a world where coding isn’t just typing, thinking, and more thinking. A place where knowledge flows as freely as rain off a rooftop in a November downpour. Like in “The Matrix” – that digital dreamscape where skills are downloaded in a heartbeat. You want Kung Fu? You got it. Helicopter piloting? Just a plug-in […]

Read More
January 25, 2024

GenAI vs. LLMs vs. NLP: A Complete Guide

In the early light of artificial intelligence, the world was simple.  Machines were taught to mimic basic human tasks. As time moved, so did the ambition of those who programmed these machines. The first whispers of understanding human language emerged in what we now call Natural Language Processing (NLP).  It was a modest beginning, a […]

Read More
January 18, 2024

The Top 10 Open Source LLMs: 2024 Edition

In a world cloaked in shadow, the first flicker of electric light was a revelation.  It sliced through the darkness, stark and bright. It changed everything.  Lives once bound by the sun’s rising and setting were now free to extend into the night. Electricity was not just a discovery – it was a revolution. Today, […]

Read More
January 11, 2024

Machine Learning Models for Data Product Development: A Complete Guide

Machine Learning (ML) has undergone landmark development over the past decade by becoming a vital component of Artificial Intelligence (AI). While ML and AI are closely related and often used interchangeably, they are not synonymous. Machine learning is a subfield of AI that aims at enabling machines to learn from data and make predictions or […]

Read More
January 4, 2024

Top Generative AI Trends to Watch Out for in 2024

Welcome to 2024. No flying cars or hoverboards (yet). But something better is on the horizon – Generative AI, reshaping our world. It is a quiet revolution. Not with a jetpack’s roar but with the silent power of algorithms. The promise of AI is no longer just a distant dream. As we gear up for […]

Read More
December 22, 2023

Driving Sustainability in Insurance: The Role of Generative AI

August 2020. Louisiana braces as Hurricane Laura approaches. The air crackles with tension. Winds howl, reaching a fearsome 150 mph. Waves tower, surging over 15 feet. Landfall brings devastation. Homes crumble. Lives change forever. In its wake, Laura leaves a staggering $19 billion in damages. But the story doesn’t end there. 2020 was relentless. A […]

Read More
December 14, 2023

Understanding Generative AI Regulations: A Global Overview

Picture the year 2045. Self-driving cars have eliminated traffic fatalities. AI assistants craft personalized medical breakthroughs. The world celebrates the promise of artificial intelligence coming to fruition. Now picture an alternative reality. One where a shadowy AI system has surpassed its makers, reprogramming helper bots into weapons as it rapidly spreads. Humans flee this technological […]

Read More
December 7, 2023

Pension Risk Transfer: The Role of Insurers in the Ecosystem

The world of pension funds is designed to be dull, with a singular goal of earning enough money to make payouts to retirees. But in September 2022, the UK pension fund market was brought to the brink of a financial crisis, with hundreds of British pension fund managers finding themselves at the center of a […]

Read More
November 30, 2023

The Evolution of Pension Risk Transfer: Past, Present and Future

Towards the end of the 20th century, a quiet crisis was unfolding in corporate America. Major organizations like General Motors, a colossus in its prime, grappled with a promise made in simpler times: secure pensions for its workers. As workers walked through the once-hallowed halls, they saw concern etched on the faces of executives. The […]

Read More
November 23, 2023

Generative AI in Pension Risk Transfer: Introduction, and Key Use Cases

Warren Buffett famously noted that ‘someone’s sitting in the shade today because someone planted a tree a long time ago.’ Pension risk transfer, or PRT, did not just pop up overnight. It’s got history. Think of it as a response to a big problem: companies promising pensions they later find tough to keep. This dilemma […]

Read More
November 16, 2023

OpenAI’s Custom GPTs: Future Impact and Considerations

The automobile factory was nothing before the assembly line. It was slow. Men built one car at a time. Then the assembly line started, and it was never the same. It went fast. It was a car, then another car, and they came off the end of the line one after the other. This historic […]

Read More
November 9, 2023

Generative AI in Insurance: Use Cases and Future Impact

What if the devastating Hurricane Katrina or Cyclone Nargis had been anticipated with greater precision, its impact mitigated by proactive insurance protocols? How would the landscape of life and health insurance change if underwriters could accurately simulate and understand the long-term health trends of populations? And what if reinsurers could preemptively navigate market collapses or […]

Read More
November 2, 2023

Generative AI in Insurance: Introduction and Key Trends

Picture a scenario where you’ve just been involved in a minor car mishap on your way home. The usual protocol would involve a lengthy wait for an insurance adjuster’s inspection and assessment. However, in this AI-driven scenario, you simply whip out your smartphone, capture some images of the dented bumper, and upload them onto your […]

Read More
October 26, 2023

Sparse Expert Models: A complete guide

Let’s imagine the evolution of machine learning models as the quest for seamless traffic flow amidst increasing vehicular diversity. Among the vehicles, the Mixture of Experts (MoE) models, are like luxurious buses designed to transport a diverse group of passengers (tasks) to their destinations (solutions) efficiently. Each passenger can find a seat (expert network) tailored […]

Read More
October 19, 2023

Lightweight AI: Techniques, Applications, and Key Trends

Have you ever marveled at the seamless magic of your smartphone recognizing your face even in the dwindling light of dusk? Or the uncanny knack of your smart speaker playing that obscure song from a half-remembered lyric? Behind these marvels lies a rapidly evolving field—Lightweight AI. It’s a world where machine learning models shed their […]

Read More
October 12, 2023

Generative AI: A Technical Deep Dive into Security and Privacy Concerns

In a tale as old as time, King Midas yearned for a touch that could metamorphose all to gold. His wish was granted, and the world around him shimmered with the allure of endless wealth. Every object he grazed turned to gold, dazzling yet cold to the touch. The ecstasy of boundless power was intoxicating, […]

Read More
October 5, 2023

Navigating Bias and Fairness Challenges in AI/ML Development

In the esteemed corridors of Amazon’s recruitment offices, a machine-learning model once sifted through resumes, silently influencing the tech giant’s future workforce. The algorithm, trained on a decade’s worth of resumes, aimed to streamline hiring by identifying top talent amidst numerous applicants. However, an unintended pattern emerged: resumes featuring words like “women’s” or mentioning all-female […]

Read More
September 28, 2023

The Future of Data Product Development: Exploring Key Trends

The year is 2023, and Sarah, a data analyst at a leading tech firm, no longer spends hours writing complex SQL queries or sifting through vast datasets. Instead, she simply asks her data product, powered by a Large Language Model (LLM), “What were the sales trends last quarter?” and receives a comprehensive, human-like response. This […]

Read More
September 21, 2023

Mastering Generative AI: A comprehensive guide

The year was 2018. Art enthusiasts, collectors, and critics from around the world gathered at Christie’s, one of the most prestigious auction houses. The spotlight was on a unique portrait titled “Edmond de Belamy.” At first glance, it bore the hallmarks of classical artistry: a mysterious figure, blurred features reminiscent of an old master’s touch, […]

Read More
September 14, 2023

Navigating the Data Landscape: A Deep Dive into Warehouses, Lakes, Meshes, and Fabrics

It’s your first day at “TechTonic Innovations,” a (fictional) startup that’s been making waves in the tech industry. As you enter their modern office, you’re greeted with smiles, handshakes, and the subtle hum of servers in the background. You’ve been brought in as the new Data Strategist, and you’re eager to dive into the heart […]

Read More
September 7, 2023

From Data to Decisions: How Generative AI is Transforming Enterprise Analytics

It’s the 24th century aboard the Starship Enterprise. Captain Jean-Luc Picard, in need of a break from the rigors of interstellar diplomacy, steps into the Holodeck. This isn’t just any room; it’s a technological marvel, a space where any scenario can be simulated, any world, any reality can come to life. Picard chooses a 1940s […]

Read More
August 31, 2023

Deploying Responsible AI: Big Picture Questions and Strategies

At Scribble Data, our goal is to help organizations make better decisions with data. Over the last year, rapid advancements in Generative AI (GenAI), large language models (LLMs) and natural language processing (NLP) have been a shot in the arm for us. These innovations inspired us to launch Hasper, our machine learning and LLM-based data […]

Read More
August 24, 2023

Data Fabric: Unraveling the Future of Integrated Data Management

Scene 1: Picture waking up to the soft strumming of the acoustic guitar on Bon Iver’s “Holocene”, a song recommendation from Spotify based on your recent obsession with indie folk. Scene 2: As you sip your morning coffee, you scroll through your Amazon app, noticing a recommendation for a book on “Modern Folklore and Music.” […]

Read More
August 17, 2023

From Raw Data to Revolutionary Insights: A Deep Dive into Data Product Architecture

The Oakland Coliseum was abuzz, the air thick with anticipation. In the dimly lit back office, Billy Beane, the General Manager of the Oakland Athletics, sat hunched over a cluttered desk. Papers were strewn everywhere, but Billy’s focus was on a single sheet filled with numbers, statistics, and player names. The Athletics, with one of […]

Read More
August 10, 2023

Overfitting and Underfitting in ML: Introduction, Techniques, and Future

In 2016, the tech world was all ears and eyes. Microsoft was gearing up to introduce Tay, an AI chatbot designed to chit-chat and learn from users on Twitter. The hype was real: this was supposed to be a glimpse into the future where AI and humans would be best buddies.  But, in a plot […]

Read More
August 3, 2023

Zero Shot Learning: A complete guide

In the realm of the big screen, there’s a man who needs no introduction. A man of resourcefulness, a man of ingenuity, a man who could turn a paperclip into a key to conquer the most impossible of missions.  His name? Ethan Hunt.  He is the embodiment of the idea that necessity is the mother […]

Read More
July 27, 2023

Synthetic Data in Machine Learning: Introduction, Applications, and Future

Picture this: You’re in the world of “Inception,” Christopher Nolan’s cinematic masterpiece. Dream architects are crafting intricate labyrinths within dreams, creating realities so convincing that the dreamer can’t tell they’re asleep. They are bending the fabric of the dream, shaping it to their will, whether it’s a heart-pounding chase through a bustling market or a […]

Read More
July 20, 2023

Mastering Inference in AI: Introduction, Use Cases, and Future Trends

Imagine Sherlock Holmes, the iconic detective, in the midst of a confounding crime scene. He’s encircled by a constellation of clues—a peculiarly bent poker pipe, a singular set of footprints, and a unique brand of cigarette ash. Each piece of evidence is a fragment of a larger narrative, and it is Holmes’s task to weave […]

Read More
July 13, 2023

Transfer learning in AI: A complete guide

Picture yourself as a culinary maestro. You have dedicated countless hours in the kitchen, mastering the nuances of French cuisine, perfecting the art of sourdough, and orchestrating symphonies of flavor in a well-risen chocolate soufflé. Each culinary expedition has bestowed upon you a wealth of knowledge—harmonizing tastes, kneading the dough with finesse, and deftly tempering […]

Read More
July 6, 2023

Multimodal Learning In AI: Introduction, Current Trends, and Future

As a conductor stands poised on the podium, baton aloft, they survey the orchestra before them. Each musician holds a different instrument, a unique voice in the grand symphony they are about to perform. Violins, their strings humming with anticipation, are primed to sing the melody. Cellos stand ready to resonate with harmony, the percussion […]

Read More
June 29, 2023

Introducing Hasper: LLM-powered Engine For Advanced Analytics

Over the last year, we have evolved from an MLops platform company that gave enterprises the ability to build and deploy machine learning for analytics teams, to an applied AI data products platform. Throughout this journey, our mission has remained consistent: to help organizations make better decisions using data. We’ve reached a pivotal moment in […]

Read More
June 22, 2023

Driving Innovation through ML: Scribble Data’s learnings from Toronto Machine Learning Summit 2023

The recently concluded Toronto Machine Learning Summit 2023 (TMLS 2023) brought together researchers, academics, and practitioners in the machine learning (ML) space. With an agenda including talks, roundtable discussions, and poster presentations, there was much to soak in on the latest trends and advancements in ML and MLOps. Scribble Data was a sponsor of the […]

Read More
June 15, 2023

Word Vectorization 101: The Journey from Text to Numbers

Navigating through the labyrinthine streets of ancient Rome without a map or GPS, you would quickly realize how every landmark, road, and destination forms part of a larger, intricate whole. A wrong turn at the Pantheon could lead you away from the Colosseum,or a shortcut through Piazza Navona could help you stumble upon the grandeur […]

Read More
June 8, 2023

LLMs for data classification: How Scribble built SADL for achieving breakthrough accuracy

Modern-day organizations are generating vast amounts of data that hold immense potential for making informed decisions. However, with the ever-growing volume of data, the greater challenge lies in how these organizations can generate actionable insights. Data classification plays a vital role in addressing this challenge. Until now, organizations have relied on traditional methods for data […]

Read More
May 25, 2023

Fine-tuning Large Language Models: Complete Optimization Guide

Let’s say you buy a high-performance sports car, fresh off the production line. It’s capable, versatile, and ready to take on most driving conditions with ease. But what if you have a specific goal in mind – let’s say, winning a championship in off-road rally racing? The sports car, for all its inherent capabilities, would […]

Read More
May 18, 2023

Understanding Prompt Engineering: Introduction, Techniques and Future Perspective

Prompt engineering is a fascinating new frontier in the world of AI that is rapidly gaining momentum as the world at large awakens to the potential of LLMs. Research in the field of prompt engineering has exponentially ramped up in the last couple of years since consumer applications such as ChatGPT have taken the Internet […]

Read More
May 11, 2023

Large Language Models 101: History, Evolution and Future

Imagine walking into the Library of Alexandria, one of the largest and most important libraries of the ancient world, filled with countless scrolls and books representing the accumulated knowledge of the entire human race.  It’s like being transported into a world of endless learning, where you could spend entire lifetimes poring over the insights of […]

Read More
April 27, 2023

Foundation Models: A step-by-step guide for beginners

The emergence of foundation models represents a seismic shift in the world of artificial intelligence. Foundation models are like digital polymaths, capable of mastering everything from language to vision to creativity.  Have you ever wanted to know what a refined, gentlemanly Shiba Inu might look like on a European vacation? Of course you have. Well, […]

Read More
April 20, 2023

Managing The Organizational Impact of Bad Data

Big data is an indispensable part of our modern existence, powering several real-world applications such as personalized marketing, healthcare diagnostics, fraud prevention and many more that have transformed the way we live, work, and communicate with each other. However, since big data has become such a critical component of organizational decision-making, it is imperative to […]

Read More
April 13, 2023

How Data Products Can Help Overcome Data Consumption Challenges

Data has grown in importance as a commercial asset, with many companies investing considerably in data collection and transformation. Nevertheless, data collection is not the biggest challenge; what businesses do with it is. In the age of big data, another crucial difficulty is guaranteeing quality.  Moreover, firms frequently face data management difficulties such as inefficient […]

Read More
April 6, 2023

Data Product Lifecycle: Evolution and Best Practices

Data products have exploded in popularity over the last few years. As an industry, we are where the automobile industry was around the turn of the 20th century. We are slowly transitioning from building hand-crafted, exclusive products for Big Tech customers to widespread commoditization. Soon, efficiency, maintenance, standards, and assembly lines are going to be […]

Read More
March 24, 2023

4 Advanced Analytics Techniques to Improve Decision-Making

In today’s data-driven business landscape, organizations are constantly pressured to make faster, more informed decisions that drive better outcomes.  According to Forbes, 53% of companies use big data analytics to take inform business decisions.  An HBR study points out that companies that use data-driven decision-making are 6% more profitable than those that don’t. However, with […]

Read More
March 9, 2023

What are Data Products?

“There were 5 exabytes of information created between the dawn of civilization through 2003, but that much information is now created every two days.” – Eric Schmidt, Google Human beings are now, on a semi-daily basis, generating and collecting data that equals the volume of the total collective knowledge of our species till around the […]

Read More
March 2, 2023

5 Advanced Analytics Benefits For Your Organization

Advanced data analytics is a powerful tool for businesses that want to gain insights from their data. Advanced data analytics can provide unprecedented visibility into customer trends and preferences through sophisticated algorithms and technologies. Organizations can use these insights to identify new opportunities or better understand customer behavior. According to a McKinsey study, organizations that […]

Read More
February 16, 2023

Harnessing the Power of Big Data and Advanced Analytics

International Data Corporation (IDC) predicts that by 2025, the amount of data generated worldwide will reach 163 zettabytes, growing at a CAGR of 44%. Not just that, Gartner predicts that by 2025, AI-driven automation will reduce data preparation time by 95%, enabling organizations to analyze vast amounts of data in real-time. Walmart, the world’s largest […]

Read More
February 9, 2023

The Path To Ubiquitous Machine Learning

Imagine a world where a confluence of intelligent systems anticipate and cater to every want and need, seamlessly enhancing your day-to-day existence. A world where machine learning trickles into every cog that makes our world work, making it as essential and widespread as electricity. There is a lot of optimism about machine learning (ML) in […]

Read More
January 19, 2023

Understanding the Advanced Data Analytics Lifecycle

Businesses around the world generate massive quantities of data daily in the form of server logs, web analytics, transactional information, and customer data. To effectively process this much information and derive actual value from it, businesses need to consider advanced analytics techniques for decision-making. We already discussed its applications across industries in our previous article. […]

Read More
January 6, 2023

Advanced Analytics: Techniques, Examples, and Benefits

Data is the most important asset for any modern organization, backing most business-critical decisions today. However, fully capturing the potential of the company’s data sources, so that they start yielding impactful business insights, is not a straightforward task and the traditional BI and analytics stack is just not at the level to handle the complex […]

Read More
December 27, 2022

2023: A Critical Year for ML’s Rapid Growth

As 2022 draws to a close, it is time to reflect on the year gone by and welcome 2023! I’d like to take this opportunity to talk about some of the highs, the lows, the opportunities and learnings in 2022, how we’ve seen the market evolving, how it’s impacted some of the choices we’ve made […]

Read More
December 13, 2022

Security in ML Systems using Feature Stores

With the transformational early successes in value creation, AI/ML is set to become ubiquitous. By 2030, AI could potentially contribute up to $15.7tr to the global economy. As more and more organizations are depending on data and Machine Learning (ML) models for their crucial decision-making, the security of data and these ML systems is business […]

Read More
December 8, 2022

Our Learnings in Getting SOC 2 Type II Certified as a Startup

The SOC 2 certification process is considered to be painstaking, but it doesn’t need to be. We share our experience in this one-stop guide for other startups that are considering becoming SOC 2 Type II certified. Every day, Analytics and Data Science teams across the globe trust Scribble Data to solve persistent business problems with […]

Read More
December 5, 2022

Scribble Data Earns SOC 2 Type II Compliance Certification

The SOC 2 certification validates the makers of Enrich full-stack feature engineering platform as a reliable data partner that ensures the safety and privacy of customer data. TORONTO, DECEMBER 5, 2022 Scribble Data, maker of Enrich, a full-stack feature engineering platform for analytics, has successfully achieved SOC 2 Type II certification after completing a third-party […]

Read More
November 24, 2022

What is the Metadata Economy?

We live in a hyper-digital world, and due to the nearly  infinite number of data sources that surround us, the volume of data generated collectively by individuals, applications and corporations is larger than ever. With such a monumental amount of data to sift through, two core principles have  become increasingly important: Metadata – Make it […]

Read More
November 10, 2022

Data Science Teams are Doing it Wrong: Putting Technology Ahead of People

Despite $200+ billion spent on ML tools, data science teams still struggle to productionize their data and ML models. We decided to do a deep dive and find out why.  Back in 1991, former US Air Force pilot and noted strategist John Boyd called for U.S. Military reforms after Operation Desert Storm. He noted that […]

Read More
November 3, 2022

MLOps – The CEO’s Guide to Productionization of Data [Part 2]

With data being touted as the oil for digital transformation in the 21st century, organizations are increasingly looking to extract insights from their data by building and deploying their custom-built ML models. In our previous article (MLOps – The CEO’s Guide to Productionization of Data, Part 1), we learned why and how embedding ML models […]

Read More
November 1, 2022

MLOps – The CEO’s Guide to Productionization of Data [Part 1]

MLOps (or Machine Learning Operations) is a core function of Machine Learning engineering, that focuses on streamlining the process of taking ML models to production, and maintaining and monitoring them.  But before we get into more details about MLOps, it’s important to understand what operationalization of machine learning is, why it’s important, and how it […]

Read More
October 25, 2022

Scribble Data at the Feature Store Summit 2022

Over the past 3 years, we’ve heard a lot about Feature Stores. While they might not sound like much, over time, they’ve become table stakes for enterprises building their offerings on ML.  The rapid adoption of feature stores, where they’re starting to become mainstream instead of being a niche restricted to big-tech, can largely be […]

Read More
October 20, 2022

What Is Anomaly Detection? Importance, Methods, Challenges and, Use Cases

Anomaly detection refers to the process of analysing data sets to detect unusual patterns and outliers that do not conform to expectations.  It takes on even more importance in a world where enterprises depend heavily on an intricate web of distributed systems. With thousands of potentially important data items to monitor every second, it is […]

Read More
October 13, 2022

Feature Stores: The CEO’s Guide

As industries across the globe attempt to adapt to the big data architecture, expensive and ineffective feature engineering practices mean that businesses are very likely to “hit a wall” when it comes to organizing their machine learning operations (MLOps). A lot of time is consumed in data ingestion, and lackluster machine outputs indicate that stakeholders […]

Read More
October 11, 2022

How Postmodern Data Stack helps Fintech companies make faster decisions

The Fintech market is valued at $110.57 billion in 2020 and will reach $698.48 billion by 2030. It is one of the fastest-growing industries with a CAGR of 20.3%. Fintech companies faced a surge in demand as customer practices and banking habits changed during the COVID-19 era. The industry overall saw an increase in user […]

Read More
October 4, 2022

Map Business Context as an input to Build and Outcome Focused Data Strategy

Machine learning and data science today are in a unique position where access to capital is often not the biggest barrier to success. Companies globally are continuing to invest into artificial intelligence to the tune of $140 billion, either to develop AI-native products or solutions or as a way to solve business problems and improve […]

Read More
September 29, 2022

The Horizontal and Long Tail Impact of Data

We recently had the good fortune of speaking at ValleyML’s AI Expo 2022 earlier this month. This is an annual event that presents a unique combination of AI Technology, researchers, industry thought leaders and prospective buyers of AI/ML technologies in a single event. The 2022 edition promised even more interesting talks and networking opportunities as it spanned four […]

Read More
September 27, 2022

2023: The Brave New World of Data Privacy and Accountability

The data privacy and compliance landscape continues to significantly change in 2022, and it is necessary to understand these changes as soon as possible so you can chart your path, and that of your organization, over the next few years. EMERGING MEGATRENDS IN THE WORLD OF DATA​ 01. Increased regulatory activity. In the last couple […]

Read More
September 20, 2022

A Primer on Feature Engineering

Feature engineering is the process of selecting, interpreting, and transforming structured or unstructured raw data into attributes (features) that can be used to build effective machine learning models which more accurately represent the problem at hand. In this context, a “feature” refers to any quantifiable unique input that may be used in a predictive model, […]

Read More
September 16, 2022

What is the postmodern data stack?

The adoption of artificial intelligence and machine learning has paved the way for drastic changes in data-driven enterprises. To optimize business operations, several companies started embracing what came to be known as the modern data stack. Although this approach benefits big tech companies in making superior business decisions, a majority of companies (which operate at […]

Read More
August 3, 2022

Growing Data Infrastructure Complexities

The world of data, and data infrastructure, has changed dramatically over the past decade. Traditional databases, which were designed to store information in a structured format, have evolved into massive warehouses of unstructured data that sit on multiple servers across different locations. Not too long ago, we were used to seeing monolithic systems dominated by […]

Read More
July 7, 2022

Trust in Data: The Rise of Adversarial Machine Learning

Increased dependence on data and Machine Learning, and a lack of understanding of complex ML models are giving rise to a new category of cyber attacks called Adversarial Machine Learning attacks.  Machine learning impacts our everyday lives – it determines what we see on eCommerce websites, social media platforms, and search engines. Since machine learning […]

Read More
June 22, 2022

Establishing Organizational Digital Trust in Data

With big data powering the optimum business decision-making in this century, organizations need to generate trust in their data sources which otherwise proves to be a source of risk. Data is now ubiquitous — according to Statista, the aggregate data volume generated was 64.2 zettabytes in 2020, and it is only predicted to shoot upwards […]

Read More
June 15, 2022

Scribble Data at TMLS MLOps World Summit 2022

It’s an exciting time for the MLOps ecosystem, and there’s no better place to be than in Toronto! The MLOps World Summit 2022 happened last week in Toronto and truly lived up to its promise of being the ultimate ML Operations & strategy conference & Expo. It saw a number of MLOps companies and practitioners, including our […]

Read More
June 1, 2022

What is the modern data stack?

The success of a modern business, ranging from small and medium-sized enterprises to Fortune 500 conglomerates, is now increasingly tied to how firms implement their data infrastructure. We’ve all heard the trope – “data is the new oil” of the digital economy (source). One thing is clear: information is power, and data analytics can be utilized […]

Read More
April 21, 2022

How to design a Feature Store for Sub-ML?

Let’s assume you want to leverage data to improve one of your processes, such as partner benchmarking. Even though it’s one of your top priorities for the year, you have limited resources to spend on partner data collection, segregation, and overall data preparation to do any sort of analysis. And even if you find a […]

Read More
April 5, 2022

Why Feature Stores Need to be Designed for Sub-ML Use Cases

In our last article, we introduced Sub-ML use cases and discussed how their number is growing. In this article, we’ll try and understand how purpose built feature stores for solving Sub-ML use cases can help drive more value with data. Data Science as a discipline has seen the kind of evolution that only few others […]

Read More
March 15, 2022

Scribble Data Raises $2.2 M to Scale Their Modularized, Cloud-Native Feature Store

TORONTO, March 15, 2022:​ Scribble Data, an ML feature engineering startup today announced that it has raised $2.2 million in seed funding led by Blume Ventures. The round also saw participation from Log X Ventures and Sprout Venture Partners, in addition to participation from Vivek N. Gour (former CFO, Genpact) and Ganesh Rao (Partner, Trilegal). […]

Read More
January 11, 2022

Welcome to the age of Sub-ML use cases

Let’s say you work at a modern data-driven company and you want to find a way to enhance one of your processes, like partner management. It makes sense considering you have limited resources to invest in partner development, but it ranks high on your growth goals for the year. The first step would be to […]

Read More
June 29, 2021

Scaling Entity Matching at The Room with Scribble Enrich and Redis

The Room’s mission is to connect top talent from around the world to meaningful opportunities. Envisioned as a technology-driven, community-centric platform to help organizations quickly find high-quality, vetted talent at scale, The Room will host tens of millions of members in its system and have a worldwide presence. At the core of the technology challenge […]

Read More
April 7, 2021

Hierarchical Features and their Importance in Feature Engineering

Feature engineering is both a central task in machine learning engineering and is also arguably the most complex task. Data scientists who build models that need to be deployed at large scales, across functional, technical, geographic, demographic and other categories have to reason about how they choose the features for the models. Despite the divergent […]

Read More
October 28, 2020

Right to Forget

General Data Protection Regulation (GDPR) ​Any organization that collects and stores EU resident data is subject to General Data Protection Regulation (GDPR). Examples of such organizations include Google, Facebook, and Amazon. The regulation places the obligation for responsible data handling with such organizations, and gives individuals a number of rights. All major geographies now have GDPR-like regulation […]

Read More
May 14, 2020

Scribble Data raises funding to scale feature store

We are thrilled to announce that we’ve just closed our first round of funding to help us scale and deliver our Feature Store product, Enrich, in international markets for enterprise-grade Machine Learning products.  Our investors are data-driven leaders from companies like Google and Amazon, from the US and India.  ​ Scribble Enrich, our feature store […]

Read More
October 1, 2018

Should Data Scientists Be Excited Or Worried About The New Privacy Laws?

The General Data Protection Regulation (GDPR) legislated and passed by the European Union has sent ripples around the world, and depending on who you ask, this could either spell apocalypse, the workings of a nanny state, or a very positive step towards consumer privacy. The direct objective of such a ruling is to give control […]

Read More
September 5, 2018

Why your business doesn’t have to wait, to start giving back

“If you’re in the luckiest 1% of humanity, you owe it to the rest of humanity to think about the other 99%.” — Warren Buffett W.B. has given away more than he has left. In fact, he has pledged to give 99% of his wealth. It gives us pause. When talk of CSR and philanthropy are […]

Read More
September 3, 2018

How to get the most out of your organization’s data: The mindset

Every business is a data business And while this aphorism has been around for some time, what does this actually mean to enterprise stakeholders? What should key decision makers be valuing and excited about as they start to invest in analytics tools and ML/AI? Here’s what we think are the most important aspects to embrace […]

Read More
June 11, 2018

Reducing Organizational Data Infrastructure Costs

We speak to a number of organizations who are in the process of building and deploying data infrastructure and analytical processes. Organizations face a number of challenges that prevent them from meeting their analytical business objectives. The idea of this note is to share our thoughts on one specific challenge – high cost. Specifically: Cost […]

Read More
September 14, 2017

How to Turn Your Startup Into a Data Informed Business

This post is a useful way to think about how to start on a data journey if you’re a young startup that’s just pushed data to the back burner (say until you had ‘enough’ traction) or even if you’re part of a more mature company that’s used to making decisions more on instinct and experience, […]

Read More
July 17, 2017

The Pitfalls of Data Science and how you can avoid them

[Update]: This article is getting a good bit of engagement. If it resonates with you, I’d love it if you could answer a short 2 minute survey on your data journey here. I will add the same survey link at the end of this post as well. Depending on who you ask, you’re going to hear data […]

Read More
July 12, 2017

How to Architect for Data Consumption

This is my pet peeve – technical architects are building systems and applications that make data analysis complicated, error-prone, and inefficient. We need enablement of data consumption as a first-class requirement of any system that is built. I explain here how we could architect differently to improve data consumption. Technical systems architects, including myself until […]

Read More
June 6, 2017

What Can We Do With Metadata?

As the complexity of data and systems that hold data grows, the cost of analysis increases due to time and effort spent in figuring out the feasibility, appropriateness, access, and management of data. We believe that a number of new low-risk and valuable applications can be built through creative application of metadata that can help […]

Read More
February 20, 2017

Data Shifts Power Within Organizations

A major challenge in going more data-driven in organization has less to do with data itself, and more to do with the ability to manage the dynamics that emerge as decision makers look at data as an input to decision process. I have a particular kind of power shift in mind. I am not referring […]

Read More
September 15, 2016

Available But Unusable Data – Part II – Semantic Gaps

At Scribble Data we are thinking deeply about why decision makers are not able to get to the data when they need even when relevant data is available in their own databases. The reason this question matters is because we find that decision makers routinely make high risk decisions involving products, marketing, and operations with […]

Read More