AI-powered search & chat for Data / Computer Science Students

Turning Up the Heat: The Mechanics of Model Distillation

 Towards Data Science

When I first read this paper, I was struck by twin impulses. The first was that I should absolutely write a post explaining it, because of how many of its ideas are elegant and compelling — from its…

Read more at Towards Data Science

What is Knowledge Distillation?

 Towards Data Science

Knowledge distillation is a fascinating concept, we’ll cover briefly why we need it, how it works.

Read more at Towards Data Science

On DINO, Self-Distillation with no labels

 Towards Data Science

It has been clear for some time that the Transformers had arrived in the field of computer vision to amaze, but hardly anyone could have imagined such astonishing results from a Vision Transformer in…...

Read more at Towards Data Science

Using Distillation to Protect Your Neural Networks

 Towards Data Science

Distillation is a hot research area. For distillation, you first train a deep learning model, the teacher network, to solve your task. Then, you train a student network, which can be any model. While…...

Read more at Towards Data Science

Distill Hiatus

 Distill

Over the past five years, Distill has supported authors in publishing artifacts that push beyond the traditional expectations of scientific papers. From Gabriel Goh’s interactive exposition of momentu...

Read more at Distill

Knowledge Distillation : Simplified

 Towards Data Science

Neural models in recent years have been successful in almost every field including extremely complex problem statements. However, these models are huge in size, with millions (and billions) of…

Read more at Towards Data Science

Distilling Step-by-Step : Paper Review

 Towards AI

Exploring one of the most recent and innovative methods in LLM compression Continue reading on Towards AI

Read more at Towards AI

Patient Knowledge Distillation

 Towards Data Science

With the advent of deep learning, newer and more complex models are constantly improving performance on a variety of tasks. However, this improvement comes at the cost of computational and storage…

Read more at Towards Data Science

Distill Update 2018

 Distill

Things that Worked Well Interfaces for Ideas Engagement as a Spectrum Software Engineering Best Practices for Scientific Publishing Challenges & Improvements The Distill Prize A Small Community Revie...

Read more at Distill

TernaryBERT: Quantization Meets Distillation

 Towards Data Science

The ongoing trend of building ever larger models like BERT and GPT-3 has been accompanied by a complementary effort to reduce their size at little or no cost in accuracy. Effective models are built…

Read more at Towards Data Science

Knowledge Distillation — A Survey Through Time

 Towards Data Science

In 2012, AlexNet outperformed all the existing models on the ImageNet data. Neural networks were about to see major adoption. By 2015, many state of the arts were broken. The trend was to use neural…

Read more at Towards Data Science

Extract Cooling for Espresso

 Towards Data Science

Coffee Data Science Applying modern techniques to hack flavor Espresso is generally brewed hot, but one of the downsides is that aroma continues to be lost to evaporation. Generally, there is a notio...

Read more at Towards Data Science

Espresso Preparation: Grinding, Distribution, and Tamping

 Towards Data Science

Previously, we looked at pre-infusion, pressure, and temperature with a few data sources to understand what provided the best extraction. Now, let’s look at grinding, distribution, and tamping. The…

Read more at Towards Data Science

Comparing Methods for Measuring Extraction Yield in Espresso

 Towards Data Science

When brewing coffee, the main quantitative quality metric is Extraction Yield (EY). EY is determined in two ways: drying spent coffee and measuring Total Dissolved Solids (TDS) to compute EY. I…

Read more at Towards Data Science

Distillation of Knowledge in Neural Networks

 Towards Data Science

Currently, especially in NLP, very large scale models are being trained. A large portion of those can’t even fit on an average person’s hardware. Plus, due to the Law of diminishing returns, a great…

Read more at Towards Data Science

The Duality of Coffee: An Extract and a Filter

 Towards Data Science

Previously, I experimented with trying to turbo charge my espresso shot by feeding espresso back into the puck to extract more coffee. Instead, I found that used or spent coffee grounds filters…

Read more at Towards Data Science

Peeling Back the Mystery of Espresso Extraction in Staccato

 Towards Data Science

I have examined extraction by layer for staccato and staccato tamped espresso before, but those shots were usually 3 to 1 (output to input). Most of my shots are split between the 1:1 and the rest of…...

Read more at Towards Data Science

Espresso Extraction by Layer

 Towards Data Science

While developing the staccato shot, I noticed the spent coffee layers stayed separated after the shot. I decided to dry out some of these pucks to take a look at coffee extraction per layer. I was…

Read more at Towards Data Science

The Power of Knowledge Distillation in Modern AI: Bridging the Gap between Powerful and Compact…

 Towards AI

What is Knowledge Distillation? At its core, knowledge distillation is about transferring knowledge from a large, complex model (often called the teacher ) to a smaller, simpler model (the student ). ...

Read more at Towards AI

Dissecting Coffee Extraction: Part 2

 Towards Data Science

Coffee Data Science Splitting extraction by layer Previously, I looked at some extraction for dried coffee pucks, and some of the shots had a paper filter in between layers. I used this paper filter ...

Read more at Towards Data Science

Staccato Tamping: Improving Espresso without a Sifter

 Towards Data Science

Most of my coffee experiments are done casually throughout the day. Because of the shelter in place orders, I moved my coffee bar to my house, and I’ve had a fair amount of lattes and non-staccato…

Read more at Towards Data Science

Decrease Neural Network Size and Maintain Accuracy: Knowledge Distillation

 Towards Data Science

Practical machine learning is all about tradeoffs. We can get better accuracy from neural networks by making them bigger, but in real life, large neural nets are hard to use. Specifically, the…

Read more at Towards Data Science

The Diminishing Returns of Tamping for Espresso

 Towards Data Science

This experiment started from a Facebook discussion in an espresso group about tamping vs tapping. The thinking was tamping compresses grounds at the top and the bottom more than the middle, and this…

Read more at Towards Data Science

Extraction over the Life of the Coffee Bean

 Towards Data Science

I’ve been collecting data on my shots for almost 2 years. I wanted to look through that information to see if I could see any trends, particularly the age of a roast. The main caveats are: So I…

Read more at Towards Data Science