This Week I Learned - Week #5 2025

This Week I Learned -

* What is DeepSeek? Aravind Srinivas of Perplexity explains:

DeepSeek R1 is an AI model. An AI model is a bunch of a matrices with floating point numbers (referred to as weights) where you feed in an input (a sequence of characters embedded as a vector of floating point numbers) and get an output sequence.

DeepSeek is a mobile app (same name as the company) that lets you interact with that AI model through a chat interface. When you use their app, your data (prompts) go to their servers.

The company has also open sourced (basically uploaded all those matrices) the weights of the AI model for free use by anyone.

When you download those weights and bring it up yourself on your own server, you get to control the inference of the AI model and that way any user request sent to this new server doesn’t go to China as long as the servers are hosted in US.

The weights are just a bunch of numbers organized as matrices executed with sequential matrix multiplies - so no computation needs to leave the server in order to compute the next word in a sequence.

That way, another company can download the weights, host it on their servers, and let users interact with them in a chat frontend, and customize the AI model further to do more things like searching the web or using tools like code execution, wolfram, etc

* DeepSeek was started in May 2023 as a side-project by Liang Wenfeng, who also founded and runs a hedge fund called High-Flyer. Its headquarters are in Hangzhou.

* OpenAI founder Sam Altman was dismissive about a challenger emerging to something like ChatGPT. The open-source Chinese-built large language model, DeepSeek-R1, is emerging as a cost-effective and open-source rival to advanced AI systems like OpenAI’s o1.

* DeepSeek-R1 is a 671B parameter AI model designed to enhance deep learning, natural language processing, and computer vision capabilities.

* Marc Andreessen, co-founder of the marquee venture capital firm Andreessen Horowitz and an advisor to US President Donald Trump, described DeepSeek’s accomplishment as “AI’s Sputnik moment,” making a reference to a period of anxiety among Western nations about a possible technological gap between the US and the Soviet Union when the latter launched the Sputnik satellite.

* Yishan, former CEO of Reddit - "Deepseek moment is not really the Sputnik moment, but more like the Google moment...Sputnik showed that the Soviets could do something the US couldn't ("a new fearsome power"). They didn't subsequently publish all the technical details and half the blueprints. They only showed that it could be done...Deepseek is MUCH more like the Google moment, because Google essentially described what it did and told everyone else how they could do it too. "

* DeepSeek R1 has figured out RL finetuning. They wrote a whole paper on this topic called DeepSeek R1 Zero, where no Supervised Fine Tuning (SFT) was used. And then combined it with some SFT to add domain knowledge with good rejection sampling (aka filtering). The main reason it’s so good is it learned reasoning from scratch rather than imitating other humans or models.

* Deepseek employs a technique known as distillation, where a new model is constructed on an existing model, rather than using raw data from the Internet.

* DeepSeek’s success lies in its use of advanced techniques like Test Time Scaling, which optimises model performance during inference, and its strategic utilization of Nvidia’s export-compliant GPUs tailored for the Chinese market.

* By leveraging older-generation Nvidia H-800 GPUs instead of cutting-edge hardware like the H-100 GPUs, DeepSeek sidestepped US semiconductor export restrictions, demonstrating that necessity drives invention.

* DeepSeek-R1-Zero (no supervised fine-tuning) showing human-like reasoning skills in natural language just by virtue of reinforcement learning (RL).

Chart: Aimee Picchi, CBS News | Source: CapitalIQ

* A $600 billion drop in Nvidia’s market value alongside steep declines for other AI and energy stocks is the largest single-day loss in U.S. stock market history.

* Meta has created several war rooms where employees are reverse engineering DeepSeek's technology

* DeepSeek R1 is now live on Perplexity, Azure AI Foundry, GitHub, Krutrim Cloud, Amazon Bedrock and SageMaker AI

* All Perplexity Pro users now get 500 daily DeepSeek R1 queries (without censorship and prompts not going to China) with web search grounding and reasoning traces. Free users get 5 daily queries.

* Azure AI Foundry has everything you need to customize, host, run, and manage AI-driven applications built in GitHub, Visual Studio, and Copilot Studio, with APIs for all your needs. Get started quickly with over 200 enterprise-ready Azure services along with more than 1,800 models for your next AI app.

* With Azure AI Content Safety, built-in content filtering is available by default, with opt-out options for flexibility. Additionally, the Safety Evaluation System allows customers to efficiently test their applications before deployment. These safeguards help Azure AI Foundry

* GitHub Models is a catalog and playground of AI models to help you build AI features and products. GitHub Models makes it easy for every developer to build AI features and products on GitHub. You can compare models using side-by-side comparisons in GitHub Models if you have API keys.

* Kaggle Models provides a way to discover, use, and share models for machine learning and generative AI applications. Kaggle Models is a repository of pre-trained models that are deeply integrated with Kaggle's platform, making them easy to use in Kaggle Competitions and Notebooks. Like Datasets, Kaggle Models organize community activity that enrich models' usefulness: every model page will contain discussions, public notebooks, and usage statistics like downloads and upvotes that make models more useful.

* DeepSeek on Humans (via @adonis_singh) -

"Humans instinctively convert selfish desires into cooperative systems by collectively pretending abstract rules (money, laws, rights) are real. These shared hallucinations act as "games" where competition is secretly redirected to benefit the group, turning conflict into society's fuel."

* ResponsiveVoice offers a HTML5 text-to-speech API

* Small-cap Stocks which are less reliant on imports are positively impacted by hike in US Tariffs

* Solar module exports to the US accounted for 98 percent of India's $1.44 billion in exports in 2024.

* An estimated 68% of US households have a pet. New York City’s council is set to consider legislation that would allow New Yorkers to take paid sick leave to care for their pets.

* Blindness is a spectrum. It is not a binary condition and simply a matter of being "blind" or "sighted." It encompasses a wide range of visual impairments, from mild to severe. - via Blink

* Guillain-Barré Syndrome is an autoimmune disease (the body’s own immune system attacks the nerves), which usually occurs a few weeks after a viral infection or bacterial infection. Diagnosis of GBS is based on the patient history and neurological, electrophysiological, and cerebrospinal fluid (CSF) examinations. GBS is a post-infection phenomenon and therefore the stress should be to identify its causes.

* If you find yourself lost in a desert or at sea, drinking urine for survival is not advisable, as it contains salt and urea, which will dehydrate you further. Seawater is even worse. Additionally, digesting protein requires more water than other foods, so it is best avoid it. Without water, death occurs after about three days in the desert as the body dries out quickly - at sea people can survive six to seven days. - The Essentials of Sea Survival by F Golden and M Tipton (2002)

* There are about 250mn feature phone users in India.

* The rate of rejection for health insurance claims increased by 19.1% in FY 2023-24 – altogether ₹26,000cr worth of claims denied.

* India has about 15 crore Senior Citizens at present. be handled carefully.

* According to the 2021 data from the Ministry of Social Justice, the government supported 551 NGO-run old age homes across India.

* "The only place success comes before work is in the dictionary." - Vince Lombardi

Search This Blog

Tech Tips, Tricks & Trivia

This Week I Learned - Week #5 2025

Comments

Post a Comment

Popular posts from this blog

Datawrapper Makes Data Beautiful & Insightful

GitHub Copilot Q&A - 1

Learning Resources for GitHub Foundations Certification