The Autonomous Blog – A blog about autonomous robots, drones and cars

By Sagar in Discussions
July 5, 2025
Neural Scaling Laws in Language and Vision
Introduction Scaling up machine learning models – in terms of model size, dataset size, and compute – has led to dramatic improvements in performance on language and vision tasks. In recent years, researchers have observed striking neural scaling laws: empirical relationships showing that as we
23 min read
78
Share
Continue Reading
By Sagar in Paper Reviews
May 2, 2025
Paper Review: BEVFusion
Motivation Autonomous vehicles rely on different sensors (cameras, LiDAR, and radar) each offering a unique view of the world. Combining them is key to reliable perception, but it’s tricky because cameras capture 2D images while LiDAR gives 3D structure. Earlier methods tried to force one
15 min read
94
Share
Continue Reading
By Sagar in Discussions
February 1, 2025
Evaluating Helix: A Vision-Language-Action Model for Humanoid Control
Robots in unstructured spaces, such as homes, have struggled to generalize across unpredictable settings. In this context, a few projects stand out more than the rest – for example, the Everyday Robot project at Google X and Figure’s 01 home robot. After being tied to
5 min read
69
Share
Continue Reading
By Sagar in Paper Reviews
January 5, 2025
Paper Review: DINOv2
Let’s talk about DINOv2, a paper that takes a major leap forward in the quest for general-purpose visual features in computer vision. Inspired by the success of large-scale self-supervised models in NLP (think GPTs and T5), the authors at Meta have built a visual foundation
9 min read
79
Share
Continue Reading
By Sagar in Paper Reviews
December 7, 2024
Paper Review: Multi-Headed Latent Attention (MLA) in DeepSeek-V2
DeepSeek-V2 introduces a major architectural innovation that enhances its efficiency as a language model – Multi-Headed Latent Attention (MLA). MLA stands out as a game-changing technique that significantly reduces memory overhead while maintaining strong performance. In this post, we will explore the fundamental concepts behind
8 min read
39
Share
Continue Reading
By Sagar in Paper Reviews
November 2, 2024
Paper Review: Depth Anything
Scaling Supervision Instead of the Architecture Depth Anything doesn’t try to win monocular depth with a clever new backbone. It asks a simpler question: if you can cheaply manufacture supervision at Internet scale, can a very plain recipe become a foundation model? The authors pair
5 min read
81
Share
Continue Reading
By Sagar in Prototypes
October 6, 2024
Building a Semi-Autonomous Drone
Hey folks, welcome to another weekend! I love drones, they give us new perspective to look at the world around us. These machines possess the ability to traverse all three dimensions, traversing vast distances in any desired direction to fulfill our desires. Among the illustrious
4 min read
49
Share
Continue Reading
By Sagar in Prototypes
September 7, 2024
Building The Spot Micro Pro
Boston Dynamics’ Spot robot is something that has always fascinated me. But the price tag of $75000 is hard to digest considering I’m just a regular guy. Luckily, folks at SpotMicroAI had started designing the Spot Micro – an open-source project where anyone could 3D
4 min read
49
Share
Continue Reading
By Sagar in Basics
August 11, 2024
Basics: Neural Networks
Historically, we humans have developed models to understand our world. For example, Newton’s second law of motion gives us the equation F=ma, which tells us that a force ‘F’ applied to a static free body in space of mass ‘m’ would accelerate it with amount
14 min read
38
Share
Continue Reading
By Sagar in My Papers
June 1, 2024
My new paper: Real-time Vision-based Navigation for a Robot in an Indoor Environment
Hey folks! I’m thrilled to share with you a research paper that I’ve been working on, diving deep into the fascinating realm of vision-based navigation for indoor robots. In my research paper titled “Real-time Vision-based Navigation for a Robot in an Indoor Environment”, I explore
2 min read
37
Share
Continue Reading