Galbot’s humanoid robots do the boring stuff, and that’s the point

What else can humanoid robots do besides dancing and somersaulting?

If anyone is qualified to answer this pointed question raised by Zhu Xiaohu, it’s Wang He, an assistant professor at Peking University and founder and CTO of Galbot. The startup is a key player in China’s embodied artificial intelligence sector, with a particular focus on developing the “brain” of robots.

Since launching in May 2023, Galbot has released a single hardware product: the humanoid robot Galbot G1. But it has introduced several embodied AI models. The company directs most of its resources into model development, aiming to enhance robotic generalization and adaptability.

Wang believes the industry’s fixation on refining robot bodies is misplaced. That focus has pushed robots into price wars, often sold for little more than the cost of raw steel. The real value, he argues, lies in advancing intelligence rather than building cheaper shells.

Developing general-purpose embodied models is a new frontier. Despite this ambition, Wang remains pragmatic. “I strongly advise against hyping up embodied artificial general intelligence (AGI). Many companies dream of leapfrogging into embodied AGI, and I just don’t agree with that,” he told 36Kr.

“These models are still immature. We’re likely five to ten years away from robots that can do everything,” he added. “There has been plenty of academic progress, but no scalable product has yet landed in the market.”

Many of Galbot’s competitors showcase flashy demos: robots folding clothes, shaving, or zipping up jackets. Meanwhile, Galbot concentrates on less flashy but essential tasks: moving, picking, and placing. Its flagship model is the modestly named GraspVLA.

“We’re also training robots to hang clothes on a hanger,” Wang said. “But that’s just academic research for now. It’s far from a commercial product.”

Currently, the most deployable robotic functionality involves mobile pick-and-place tasks. Galbot is working to apply these in environments such as pharmacies and retail stores.

The company has rolled out what it describes as the world’s first humanoid-powered smart retail solution. In Beijing, it operates nearly ten unmanned pharmacies that function round-the-clock, with robots sorting and delivering medications to couriers.

The plan is to scale to 100 locations across Beijing, Shanghai, and Shenzhen by year’s end. This retail deployment has already reached commercialization and is projected to generate nearly RMB 100 million (USD 14 million) in revenue this year.

At the recent BAAI Conference, Galbot gave a live demo. On command from Wang, the G1 robot located a beverage on a shelf and delivered it, completing the task autonomously without remote operation or prior scene mapping.

Wang is transparent about the challenges: even basic deployment requires extensive data preparation. Galbot is beginning with shelving environments in retail and gradually expanding its scope.

Mastering generalization for core tasks like pick-and-place would represent a significant milestone in robotics. Wang suggests that if robots scoring a perfect 100 represent total general-purpose capability, then reliable pick-and-place skills might merit a score of ten. Consistent deployment in retail shelving? Just a one.

Galbot, Wang said, has moved from zero to one on that scale and is now pushing toward general-purpose embodied intelligence.

Photo of Wang He, founder and CTO of Galbot. Photo and header photo source: Galbot via 36Kr.

The following transcript has been edited and consolidated for brevity and clarity.

36Kr: How many employees does Galbot have right now?

Wang He (WH): Just over 100.

36Kr: That’s smaller than other companies in your tier.

WH: We’re still focused on product and engineering. Galbot has released just one humanoid robot, the Galbot G1, designed for use in industrial, retail, and service scenarios. Its primary functions are movement, picking, and placing.

We believe concentrating on a single skill set that can be applied across multiple settings is more valuable than pursuing scattered capabilities or trying to build robots for every possible scenario. That approach would require a much larger team.

36Kr: Galbot has only released one robot body, but multiple AI models. Are you prioritizing models over hardware?

WH: Surprisingly, we actually employ more hardware engineers than software engineers. People assume one product means less need for hardware support, but our design standards are quite different.

Many companies build robots that only need to perform for a short demo. But there’s a big gap between a demo bot and one that operates round-the-clock in a real pharmacy. If a hardware glitch requires a technician, that adds cost. From day one, we’ve aimed to meet or even exceed automotive-grade reliability standards.

36Kr: What about R&D spending?

WH: Our biggest investment is still in model development. But this doesn’t mean hiring a ton of people to train models. It’s about building a complete pipeline that goes from data infrastructure to training and testing. Compute costs are a major part of that. In fact, the top model engineers are always rare, no matter the company.

36Kr: Galbot is known for using synthetic data. But other companies say they do this too, blending it with simulation, real-world video, and robot data. What sets you apart?

WH: Synthetic data is only useful if you know how to use it. Some companies dismiss it, but we’ve used it to cut training costs and improve generalization. Our proprietary data generation system is core to our success.

Yes, anyone can download YouTube videos. We also use some teleoperation data from our retail deployments, but synthetic data makes up the bulk. To do this well, you need deep expertise in graphics, physics simulation, rendering, action pipelines, and validation. That infrastructure takes years to build. It’s why our models generalize better.

36Kr: Your robots use wheeled bases. Is that because Galbot emphasizes upper-body tasks?

WH: It comes down to market demand. Most customers want robots that can move, pick, and place in retail or factory environments. Wheels are better for that. Bipedal robots are noisy and have short battery life. Ours can operate for six to eight hours on a single charge.

We do research across the full embodied stack, including bipedal systems. But those aren’t yet ready for large-scale use.

36Kr: Hospitality robots, like those used for greeting or performing, have gained traction this year. Why hasn’t Galbot entered that market?

WH: I think those use cases are a flash in the pan. Real market value doesn’t come from viral moments. It comes from consistent user experience.

Take lobby greeters. They exist, but they are mostly ornamental. We’re developing the next generation of concierge robots, machines people will actually want to use because they are helpful. Once we get there, the market will follow.

We’re not ignoring this space. We’re building toward it. Right now, we’re connecting technical dots and gradually expanding into broader applications.

36Kr: Do your investors pressure you to commercialize quickly?

WH: Our investors have been highly supportive, not just with funding but also with strategic resources. We already have deployments generating revenue, so we don’t feel intense pressure.

36Kr: What about the education and research markets? Are you targeting them?

WH: That’s a matter of priorities. The education market may not be as profitable as it seems. How many units can you actually sell?

There’s already a swarm of bipedal robot companies chasing that segment. At Galbot, we want to solve real market pain points, not commoditize robot bodies as if they are just steel parts.

36Kr: Are you seeing signs of a price war?

WH: Absolutely. Prices have dropped fast. Some robots are selling for just tens of thousands of RMB, and prices will fall further. That’s not all bad: faster hardware cycles bring down costs, which helps us too.

But the key question is: what problems are these low-cost robots solving? We focus on high-value tasks. Our robots cost six figures in RMB, and customers are happy to pay because they offset labor costs, especially where three-shift coverage is required. That’s why we expect to bring in an eight-figure RMB sum this year.

36Kr: Why are customers willing to pay a premium?

WH: Expectations are different. If someone buys a robot for cheap, they don’t expect much. But if you want a robot that can be deployed onsite, working around the clock with no failures for a full month, that’s what we offer.

36Kr: Some critics say that pick-and-place tasks are too limited. What’s your take?

WH: In retail, warehousing, and automotive logistics, tens of thousands of people do exactly that: move, pick, and place. If someone thinks that’s a niche market, they probably haven’t looked closely. I see demand for hundreds of thousands of such robots, potentially more than the global output of industrial robots today.

36Kr: So why haven’t these robots seen wider deployment?

WH: Because the technology isn’t mature enough. Even leading players like Google DeepMind’s RT series haven’t achieved full deployment.

At the BAAI Conference, we were the only ones who dared to show our robot autonomously retrieving and delivering goods live on stage. No remote control, no pre-scripting. I haven’t seen anyone else do that yet.

36Kr: Other companies show robots doing more complex tasks like shaving or folding clothes. Isn’t that more impressive?

WH: Those are academic demonstrations, not products. Folding clothes sounds great, but can anyone deliver that at the speed, neatness, and generalization required for commercial use?

We’re also working on complex skills, like hanging clothes on a hanger. Our synthetic data library includes millions of virtual garments. But no one has turned that into a scalable, reliable product yet.

36Kr: Of your deployments, which are market-ready and which are still in testing?

WH: Our pharmacy and retail applications are fully commercialized. That’s where most of our revenue comes from.

Factory deployments are still at the proof-of-concept (POC) stage. These environments demand high precision and uptime. On an electric vehicle line, even a minute of downtime is costly. Tesla, Figure AI, and others are also in the POC phase.

Still, we’ve delivered several industry-first POC projects globally, like scanning and plate sorting (SPS) for a global carmaker, bin and sunroof transport for Mercedes-Benz, and material handling for Zeekr. We’re moving fast, but full integration into production lines will still take time.

36Kr: Are any of those automakers investors?

WH: No, none of them have invested in us. But they have strong automation needs, which is why we’ve built partnerships with them.

36Kr: You’ve released several models. Beyond GraspVLA, are other models like TrackVLA commercialized?

WH: TrackVLA targets consumer-facing uses, like industrial inspection or transporting items in-store. We’re working with Unitree Robotics and others to deploy it.

Navigation is easier to generalize than manipulation. Our models can work across various robotic dog platforms, aiding cross-hardware scalability.

36Kr: Agibot and Astribot are working with Physical Intelligence (PI). Does teaming up with a top model provider help with commercialization?

WH: I don’t know the specifics of their partnerships. What I do know is that PI gathers a wide variety of real-world robot data from different manufacturers. From a data quality standpoint, I don’t support the approach of aggregating data from many robot types. Mixing data across platforms often leads to low-quality training data.

36Kr:If we compare embodied intelligence to foundation models in AI, where do things stand?

WH: The comparison doesn’t really apply. Embodied intelligence has far more dimensions.

With autonomous driving, for example, there are five levels, and that’s just for driving. Embodied intelligence spans a wide range of tasks. A robot might master pick-and-place but still be unable to lift a child or help someone stand up.

Each product needs its own metric. To be called intelligent, a robot should reach at least Level 4: autonomous operation, not just functioning as a tool.

Unlike the rapid leap seen with ChatGPT, embodied intelligence is progressing slowly and incrementally.

36Kr: So the “ChatGPT moment” for embodied AI is still far off?

WH: Yes. ChatGPT works for general Q&A. Embodied AI faces hurdles in hardware, sensors, and data collection. Many components are still immature. We’re five to ten years away.

Humans rely on more than vision, language, and action. We use hearing, smell, touch, and temperature. Vision-language-action (VLA) models are just the beginning. To achieve human-level embodied intelligence, we’ll need to integrate more modalities.

Right now, the most viable path is perfecting mobile pick-and-place in predictable environments like supermarkets and factory lines. If we can scale that, it will be a huge milestone akin to building a fully automated dark factory.

36Kr: Are others in the industry aiming for that same milestone, or pursuing different breakthroughs?

WH: Honestly, not many are doing the hard work. Some just sell hardware or platforms without ensuring end users can solve problems.

Even among those focused on models, most are still doing academic work. Very few are building practical products for real deployment. That lack of responsibility and realism is why the field feels fragmented.

36Kr: What needs to be done to deploy mobile pick-and-place robots in services like pharmacies and convenience stores?

WH: Every new deployment requires data prep. Whether it’s synthetic data, small-scale real-world data, or even reinforcement learning in a specific setting, some kind of groundwork is essential for building a product that performs consistently.

We’re not trying to solve every problem at once. We’re starting with grocery and retail shelving and making sure our robots can generalize well in those specific scenarios. Only after that will we scale to more varied environments. It’s a tougher path than most people realize.

36Kr: Beyond pick-and-place, what’s Galbot’s next frontier?

WH: Our team of researchers is working on legged robots, dexterous hands, and more. Dexterous manipulation is a personal focus—I’ve won several awards in this area. It’s the ultimate challenge in robotic control.

Our strategy is to stay ahead of the curve. Galbot’s mission is to create general-purpose robots that can serve every industry and household.

KrASIA Connection features translated and adapted content that was originally published by 36Kr. This article was written by Wang Fangyu for 36Kr.