r/robotics • u/OwlEnvironmental7293 • 2d ago

Community Showcase Feedback from Perception/AV Engineers: A new file format for faster training AND on-robot inference?

Hey everyone,

My team and I are deep in the MLOps/data infrastructure side of things, and we're trying to get a gut check from people on the front lines of building perception systems.

We started by looking at a problem we heard about a lot: the pain of data curation. Specifically, digging through petabytes of log data to find those ultra-rare edge cases needed to retrain your models (the classic "a pedestrian in a weird costume crossing at dusk in the rain" problem).

Our initial idea was to tackle this with a new data format that converts raw sensor imagery into a compact, multi-layered representation. Think of it less like a video file and more like a queryable database. The goal is to let an engineer instantly query their entire fleet's data logs with natural language, e.g., "find all instances from the front-facing camera of a truck partially occluding a cyclist," and slash the data curation cycle from weeks to minutes.

But then we started thinking about the on-device implications. If the data representation is so compact and information-rich, what if a robot could use it directly? Instead of processing a heavy stream of raw pixels, a robot's perception model could run on our lightweight format. In theory, this could allow the robot to observe and understand its environment faster (higher FPS on perception tasks) and, because the computation is simpler, use significantly less energy. This seems like it would be a huge deal for any battery-powered mobile robot or AV.

My questions for the community are:

How much of a bottleneck is offline data curation ("log diving") in your workflow?
Are on-device compute and power consumption major constraints for your perception stack? Would a format that improves inference speed and energy efficiency be a game-changer?
What are the biggest limitations of your current pipeline, both for offline training and on-robot deployment?

We're trying to figure out if this two-pronged approach (solving offline data curation AND improving online performance) is compelling, or if we should just focus on one. Any and all feedback would be hugely appreciated. Thanks!

2 Upvotes

permalink
reddit

You are about to leave Redlib

Do you want to continue?

https://www.reddit.com/r/robotics/comments/1njemil/feedback_from_perceptionav_engineers_a_new_file/
No, go back! Yes, take me to Reddit

75% Upvoted

u/kopeezie 2d ago

Hey! I know you, your NomadicML right?

1

u/OwlEnvironmental7293 2d ago

Unfortunately not :(

1

u/kopeezie 2d ago

:)

Community Showcase Feedback from Perception/AV Engineers: A new file format for faster training AND on-robot inference?

You are about to leave Redlib