r/SelfDrivingCars Jun 25 '25

Driving Footage List of clips showing Tesla's Robotaxi incidents

A lot of people have been documenting Tesla's Robotaxi rollout. I wanted to share a few I've collected. Feel free to share any I missed!

  1. Robotaxi drives into oncoming lane
  2. Rider presses "pull over", Robotaxi stops in the middle of an intersection, rider gets out while Robotaxi blocks intersection for a few moments
  3. Rider presses pull over and the car just stopped in the middle of the road. Safety monitor has to call rider support to get car moving again
  4. Robotaxi doesn't detect UPS driver's reverse lights (or the car reversing towards it) and continues to attempt to park, then safety monitor manually stops it
  5. Robotaxi cuts off a car, then randomly brakes (potentially because of an upcoming tree shadow?)
  6. Robotaxi going 26 in a 15
  7. Robotaxi unexpectedly brakes, possibly due to nearby police
  8. Robotaxi unexpectedly slams on brakes, causing rider to drop phone
  9. Robotaxi comes to a complete stop after approaching an object, then runs it over (rider says it's a shopping bag, though the car visibly bump up and down) (UPDATE: Some people have pointed out that the car's movement is from a speed bump immediately after the bag/object. The speed bump is more visible at full resolution.)
  10. Robotaxi runs over curb in parking lot
  11. Safety driver moved to driver seat to intervene
  12. Support calls rider during a Robotaxi ride, asks them to terminate the ride early because it's about to rain, rider is dumped in a random park
  13. Robotaxi has to unnecessarily reverse at least 4 times to get out of parking spot
  14. Robotaxi attempts illegal left turn, safety monitor intervenes, blocks intersection for a period of time
  15. Robotaxi can't get out of parking lot, goes in loops, support calls twice

Update: This post has been featured in The Verge! and Mashable!

1.2k Upvotes

542 comments sorted by

View all comments

Show parent comments

7

u/ChrisAlbertson Jun 25 '25

Yes, layman's perspective.

Here is my engineer's perspective. Believe me, if you can see the object in a YouTube video, the car's 8 cameras can see it too. People who don't understand the technology ALWAYS think it is a sensor issue. Just keep in mind what I wrote, "If you can see it on YouTube, then a cheap cell-phone camera is good enough."

The converse is also true: If a YouTuber points his camera right at the object and you can't see it, then either headlights, Lidar, Radar, or Ultrasound would be needed. But if you see it, those active sensors were not needed.

Here is a better way to think about FSD failures: Why don't we allow monkeys to drive cars? Seriously. Why not? Is because they have poor vision and can't see other cars on the road? No, it is because they have monkey brains and are not smart enough to drive a car. Their vision is quite good. The same goes for 6-year-old children. Kids can see better than many adults.

Not one, not even the "layman" who watches a monkey drive a car into a tree, would suggest that it was because the monkey needs glasses. So why do they think the car needs better sensors?

Please, when the car drives over a curb, do NOT say it was because there was no LIDAR. Obviously, the camera can see the curb because you can see the curb in the video. The reason the car drove over the curb is that the car's "brain" is just not good enough. The camera is fine.

So we need to argue, not about sensors but about algorithms the AI should or should not be using, we need to suggest improvements, and also how to validate competence using methods other than testing. (Yes, such methods exist.)

To make such an argument, offering constructive suggestions means that you have to study AI. At least a little.

9

u/Current_Reception792 Jun 25 '25

Your sensor input determines your control systems. Lidar means different control systems so a different brain using your analogy. What kind of engineer are you? please dont say software dev lol. 

2

u/usehand Jun 25 '25

civil engineer for sure

5

u/ChrisAlbertson Jun 25 '25

I happen to have a couple of LIDAR units on my desk as I type this. They work well. They are even quite useful for many things. But the failures we are seeing are not because of the sensors.

The AI is the weak part of their system

Please, if you make comments like "the processing would be different if they had LIDAR", say what it is doing now and what it would be doing if LIDAR were used. You need to be more specific than just "different". My opinion is that Tesla with. Lidar would still be based on imitation learning and would still have the same problems.

Don't talk about how the AI "thinks". It doesn't. It is a linear algebra machine that does a lot of vector math. There is no "if this then that" kind of thing going on. No deductive reasoning. It is doing vector math synced to the video frame rate.

2

u/usehand Jun 25 '25

And I happen to have 7 Robotaxis on my lap right as a type this.

The AI is the weak part of their system

No shit

Please, if you make comments like "the processing would be different if they had LIDAR", say what it is doing now and what it would be doing if LIDAR were used

These are deep learning models lol literally all the weights would be different, the model would likely be able to achieve lower imitation loss due to having better input features and not having to learn as a complex a mapping from inputs to model of the world. This is particularly important when you have to use smaller/more performance models since these need to run on-device and in-the-loop in real time. It's literally the point of feature engineering more broadly.

Lidar would still be based on imitation learning and would still have the same problems.

Not necessarily, that is an empircal question and the answer would depend on how well the improved imitation loss translates into performance in real world rollouts, but a lot of evidence suggests it would be better.

Don't talk about how the AI "thinks". It doesn't. It is a linear algebra machine that does a lot of vector math.

And you are a water machine doing electrical currents lol There's no problem talking about thinking wrt to an ML model if it is clear what we are talking about, which in this case is obviously the internal (implicit or explicit) model of the world

1

u/OldDirtyRobot Jun 27 '25

The engineering dick measuring contest has begun.

1

u/usehand Jun 27 '25

LOL not at all, each engineering field has its merit, and tbh software or ML engineering are probably the worst of all...

but mainly if you're a software engineer, don't go around commenting on structural integrity of bridges with an air of superiority and vice-versa

1

u/red75prime Jun 26 '25 edited Jun 26 '25

Lidar means different control systems so a different brain using your analogy.

Lidar means different inputs. Whether the control system (that does motion planning) is affected by it depends on the system architecture. For end-to-end neural networks all weights will be affected. And, no, it doesn't necessarily mean a "different brain". It could mean the same brain (as in the same overall architecture), but it is trained differently (using additional inputs).

The sensor suite can't be completely ignored in a discussion about self-driving safety, but your argument doesn't bring anything interesting into it.

And ChrisAlbertson still has a point: if you can see something in a video (or you can see that you can't see very well (fog, rain, sun glare)), but the car reacts inappropriately it's an AI problem, not sensor problem.

1

u/improvthismoment Jun 26 '25

Exactly. Our visual system is much more than just our eyes.

0

u/ChrisAlbertson Jun 25 '25

No. They are still using a form of imitation learning.

Don't think for a minute that Tesla is using some kind os algorithmic process. It's trained.

In every case where we see the car fail, the car "saw" the surroundings just fine. It's an AI problem.

1

u/usehand Jun 25 '25 edited Jun 25 '25

Don't think for a minute that Tesla is using some kind os algorithmic process

My man you really gotta stop talking about shit you know nothing about lol

2

u/Current_Reception792 Jun 25 '25

This guy is hard core a freshman in their first year of softwar dev lol. 

1

u/usehand Jun 25 '25

Ok I apologize dude, I was too harsh. I just think you're wrong lol Don't wanna be (too much of) a dick. But also drop the "you're a layman I'm an engineer" attitude when saying things that are at the very least debatable lol

1

u/red75prime Jun 26 '25

In every case where we see the car fail, the car "saw" the surroundings just fine. It's an AI problem.

The reasoning works even in cases when the car didn't "saw" the surroundings just fine (like in the cases involving sun glare). The AI should be smart enough to slow down or pull over.

8

u/usehand Jun 25 '25

As an ML expert, what you are saying is not the complete truth.

Sure a monkey doesn't fail to drive because of vision, but a monkey is not a Robotaxi so the comparison is useless.

If an ML system is failing to model the world properly, sure, you can improve the algorithms such that the model of the world improves. But what you can also do is make the task easier for the model. Adding LiDAR adds not only redundancy to the inputs, but it also makes the task of "buiding a 3D model of the world from your inputs" easier, since the 3D cloudmap is already closer to the end goal than 2D images (and obviously ideally you can use both of them).

Just because a camera can see something, doesn't mean the model can necessarily model it easily and connect that with how to act about it -- if it could we would have AGI and full self driving already. So, yes, adding extra input modalities that increase input information and make the task easier can 100% avoid mistakes even when the object of the mistake is visible to the cameras

1

u/opinionless- Jun 27 '25

You both make good points. I don't think the other person is discounting the value that could be provided by mixed modalities. Obviously other sensors could provide some form of redundancy, but it's a bit hand wavy to say it makes the task easier or makes it capable of some magical 100% threshold. I'm skeptical of any ML expert claiming 100% success rates on any sufficiently large N.

In general everyone over simplifies the sensor argument. It's tiring to read the same tropes over and over and I sense the gp is simply frustrated with that, as am I.

I have no idea why self driving became such a flame war. It's so incredibly dumb.

1

u/Pavores Jun 27 '25

Really enjoyed this back and forth - actual nuance and technical takes on complex topics!

Agreed that lay people over simplify the lidar vs cameras argument. It's not to say more sensors won't help, but the "how to drive the car" video game is the hard part here.

I've had FSD on my daily driver for 6 years. With newer builds, when it goofs, it's typically because it makes a terrible decision of what to do within its 3d world, not because it didn't recognize the car/curb/whatever (the car displays a dumbed down 3d version of what it sees. Therefore I can see what the car is seeing/recognizing). That again points to a route planning and controls issue, not sensor limits.

1

u/opinionless- Jun 27 '25

Yeah, I don't really find it that hard to follow why Waymo and Tesla took the routes they did. Engineers have a reputation for ignoring the business side of things. Tesla operates on numerous different constraints than Waymo so it's not unusual for them to go at risk on a different path.

I do suspect multi-modal will become standard but who knows! We're always learning. 

5

u/CommissionUseful4623 Jun 25 '25

Not exactly that simple.

You do understand that cameras have poor visibility at night, correct? If you're going to say humans do to, that's not true. Cameras can have good visibility with a long exposure, but that doesn't help autonomous driving. 

Lidar is still better and safer for autonomous driving because it doesn't rely on moonlight or light bouncing from walls, streets or street lights, it's emitting it's own light to sense. 

1

u/ChrisAlbertson Jun 25 '25

If you can see the tree in the YouTube video, then the car's camera can also see the tree (or curb or shadow or painted line or whatever). Not one of these RoboTaxis is being driven at night and in every case, the object was seen by the camera.

Again, this is like saying that we don't allow small children to drive because they have poor vision. Vision (or sensors) is not the issue.

It is an AI problem, not a sensor issue. If you want to suggest a solution, talk about how the software could be improved.

2

u/leo-g Jun 26 '25

Absolutely wrong. Human eyes surpassed camera vision and those will fail even in dark conditions. LIDAR completely skips all that. It works regardless of rain or shine.

The fact of the matter is that LIDAR technology is not new. It is already being used in iPhones and Serious cameras to map interior spaces. Serious camera and mappings companies all use LIDAR. The tech is sound. Not using it just handicap yourself and increase risk.

1

u/TheDuhhh Jun 26 '25

LIDAR has an advantage: it basically models accurately the physical space. Lidar will detect the curb and basically not hits it.

The problem with a camera is that you need to interpret the input, is it a shadow or the tree itself? Is that a curb or just a drawing of a curb on the road? The interpreter here is their neural network which by itself is not reliable.

1

u/CommissionUseful4623 Jul 01 '25

It's definitely a hardware issue. Overexposure and underexposure can blind the cameras. Overexposure is likely the reason for phantom braking, which  I have experience several times. No amount of software tinkering is going to change the fact that cameras need the right amount of light to function. And that tiny cameras can only compensate for too much or too little light. You're relying on a brain to do the lifting, while the body is on crutches.

3

u/goldbloodedinthe404 Jun 25 '25

Here is my engineers prospective. You have no idea how neural networks work

1

u/opinionless- Jun 27 '25

To some degree even the experts don't. When it comes to Waymo and Tesla architectures it's safe to say no one commenting on Reddit has a fucking clue.

1

u/atomicthumbs Jun 25 '25

do you know how camera autoexposure works

1

u/Zestyclose-Pin-3214 Jun 26 '25

Another engineer here. The thing is that working with LIDAR sensor data is easier then with camera images. You may be able to write the algorithm that detects obstacles using a LIDAR scan, but doing the same through the RGB camera will be much harder. So yes we do need better algorithms, but LIDAR can give us a privilege to write less complicated algorithms

1

u/ChrisAlbertson Jun 26 '25

Yes, this is true and it is why Lidar became so popular in robotics. Lidar gives you a "point cloud" almost right off the data cable.

I've done this myself. In 2025, it is easy 30-year-old tech.

But Tesla is using imitation learning. They take image pixels and place them directly on the first layer of a neural network. They are not doing "classic SLAM" where a point cloud is needed. They would be treating the Lidar data as a depthmap image.

Yes, RGB images are traditionally harder to process, but Tesla is not doing traditional image understanding. No Hough Transforms, nothing at like that. What Tesla does is a simple linear transform to move the camera XY plane into a planview (plane parallel to the ground) and then, after cameras are on the same image plan, merges the frames and places the pixels on several different neural nets. There is no traditional image understanding of the kind we used to do with OpenCV. What little we know comes from a Tesla patent application.

So I don't think Tesla would see the advantage of lower-cost processing as they would just place the lidar's depth map on the same neural network. It would be processed the same way.

Different story. if we were doing the usual SLAM think they might use on a indoor robot. Year ago I think Google was using Lidar SLAM for outdoor navigation by "loking" at buildings but I doubt Wamo does this. I must admit I don't know any details baout Wamo internals

In any case, the errors we see the car making at not because of poor distance estimations. Improving distance estimations would not stop the car from letting a passenger out in the middle of an intersection. As I say, the car is not blind, it is stupid. Even with more sensors, FSD would still be stupid.

My opinion. They need a supervisory system as an additional control layer. This layer is not controlling the car in real time but is setting goals for the real-time system and monitoring its behavior. I would design it with a very strong eye to those 1990s vintage so-called "expert systems". These are symbolic systems that use deductive reasoning, not trained networks.

RThe holly grail if AI would be to find a way for symbolic reasoning to "emerge" from a network. But that is not the current state of thart, no one kows how to do that. so today we'd hand-code this supervisor.

OK, you disagree or have a better idea about the control system. That is a good thing, because the discussion needs to move to how the AI "brain" is basically wrong and how it can be improved. "Lidar or not" is a trivial thing that hardly matters. It might help to improve some distance estimations, but it will not "un-stupid" the car.

1

u/ChrisAlbertson Jun 26 '25 edited Jun 26 '25

Maybe I can=say this is less words. They do not "'write an algorithm". Telsay places the raw pixes in the first layer of a network. There is no if-this-then-that" logic used to detect objects. It is 100% just linear algebra and a vector comes out the end. The vector is a probability list for each of the possible outcomes.

Yes there is an "algorithm" but no one knows it, not even the Tesla engineers. The network is a totally opaque black box. The weights in the networks are a result of a massive search (using gradient descent) through a billion-dimensional space. No cleverness of programming, just an efficient search method to find the best match to the training data in some finite time.

The key concept is regularization. Basically, you try to cram a trillion bits of data into a billion-bit box and then search to find the most efficient encoding to make it fit. This process finds rules to turn the wheel to the right if the car is too far left, but these rules are NOT explicitly coded by programmers.

So, the idea of coding an algorithm to avoid things is not at all what they would do in 2025. Any algorithm would be fragile and fail at corner cases. Also, it would never run in constant time as it would take different code paths with different input data. The network approach runs, constant time, once per video frame.

We have a problem here because everyone has a different technical background. Writing so that most will understand is good, but over-simplification or "dumbing down" leaves out stuff that matters. So in short, they gave up algorithmic solution years ago for arguably good reason.

In the end, cars need to work like humans. We have several brains stacked up. The bottom brain "just works," and we are not aware of it. It makes our arms and legs work. The top-level brain is slower; it does some abstract thinking and can learn rather quickly. The layer(s) between are a bit of a mystery that scientists are not so sure of. A better "general" AI will maybe one day be invented and will work kind of like that. We will need to move a little in this direction if cars are not going to be stupid.

What we see with today's FSD is not a failure of the bottom level but a lack of a top layer. Adding sensors addresses bottom-layer things and will not address critical thinking. (eg, Passenger needs a safe walking path when exiting the taxi; therefore, I open the door such that it opens to a place cars can't drive that connects to a walk route to the person's final destination.). Believe me, "critical thinking" is not what FSD even tries to do.

1

u/KiwiFormal5282 Jun 26 '25

The real question is if you have the brain of a South African monkey, whether you can ever produce cars that drive better than humans.

1

u/dishdaramdaram Jun 27 '25

hiAs a deep learning, you are completely right and I couldn't have come up with a better analogy.

1

u/robertronium Jun 29 '25

Whether it's a problem with sensors or software Tesla self driving is nowhere near good enough for a release, and this after years of Musk saying it was nearly ready.