Computing Workflows, Data Science, and such


Autonomous Car Model Review

I didn’t have have much specifically planned for this week, so I ended up reviewing some work I did over the weekend. This was a mixup of transfer learning, rebuilding other models in Keras, and processing more data.

As mentioned before, the SqueezeNet transfer learning wasn’t paying off, so I moved on to replicating the results of DeepDrive in Keras. Craig from DeepDrive was kind enough to share his model weights in a binary format. I worked out how to map them to a Keras AlexNet-based model with a little reshaping. But, the model seems to list to the left. It might be that Caffe has some features we don’t know about in how it implements AlexNet, or that I’ve got the color channels screwed up. Either way, I wasn’t having great luck with this on the first 100 samples of the DeepDrive data. But it does work in Caffe, so there must be something here.

Getting frustrated, I turned back to my own models. The 0.03 MSE left a lot of room for improvement, so I looked around for inspiration. Nvidia recently released a TensorFlow implementation of one of their models, and it had a relatively reasonable number of parameters. I didn’t want to just copy it, but I did try 5 convolutional layers with an increasing number of kernels in each layer. With a modest amount of training, this started to perform decently, with about 0.01 MSE. Finally, a win.

This comes with a caveat, however, that I also processed some more DeepDrive data, about doubling the amount available. I strongly suspect this new data is a little more well-behaved generally, so that might color the effectiveness of this model.

What’s worse, during stream I discovered a typo in this “good model”. Turns out I was using 1 less convolutional layer than I thought! Fixing that and rerunning, I get a new model that’s a little more effective still. It pays to double check for easy to make and fix mistakes.

Transfer Learning

After last week’s troubles, this past stream went over well. Introducing transfer learning, though it hasn’t yet produced smashing results, at least worked quickly. To briefly review, transfer learning is part of an existing model (topology and weights) to kick-start your own model. In this case, it means using several of the input layers of the SqueezeNet architecture to extract interesting image features, and then use those as inputs to the self-driving car model.

So far, the results haven’t been stunning. Off-stream, I trained for about 10 epochs, but the validation accuracy never dropped below 0.02 (and then I ran into some weird memory problem). This is all with the DeepDrive data, since that’s a bit easier to chew on and I feel has a better mix of turns in the small subset of data I’ve retained here.

One solution might be to jack up the number of neurons in the dense layer. This will supply more interesting combinations of the previous convolutional and real valued inputs. It will also slow down the model, so I need to be careful how far I’m willing to push it.

Comma.ai Model

I must apologize for the short stream this week. As I was processing the data, I read it in a 2nd time and used up all available memory. This ground the computer (and the stream) to a halt.

But, in the 30 minutes of DanDoesData, I created a simple model based on the comma.ai data. Over the weekend, I had fixed up the data shrinking and scaling, but I suspect that I’ll have more of that yet to do. The steering is dominated by a few large turns in the data, with small lane changes at other times. Or maybe it’s just the visualization that I need to change.

The dataset I compressed will be posted here, hopefully by the time you read this post. I may cut it down further, or at least remove some of the outlier shots (like the car backing out of a driveway), but it’s usable right now.

Comma.ai data

The race over the weekend went…ok, based on what I heard. I was unable to attend myself, but I believe Otto got on the track and promptly listed right. The track was more or less a large right turn, so that’s something, right?

This week, I wanted to take a look at a different data set that came out recently, comma.ai. This is 7.25 hours of highway driving put out by a startup that wants to add after-market self-driving capabilities to regular cars. I think that’s an ambitious idea, but the way to bring autonomous vehicles to the masses, so why not have a look.

The dataset is a full 45 GB, but I only wanted to work with a small subset. So I picked one of the small image files (3 GB or so) and went with that. When I have a compressed minimal dataset ready, I’ll post that.

The data itself is RGB 160x320 images, and a whole mess of logging information, including the precious speed, acceleration, and steering angle. The minimal data in question is about 20000 data points, although the first 1000 or so are the vehicle backing out of a garage.

This episode was spent just poking through the data and trying to reformat it. One catch is that there are 20 images per second, but the logging records 100 times per second. What’s worse is that it’s not exactly those numbers, so you can’t just take very 5th logging point to tie to the image. The logging does provide a pointer to the appropriate image for each log point, but it took some mangling to get that in the form I wanted. Once that was done, it was mostly smooth sailing resizing the image data. From 3 GB down to about 256 MB. Not bad.

There still needs to be rescaling of the other parameters. This should be pretty straightforward, but I know an issue is that most of the driving is straight ahead, with only slight adjustments. This is reflective of real driving on a highway of course, but Otto will be doing a lot of turning.

Self-driving Control Script

The first real race is this weekend, so the push this past week was to finalize the model and write a control script to communicate with the car. On the model front, I decided to drop trying to model gas, preferring to have a decent model of steering instead. This also meant reducing the regularization and removing the dropout. I think the data I have may not be sufficient for these kinds of techniques. The final Mean Squared Error I got was about 0.028 or so. Not great, but ok.

Isaac Gerg was kind enough to send me a model he cooked up using the same data. His model converts the images to grayscale, but is similar in style. The other big difference is that he goes for a wider model with 1 fewer layers. The computation takes longer, but this model got MSE of about 0.02 on stream. Not half bad.

To make the model drive the car, we have to communicate over a serial port to a FUBARino. First we read input of the acceleration (as well as yaw, pitch, and roll). Then we combine this with webcam imagery (via Pygame) to make a steering prediction. This prediction gets rescaled and sent back down the serial port to direct the car. Step 3: Profit.

Page 6 of 16