跳至内容

博客列表

使用Colab来加速Donkey Car训练 (Tensorflow GPU)

colab

本notebook助你快速训练你的Donkey car或者自动驾驶小车模型。

参考并改进了@sachindroid8的notebook,先向前人致敬!

先说说使用Colab进行训练的优缺点

  1. 效率

使用Colab需要梯子,也要把数据传给google,上传时间因人而异,我上传大概花来2-3分钟,剩下就是执行代码和训练。第一执行代码需要搞懂什么原理,后面执行基本不用花什么时间,然而使用GPU训练时间,我有6k图片,训练30-40个Epoch左右提前结束,训练时间少于1分钟!每个Epoch只需1-2秒。

然后使用我的Macbook Pro训练,因为没有GPU的缘故,每个Epoc需要30-50秒,训练下来,接近40分钟到1小时才能完成。

  1. 便利性

如果没有梯子,当然Colab不是一个选择,我知道百度也提供一些有限度免费的服务器,以后可以再做测试,但有梯子的话,Google Colab是最好的选择。下面便开始教你怎样开始训练 

载入我的Notebook

打开使用Google_Colab来Donkey_Car

安装 TensorFlow 1.14.0

TensorFlow 2.x和Donkey Car 3.x现在还有兼容性问题,暂时不推荐使用(2019.8.17)In [1]:

!pip install tensorflow-gpu==1.14.0

Collecting tensorflow-gpu==1.14.0
  Downloading https://files.pythonhosted.org/packages/76/04/43153bfdfcf6c9a4c38ecdb971ca9a75b9a791bb69a764d652c359aca504/tensorflow_gpu-1.14.0-cp36-cp36m-manylinux1_x86_64.whl (377.0MB)
     |████████████████████████████████| 377.0MB 86kB/s 
Requirement already satisfied: grpcio>=1.8.6 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (1.15.0)
Requirement already satisfied: absl-py>=0.7.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (0.7.1)
Requirement already satisfied: tensorflow-estimator<1.15.0rc0,>=1.14.0rc0 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (1.14.0)
Requirement already satisfied: wrapt>=1.11.1 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (1.11.2)
Requirement already satisfied: termcolor>=1.1.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (1.1.0)
Requirement already satisfied: astor>=0.6.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (0.8.0)
Requirement already satisfied: keras-applications>=1.0.6 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (1.0.8)
Requirement already satisfied: tensorboard<1.15.0,>=1.14.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (1.14.0)
Requirement already satisfied: wheel>=0.26 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (0.33.4)
Requirement already satisfied: protobuf>=3.6.1 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (3.7.1)
Requirement already satisfied: six>=1.10.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (1.12.0)
Requirement already satisfied: keras-preprocessing>=1.0.5 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (1.1.0)
Requirement already satisfied: google-pasta>=0.1.6 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (0.1.7)
Requirement already satisfied: gast>=0.2.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (0.2.2)
Requirement already satisfied: numpy<2.0,>=1.14.5 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (1.16.4)
Requirement already satisfied: h5py in /usr/local/lib/python3.6/dist-packages (from keras-applications>=1.0.6->tensorflow-gpu==1.14.0) (2.8.0)
Requirement already satisfied: markdown>=2.6.8 in /usr/local/lib/python3.6/dist-packages (from tensorboard<1.15.0,>=1.14.0->tensorflow-gpu==1.14.0) (3.1.1)
Requirement already satisfied: werkzeug>=0.11.15 in /usr/local/lib/python3.6/dist-packages (from tensorboard<1.15.0,>=1.14.0->tensorflow-gpu==1.14.0) (0.15.5)
Requirement already satisfied: setuptools>=41.0.0 in /usr/local/lib/python3.6/dist-packages (from tensorboard<1.15.0,>=1.14.0->tensorflow-gpu==1.14.0) (41.0.1)
Installing collected packages: tensorflow-gpu
Successfully installed tensorflow-gpu-1.14.0

检查GPU是否有效

如果显示”Found GPU at: / device: GPU: 0“,则GPU可以正常使用

如果没有以上输出,需要检查Runtime (运行类型)是否选择了GPU硬件加速器In [2]:

device_name = tf.test.gpu_device_name()

if device_name != '/device:GPU:0':

raise SystemError('GPU device not found')

print('Found GPU at: {}'.format(device_name))

Found GPU at: /device:GPU:0

克隆Donkey respository

In [3]:

!git clone https://github.com/autorope/donkeycar.git donkey

Cloning into 'donkey'...
remote: Enumerating objects: 120, done.
remote: Counting objects: 100% (120/120), done.
remote: Compressing objects: 100% (76/76), done.
remote: Total 10558 (delta 65), reused 73 (delta 31), pack-reused 10438
Receiving objects: 100% (10558/10558), 58.74 MiB | 46.70 MiB/s, done.
Resolving deltas: 100% (6528/6528), done.

安装 Donkey car

In [4]:

Obtaining file:///content/donkey
Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from donkeycar==3.1.0) (1.16.4)
Requirement already satisfied: pillow in /usr/local/lib/python3.6/dist-packages (from donkeycar==3.1.0) (4.3.0)
Requirement already satisfied: docopt in /usr/local/lib/python3.6/dist-packages (from donkeycar==3.1.0) (0.6.2)
Requirement already satisfied: tornado in /usr/local/lib/python3.6/dist-packages (from donkeycar==3.1.0) (4.5.3)
Requirement already satisfied: requests in /usr/local/lib/python3.6/dist-packages (from donkeycar==3.1.0) (2.21.0)
Requirement already satisfied: h5py in /usr/local/lib/python3.6/dist-packages (from donkeycar==3.1.0) (2.8.0)
Requirement already satisfied: moviepy in /usr/local/lib/python3.6/dist-packages (from donkeycar==3.1.0) (0.2.3.5)
Requirement already satisfied: pandas in /usr/local/lib/python3.6/dist-packages (from donkeycar==3.1.0) (0.24.2)
Requirement already satisfied: PrettyTable in /usr/local/lib/python3.6/dist-packages (from donkeycar==3.1.0) (0.7.2)
Collecting paho-mqtt (from donkeycar==3.1.0)
  Downloading https://files.pythonhosted.org/packages/25/63/db25e62979c2a716a74950c9ed658dce431b5cb01fde29eb6cba9489a904/paho-mqtt-1.4.0.tar.gz (88kB)
     |████████████████████████████████| 92kB 4.2MB/s 
Requirement already satisfied: olefile in /usr/local/lib/python3.6/dist-packages (from pillow->donkeycar==3.1.0) (0.46)
Requirement already satisfied: urllib3<1.25,>=1.21.1 in /usr/local/lib/python3.6/dist-packages (from requests->donkeycar==3.1.0) (1.24.3)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.6/dist-packages (from requests->donkeycar==3.1.0) (2019.6.16)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /usr/local/lib/python3.6/dist-packages (from requests->donkeycar==3.1.0) (3.0.4)
Requirement already satisfied: idna<2.9,>=2.5 in /usr/local/lib/python3.6/dist-packages (from requests->donkeycar==3.1.0) (2.8)
Requirement already satisfied: six in /usr/local/lib/python3.6/dist-packages (from h5py->donkeycar==3.1.0) (1.12.0)
Requirement already satisfied: decorator<5.0,>=4.0.2 in /usr/local/lib/python3.6/dist-packages (from moviepy->donkeycar==3.1.0) (4.4.0)
Requirement already satisfied: tqdm<5.0,>=4.11.2 in /usr/local/lib/python3.6/dist-packages (from moviepy->donkeycar==3.1.0) (4.28.1)
Requirement already satisfied: imageio<3.0,>=2.1.2 in /usr/local/lib/python3.6/dist-packages (from moviepy->donkeycar==3.1.0) (2.4.1)
Requirement already satisfied: pytz>=2011k in /usr/local/lib/python3.6/dist-packages (from pandas->donkeycar==3.1.0) (2018.9)
Requirement already satisfied: python-dateutil>=2.5.0 in /usr/local/lib/python3.6/dist-packages (from pandas->donkeycar==3.1.0) (2.5.3)
Building wheels for collected packages: paho-mqtt
  Building wheel for paho-mqtt (setup.py) ... done
  Created wheel for paho-mqtt: filename=paho_mqtt-1.4.0-cp36-none-any.whl size=48333 sha256=9a67d0c95fae2b9495c20980895d9569ef53859cbc22bb57fc2466d7a581af7b
  Stored in directory: /root/.cache/pip/wheels/82/e5/de/d90d0f397648a1b58ffeea1b5742ac8c77f71fd43b550fa5a5
Successfully built paho-mqtt
Installing collected packages: paho-mqtt, donkeycar
  Running setup.py develop for donkeycar
Successfully installed donkeycar paho-mqtt-1.4.0

创建项目

我使用了mycar作为项目名称,你可以改名,但改名后需要相应修改后面但代码In [5]:

!donkey createcar --path /content/mycar

using donkey v3.1.0 ...
Creating car folder: /content/mycar
making dir  /content/mycar
Creating data & model folders.
making dir  /content/mycar/models
making dir  /content/mycar/data
making dir  /content/mycar/logs
Copying car application template: complete
Copying car config defaults. Adjust these before starting your car.
Copying train script. Adjust these before starting your car.
Copying my car config overrides
Donkey setup complete.

准备数据: 上传data.zip并解压

现在你需要把pi采集回来的data目录上需要训练的目录打包,保存成data.zip.

在pi的data目录上运行:

$ zip -r data.zip tub_3_19-08-17/

然后拷贝会电脑,准备上传到Colab

上传data.zip到Colab

运行下面代码,会出现一个上传按钮,点击上传刚才打包的data.zipIn [7]:

from google.colab import files

if(os.path.exists("/content/data.zip")):

os.remove("/content/data.zip")

if(os.path.exists("/content/mycar/data/data.zip")):

os.remove("/content/mycar/data/data.zip")

uploaded = files.upload()

WORK_FOLDER = "/content/mycar/data/"

if(os.path.exists(WORK_FOLDER) == False):

!mv /content/data.zip /content/mycar/data/

清理已经上传的文件

你需要确保content/mycar/data目录下有tub目录,目录里面有图片和对应的json文件

data.zip就不用保留了In [0]:

!rm /content/mycar/data/data.zip

训练模型

In [21]:

!python /content/mycar/manage.py train --model /content/mycar/models/mypilot.h5

using donkey v3.1.0 ...
loading config file: /content/mycar/config.py
loading personal config over-rides

config loaded
WARNING: Logging before flag parsing goes to stderr.
W0818 04:53:15.944086 140399927523200 deprecation_wrapper.py:119] From /content/donkey/donkeycar/parts/keras.py:18: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

W0818 04:53:15.944333 140399927523200 deprecation_wrapper.py:119] From /content/donkey/donkeycar/parts/keras.py:18: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

2019-08-18 04:53:15.954857: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-08-18 04:53:15.959505: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2019-08-18 04:53:16.083690: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-08-18 04:53:16.084227: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x2455100 executing computations on platform CUDA. Devices:
2019-08-18 04:53:16.084262: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): Tesla T4, Compute Capability 7.5
2019-08-18 04:53:16.086194: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2300000000 Hz
2019-08-18 04:53:16.086460: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x65e6380 executing computations on platform Host. Devices:
2019-08-18 04:53:16.086508: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): <undefined>, <undefined>
2019-08-18 04:53:16.086686: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-08-18 04:53:16.087039: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: 
name: Tesla T4 major: 7 minor: 5 memoryClockRate(GHz): 1.59
pciBusID: 0000:00:04.0
2019-08-18 04:53:16.087321: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2019-08-18 04:53:16.088548: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2019-08-18 04:53:16.089621: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2019-08-18 04:53:16.089935: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2019-08-18 04:53:16.091399: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2019-08-18 04:53:16.092394: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2019-08-18 04:53:16.095470: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2019-08-18 04:53:16.095606: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-08-18 04:53:16.096009: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-08-18 04:53:16.096350: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2019-08-18 04:53:16.096401: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2019-08-18 04:53:16.097166: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-08-18 04:53:16.097190: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187]      0 
2019-08-18 04:53:16.097201: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0:   N 
2019-08-18 04:53:16.097482: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-08-18 04:53:16.097878: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-08-18 04:53:16.098226: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14089 MB memory) -> physical GPU (device: 0, name: Tesla T4, pci bus id: 0000:00:04.0, compute capability: 7.5)
"get_model_by_type" model Type is: linear
W0818 04:53:16.444941 140399927523200 deprecation.py:506] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
training with model type <class 'donkeycar.parts.keras.KerasLinear'>
Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
img_in (InputLayer)             [(None, 120, 160, 3) 0                                            
__________________________________________________________________________________________________
conv2d_1 (Conv2D)               (None, 58, 78, 24)   1824        img_in[0][0]                     
__________________________________________________________________________________________________
dropout (Dropout)               (None, 58, 78, 24)   0           conv2d_1[0][0]                   
__________________________________________________________________________________________________
conv2d_2 (Conv2D)               (None, 27, 37, 32)   19232       dropout[0][0]                    
__________________________________________________________________________________________________
dropout_1 (Dropout)             (None, 27, 37, 32)   0           conv2d_2[0][0]                   
__________________________________________________________________________________________________
conv2d_3 (Conv2D)               (None, 12, 17, 64)   51264       dropout_1[0][0]                  
__________________________________________________________________________________________________
dropout_2 (Dropout)             (None, 12, 17, 64)   0           conv2d_3[0][0]                   
__________________________________________________________________________________________________
conv2d_4 (Conv2D)               (None, 10, 15, 64)   36928       dropout_2[0][0]                  
__________________________________________________________________________________________________
dropout_3 (Dropout)             (None, 10, 15, 64)   0           conv2d_4[0][0]                   
__________________________________________________________________________________________________
conv2d_5 (Conv2D)               (None, 8, 13, 64)    36928       dropout_3[0][0]                  
__________________________________________________________________________________________________
dropout_4 (Dropout)             (None, 8, 13, 64)    0           conv2d_5[0][0]                   
__________________________________________________________________________________________________
flattened (Flatten)             (None, 6656)         0           dropout_4[0][0]                  
__________________________________________________________________________________________________
dense (Dense)                   (None, 100)          665700      flattened[0][0]                  
__________________________________________________________________________________________________
dropout_5 (Dropout)             (None, 100)          0           dense[0][0]                      
__________________________________________________________________________________________________
dense_1 (Dense)                 (None, 50)           5050        dropout_5[0][0]                  
__________________________________________________________________________________________________
dropout_6 (Dropout)             (None, 50)           0           dense_1[0][0]                    
__________________________________________________________________________________________________
n_outputs0 (Dense)              (None, 1)            51          dropout_6[0][0]                  
__________________________________________________________________________________________________
n_outputs1 (Dense)              (None, 1)            51          dropout_6[0][0]                  
==================================================================================================
Total params: 817,028
Trainable params: 817,028
Non-trainable params: 0
__________________________________________________________________________________________________
None
found 0 pickles writing json records and images in tub /content/mycar/data/tub_3_19-08-17
/content/mycar/data/tub_3_19-08-17
collating 5799 records ...
train: 4639, val: 1160
total records: 5799
steps_per_epoch 36
Epoch 1/100
2019-08-18 04:53:19.663907: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2019-08-18 04:53:19.984999: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
35/36 [============================>.] - ETA: 0s - loss: 0.1681 - n_outputs0_loss: 0.1555 - n_outputs1_loss: 0.0126
Epoch 00001: val_loss improved from inf to 0.14482, saving model to /content/mycar/models/mypilot.h5
36/36 [==============================] - 8s 213ms/step - loss: 0.1677 - n_outputs0_loss: 0.1553 - n_outputs1_loss: 0.0124 - val_loss: 0.1448 - val_n_outputs0_loss: 0.1409 - val_n_outputs1_loss: 0.0039
Epoch 2/100
34/36 [===========================>..] - ETA: 0s - loss: 0.1278 - n_outputs0_loss: 0.1234 - n_outputs1_loss: 0.0044
Epoch 00002: val_loss improved from 0.14482 to 0.08814, saving model to /content/mycar/models/mypilot.h5
36/36 [==============================] - 2s 53ms/step - loss: 0.1257 - n_outputs0_loss: 0.1214 - n_outputs1_loss: 0.0043 - val_loss: 0.0881 - val_n_outputs0_loss: 0.0870 - val_n_outputs1_loss: 0.0012
Epoch 3/100
35/36 [============================>.] - ETA: 0s - loss: 0.0943 - n_outputs0_loss: 0.0910 - n_outputs1_loss: 0.0033
Epoch 00003: val_loss improved from 0.08814 to 0.07490, saving model to /content/mycar/models/mypilot.h5
36/36 [==============================] - 2s 52ms/step - loss: 0.0938 - n_outputs0_loss: 0.0905 - n_outputs1_loss: 0.0033 - val_loss: 0.0749 - val_n_outputs0_loss: 0.0738 - val_n_outputs1_loss: 0.0011
Epoch 4/100
35/36 [============================>.] - ETA: 0s - loss: 0.0840 - n_outputs0_loss: 0.0813 - n_outputs1_loss: 0.0027
Epoch 00004: val_loss improved from 0.07490 to 0.07108, saving model to /content/mycar/models/mypilot.h5
36/36 [==============================] - 2s 51ms/step - loss: 0.0835 - n_outputs0_loss: 0.0808 - n_outputs1_loss: 0.0027 - val_loss: 0.0711 - val_n_outputs0_loss: 0.0702 - val_n_outputs1_loss: 8.9668e-04
Epoch 5/100
35/36 [============================>.] - ETA: 0s - loss: 0.0767 - n_outputs0_loss: 0.0741 - n_outputs1_loss: 0.0026
Epoch 00005: val_loss did not improve from 0.07108
36/36 [==============================] - 2s 50ms/step - loss: 0.0765 - n_outputs0_loss: 0.0739 - n_outputs1_loss: 0.0026 - val_loss: 0.0722 - val_n_outputs0_loss: 0.0717 - val_n_outputs1_loss: 5.4629e-04
Epoch 6/100
35/36 [============================>.] - ETA: 0s - loss: 0.0750 - n_outputs0_loss: 0.0727 - n_outputs1_loss: 0.0023
Epoch 00006: val_loss improved from 0.07108 to 0.06483, saving model to /content/mycar/models/mypilot.h5
36/36 [==============================] - 2s 52ms/step - loss: 0.0749 - n_outputs0_loss: 0.0725 - n_outputs1_loss: 0.0024 - val_loss: 0.0648 - val_n_outputs0_loss: 0.0641 - val_n_outputs1_loss: 7.5336e-04
Epoch 7/100
35/36 [============================>.] - ETA: 0s - loss: 0.0711 - n_outputs0_loss: 0.0689 - n_outputs1_loss: 0.0022
Epoch 00007: val_loss improved from 0.06483 to 0.06324, saving model to /content/mycar/models/mypilot.h5
36/36 [==============================] - 2s 52ms/step - loss: 0.0711 - n_outputs0_loss: 0.0689 - n_outputs1_loss: 0.0021 - val_loss: 0.0632 - val_n_outputs0_loss: 0.0626 - val_n_outputs1_loss: 5.9058e-04
Epoch 8/100
35/36 [============================>.] - ETA: 0s - loss: 0.0693 - n_outputs0_loss: 0.0675 - n_outputs1_loss: 0.0018
Epoch 00008: val_loss did not improve from 0.06324
36/36 [==============================] - 2s 51ms/step - loss: 0.0690 - n_outputs0_loss: 0.0672 - n_outputs1_loss: 0.0018 - val_loss: 0.0644 - val_n_outputs0_loss: 0.0637 - val_n_outputs1_loss: 7.8261e-04
Epoch 9/100
35/36 [============================>.] - ETA: 0s - loss: 0.0693 - n_outputs0_loss: 0.0675 - n_outputs1_loss: 0.0018
Epoch 00009: val_loss did not improve from 0.06324
36/36 [==============================] - 2s 50ms/step - loss: 0.0690 - n_outputs0_loss: 0.0672 - n_outputs1_loss: 0.0018 - val_loss: 0.0653 - val_n_outputs0_loss: 0.0646 - val_n_outputs1_loss: 6.6657e-04
Epoch 10/100
34/36 [===========================>..] - ETA: 0s - loss: 0.0651 - n_outputs0_loss: 0.0633 - n_outputs1_loss: 0.0018
Epoch 00010: val_loss did not improve from 0.06324
36/36 [==============================] - 2s 51ms/step - loss: 0.0657 - n_outputs0_loss: 0.0639 - n_outputs1_loss: 0.0018 - val_loss: 0.0699 - val_n_outputs0_loss: 0.0692 - val_n_outputs1_loss: 6.9729e-04
Epoch 11/100
35/36 [============================>.] - ETA: 0s - loss: 0.0630 - n_outputs0_loss: 0.0612 - n_outputs1_loss: 0.0018
Epoch 00011: val_loss did not improve from 0.06324
36/36 [==============================] - 2s 51ms/step - loss: 0.0630 - n_outputs0_loss: 0.0612 - n_outputs1_loss: 0.0018 - val_loss: 0.0633 - val_n_outputs0_loss: 0.0619 - val_n_outputs1_loss: 0.0014
Epoch 12/100
35/36 [============================>.] - ETA: 0s - loss: 0.0643 - n_outputs0_loss: 0.0626 - n_outputs1_loss: 0.0016
Epoch 00012: val_loss improved from 0.06324 to 0.06034, saving model to /content/mycar/models/mypilot.h5
36/36 [==============================] - 2s 52ms/step - loss: 0.0644 - n_outputs0_loss: 0.0628 - n_outputs1_loss: 0.0017 - val_loss: 0.0603 - val_n_outputs0_loss: 0.0598 - val_n_outputs1_loss: 4.9525e-04
Epoch 13/100
35/36 [============================>.] - ETA: 0s - loss: 0.0611 - n_outputs0_loss: 0.0597 - n_outputs1_loss: 0.0014
Epoch 00013: val_loss improved from 0.06034 to 0.05636, saving model to /content/mycar/models/mypilot.h5
36/36 [==============================] - 2s 52ms/step - loss: 0.0613 - n_outputs0_loss: 0.0599 - n_outputs1_loss: 0.0014 - val_loss: 0.0564 - val_n_outputs0_loss: 0.0557 - val_n_outputs1_loss: 6.3083e-04
Epoch 14/100
34/36 [===========================>..] - ETA: 0s - loss: 0.0593 - n_outputs0_loss: 0.0580 - n_outputs1_loss: 0.0013
Epoch 00014: val_loss did not improve from 0.05636
36/36 [==============================] - 2s 51ms/step - loss: 0.0590 - n_outputs0_loss: 0.0577 - n_outputs1_loss: 0.0013 - val_loss: 0.0580 - val_n_outputs0_loss: 0.0575 - val_n_outputs1_loss: 5.4033e-04
Epoch 15/100
34/36 [===========================>..] - ETA: 0s - loss: 0.0552 - n_outputs0_loss: 0.0540 - n_outputs1_loss: 0.0012
Epoch 00015: val_loss did not improve from 0.05636
36/36 [==============================] - 2s 51ms/step - loss: 0.0555 - n_outputs0_loss: 0.0542 - n_outputs1_loss: 0.0012 - val_loss: 0.0565 - val_n_outputs0_loss: 0.0559 - val_n_outputs1_loss: 5.6679e-04
Epoch 16/100
35/36 [============================>.] - ETA: 0s - loss: 0.0538 - n_outputs0_loss: 0.0527 - n_outputs1_loss: 0.0012
Epoch 00016: val_loss did not improve from 0.05636
36/36 [==============================] - 2s 50ms/step - loss: 0.0538 - n_outputs0_loss: 0.0526 - n_outputs1_loss: 0.0012 - val_loss: 0.0624 - val_n_outputs0_loss: 0.0619 - val_n_outputs1_loss: 5.0014e-04
Epoch 17/100
35/36 [============================>.] - ETA: 0s - loss: 0.0536 - n_outputs0_loss: 0.0525 - n_outputs1_loss: 0.0011
Epoch 00017: val_loss improved from 0.05636 to 0.05615, saving model to /content/mycar/models/mypilot.h5
36/36 [==============================] - 2s 53ms/step - loss: 0.0534 - n_outputs0_loss: 0.0523 - n_outputs1_loss: 0.0011 - val_loss: 0.0561 - val_n_outputs0_loss: 0.0556 - val_n_outputs1_loss: 5.2067e-04
Epoch 18/100
35/36 [============================>.] - ETA: 0s - loss: 0.0502 - n_outputs0_loss: 0.0492 - n_outputs1_loss: 0.0011
Epoch 00018: val_loss improved from 0.05615 to 0.05594, saving model to /content/mycar/models/mypilot.h5
36/36 [==============================] - 2s 52ms/step - loss: 0.0501 - n_outputs0_loss: 0.0490 - n_outputs1_loss: 0.0011 - val_loss: 0.0559 - val_n_outputs0_loss: 0.0555 - val_n_outputs1_loss: 4.4089e-04
Epoch 00018: early stopping
Training completed in 0:00:40.


----------- Best Eval Loss :0.055939 ---------
<Figure size 640x480 with 1 Axes>

显示训练结果的Loss曲线

In [26]:

import matplotlib.pyplot as plt

list_of_png = glob.glob('/content/mycar/models/*png')

latest_png = max(list_of_png, key=os.path.getctime)

image = cv2.imread(latest_png)

/content/mycar/models/mypilot.h5_loss_acc_0.055939.png

Out[26]:

<matplotlib.image.AxesImage at 0x7f867c896be0>

robocarstore/173807730625008222

将训练好的模型放回 Donkey Car.

一旦模型训练完毕,你可以在 mycar/models中找到训练好的模型
1.下载mypilot文件到你到PC或Mac

  1. 再从PC或Mac拷贝模型到你到pi

最后用Autopilot mode来驾驶Donkey car

Enjoy!

使用Colab来加速Donkey Car训练 (Tensorflow GPU)

colab

本notebook助你快速训练你的Donkey car或者自动驾驶小车模型。

参考并改进了@sachindroid8的notebook,先向前人致敬!

先说说使用Colab进行训练的优缺点

  1. 效率

使用Colab需要梯子,也要把数据传给google,上传时间因人而异,我上传大概花来2-3分钟,剩下就是执行代码和训练。第一执行代码需要搞懂什么原理,后面执行基本不用花什么时间,然而使用GPU训练时间,我有6k图片,训练30-40个Epoch左右提前结束,训练时间少于1分钟!每个Epoch只需1-2秒。

然后使用我的Macbook Pro训练,因为没有GPU的缘故,每个Epoc需要30-50秒,训练下来,接近40分钟到1小时才能完成。

  1. 便利性

如果没有梯子,当然Colab不是一个选择,我知道百度也提供一些有限度免费的服务器,以后可以再做测试,但有梯子的话,Google Colab是最好的选择。下面便开始教你怎样开始训练 

载入我的Notebook

打开使用Google_Colab来Donkey_Car

安装 TensorFlow 1.14.0

TensorFlow 2.x和Donkey Car 3.x现在还有兼容性问题,暂时不推荐使用(2019.8.17)In [1]:

!pip install tensorflow-gpu==1.14.0

Collecting tensorflow-gpu==1.14.0
  Downloading https://files.pythonhosted.org/packages/76/04/43153bfdfcf6c9a4c38ecdb971ca9a75b9a791bb69a764d652c359aca504/tensorflow_gpu-1.14.0-cp36-cp36m-manylinux1_x86_64.whl (377.0MB)
     |████████████████████████████████| 377.0MB 86kB/s 
Requirement already satisfied: grpcio>=1.8.6 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (1.15.0)
Requirement already satisfied: absl-py>=0.7.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (0.7.1)
Requirement already satisfied: tensorflow-estimator<1.15.0rc0,>=1.14.0rc0 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (1.14.0)
Requirement already satisfied: wrapt>=1.11.1 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (1.11.2)
Requirement already satisfied: termcolor>=1.1.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (1.1.0)
Requirement already satisfied: astor>=0.6.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (0.8.0)
Requirement already satisfied: keras-applications>=1.0.6 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (1.0.8)
Requirement already satisfied: tensorboard<1.15.0,>=1.14.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (1.14.0)
Requirement already satisfied: wheel>=0.26 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (0.33.4)
Requirement already satisfied: protobuf>=3.6.1 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (3.7.1)
Requirement already satisfied: six>=1.10.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (1.12.0)
Requirement already satisfied: keras-preprocessing>=1.0.5 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (1.1.0)
Requirement already satisfied: google-pasta>=0.1.6 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (0.1.7)
Requirement already satisfied: gast>=0.2.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (0.2.2)
Requirement already satisfied: numpy<2.0,>=1.14.5 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (1.16.4)
Requirement already satisfied: h5py in /usr/local/lib/python3.6/dist-packages (from keras-applications>=1.0.6->tensorflow-gpu==1.14.0) (2.8.0)
Requirement already satisfied: markdown>=2.6.8 in /usr/local/lib/python3.6/dist-packages (from tensorboard<1.15.0,>=1.14.0->tensorflow-gpu==1.14.0) (3.1.1)
Requirement already satisfied: werkzeug>=0.11.15 in /usr/local/lib/python3.6/dist-packages (from tensorboard<1.15.0,>=1.14.0->tensorflow-gpu==1.14.0) (0.15.5)
Requirement already satisfied: setuptools>=41.0.0 in /usr/local/lib/python3.6/dist-packages (from tensorboard<1.15.0,>=1.14.0->tensorflow-gpu==1.14.0) (41.0.1)
Installing collected packages: tensorflow-gpu
Successfully installed tensorflow-gpu-1.14.0

检查GPU是否有效

如果显示”Found GPU at: / device: GPU: 0“,则GPU可以正常使用

如果没有以上输出,需要检查Runtime (运行类型)是否选择了GPU硬件加速器In [2]:

device_name = tf.test.gpu_device_name()

if device_name != '/device:GPU:0':

raise SystemError('GPU device not found')

print('Found GPU at: {}'.format(device_name))

Found GPU at: /device:GPU:0

克隆Donkey respository

In [3]:

!git clone https://github.com/autorope/donkeycar.git donkey

Cloning into 'donkey'...
remote: Enumerating objects: 120, done.
remote: Counting objects: 100% (120/120), done.
remote: Compressing objects: 100% (76/76), done.
remote: Total 10558 (delta 65), reused 73 (delta 31), pack-reused 10438
Receiving objects: 100% (10558/10558), 58.74 MiB | 46.70 MiB/s, done.
Resolving deltas: 100% (6528/6528), done.

安装 Donkey car

In [4]:

Obtaining file:///content/donkey
Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from donkeycar==3.1.0) (1.16.4)
Requirement already satisfied: pillow in /usr/local/lib/python3.6/dist-packages (from donkeycar==3.1.0) (4.3.0)
Requirement already satisfied: docopt in /usr/local/lib/python3.6/dist-packages (from donkeycar==3.1.0) (0.6.2)
Requirement already satisfied: tornado in /usr/local/lib/python3.6/dist-packages (from donkeycar==3.1.0) (4.5.3)
Requirement already satisfied: requests in /usr/local/lib/python3.6/dist-packages (from donkeycar==3.1.0) (2.21.0)
Requirement already satisfied: h5py in /usr/local/lib/python3.6/dist-packages (from donkeycar==3.1.0) (2.8.0)
Requirement already satisfied: moviepy in /usr/local/lib/python3.6/dist-packages (from donkeycar==3.1.0) (0.2.3.5)
Requirement already satisfied: pandas in /usr/local/lib/python3.6/dist-packages (from donkeycar==3.1.0) (0.24.2)
Requirement already satisfied: PrettyTable in /usr/local/lib/python3.6/dist-packages (from donkeycar==3.1.0) (0.7.2)
Collecting paho-mqtt (from donkeycar==3.1.0)
  Downloading https://files.pythonhosted.org/packages/25/63/db25e62979c2a716a74950c9ed658dce431b5cb01fde29eb6cba9489a904/paho-mqtt-1.4.0.tar.gz (88kB)
     |████████████████████████████████| 92kB 4.2MB/s 
Requirement already satisfied: olefile in /usr/local/lib/python3.6/dist-packages (from pillow->donkeycar==3.1.0) (0.46)
Requirement already satisfied: urllib3<1.25,>=1.21.1 in /usr/local/lib/python3.6/dist-packages (from requests->donkeycar==3.1.0) (1.24.3)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.6/dist-packages (from requests->donkeycar==3.1.0) (2019.6.16)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /usr/local/lib/python3.6/dist-packages (from requests->donkeycar==3.1.0) (3.0.4)
Requirement already satisfied: idna<2.9,>=2.5 in /usr/local/lib/python3.6/dist-packages (from requests->donkeycar==3.1.0) (2.8)
Requirement already satisfied: six in /usr/local/lib/python3.6/dist-packages (from h5py->donkeycar==3.1.0) (1.12.0)
Requirement already satisfied: decorator<5.0,>=4.0.2 in /usr/local/lib/python3.6/dist-packages (from moviepy->donkeycar==3.1.0) (4.4.0)
Requirement already satisfied: tqdm<5.0,>=4.11.2 in /usr/local/lib/python3.6/dist-packages (from moviepy->donkeycar==3.1.0) (4.28.1)
Requirement already satisfied: imageio<3.0,>=2.1.2 in /usr/local/lib/python3.6/dist-packages (from moviepy->donkeycar==3.1.0) (2.4.1)
Requirement already satisfied: pytz>=2011k in /usr/local/lib/python3.6/dist-packages (from pandas->donkeycar==3.1.0) (2018.9)
Requirement already satisfied: python-dateutil>=2.5.0 in /usr/local/lib/python3.6/dist-packages (from pandas->donkeycar==3.1.0) (2.5.3)
Building wheels for collected packages: paho-mqtt
  Building wheel for paho-mqtt (setup.py) ... done
  Created wheel for paho-mqtt: filename=paho_mqtt-1.4.0-cp36-none-any.whl size=48333 sha256=9a67d0c95fae2b9495c20980895d9569ef53859cbc22bb57fc2466d7a581af7b
  Stored in directory: /root/.cache/pip/wheels/82/e5/de/d90d0f397648a1b58ffeea1b5742ac8c77f71fd43b550fa5a5
Successfully built paho-mqtt
Installing collected packages: paho-mqtt, donkeycar
  Running setup.py develop for donkeycar
Successfully installed donkeycar paho-mqtt-1.4.0

创建项目

我使用了mycar作为项目名称,你可以改名,但改名后需要相应修改后面但代码In [5]:

!donkey createcar --path /content/mycar

using donkey v3.1.0 ...
Creating car folder: /content/mycar
making dir  /content/mycar
Creating data & model folders.
making dir  /content/mycar/models
making dir  /content/mycar/data
making dir  /content/mycar/logs
Copying car application template: complete
Copying car config defaults. Adjust these before starting your car.
Copying train script. Adjust these before starting your car.
Copying my car config overrides
Donkey setup complete.

准备数据: 上传data.zip并解压

现在你需要把pi采集回来的data目录上需要训练的目录打包,保存成data.zip.

在pi的data目录上运行:

$ zip -r data.zip tub_3_19-08-17/

然后拷贝会电脑,准备上传到Colab

上传data.zip到Colab

运行下面代码,会出现一个上传按钮,点击上传刚才打包的data.zipIn [7]:

from google.colab import files

if(os.path.exists("/content/data.zip")):

os.remove("/content/data.zip")

if(os.path.exists("/content/mycar/data/data.zip")):

os.remove("/content/mycar/data/data.zip")

uploaded = files.upload()

WORK_FOLDER = "/content/mycar/data/"

if(os.path.exists(WORK_FOLDER) == False):

!mv /content/data.zip /content/mycar/data/

清理已经上传的文件

你需要确保content/mycar/data目录下有tub目录,目录里面有图片和对应的json文件

data.zip就不用保留了In [0]:

!rm /content/mycar/data/data.zip

训练模型

In [21]:

!python /content/mycar/manage.py train --model /content/mycar/models/mypilot.h5

using donkey v3.1.0 ...
loading config file: /content/mycar/config.py
loading personal config over-rides

config loaded
WARNING: Logging before flag parsing goes to stderr.
W0818 04:53:15.944086 140399927523200 deprecation_wrapper.py:119] From /content/donkey/donkeycar/parts/keras.py:18: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

W0818 04:53:15.944333 140399927523200 deprecation_wrapper.py:119] From /content/donkey/donkeycar/parts/keras.py:18: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

2019-08-18 04:53:15.954857: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-08-18 04:53:15.959505: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2019-08-18 04:53:16.083690: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-08-18 04:53:16.084227: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x2455100 executing computations on platform CUDA. Devices:
2019-08-18 04:53:16.084262: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): Tesla T4, Compute Capability 7.5
2019-08-18 04:53:16.086194: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2300000000 Hz
2019-08-18 04:53:16.086460: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x65e6380 executing computations on platform Host. Devices:
2019-08-18 04:53:16.086508: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): <undefined>, <undefined>
2019-08-18 04:53:16.086686: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-08-18 04:53:16.087039: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: 
name: Tesla T4 major: 7 minor: 5 memoryClockRate(GHz): 1.59
pciBusID: 0000:00:04.0
2019-08-18 04:53:16.087321: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2019-08-18 04:53:16.088548: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2019-08-18 04:53:16.089621: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2019-08-18 04:53:16.089935: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2019-08-18 04:53:16.091399: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2019-08-18 04:53:16.092394: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2019-08-18 04:53:16.095470: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2019-08-18 04:53:16.095606: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-08-18 04:53:16.096009: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-08-18 04:53:16.096350: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2019-08-18 04:53:16.096401: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2019-08-18 04:53:16.097166: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-08-18 04:53:16.097190: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187]      0 
2019-08-18 04:53:16.097201: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0:   N 
2019-08-18 04:53:16.097482: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-08-18 04:53:16.097878: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-08-18 04:53:16.098226: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14089 MB memory) -> physical GPU (device: 0, name: Tesla T4, pci bus id: 0000:00:04.0, compute capability: 7.5)
"get_model_by_type" model Type is: linear
W0818 04:53:16.444941 140399927523200 deprecation.py:506] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
training with model type <class 'donkeycar.parts.keras.KerasLinear'>
Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
img_in (InputLayer)             [(None, 120, 160, 3) 0                                            
__________________________________________________________________________________________________
conv2d_1 (Conv2D)               (None, 58, 78, 24)   1824        img_in[0][0]                     
__________________________________________________________________________________________________
dropout (Dropout)               (None, 58, 78, 24)   0           conv2d_1[0][0]                   
__________________________________________________________________________________________________
conv2d_2 (Conv2D)               (None, 27, 37, 32)   19232       dropout[0][0]                    
__________________________________________________________________________________________________
dropout_1 (Dropout)             (None, 27, 37, 32)   0           conv2d_2[0][0]                   
__________________________________________________________________________________________________
conv2d_3 (Conv2D)               (None, 12, 17, 64)   51264       dropout_1[0][0]                  
__________________________________________________________________________________________________
dropout_2 (Dropout)             (None, 12, 17, 64)   0           conv2d_3[0][0]                   
__________________________________________________________________________________________________
conv2d_4 (Conv2D)               (None, 10, 15, 64)   36928       dropout_2[0][0]                  
__________________________________________________________________________________________________
dropout_3 (Dropout)             (None, 10, 15, 64)   0           conv2d_4[0][0]                   
__________________________________________________________________________________________________
conv2d_5 (Conv2D)               (None, 8, 13, 64)    36928       dropout_3[0][0]                  
__________________________________________________________________________________________________
dropout_4 (Dropout)             (None, 8, 13, 64)    0           conv2d_5[0][0]                   
__________________________________________________________________________________________________
flattened (Flatten)             (None, 6656)         0           dropout_4[0][0]                  
__________________________________________________________________________________________________
dense (Dense)                   (None, 100)          665700      flattened[0][0]                  
__________________________________________________________________________________________________
dropout_5 (Dropout)             (None, 100)          0           dense[0][0]                      
__________________________________________________________________________________________________
dense_1 (Dense)                 (None, 50)           5050        dropout_5[0][0]                  
__________________________________________________________________________________________________
dropout_6 (Dropout)             (None, 50)           0           dense_1[0][0]                    
__________________________________________________________________________________________________
n_outputs0 (Dense)              (None, 1)            51          dropout_6[0][0]                  
__________________________________________________________________________________________________
n_outputs1 (Dense)              (None, 1)            51          dropout_6[0][0]                  
==================================================================================================
Total params: 817,028
Trainable params: 817,028
Non-trainable params: 0
__________________________________________________________________________________________________
None
found 0 pickles writing json records and images in tub /content/mycar/data/tub_3_19-08-17
/content/mycar/data/tub_3_19-08-17
collating 5799 records ...
train: 4639, val: 1160
total records: 5799
steps_per_epoch 36
Epoch 1/100
2019-08-18 04:53:19.663907: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2019-08-18 04:53:19.984999: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
35/36 [============================>.] - ETA: 0s - loss: 0.1681 - n_outputs0_loss: 0.1555 - n_outputs1_loss: 0.0126
Epoch 00001: val_loss improved from inf to 0.14482, saving model to /content/mycar/models/mypilot.h5
36/36 [==============================] - 8s 213ms/step - loss: 0.1677 - n_outputs0_loss: 0.1553 - n_outputs1_loss: 0.0124 - val_loss: 0.1448 - val_n_outputs0_loss: 0.1409 - val_n_outputs1_loss: 0.0039
Epoch 2/100
34/36 [===========================>..] - ETA: 0s - loss: 0.1278 - n_outputs0_loss: 0.1234 - n_outputs1_loss: 0.0044
Epoch 00002: val_loss improved from 0.14482 to 0.08814, saving model to /content/mycar/models/mypilot.h5
36/36 [==============================] - 2s 53ms/step - loss: 0.1257 - n_outputs0_loss: 0.1214 - n_outputs1_loss: 0.0043 - val_loss: 0.0881 - val_n_outputs0_loss: 0.0870 - val_n_outputs1_loss: 0.0012
Epoch 3/100
35/36 [============================>.] - ETA: 0s - loss: 0.0943 - n_outputs0_loss: 0.0910 - n_outputs1_loss: 0.0033
Epoch 00003: val_loss improved from 0.08814 to 0.07490, saving model to /content/mycar/models/mypilot.h5
36/36 [==============================] - 2s 52ms/step - loss: 0.0938 - n_outputs0_loss: 0.0905 - n_outputs1_loss: 0.0033 - val_loss: 0.0749 - val_n_outputs0_loss: 0.0738 - val_n_outputs1_loss: 0.0011
Epoch 4/100
35/36 [============================>.] - ETA: 0s - loss: 0.0840 - n_outputs0_loss: 0.0813 - n_outputs1_loss: 0.0027
Epoch 00004: val_loss improved from 0.07490 to 0.07108, saving model to /content/mycar/models/mypilot.h5
36/36 [==============================] - 2s 51ms/step - loss: 0.0835 - n_outputs0_loss: 0.0808 - n_outputs1_loss: 0.0027 - val_loss: 0.0711 - val_n_outputs0_loss: 0.0702 - val_n_outputs1_loss: 8.9668e-04
Epoch 5/100
35/36 [============================>.] - ETA: 0s - loss: 0.0767 - n_outputs0_loss: 0.0741 - n_outputs1_loss: 0.0026
Epoch 00005: val_loss did not improve from 0.07108
36/36 [==============================] - 2s 50ms/step - loss: 0.0765 - n_outputs0_loss: 0.0739 - n_outputs1_loss: 0.0026 - val_loss: 0.0722 - val_n_outputs0_loss: 0.0717 - val_n_outputs1_loss: 5.4629e-04
Epoch 6/100
35/36 [============================>.] - ETA: 0s - loss: 0.0750 - n_outputs0_loss: 0.0727 - n_outputs1_loss: 0.0023
Epoch 00006: val_loss improved from 0.07108 to 0.06483, saving model to /content/mycar/models/mypilot.h5
36/36 [==============================] - 2s 52ms/step - loss: 0.0749 - n_outputs0_loss: 0.0725 - n_outputs1_loss: 0.0024 - val_loss: 0.0648 - val_n_outputs0_loss: 0.0641 - val_n_outputs1_loss: 7.5336e-04
Epoch 7/100
35/36 [============================>.] - ETA: 0s - loss: 0.0711 - n_outputs0_loss: 0.0689 - n_outputs1_loss: 0.0022
Epoch 00007: val_loss improved from 0.06483 to 0.06324, saving model to /content/mycar/models/mypilot.h5
36/36 [==============================] - 2s 52ms/step - loss: 0.0711 - n_outputs0_loss: 0.0689 - n_outputs1_loss: 0.0021 - val_loss: 0.0632 - val_n_outputs0_loss: 0.0626 - val_n_outputs1_loss: 5.9058e-04
Epoch 8/100
35/36 [============================>.] - ETA: 0s - loss: 0.0693 - n_outputs0_loss: 0.0675 - n_outputs1_loss: 0.0018
Epoch 00008: val_loss did not improve from 0.06324
36/36 [==============================] - 2s 51ms/step - loss: 0.0690 - n_outputs0_loss: 0.0672 - n_outputs1_loss: 0.0018 - val_loss: 0.0644 - val_n_outputs0_loss: 0.0637 - val_n_outputs1_loss: 7.8261e-04
Epoch 9/100
35/36 [============================>.] - ETA: 0s - loss: 0.0693 - n_outputs0_loss: 0.0675 - n_outputs1_loss: 0.0018
Epoch 00009: val_loss did not improve from 0.06324
36/36 [==============================] - 2s 50ms/step - loss: 0.0690 - n_outputs0_loss: 0.0672 - n_outputs1_loss: 0.0018 - val_loss: 0.0653 - val_n_outputs0_loss: 0.0646 - val_n_outputs1_loss: 6.6657e-04
Epoch 10/100
34/36 [===========================>..] - ETA: 0s - loss: 0.0651 - n_outputs0_loss: 0.0633 - n_outputs1_loss: 0.0018
Epoch 00010: val_loss did not improve from 0.06324
36/36 [==============================] - 2s 51ms/step - loss: 0.0657 - n_outputs0_loss: 0.0639 - n_outputs1_loss: 0.0018 - val_loss: 0.0699 - val_n_outputs0_loss: 0.0692 - val_n_outputs1_loss: 6.9729e-04
Epoch 11/100
35/36 [============================>.] - ETA: 0s - loss: 0.0630 - n_outputs0_loss: 0.0612 - n_outputs1_loss: 0.0018
Epoch 00011: val_loss did not improve from 0.06324
36/36 [==============================] - 2s 51ms/step - loss: 0.0630 - n_outputs0_loss: 0.0612 - n_outputs1_loss: 0.0018 - val_loss: 0.0633 - val_n_outputs0_loss: 0.0619 - val_n_outputs1_loss: 0.0014
Epoch 12/100
35/36 [============================>.] - ETA: 0s - loss: 0.0643 - n_outputs0_loss: 0.0626 - n_outputs1_loss: 0.0016
Epoch 00012: val_loss improved from 0.06324 to 0.06034, saving model to /content/mycar/models/mypilot.h5
36/36 [==============================] - 2s 52ms/step - loss: 0.0644 - n_outputs0_loss: 0.0628 - n_outputs1_loss: 0.0017 - val_loss: 0.0603 - val_n_outputs0_loss: 0.0598 - val_n_outputs1_loss: 4.9525e-04
Epoch 13/100
35/36 [============================>.] - ETA: 0s - loss: 0.0611 - n_outputs0_loss: 0.0597 - n_outputs1_loss: 0.0014
Epoch 00013: val_loss improved from 0.06034 to 0.05636, saving model to /content/mycar/models/mypilot.h5
36/36 [==============================] - 2s 52ms/step - loss: 0.0613 - n_outputs0_loss: 0.0599 - n_outputs1_loss: 0.0014 - val_loss: 0.0564 - val_n_outputs0_loss: 0.0557 - val_n_outputs1_loss: 6.3083e-04
Epoch 14/100
34/36 [===========================>..] - ETA: 0s - loss: 0.0593 - n_outputs0_loss: 0.0580 - n_outputs1_loss: 0.0013
Epoch 00014: val_loss did not improve from 0.05636
36/36 [==============================] - 2s 51ms/step - loss: 0.0590 - n_outputs0_loss: 0.0577 - n_outputs1_loss: 0.0013 - val_loss: 0.0580 - val_n_outputs0_loss: 0.0575 - val_n_outputs1_loss: 5.4033e-04
Epoch 15/100
34/36 [===========================>..] - ETA: 0s - loss: 0.0552 - n_outputs0_loss: 0.0540 - n_outputs1_loss: 0.0012
Epoch 00015: val_loss did not improve from 0.05636
36/36 [==============================] - 2s 51ms/step - loss: 0.0555 - n_outputs0_loss: 0.0542 - n_outputs1_loss: 0.0012 - val_loss: 0.0565 - val_n_outputs0_loss: 0.0559 - val_n_outputs1_loss: 5.6679e-04
Epoch 16/100
35/36 [============================>.] - ETA: 0s - loss: 0.0538 - n_outputs0_loss: 0.0527 - n_outputs1_loss: 0.0012
Epoch 00016: val_loss did not improve from 0.05636
36/36 [==============================] - 2s 50ms/step - loss: 0.0538 - n_outputs0_loss: 0.0526 - n_outputs1_loss: 0.0012 - val_loss: 0.0624 - val_n_outputs0_loss: 0.0619 - val_n_outputs1_loss: 5.0014e-04
Epoch 17/100
35/36 [============================>.] - ETA: 0s - loss: 0.0536 - n_outputs0_loss: 0.0525 - n_outputs1_loss: 0.0011
Epoch 00017: val_loss improved from 0.05636 to 0.05615, saving model to /content/mycar/models/mypilot.h5
36/36 [==============================] - 2s 53ms/step - loss: 0.0534 - n_outputs0_loss: 0.0523 - n_outputs1_loss: 0.0011 - val_loss: 0.0561 - val_n_outputs0_loss: 0.0556 - val_n_outputs1_loss: 5.2067e-04
Epoch 18/100
35/36 [============================>.] - ETA: 0s - loss: 0.0502 - n_outputs0_loss: 0.0492 - n_outputs1_loss: 0.0011
Epoch 00018: val_loss improved from 0.05615 to 0.05594, saving model to /content/mycar/models/mypilot.h5
36/36 [==============================] - 2s 52ms/step - loss: 0.0501 - n_outputs0_loss: 0.0490 - n_outputs1_loss: 0.0011 - val_loss: 0.0559 - val_n_outputs0_loss: 0.0555 - val_n_outputs1_loss: 4.4089e-04
Epoch 00018: early stopping
Training completed in 0:00:40.


----------- Best Eval Loss :0.055939 ---------
<Figure size 640x480 with 1 Axes>

显示训练结果的Loss曲线

In [26]:

import matplotlib.pyplot as plt

list_of_png = glob.glob('/content/mycar/models/*png')

latest_png = max(list_of_png, key=os.path.getctime)

image = cv2.imread(latest_png)

/content/mycar/models/mypilot.h5_loss_acc_0.055939.png

Out[26]:

<matplotlib.image.AxesImage at 0x7f867c896be0>

robocarstore/173807730625008222

将训练好的模型放回 Donkey Car.

一旦模型训练完毕,你可以在 mycar/models中找到训练好的模型
1.下载mypilot文件到你到PC或Mac

  1. 再从PC或Mac拷贝模型到你到pi

最后用Autopilot mode来驾驶Donkey car

Enjoy!

Use Colab to Accelerate Donkey Car Training (Tensorflow GPU)

colab

This notebook helps you quickly train your Donkey car or autonomous car model.

Reference and improved the notebook of @sachindroid8, thanks to the previous generation!

The Pros and Cons of Using Colab for Training

  1. Efficiency

Using Colab requires a VPN, and also needs to upload data to Google, the upload time varies from person to person, I uploaded it for about 2-3 minutes, the rest is executing the code and training. The first execution of the code needs to understand what the principle is, and the subsequent execution basically does not take much time, but the training time using GPU, I have 6k images, training about 30-40 epochs early, the training time is less than 1 minute! Each Epoch only needs 1-2 seconds.

Then using my Macbook Pro training, because of the lack of GPU, each Epoch needs 30-50 seconds, training down, it takes about 40 minutes to 1 hour to complete.

  1. Convenience

If you don't have a VPN, of course, Colab is not a choice, I know Baidu also provides some limited free servers, which can be tested later, but with a VPN, Google Colab is the best choice. Below is how to start training

Load My Notebook

Open the Using Google_Colab to Donkey_Car

Install TensorFlow 1.14.0

TensorFlow 2.x and Donkey Car 3.x still have compatibility issues, so it is not recommended to use them temporarily (2019.8.17)In [1]:

!pip install tensorflow-gpu==1.14.0

Collecting tensorflow-gpu==1.14.0
  Downloading https://files.pythonhosted.org/packages/76/04/43153bfdfcf6c9a4c38ecdb971ca9a75b9a791bb69a764d652c359aca504/tensorflow_gpu-1.14.0-cp36-cp36m-manylinux1_x86_64.whl (377.0MB)
     |████████████████████████████████| 377.0MB 86kB/s 
Requirement already satisfied: grpcio>=1.8.6 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (1.15.0)
Requirement already satisfied: absl-py>=0.7.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (0.7.1)
Requirement already satisfied: tensorflow-estimator<1.15.0rc0,>=1.14.0rc0 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (1.14.0)
Requirement already satisfied: wrapt>=1.11.1 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (1.11.2)
Requirement already satisfied: termcolor>=1.1.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (1.1.0)
Requirement already satisfied: astor>=0.6.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (0.8.0)
Requirement already satisfied: keras-applications>=1.0.6 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (1.0.8)
Requirement already satisfied: tensorboard<1.15.0,>=1.14.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (1.14.0)
Requirement already satisfied: wheel>=0.26 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (0.33.4)
Requirement already satisfied: protobuf>=3.6.1 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (3.7.1)
Requirement already satisfied: six>=1.10.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (1.12.0)
Requirement already satisfied: keras-preprocessing>=1.0.5 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (1.1.0)
Requirement already satisfied: google-pasta>=0.1.6 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (0.1.7)
Requirement already satisfied: gast>=0.2.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (0.2.2)
Requirement already satisfied: numpy<2.0,>=1.14.5 in /usr/local/lib/python3.6/dist-packages (from tensorflow-gpu==1.14.0) (1.16.4)
Requirement already satisfied: h5py in /usr/local/lib/python3.6/dist-packages (from keras-applications>=1.0.6->tensorflow-gpu==1.14.0) (2.8.0)
Requirement already satisfied: markdown>=2.6.8 in /usr/local/lib/python3.6/dist-packages (from tensorboard<1.15.0,>=1.14.0->tensorflow-gpu==1.14.0) (3.1.1)
Requirement already satisfied: werkzeug>=0.11.15 in /usr/local/lib/python3.6/dist-packages (from tensorboard<1.15.0,>=1.14.0->tensorflow-gpu==1.14.0) (0.15.5)
Requirement already satisfied: setuptools>=41.0.0 in /usr/local/lib/python3.6/dist-packages (from tensorboard<1.15.0,>=1.14.0->tensorflow-gpu==1.14.0) (41.0.1)
Installing collected packages: tensorflow-gpu
Successfully installed tensorflow-gpu-1.14.0

Check if GPU is working

If you see “Found GPU at: / device: GPU: 0”, then the GPU is working normally

If you don't see the above output, you need to check if the Runtime (Run Type) has selected the GPU hardware acceleratorIn [2]:

device_name = tf.test.gpu_device_name()

if device_name != '/device:GPU:0':

raise SystemError('GPU device not found')

print('Found GPU at: {}'.format(device_name))

Found GPU at: /device:GPU:0

克隆Donkey respository

In [3]:

!git clone https://github.com/autorope/donkeycar.git donkey

Cloning into 'donkey'...
remote: Enumerating objects: 120, done.
remote: Counting objects: 100% (120/120), done.
remote: Compressing objects: 100% (76/76), done.
remote: Total 10558 (delta 65), reused 73 (delta 31), pack-reused 10438
Receiving objects: 100% (10558/10558), 58.74 MiB | 46.70 MiB/s, done.
Resolving deltas: 100% (6528/6528), done.

Install Donkey car

In [4]:

Obtaining file:///content/donkey
Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from donkeycar==3.1.0) (1.16.4)
Requirement already satisfied: pillow in /usr/local/lib/python3.6/dist-packages (from donkeycar==3.1.0) (4.3.0)
Requirement already satisfied: docopt in /usr/local/lib/python3.6/dist-packages (from donkeycar==3.1.0) (0.6.2)
Requirement already satisfied: tornado in /usr/local/lib/python3.6/dist-packages (from donkeycar==3.1.0) (4.5.3)
Requirement already satisfied: requests in /usr/local/lib/python3.6/dist-packages (from donkeycar==3.1.0) (2.21.0)
Requirement already satisfied: h5py in /usr/local/lib/python3.6/dist-packages (from donkeycar==3.1.0) (2.8.0)
Requirement already satisfied: moviepy in /usr/local/lib/python3.6/dist-packages (from donkeycar==3.1.0) (0.2.3.5)
Requirement already satisfied: pandas in /usr/local/lib/python3.6/dist-packages (from donkeycar==3.1.0) (0.24.2)
Requirement already satisfied: PrettyTable in /usr/local/lib/python3.6/dist-packages (from donkeycar==3.1.0) (0.7.2)
Collecting paho-mqtt (from donkeycar==3.1.0)
  Downloading https://files.pythonhosted.org/packages/25/63/db25e62979c2a716a74950c9ed658dce431b5cb01fde29eb6cba9489a904/paho-mqtt-1.4.0.tar.gz (88kB)
     |████████████████████████████████| 92kB 4.2MB/s 
Requirement already satisfied: olefile in /usr/local/lib/python3.6/dist-packages (from pillow->donkeycar==3.1.0) (0.46)
Requirement already satisfied: urllib3<1.25,>=1.21.1 in /usr/local/lib/python3.6/dist-packages (from requests->donkeycar==3.1.0) (1.24.3)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.6/dist-packages (from requests->donkeycar==3.1.0) (2019.6.16)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /usr/local/lib/python3.6/dist-packages (from requests->donkeycar==3.1.0) (3.0.4)
Requirement already satisfied: idna<2.9,>=2.5 in /usr/local/lib/python3.6/dist-packages (from requests->donkeycar==3.1.0) (2.8)
Requirement already satisfied: six in /usr/local/lib/python3.6/dist-packages (from h5py->donkeycar==3.1.0) (1.12.0)
Requirement already satisfied: decorator<5.0,>=4.0.2 in /usr/local/lib/python3.6/dist-packages (from moviepy->donkeycar==3.1.0) (4.4.0)
Requirement already satisfied: tqdm<5.0,>=4.11.2 in /usr/local/lib/python3.6/dist-packages (from moviepy->donkeycar==3.1.0) (4.28.1)
Requirement already satisfied: imageio<3.0,>=2.1.2 in /usr/local/lib/python3.6/dist-packages (from moviepy->donkeycar==3.1.0) (2.4.1)
Requirement already satisfied: pytz>=2011k in /usr/local/lib/python3.6/dist-packages (from pandas->donkeycar==3.1.0) (2018.9)
Requirement already satisfied: python-dateutil>=2.5.0 in /usr/local/lib/python3.6/dist-packages (from pandas->donkeycar==3.1.0) (2.5.3)
Building wheels for collected packages: paho-mqtt
  Building wheel for paho-mqtt (setup.py) ... done
  Created wheel for paho-mqtt: filename=paho_mqtt-1.4.0-cp36-none-any.whl size=48333 sha256=9a67d0c95fae2b9495c20980895d9569ef53859cbc22bb57fc2466d7a581af7b
  Stored in directory: /root/.cache/pip/wheels/82/e5/de/d90d0f397648a1b58ffeea1b5742ac8c77f71fd43b550fa5a5
Successfully built paho-mqtt
Installing collected packages: paho-mqtt, donkeycar
  Running setup.py develop for donkeycar
Successfully installed donkeycar paho-mqtt-1.4.0

Create a project

I used mycar as the project name, you can change it, but you need to modify the code after renamingIn [5]:

!donkey createcar --path /content/mycar

using donkey v3.1.0 ...
Creating car folder: /content/mycar
making dir  /content/mycar
Creating data & model folders.
making dir  /content/mycar/models
making dir  /content/mycar/data
making dir  /content/mycar/logs
Copying car application template: complete
Copying car config defaults. Adjust these before starting your car.
Copying train script. Adjust these before starting your car.
Copying my car config overrides
Donkey setup complete.

Prepare data: Upload data.zip and unzip

Now you need to package the data directory that you collected from pi, save it as data.zip.

Run the following command on the pi data directory:

$ zip -r data.zip tub_3_19-08-17/

Then copy it back to your computer, ready to upload to Colab

Upload data.zip to Colab

Run the following code, you will see an upload button, click to upload the data.zip you just packagedIn [7]:

from google.colab import files

if(os.path.exists("/content/data.zip")):

os.remove("/content/data.zip")

if(os.path.exists("/content/mycar/data/data.zip")):

os.remove("/content/mycar/data/data.zip")

uploaded = files.upload()

WORK_FOLDER = "/content/mycar/data/"

if(os.path.exists(WORK_FOLDER) == False):

!mv /content/data.zip /content/mycar/data/

Clean up the uploaded files

You need to ensure that the content/mycar/data directory has a tub directory, and the directory contains images and corresponding json files

data.zip is not neededIn [0]:

!rm /content/mycar/data/data.zip

Train the model

In [21]:

!python /content/mycar/manage.py train --model /content/mycar/models/mypilot.h5

using donkey v3.1.0 ...
loading config file: /content/mycar/config.py
loading personal config over-rides

config loaded
WARNING: Logging before flag parsing goes to stderr.
W0818 04:53:15.944086 140399927523200 deprecation_wrapper.py:119] From /content/donkey/donkeycar/parts/keras.py:18: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

W0818 04:53:15.944333 140399927523200 deprecation_wrapper.py:119] From /content/donkey/donkeycar/parts/keras.py:18: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

2019-08-18 04:53:15.954857: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-08-18 04:53:15.959505: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2019-08-18 04:53:16.083690: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-08-18 04:53:16.084227: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x2455100 executing computations on platform CUDA. Devices:
2019-08-18 04:53:16.084262: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): Tesla T4, Compute Capability 7.5
2019-08-18 04:53:16.086194: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2300000000 Hz
2019-08-18 04:53:16.086460: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x65e6380 executing computations on platform Host. Devices:
2019-08-18 04:53:16.086508: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): <undefined>, <undefined>
2019-08-18 04:53:16.086686: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-08-18 04:53:16.087039: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: 
name: Tesla T4 major: 7 minor: 5 memoryClockRate(GHz): 1.59
pciBusID: 0000:00:04.0
2019-08-18 04:53:16.087321: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2019-08-18 04:53:16.088548: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2019-08-18 04:53:16.089621: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2019-08-18 04:53:16.089935: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2019-08-18 04:53:16.091399: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2019-08-18 04:53:16.092394: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2019-08-18 04:53:16.095470: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2019-08-18 04:53:16.095606: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-08-18 04:53:16.096009: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-08-18 04:53:16.096350: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2019-08-18 04:53:16.096401: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2019-08-18 04:53:16.097166: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-08-18 04:53:16.097190: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187]      0 
2019-08-18 04:53:16.097201: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0:   N 
2019-08-18 04:53:16.097482: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-08-18 04:53:16.097878: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-08-18 04:53:16.098226: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14089 MB memory) -> physical GPU (device: 0, name: Tesla T4, pci bus id: 0000:00:04.0, compute capability: 7.5)
"get_model_by_type" model Type is: linear
W0818 04:53:16.444941 140399927523200 deprecation.py:506] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
training with model type <class 'donkeycar.parts.keras.KerasLinear'>
Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
img_in (InputLayer)             [(None, 120, 160, 3) 0                                            
__________________________________________________________________________________________________
conv2d_1 (Conv2D)               (None, 58, 78, 24)   1824        img_in[0][0]                     
__________________________________________________________________________________________________
dropout (Dropout)               (None, 58, 78, 24)   0           conv2d_1[0][0]                   
__________________________________________________________________________________________________
conv2d_2 (Conv2D)               (None, 27, 37, 32)   19232       dropout[0][0]                    
__________________________________________________________________________________________________
dropout_1 (Dropout)             (None, 27, 37, 32)   0           conv2d_2[0][0]                   
__________________________________________________________________________________________________
conv2d_3 (Conv2D)               (None, 12, 17, 64)   51264       dropout_1[0][0]                  
__________________________________________________________________________________________________
dropout_2 (Dropout)             (None, 12, 17, 64)   0           conv2d_3[0][0]                   
__________________________________________________________________________________________________
conv2d_4 (Conv2D)               (None, 10, 15, 64)   36928       dropout_2[0][0]                  
__________________________________________________________________________________________________
dropout_3 (Dropout)             (None, 10, 15, 64)   0           conv2d_4[0][0]                   
__________________________________________________________________________________________________
conv2d_5 (Conv2D)               (None, 8, 13, 64)    36928       dropout_3[0][0]                  
__________________________________________________________________________________________________
dropout_4 (Dropout)             (None, 8, 13, 64)    0           conv2d_5[0][0]                   
__________________________________________________________________________________________________
flattened (Flatten)             (None, 6656)         0           dropout_4[0][0]                  
__________________________________________________________________________________________________
dense (Dense)                   (None, 100)          665700      flattened[0][0]                  
__________________________________________________________________________________________________
dropout_5 (Dropout)             (None, 100)          0           dense[0][0]                      
__________________________________________________________________________________________________
dense_1 (Dense)                 (None, 50)           5050        dropout_5[0][0]                  
__________________________________________________________________________________________________
dropout_6 (Dropout)             (None, 50)           0           dense_1[0][0]                    
__________________________________________________________________________________________________
n_outputs0 (Dense)              (None, 1)            51          dropout_6[0][0]                  
__________________________________________________________________________________________________
n_outputs1 (Dense)              (None, 1)            51          dropout_6[0][0]                  
==================================================================================================
Total params: 817,028
Trainable params: 817,028
Non-trainable params: 0
__________________________________________________________________________________________________
None
found 0 pickles writing json records and images in tub /content/mycar/data/tub_3_19-08-17
/content/mycar/data/tub_3_19-08-17
collating 5799 records ...
train: 4639, val: 1160
total records: 5799
steps_per_epoch 36
Epoch 1/100
2019-08-18 04:53:19.663907: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2019-08-18 04:53:19.984999: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
35/36 [============================>.] - ETA: 0s - loss: 0.1681 - n_outputs0_loss: 0.1555 - n_outputs1_loss: 0.0126
Epoch 00001: val_loss improved from inf to 0.14482, saving model to /content/mycar/models/mypilot.h5
36/36 [==============================] - 8s 213ms/step - loss: 0.1677 - n_outputs0_loss: 0.1553 - n_outputs1_loss: 0.0124 - val_loss: 0.1448 - val_n_outputs0_loss: 0.1409 - val_n_outputs1_loss: 0.0039
Epoch 2/100
34/36 [===========================>..] - ETA: 0s - loss: 0.1278 - n_outputs0_loss: 0.1234 - n_outputs1_loss: 0.0044
Epoch 00002: val_loss improved from 0.14482 to 0.08814, saving model to /content/mycar/models/mypilot.h5
36/36 [==============================] - 2s 53ms/step - loss: 0.1257 - n_outputs0_loss: 0.1214 - n_outputs1_loss: 0.0043 - val_loss: 0.0881 - val_n_outputs0_loss: 0.0870 - val_n_outputs1_loss: 0.0012
Epoch 3/100
35/36 [============================>.] - ETA: 0s - loss: 0.0943 - n_outputs0_loss: 0.0910 - n_outputs1_loss: 0.0033
Epoch 00003: val_loss improved from 0.08814 to 0.07490, saving model to /content/mycar/models/mypilot.h5
36/36 [==============================] - 2s 52ms/step - loss: 0.0938 - n_outputs0_loss: 0.0905 - n_outputs1_loss: 0.0033 - val_loss: 0.0749 - val_n_outputs0_loss: 0.0738 - val_n_outputs1_loss: 0.0011
Epoch 4/100
35/36 [============================>.] - ETA: 0s - loss: 0.0840 - n_outputs0_loss: 0.0813 - n_outputs1_loss: 0.0027
Epoch 00004: val_loss improved from 0.07490 to 0.07108, saving model to /content/mycar/models/mypilot.h5
36/36 [==============================] - 2s 51ms/step - loss: 0.0835 - n_outputs0_loss: 0.0808 - n_outputs1_loss: 0.0027 - val_loss: 0.0711 - val_n_outputs0_loss: 0.0702 - val_n_outputs1_loss: 8.9668e-04
Epoch 5/100
35/36 [============================>.] - ETA: 0s - loss: 0.0767 - n_outputs0_loss: 0.0741 - n_outputs1_loss: 0.0026
Epoch 00005: val_loss did not improve from 0.07108
36/36 [==============================] - 2s 50ms/step - loss: 0.0765 - n_outputs0_loss: 0.0739 - n_outputs1_loss: 0.0026 - val_loss: 0.0722 - val_n_outputs0_loss: 0.0717 - val_n_outputs1_loss: 5.4629e-04
Epoch 6/100
35/36 [============================>.] - ETA: 0s - loss: 0.0750 - n_outputs0_loss: 0.0727 - n_outputs1_loss: 0.0023
Epoch 00006: val_loss improved from 0.07108 to 0.06483, saving model to /content/mycar/models/mypilot.h5
36/36 [==============================] - 2s 52ms/step - loss: 0.0749 - n_outputs0_loss: 0.0725 - n_outputs1_loss: 0.0024 - val_loss: 0.0648 - val_n_outputs0_loss: 0.0641 - val_n_outputs1_loss: 7.5336e-04
Epoch 7/100
35/36 [============================>.] - ETA: 0s - loss: 0.0711 - n_outputs0_loss: 0.0689 - n_outputs1_loss: 0.0022
Epoch 00007: val_loss improved from 0.06483 to 0.06324, saving model to /content/mycar/models/mypilot.h5
36/36 [==============================] - 2s 52ms/step - loss: 0.0711 - n_outputs0_loss: 0.0689 - n_outputs1_loss: 0.0021 - val_loss: 0.0632 - val_n_outputs0_loss: 0.0626 - val_n_outputs1_loss: 5.9058e-04
Epoch 8/100
35/36 [============================>.] - ETA: 0s - loss: 0.0693 - n_outputs0_loss: 0.0675 - n_outputs1_loss: 0.0018
Epoch 00008: val_loss did not improve from 0.06324
36/36 [==============================] - 2s 51ms/step - loss: 0.0690 - n_outputs0_loss: 0.0672 - n_outputs1_loss: 0.0018 - val_loss: 0.0644 - val_n_outputs0_loss: 0.0637 - val_n_outputs1_loss: 7.8261e-04
Epoch 9/100
35/36 [============================>.] - ETA: 0s - loss: 0.0693 - n_outputs0_loss: 0.0675 - n_outputs1_loss: 0.0018
Epoch 00009: val_loss did not improve from 0.06324
36/36 [==============================] - 2s 50ms/step - loss: 0.0690 - n_outputs0_loss: 0.0672 - n_outputs1_loss: 0.0018 - val_loss: 0.0653 - val_n_outputs0_loss: 0.0646 - val_n_outputs1_loss: 6.6657e-04
Epoch 10/100
34/36 [===========================>..] - ETA: 0s - loss: 0.0651 - n_outputs0_loss: 0.0633 - n_outputs1_loss: 0.0018
Epoch 00010: val_loss did not improve from 0.06324
36/36 [==============================] - 2s 51ms/step - loss: 0.0657 - n_outputs0_loss: 0.0639 - n_outputs1_loss: 0.0018 - val_loss: 0.0699 - val_n_outputs0_loss: 0.0692 - val_n_outputs1_loss: 6.9729e-04
Epoch 11/100
35/36 [============================>.] - ETA: 0s - loss: 0.0630 - n_outputs0_loss: 0.0612 - n_outputs1_loss: 0.0018
Epoch 00011: val_loss did not improve from 0.06324
36/36 [==============================] - 2s 51ms/step - loss: 0.0630 - n_outputs0_loss: 0.0612 - n_outputs1_loss: 0.0018 - val_loss: 0.0633 - val_n_outputs0_loss: 0.0619 - val_n_outputs1_loss: 0.0014
Epoch 12/100
35/36 [============================>.] - ETA: 0s - loss: 0.0643 - n_outputs0_loss: 0.0626 - n_outputs1_loss: 0.0016
Epoch 00012: val_loss improved from 0.06324 to 0.06034, saving model to /content/mycar/models/mypilot.h5
36/36 [==============================] - 2s 52ms/step - loss: 0.0644 - n_outputs0_loss: 0.0628 - n_outputs1_loss: 0.0017 - val_loss: 0.0603 - val_n_outputs0_loss: 0.0598 - val_n_outputs1_loss: 4.9525e-04
Epoch 13/100
35/36 [============================>.] - ETA: 0s - loss: 0.0611 - n_outputs0_loss: 0.0597 - n_outputs1_loss: 0.0014
Epoch 00013: val_loss improved from 0.06034 to 0.05636, saving model to /content/mycar/models/mypilot.h5
36/36 [==============================] - 2s 52ms/step - loss: 0.0613 - n_outputs0_loss: 0.0599 - n_outputs1_loss: 0.0014 - val_loss: 0.0564 - val_n_outputs0_loss: 0.0557 - val_n_outputs1_loss: 6.3083e-04
Epoch 14/100
34/36 [===========================>..] - ETA: 0s - loss: 0.0593 - n_outputs0_loss: 0.0580 - n_outputs1_loss: 0.0013
Epoch 00014: val_loss did not improve from 0.05636
36/36 [==============================] - 2s 51ms/step - loss: 0.0590 - n_outputs0_loss: 0.0577 - n_outputs1_loss: 0.0013 - val_loss: 0.0580 - val_n_outputs0_loss: 0.0575 - val_n_outputs1_loss: 5.4033e-04
Epoch 15/100
34/36 [===========================>..] - ETA: 0s - loss: 0.0552 - n_outputs0_loss: 0.0540 - n_outputs1_loss: 0.0012
Epoch 00015: val_loss did not improve from 0.05636
36/36 [==============================] - 2s 51ms/step - loss: 0.0555 - n_outputs0_loss: 0.0542 - n_outputs1_loss: 0.0012 - val_loss: 0.0565 - val_n_outputs0_loss: 0.0559 - val_n_outputs1_loss: 5.6679e-04
Epoch 16/100
35/36 [============================>.] - ETA: 0s - loss: 0.0538 - n_outputs0_loss: 0.0527 - n_outputs1_loss: 0.0012
Epoch 00016: val_loss did not improve from 0.05636
36/36 [==============================] - 2s 50ms/step - loss: 0.0538 - n_outputs0_loss: 0.0526 - n_outputs1_loss: 0.0012 - val_loss: 0.0624 - val_n_outputs0_loss: 0.0619 - val_n_outputs1_loss: 5.0014e-04
Epoch 17/100
35/36 [============================>.] - ETA: 0s - loss: 0.0536 - n_outputs0_loss: 0.0525 - n_outputs1_loss: 0.0011
Epoch 00017: val_loss improved from 0.05636 to 0.05615, saving model to /content/mycar/models/mypilot.h5
36/36 [==============================] - 2s 53ms/step - loss: 0.0534 - n_outputs0_loss: 0.0523 - n_outputs1_loss: 0.0011 - val_loss: 0.0561 - val_n_outputs0_loss: 0.0556 - val_n_outputs1_loss: 5.2067e-04
Epoch 18/100
35/36 [============================>.] - ETA: 0s - loss: 0.0502 - n_outputs0_loss: 0.0492 - n_outputs1_loss: 0.0011
Epoch 00018: val_loss improved from 0.05615 to 0.05594, saving model to /content/mycar/models/mypilot.h5
36/36 [==============================] - 2s 52ms/step - loss: 0.0501 - n_outputs0_loss: 0.0490 - n_outputs1_loss: 0.0011 - val_loss: 0.0559 - val_n_outputs0_loss: 0.0555 - val_n_outputs1_loss: 4.4089e-04
Epoch 00018: early stopping
Training completed in 0:00:40.


----------- Best Eval Loss :0.055939 ---------
<Figure size 640x480 with 1 Axes>

Display the Loss curve of the training result

In [26]:

import matplotlib.pyplot as plt

list_of_png = glob.glob('/content/mycar/models/*png')

latest_png = max(list_of_png, key=os.path.getctime)

image = cv2.imread(latest_png)

/content/mycar/models/mypilot.h5_loss_acc_0.055939.png

Out[26]:

<matplotlib.image.AxesImage at 0x7f867c896be0>

robocarstore/173807730625008222

Put the trained model back into Donkey Car.

Once the model is trained, you can find the trained model in the mycar/models directory
1. Download the mypilot file to your PC or Mac

  1. Copy the model from your PC or Mac to your pi

Finally, use the Autopilot mode to drive Donkey car

Enjoy!

玩转JetBot自动驾驶 (一)准备DIY零件清单

robocarstore/173807773325122723

JetBot是一款很容易上手的机器学习自动驾驶小车,个人感觉相比Donkey Car容易入门。
本文列出了我在组装JetBot时,所需要的全部电子元件,3D打印部件,五金件。
避免大家少走弯路,我也会把详细的型号参数也一一罗列。

先来看看「零件全家福」,有个大概的了解先:

robocarstore/173807775925330624

以下列出所有的零件和型号参数:

robocarstore/173807779624892425

在安装过程中,我还用到一些工具,如:M3扳手,M2扳手,M3的六角螺母扳手,十字螺丝刀,一字螺丝刀,高温胶带,电烙铁(有些零件需要焊接加工)。

以上,就是我组装的这台JetBot用到的所有零件和工具。

当然,你也可参考详细的英伟达官Git 零件列表,列表提出了一些其他的配件的解决方案:
https://github.com/NVIDIA-AI-IOT/jetbot/wiki/bill-of-materials

下一篇,将将介绍如何Step by step安装JetBot

to be continued…

玩转JetBot自动驾驶 (二)动手组装Jetbot Car

robocarstore/173807852624587444

本文将详细介绍JetBot的硬件安装过程,并提供了安装全过程的视频。

由于拍摄过程中主摄像头抽风,视频内容不可用,所以只能用副摄像头的素材,画质有点强差人意,请见谅。

也可参考官Git的安装过程互补长短(本文也有部分图片也来自于这里):
https://github.com/NVIDIA-AI-IOT/jetbot/wiki/hardware-setup

需要准备的工具

M3扳手,M2扳手,M3的六角螺母扳手,十字螺丝刀,一字螺丝刀,绝缘胶带,高温胶带,3M双面胶,电烙铁(有些零件需要焊接加工)。

开始组装

Step – 1 安装无线网卡

1,从Jetson Nano 上拆除芯片组(带散热片的那整块模组)

先拧开2个螺丝,然后拨开卡扣(侧锁),芯片组即会自动弹起一定角度,顺着这个角度慢慢地把芯片组滑出。 robocarstore/173807821424761926 robocarstore/173807825424765927 robocarstore/173807827724901728 robocarstore/173807828824814429

2,安装WiFi模块AC8265

把英特尔无线网卡AC8265接上天线,用高温胶带缠绕加固,避免脱落,并把凹槽的高温带剪掉。把Jetson Nano底座的螺丝拧起,网卡模块插进网卡插槽后,把刚拧起的螺丝拧回去,固定好AC8265网卡模块。 robocarstore/173807831124775830 robocarstore/173807831924893431 robocarstore/173807832725044032 robocarstore/173807836024830233

3,芯片组安装回Jetson Nano的底座,并固定好天线

把芯片组插回卡槽,按平,卡扣(侧锁)就回自动卡上去,拧回2个螺丝固定好芯片组,然后用高温带把天线固定在散热片上。 robocarstore/173807836924645634

robocarstore/173807838324670435

robocarstore/173807839225296536

Step – 2 电机驱动接电源线

4,焊接电机驱动模块

这里需要焊接,在购买此电机驱动模块时就会有一些配件附送,只需要焊接下图1所示的接口即可,最终加工完成后如图4所示。

robocarstore/173807841224729437

robocarstore/173807842224823038

robocarstore/173807843724698939

robocarstore/173807844433350340

5,MicroUSB电源线剥开正负极

如下图,剥开正负极的电线,其他线用绝缘胶带固定好防止短路即可。一般情况是红正黑负,如果不确定,请参考下图中的MicroUSB数据线接线图,用万用表测量。也可以接上一个LED灯,利用二极管的特性,通电来测试正负极是否正确。

robocarstore/173807845024671941

注意:这个正负极一定要搞清楚,搞错了轻则有可能会烧掉你的Jetson Nona,更严重则可能会导致电源爆炸。

一定要搞清楚!

一定要搞清楚!!

一定要搞清楚!!!

重要的事情说三遍!!!!!!!没经验的一定要找有经验的帮忙!!!

6,电机驱动模块电源接口连接MicroUSB数据线

先看看这个电机驱动模块接口,其实它是PCA9685和TB6612的集成板。

robocarstore/173807849325676742

然而,组装JetBot只需要用到几个接口而已。这一步骤只接MicroUSB数据线,作为外部电源线(3v3接_正极_、GND接_负极_)。如下图:

robocarstore/173807851424798543

Step – 3 安装TT马达

7,安装TT马达

准备好3D打印部件「车架」(chassis.stl)和TT马达。把TT马达安装在「车架」上,用M3螺丝和M3螺母固定,注意请轻手一点,3D打印的车架并不如想象中的坚固。

robocarstore/173807852624587444

robocarstore/173807854224561745

robocarstore/173807855324679046

Step – 4 安装电机驱动模块

8,安装电机驱动模块在「车架」上

robocarstore/173807856624662147

先把电机驱动模块用M2*6自攻丝安装在「车架」上(如图1)。

Step – 5 TT马达与电机驱动模块接线

9,TT马达与电机驱动模块接线

根据下图的接线方法进行TT马达与驱动模块的接线(即使接错也不打紧在之后的应用实例中可检测方向是否正确,再作调整)。

robocarstore/173807858324621348

注意:在安装视频中,电机驱动安装方向与图片不一样,请按文章上图的所示方向一致即可。

Step – 6 安装轮子和预接杜邦线

10,安装万向轮

准备好「万向轮支架」(caster_shroud_60mm.stl)、「万向轮槽」(caster_base_60mm.stl)、聚甲醛小球(POM ball)。依次在车架的凹槽中放上「万向轮支架」–> 聚甲醛小球–> 「万向轮槽」,最后用「M2*8自攻螺丝」上紧固定。 robocarstore/173807863324608149

robocarstore/173807865324939550

robocarstore/173807866124643751

robocarstore/173807867024781552

11,电机驱动模块预留接线

先使用杜邦线预接电机驱动模块上的四个引脚,分别是:3.3v、GND、SDA、SCL。预接好,等待之后的方便使用。

robocarstore/173807868126285253

12,安装左右两边轮子

请小心翼翼地安装轮子,以免用力过度压烂「车架」。

robocarstore/173807871924726254

Step – 7 Jetson Nano、OLED、电机驱动模块接线

13,固定Jetson Nano在车架上

使用M2*6自攻丝把Jetson Nano固定在「车架」上。 robocarstore/173807872924554455 robocarstore/173807875224607056

14,OLED Display安装前焊接加工

在接线之前我们需要使用长双排弯针对OLED Display进行焊接加工(当然你也可以直接焊接杜邦线)。 robocarstore/173807876024663757 robocarstore/173807877324632258

15,OLED Display与驱动电机接线

先来了解OLED Display的引脚。

robocarstore/173807879224744959

然后参考下图连线所示,把OLED Display 和电机驱动模块进行接线。

robocarstore/173807879924605060

16,把OLED与Jetson Nano接线

把OLED安插在下图所示针脚上(与OLED Display的针脚是一一对应的),安装方向请参考图2。 robocarstore/173807881124578461 robocarstore/173807883024886862

Step – 8 安装Pi V2 摄像头

17 把摄像头固定在「车架」上

先使用M2*6自攻丝固定Pi V2摄像头在「摄像头支架」(camera_mount.stl)上,视频线接上Jetson Nano,再把「摄像头支架」使用M2*6自攻丝固定在「车架」上即可。 robocarstore/173807883824550163 robocarstore/173807885124829464 robocarstore/173807885824666465

Step – 9 安装移动电源

18,把移动电源放进在车架的电源槽

这是最后一步,也是最简单的一步了。把移动电源放进电源槽里,用胶带或者3M的双面胶固定。

robocarstore/173807886524635366


下一篇,将介绍怎么把Jetson Nano的系统烧写在MicroSD卡上。

to be continue……

玩转JetBot自动驾驶 (三)系统安装与配置

robocarstore/173807986231205671

在上一篇文章中,我们完成了JetBot的硬件安装。现在,我们将继续完成JetBot的系统安装和配置。这个过程包括刷写JetBot的SD卡镜像、启动Jetson Nano,并进行一些必要的设置,以确保JetBot能够正确运行。请按照以下步骤完成这些操作。

刷写JetBot镜像(JetBot SD card image)

1,准备好一张64GB+的MicroSD卡

2,下载 JetBot镜像(6.87GB):

百度网盘下载:
链接:https://pan.baidu.com/s/1O8DVn28kY2-5-WBwUMZg9w 密码:dydn

如果提示你烧写的镜像大于你的MicroSD卡,请尝试下载这个镜像,解压缩后63GB:

百度网盘下载:
链接:https://pan.baidu.com/s/1FqeTe4aHYhkEFKxCCEn7XQ 密码:utvz

3,下载SD卡格式化软件「SD Memory Card Formatter

用于正确格式化你的MicroSD卡。
官网下载:https://www.sdcard.org/downloads/formatter/

4,下载SD卡刷写软件「Etcher」

用于把.img的镜像文件写入MicroSD卡
官网下载:https://www.balena.io/etcher/

5,开始烧写

电脑使用读卡器读取MicroSD卡。 robocarstore/173807982431236168

robocarstore/173807983731180869

使用「SD Memory Card Formatter」格式化你的SD卡。如果电脑读取的容量与MicroSD卡标称容量相同时,则可以跳过这个步骤。

打开「Etcher」,选择你所下载的JetBot镜像文件,然后选择要刷写的MicroSD盘符,点击[Flash!],开始刷写。

robocarstore/173807984631182970

如果提示镜像超出MicroSD卡大小的情况,请使用63GB的镜像文件。

这是一个漫长的等待,整个刷写过程小编足足用了3小时多,可能使用USB3.0或者使用Windows系统,刷写时间会更快一些。

6,拔出SD卡

启动Jetson Nano

7,把MicroSD卡插入Jetson Nano

把刚才刷写完成的MicroSD卡插入Jetson Nano的MicroSD卡槽里面

robocarstore/173807986231205671

8,把显示器,键盘鼠标,电源连接上Jetson Nano

注意此时的连接的电源是使用常见的手机充电器,5V电源2A电流的插座电源头进行供电。

robocarstore/173807987531203572

建议在没有连接PIOLED/电机驱动模块的情况下启动Jetson Nano。这样可以确保系统可以正确启动,而不用担心其他硬件所带来不确定问题。在正常启动关机后,再重新连接PiOLED/电机驱动,并仔细检查接线是否正确,然后再通电重新启动。

设置JetBot 连接 本地Wifi

9,开机账号

接通电源后,会见到显示器出现英伟达的LOGO,其实这是一个ubuntu系统,稍等片刻,就能看到输入密码界面,账号密码均是:jetbot

10,进入系统设置好WiFi

在右上角找到网络连接的图标,设置连接你正在使用的WiFi,这样在下次开机的时候JetBot就会自动连接你所设置的Wifi,并把获得的局域网IP显示在PiOLED屏幕上。

11,关机

当你设置好WiFi的时候,可以点击右上角电源图标,点开找到「shutdonw」进行关机。

12,拔掉JetBot的插座电源线,显示器,鼠标,键盘

确保已经关机的时候,把所有刚才外接,包括插座电源线,显示器视频线,无线鼠标键盘统统去掉。

13,通过MicroUSB数据线,使用移动电源给JetBot供电。

这次是使用两条MicroUSB数据线连接移动电源,一条为JetBot供电,另一条则为电机驱动模块供电。

14,等待JetBot开启完成,大概2分钟左右

15,查看PiOLED上显示的IP地址

经过大概2分钟左右的等待,可以看到PiOLED的屏幕上显示有当前的JetBot信息,包括IP地址,内存占用情况等等。

robocarstore/173807988731362373

16,在浏览器输入网址:http://:8888

例如,小编的JetBot IP地址是:192.168.199.142,就在电脑的浏览器上输入地址:http://192.168.199.142:8888,可以看到如下界面,密码输入:jetbot,登入系统。

robocarstore/173807990431293974

设置电源模式

为了确保Jetson Nano不会比电池组提供更多电流,请通过调用以下命令将Jetson Nano置于5W模式。

17,通过浏览器连接你的JetBot ,http://:8888

18,点击 + 打开一个控制台,运行一个终端

robocarstore/173807991931385875

20,设置5W模式,输入以下命令

sudo nvpmodel -m1

robocarstore/173807993431444676

会提示请输入密码,密码为:jetbot

21,检查是否设置成功,输入以下命令

sudo nvpmodel -q

robocarstore/173807994231326677

可以看到「NV Power Mode: 5W」,这就设置成功。

安装最线版本的软件(此步骤可选)

当然,你也可以不更新,直接使用系统原有的版本。

22,运行一个终端

23,下载安装最新版本,输入以下命令

git clone https://github.com/NVIDIA-AI-IOT/jetbot

sudo python3 setup.py install

24,覆盖旧版本的程序,输入以下命令

sudo apt-get install rsync

rsync jetbot/notebooks ~/Notebooks

一切准备就绪,下一次将打开机器学习的大门。

to be continue……

玩转JetBot自动驾驶 (四)开动你的JetBot

robocarstore/173808120633923478

本篇详细讲解,如何使用jupyter lab在浏览器上控制你的JetBot,如何通过python进行对JetBot的编程。

认识Jupyter Lab的界面

我们在上一篇已经通过浏览器接触过jupyter lab并使用了一些功能,接下来我们会一直使用这个工具,所有我们有必要了解一下jupyter lab的界面,对不同区域的名称有个印象,将会让你在之后的操作更得心应手。

robocarstore/173808120633923478

大概说明一下:

  • 顶部菜单:包含了jupyter lab的所有操作,例如新建,保存,关闭运行中的内核等等
  • 控制台:是一些快捷方式,在这里可以快速新建一个notebook,打开一个Terminal(终端,或者叫命令行)等等。
  • 快捷工具栏:是一些快捷方式,从左到右,分别表示「新建一个控制台」、「创建文件夹」、「上传文件」、「刷新」
  • 侧边选项卡:可以分别点开「文件浏览器」、「运行中的核心列表」、「命令列表」、「窗口列表」

接下来,会解读notebook里面python的语句什么意思,有什么用。

完整的notebook,请在这里浏览,样式更好看:

https://github.com/ling3ye/jetbot/blob/master/notebooks/basic_motion/basic_motion.ipynb

也可以下载此notebook,覆盖你的原有基础移动notebook。

基本移动

欢迎来到基于jupyter lab的Jetbot编程界面。
这种类型文档我们称为“jupyter Notebook”,是一种集合文本,代码和图形显示于一身的文档。比起只有代码然后注释的方式更整齐简单明了, 如果你不熟悉‘Jupyter’ ,我建议你点击顶部菜单栏的「help」的下拉菜单,这有很多关于Jupyter lab的使用参考。

而在这个notebook,将会介绍JetBot的基础编程知识,以及如何使用python对你的JetBot进行编程。

加载Robot类

准备开始JetBot为编程前,我们需要导入“Robot”类。这个类允许我们轻松控制JetBot的电机! 这包含在“jetbot”的_package_中。

如果你是一名Python新手,一个_package_就是一个包含代码文件的文件夹。
这些代码文件称为_modules_(模型)

要加载Robot类,请高亮显示下面的单元格,并按下ctrl + enter或上面的play图标。 这操作将执行单元格中所包含的代码。

现在已经加载Robot类,我们可以用一下语句初始化这个_instance_(实例)

现在我们已经创建了一个名为“Robot”的Robot实例,我们可以使用这个实例去控制我们的机器人(JetBot),执行下面的命令可使JetBot按最大速度的30%逆时针转动。

注意:这个命令将会使机器人发生移动,请保证有足够的平面给机器人移动,避免跌落损坏,或者干脆就放在地上。

很好,你现在应该见到JetBot在逆时针转动了!

如果你的机器人没有向左转,这意味着其中一个或者两个电机接线出现问题。尝试关闭电源。找出不正确转动的电机,交换其正负极的接线。

提醒:请务必仔细检查接线,线的拔插也需要在切断电源的状态下进行。

现在,执行以下stop方法,就可以使机器人停止。

有时可能我们只想在一段时间内移动机器人,为此,我们可以使用Python的time package。执行以下代码,加载time

这个package定义了sleep函数,它导致代码执行时停止指定的秒数再运行下一个命令。 尝试以下命令的组合,使机器人仅向左转半秒钟。

非常好。你应该见到JetBot左转了一会儿,然后停了下来。

这个robot类也有rightforwardbackwards方法。 尝试创建自己的单元格,参考之前的代码,让机器人以50%的速度向前移动一秒钟。

通过鼠标点击侧边的高亮条并按下“b”或按notebook上方的工具栏“+”图标来创建一个新单元格。 完成后,尝试输入您认为会使机器人以50%的速度向前移动一秒钟的代码,再执行验证所输入的代码是否正确。

单独控制电机

上面我们看到了如何使用leftright等命令控制JetBot。但是如果我们想要单独设置每个电机速度怎么办?其实,有两种方法可以做到这一点。

第一种方法是调用set_motors方法。 例如,左转一秒,我们可以将左电机速度设置为30%,将右电机设置为60%,这将实现不同弧度的转向方式,如下所示。

robot.set_motors(0.3, 0.6)

非常好!你应该见到JetBot向左转。但实际上我们可以使用另一种方式来完成同样的事情。

Robot类中还有两个名为left_motorright_motor的属性,分别表示左电机和右电机的速度值。 这些属性是Motor类实例中的,每一个实例都包含一个value值。当这个value发生了变化就会触发events,重新分配电机的速度值。

所以在这个电机类中,我们附加的一个函数,只要值发生变化就会更新电机命令。因此,为了完成我们上面所做的完全相同的事情,我们可以执行以下操作。

robot.left_motor.value = 0.3

robot.right_motor.value = 0.6

robot.left_motor.value = 0.0

robot.right_motor.value = 0.0
您应该看到JetBot以相同的方式移动!

使用traitlets库连接到HTML控件操作电机

接下来介绍一个非常酷的功能,就是在Jupyter Notbooks中可以让我们在这个页面上制作一些图形小按钮(控件),而使用traitlets可以连接这些小部件进行控制操作。这样,我们就可以通过网页的按钮,去控制我们的小车,这将会变得非常方便好玩。

为了说明如何编写程序,我们先创建并显示两个用于控制电机的滑块。

import ipywidgets.widgets as widgets

from IPython.display import display

# create two sliders with range [-1.0, 1.0]

left_slider = widgets.FloatSlider(description='left', min=\-1.0, max=1.0, step=0.01, orientation='vertical')

right_slider = widgets.FloatSlider(description='right', min=\-1.0, max=1.0, step=0.01, orientation='vertical')

# create a horizontal box container to place the sliders next to eachother

slider_container = widgets.HBox([left_slider, right_slider])

# display the container in this cell's output

display(slider_container)
你应该看见两个垂直的滑块显示在上面。

技巧提示:在Jupyter Lab,其实你可以把单元格弹出到其他窗口,例如这两个滑块。虽然不在同一窗口,但它仍然连接着这个notebook。具体操作是,鼠标移动到单元格(例如:滑块)上右键,选择「Creat new view for output」(为输出创建新窗口),然后拖动窗口到你满意的地方即可。

尝试单击并上下拖动滑块,会见到数值的变化。 请注意,当前我们移动滑块时JetBot的电机是没有任何反应的,那是因为我们还没有将它们连接到电机上! 下面我们将通过使用traitlets包中的link函数来实现。

left_link = traitlets.link((left_slider, 'value'), (robot.left_motor, 'value'))

right_link = traitlets.link((right_slider, 'value'), (robot.right_motor, 'value'))
现在尝试拖动滑块(要先慢慢地拖动,以免你的JetBot突然冲出边界造成损坏),您应该看到相应的电机在转动!

我们上面创建的link函数实际上创建了一个双向链接! 那意味着, 如果我们在其他地方设置电机值,滑块将会跟着更新! 尝试执行下面的代码块:

执行上面代码你应该看见滑块也发生了改变,响应了电机的速度值。如果我们要断开此连接,我们可以调用unlink方法逐一断开连接。

但是如果我们不想要一个双向的连接,比如说我们只想用滑块来显示电机的速度值,而不想用来控制,那么要实现这种功能,我们就可以使用dlink函数,左边是来源,右边是目标,(数据来源于电机,然后要显示在目标上)。  

left_link = traitlets.dlink((robot.left_motor, 'value'), (left_slider, 'value'))

right_link = traitlets.dlink((robot.right_motor, 'value'), (right_slider, 'value'))
现在你可以上下移动滑块,你应该看到机器人的电机是没有一点反应。但当我们设置电机的速度值并执行的时候,滑块将会作出响应的数值更新。

将函数添加到事件

另一种使用traitlets的方法是把函数附加到事件中(例如 forward) 。只要对对象发生改变,就会调用函数,并将传递改变了的一些信息,例如old 值和new值。

先让我们创建一些用来控制机器人的按钮显示在notebook上。

button_layout = widgets.Layout(width='100px', height='80px', align_self='center')

stop_button = widgets.Button(description='stop', button_style='danger', layout=button_layout)

forward_button = widgets.Button(description='forward',

backward_button = widgets.Button(description='backward', layout=button_layout)

left_button = widgets.Button(description='left', layout=button_layout)

right_button = widgets.Button(description='right', layout=button_layout)

middle_box = widgets.HBox([left_button, stop_button, right_button],

layout=widgets.Layout(align_self='center'))

controls_box = widgets.VBox([forward_button, middle_box, backward_button])
你应该看到上面显示的一组机器人控制按钮,但现在你点击按钮并不会做任何事。要做到控制,我们需要创建一些函数附加到按钮on_click事件的中。
def step_forward(change):

def step_backward(change):
现在我们已经定义了那些函数,让我们把这些函数附加到每一个按钮的on_click事件
# link buttons to actions

stop_button.on_click(stop)

forward_button.on_click(step_forward) backward_button.on_click(step_backward) left_button.on_click(step_left)

right_button.on_click(step_right)

执行以上代码,现在当你点击每一个按钮时,你应该看到JetBot都会对应作出移动。

心跳开关

这里我们显示怎么去使用’heartbeat’ package 来停止JetBot的的移动。这是检测JetBot与浏览器的连接是否还存在的简单方法。可以通过下面显示的滑块调整心跳周期(以秒为单位),如果两次心跳之内不能在浏览器之间往返通信的,那么心跳的’status‘ (状态)属性值将会设置为dead,一旦连接恢复连接,status属性将设置为alive

from jetbot import Heartbeat

# this function will be called when heartbeat 'alive' status changes

def handle_heartbeat_status(change):

if change['new'] == Heartbeat.Status.dead:

heartbeat.observe(handle_heartbeat_status, names='status')

period_slider = widgets.FloatSlider(description='period', min=0.001, max=0.5, step=0.01, value=0.5)

traitlets.dlink((period_slider, 'value'), (heartbeat, 'period'))

display(period_slider, heartbeat.pulseout)
尝试执行以下这段代码去启动电机,然后降低滑块去看看发生了什么情况。你也可以尝试关闭你的机器人或者电脑。

总结

这是一个简单的notebook例子,希望能对你的JetBot编程建立信心。

玩转JetBot自动驾驶 (五)通过采集数据教JetBot如何认识危险

如果你学会了通过基础移动的notebook实现jetbot的行走,那就太了不起了。
但其实更了不起的是让jetbot独自行走。
这将会是一个超级难的任务,有许多不同的处理,但所有的问题通常会分解成更容易的子问题。
而最重要的解决的问题是防止jetbot发生危险的情况,我们称之为避障。

在这一套notebook,我们将会使用深度学习和一个非常通用的传感器:摄像头,来解决问题。
你将学会如何使用神经网络,摄像头和NVIDIA Jetson Nano教会JetBot学习一个非常有用的行为——避障!

我们想象一个虚拟的安全罩(范围),在这个安全罩内机器人能够旋转一圈而不会碰撞到任何物体(或者其他情况,例如从桌面上掉落)。

当然,JetBot会受到视野的限制,我们无法防止物体被放置在JetBot后面等问题。但我们可以防止JetBot进入这些地方或场景。

我们的方式非常简单: 首先,我们会手动地把JetBot放置在违反安全罩的地方或场景中,把这些情景拍照并标记为blocked(阻塞)。 其次,我们会手动地把JetBot放置放置在符合安全罩的地方或者场景中,把这些情景拍照并标记为free(通畅)。

这就是我们在这个notebook所做的一切:数据采集。一旦我们有了大量的图像和标签,我们会把数据上传到支持GPU运算的主机上,_训练_一个神经网络,然后根据JetBot所看到的图像通过这个神经网络来判断安全罩是否被侵犯。最后,我们将使用这个神经网络来实现一个简单的避障行为!

重要提示:当JetBot旋转的时候,它事件上是围绕着两个轮子之间的中心点旋转,而不是JetBot地盘的中心旋转。当你尝试预估JetBot旋转时安全罩是否被侵犯了的时候,这是一个重要的参考细节。但也不用太担心,不必太过于准确。如果有不放心的,最好就往更谨慎的方向做(例如虚拟一个更大的安全罩)。我们要确保JetBot不会进入一个无法转向而又无法离开的情况。

实时显示摄像头

那么,我们就开始了。首先,让我们像在notebook中初始化摄像头,并显示所看到的画面。

我们的神经网络采用224×224像素的图像作为输入。因此我们将摄像头设置为该大小,以最小化文件大小,而最小化数据集。(我们已经通过测试此像素适用于此任务) 在某些情况下,收集数据时最好用较大的图像尺寸,然后做处理的时候缩小到需要的大小。

import ipywidgets.widgets as widgets

from IPython.display import display

from jetbot import Camera, bgr8_to_jpeg

camera = Camera.instance(width=224, height=224)

image = widgets.Image(format='jpeg', width=224, height=224) # this width and height doesn't necessarily have to match the camera

camera_link = traitlets.dlink((camera, 'value'), (image, 'value'), transform=bgr8_to_jpeg)

运行完上面的代码块后,就可以实时的看到摄像头拍摄到的画面。

接下来让我们创建一些目录存储数据。我们将会建立一个叫dataset的文件夹,里面有两个子文件夹,分别是 freeblocked,用于分类放置每一个场景的图片。

blocked_dir = 'dataset/blocked'

free_dir = 'dataset/free'

# we have this "try/except" statement because these next functions can throw an error if the directories exist already

print('Directories not created becasue they already exist')
运行完上面的代码块后,你现在刷新左侧的Jupyter文件浏览器,你应该会见到这些目录。

创建分类按钮

接下来,我们将创建一些按钮用来保存不同标签的快照。我们还将创建一些文本框,用于显示到目前位置我们每个标签收集到的图像数量。这很重要,因为我们要确保采集到的free图像要和blocked图像一样多。还有助于我们了解整体收集了多少图像。

button_layout = widgets.Layout(width='128px', height='64px')

free_button = widgets.Button(description='add free', button_style='success', layout=button_layout)

blocked_button = widgets.Button(description='add blocked', button_style='danger', layout=button_layout)

free_count = widgets.IntText(layout=button_layout, value=len(os.listdir(free_dir)))

blocked_count = widgets.IntText(layout=button_layout, value=len(os.listdir(blocked_dir)))

display(widgets.HBox([free_count, free_button]))

display(widgets.HBox([blocked_count, blocked_button]))
到此为止,那些按钮是不会做任何事的。我们需要把保存图像的函数附加到每一个按钮的on_click 事件中。我们会通过Image部件,把这些经过压缩处理的JPEG格式图像保存到对应的分类文件夹里。

为了确保我们不重复的任何文件名(甚至跨越不同的机器!)我们将在python中使用uuidpackage,它的作用是可以定义uuid方法,生成唯一标识符。该标识符是根据当前机器地址和时间生成的。

def save_snapshot(directory):

image_path = os.path.join(directory, str(uuid1()) + '.jpg')

with open(image_path, 'wb') as f:

global free_dir, free_count

free_count.value = len(os.listdir(free_dir))

global blocked_dir, blocked_count

save_snapshot(blocked_dir)

blocked_count.value = len(os.listdir(blocked_dir))

# attach the callbacks, we use a 'lambda' function to ignore the

# parameter that the on_click event would provide to our function

# because we don't need it.

free_button.on_click(lambda x: save_free())

blocked_button.on_click(lambda x: save_blocked())
太棒了,现在上面的按钮应该可以把图片保存在free 或者 blocked 文件夹里。你可以使用Jupyter Lab 的文件浏览器去查看这些文件。

现在开始动手收集一些数据

1,请把JetBot放在阻挡的情况下,并按下add blocked按钮 2,请把JetBot放在通畅的情况下,并按下add free按钮 3.重复1,2

提示:你可以把那些按钮部件输出在新的窗口,这样方便你的操作。我们也将执行下面的代码把它们显示在一起。

以下是一些数据标记的提示

  1. 尝试不同方向
  2. 尝试不同的照明
  3. 尝试不同的对象/碰撞类型:例如墙壁,物体等
  4. 尝试不同纹理的平面/物体:例如图案,不同光滑度,玻璃等

最终,JetBot在现实世界中越多的场景数据其防撞行为就越好,所以得到各种各样的数据很重要,而不仅仅是大量的数据。可能每个分类都需要100个图像(这不一定是一个科学的做法,仅仅是一个有用的提示)。收集这么多数据其实不用担心,当你开始收集的时候,就会变得很快完成。

display(widgets.HBox([free_count, free_button]))

display(widgets.HBox([blocked_count, blocked_button]))

下一步

当你收集足够的数据的时候,我们需要把这些数据复制到我们的GPU平台上进行训练。首先,我们可以调用_terminal_(命令行模式又或者叫终端)命令,进行数据打包压缩成一个*.zip文件。

! 表示我们要将使用shell命令运行 -r 表示包含所有包含子文件夹文件。-q 表示zip命令不输出任何信息

!zip -r -q dataset.zip dataset
您应该在Jupyter Lab文件浏览器中看到名为dataset.zip的文件。你可以右键点击该文件进行下载操作。

接下来,我们需要把这些数据上传到我们的GPU平台或者云计算机来训练我们的防撞神经网络。

而Jetson Nano是支持GPU的,所以接下来,我们直接在Jetson Nano上训练,期待下一篇的玩转JetBot将一步一步地说明,这将会非常简单。

玩转JetBot自动驾驶 (六)训练JetBot自己认识危险

robocarstore/173808284534719379

在训练之前,建议你把JetBot关机,使用一个5V,3A的充电头或专用电源给JetBot供电。不建议使用移动电源,因为在训练的过程中会消耗很多资源很费电,如果嫌麻烦的请把移动电源充满电,以免在训练在训练过程中没电了,白费劲了。

开机,通过浏览器输入如下网址:
http://<你JetBot的IP地址>:8888

会看到Jupyter Lab的界面提示登陆:

默认的账户密码都是:jetbot

在Jupyter Lab的文件浏览器中,找到collision_avoidance文件夹,点开train_model.ipynb开始训练吧。

避障 – 训练模型

训练的方法很简单,我们将使用图像分类器来训练两个类freeblocked,我们用这个训练完的模型来避免碰撞。为此,我们将使用一个流行的深度学习库 PyTorch

import torch.optim as optim

import torch.nn.functional as F

import torchvision.datasets as datasets

import torchvision.models as models

import torchvision.transforms as transforms

上传并提取数据集

上次,我们上一次收集并打包成.zip的数据集,然后通过调用shell (命令行)命令来提取(解压缩)此数据集。如果你直接就在jetson nano上训练,可以跳过这一步。

你应该见到一个名为dataset的文件夹出现在文件浏览器上。

创建数据集实例

现在我们使用torchvision.datasets 库中的ImageFolder数据集类。里面有个附加torchvision.transforms库用于转换数据,为训练模型做好准备。

dataset = datasets.ImageFolder(

transforms.ColorJitter(0.1, 0.1, 0.1, 0.1),

transforms.Resize((224, 224)),

transforms.Normalize(\[0.485, 0.456, 0.406\], \[0.229, 0.224, 0.225\])

将数据集拆分为训练集和测试集

接下来,我们将数据集拆分为 训练集 和 测试集。测试集将用于验证我们训练完的模型准确性。

train\_dataset, test\_dataset = torch.utils.data.random\_split(dataset, \[len(dataset) - 50, 50\])

创建数据加载器以批量加载数据

我们将创建两个DataLoader实例,它们为洗牌数据提供实用程序,生成_批次_图像,并与多个任务并行加载样本。

train\_loader = torch.utils.data.DataLoader(

test\_loader = torch.utils.data.DataLoader(

定义神经网络

现在,我们定义我们将要训练的神经网络。 torchvision 库提供了一系列我们可以使用的预训练模型。

在一个称为_迁移学习_的过程中,我们可以重新利用预先训练的模型(在数百万图像上进行训练),以获得可能少的数据,准确完成尽量多的任务。

在预训练模型的原始训练中学到的重要特征可重复用于新任务。 我们将使用alexnet模型。

model = models.alexnet(pretrained=True)

alexnet模型最初是针对具有1000个类标签的数据集进行训练的,但我们的数据集只有两个类标签! 我们将把最好的层替换为最新的,未经训练的层只有两个输出。

model.classifier\[6\] = torch.nn.Linear(model.classifier\[6\].in\_features, 2)

最后,我们将模型转移到GPU上执行

device = torch.device('cuda')

训练神经网络

使用下面的代码,将开始训练我们的神经网络,在运行完每个世代后,保存表现最佳的模型。

一个世代是所有数据运行一遍

BEST\_MODEL\_PATH = 'best\_model.pth'

optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)

for epoch in range(NUM\_EPOCHS):

for images, labels in iter(train\_loader):

images = images.to(device)

labels = labels.to(device)

loss = F.cross\_entropy(outputs, labels)

for images, labels in iter(test\_loader):

images = images.to(device)

labels = labels.to(device)

test\_error\_count += float(torch.sum(torch.abs(labels - outputs.argmax(1))))

test\_accuracy = 1.0 - float(test\_error\_count) / float(len(test\_dataset))

print('%d: %f' % (epoch, test\_accuracy))

if test\_accuracy \> best\_accuracy:

torch.save(model.state\_dict(), BEST\_MODEL\_PATH)

best\_accuracy = test\_accuracy

当这完成后,你应该会见到一个文件best_model.pth在Jupyter Lab的文件浏览器上,鼠标右键可以选择下载,可以保存这个模型在你的系统平台上。

可以说,这就是JetBot自己学习的知识了。在下一篇,我们将使用这个知识,让JetBot独自运行,并且能避开障碍。

玩转JetBot自动驾驶 (七)实现自动避障

robocarstore/173808312935148180

本篇为本入门系列的最后一篇,一共7篇教程,能完整地让JetBot通过摄像头实时自动避障。

现在,JetBot需要改为移动电源供电。

开机后,通过浏览器输入如下网址:
http://<你JetBot的IP地址>:8888

会看到Jupyter Lab的界面提示登陆:

默认的账户密码都是:jetbot

在Jupyter Lab的文件浏览器中,找到collision_avoidance文件夹,点开live_demo.ipynb开始实时避障吧。

实时避障

在这个notebook,我们将会使用我们上次训练的模型,测试Jetbot是否遇到freeblocked的情况就会做出相应的行为。

加载训练模型

我们假设你已经按照训练实例notebook中训练模型,并下载到你的工作平台上。现在,你需要把模型上传到此notebook的相同目录中。在Jupyter Lab的文件浏览器上有上传的按钮,点击就能把文件上传上去。

请在执行下一个单元格的代码之前,请确保该训练好的模型已上传完成。

执行以下代码,初始化PyTorch模型

model = torchvision.models.alexnet(pretrained=False)

model.classifier\[6\] = torch.nn.Linear(model.classifier\[6\].in\_features, 2)

接下来加载您上传的已经被训练过的`best_model.pth`的模型
```python
model.load\_state\_dict(torch.load('best\_model.pth'))

目前,模型权重计算位于CPU内存上,执行下面的代码以使用到GPU。

device = torch.device('cuda')

预处理功能

现在我们加载了模型,但有一个小问题,就是我们的摄像头的图像格式要与训练模型时的图像格式完全相同。要做到这一点,我们需要做一些预处理。分如下几个步骤:

  1. 从BGR转换为RGB模式
  2. 从HWC布局转换为CHW布局
  3. 使用与训练期间相同的参数进行标准化(我们的摄像机提供[0,255]范围内的值,并在[0,1]范围内训练加载的图像,因此我们需要缩放255.0
  4. 将数据从CPU内存传输到GPU内存
  5. 批量添加维度
mean = 255.0 \* np.array(\[0.485, 0.456, 0.406\])

stdev = 255.0 \* np.array(\[0.229, 0.224, 0.225\])

normalize = torchvision.transforms.Normalize(mean, stdev)

def preprocess(camera\_value):

x = cv2.cvtColor(x, cv2.COLOR\_BGR2RGB)

x = x.transpose((2, 0, 1))

x = torch.from\_numpy(x).float()

非常好! 我们现在定义了我们的预处理功能,可以将图像从相机格式转换为神经网络输入的格式。

现在,让我们显示我们的摄像头。 你现在应该对此非常熟悉。 我们还将创建一个滑块,用于显示机器人被阻挡的概率。

from IPython.display import display

import ipywidgets.widgets as widgets

from jetbot import Camera, bgr8\_to\_jpeg

camera = Camera.instance(width=224, height=224)

image = widgets.Image(format='jpeg', width=224, height=224)

blocked\_slider = widgets.FloatSlider(description='blocked', min=0.0, max=1.0, orientation='vertical')

camera\_link = traitlets.dlink((camera, 'value'), (image, 'value'), transform=bgr8\_to\_jpeg)

display(widgets.HBox(\[image, blocked\_slider\]))

我们还创建需要驱动电机的robot实例。

接下来,我们创建一个函数,只要相机的值发生变化,就会调用该函数。 此功能将执行以下步骤

  1. 预处理相机图像
  2. 执行神经网络
  3. 当神经网络输出表明我们被阻挡时,我们将向左转,否则我们继续前进。

import torch.nn.functional as F

global blocked\_slider, robot

\# we apply the \`softmax\` function to normalize the output vector so it sums to 1 (which makes it a probability distribution)

prob\_blocked = float(y.flatten()\[0\])

blocked\_slider.value = prob\_blocked

update({'new': camera.value}) \# we call the function once to intialize
很好! 我们已经创建了神经网络执行功能,但现在我们需要将它附加到相机进行处理。

我们用observe函数完成了这个处理。

警告:此代码将移动机器人! 请确保你的Jetbot安全。

camera.observe(update, names='value') \# this attaches the 'update' function to the 'value' traitlet of our camera

真棒! 如果您的以运行上面这代码块,它现在应该为每个检测到的照片生成新命令。 也许首先将机器人放在地上,看看它遇到障碍物时的反应。

如果要停止此行为,可以通过执行以下代码来取消。

camera.unobserve(update, names='value')

也许您希望Jetbot在没有流式传输视频的情况下运行,这样会减少JetBot的运算负担。 您可以取消摄像头的连接,执行如下代码。
只是不推流到浏览器上,但在Jetbot上摄像头仍然是工作状态中的。

camera\_link.unlink() \# don't stream to browser (will still run camera)

如果要继续在浏览器显示视频,请执行以下代码。

camera\_link.link() \# stream to browser (wont run camera)

总结

非常有趣,现在你的机器人可以智能地避开障碍!

如果您的机器人没有很好地避免碰撞,请尝试找出失败的位置。 美妙之处在于我们可以为这些故障情况收集更多数据并使Jetbot变得更好:)