認識結果の表示ルーチン(display_objects_in_gui)

203行目
パラメータは

source_image：入力画像(表示画像)
filtered_objects：整理された認識結果

filtered_objects は最終認識結果数 × 6 の 2次元配列(list)
データ構成はこんな感じ。

filtered_objects[n][0] : クラスのラベル(認識結果の名称; 文字列)
filtered_objects[n][1] : X座標(pixel単位)
filtered_objects[n][2] : Y座標(pixel単位)
filtered_objects[n][3] : 幅(pixel単位)
filtered_objects[n][4] : 高さ(pixel単位)
filtered_objects[n][5] : スコア

def display_objects_in_gui(source_image, filtered_objects):

205行目
入力画像を表示用にdisplay_imageにコピーする。(もともと入力されたsource_imageは汚さない。)
source_image_width、source_image_heightは入力画像の幅と高さ。

    display_image = source_image.copy()
    source_image_width = source_image.shape[1]
    source_image_height = source_image.shape[0]

209行目
NETWORK_IMAGE_WIDTHとNETWORK_IMAGE_HEIGHT はニューラルネットに入力した画像サイズ(グローバル変数)。
どうせなら関数パラメータで渡した方がスマートだと思うが…
filtered_objectsの各データはこのサイズで定義されているので、表示用に変換するための比率をx_ratio、y_ratioとして得る。

    x_ratio = float(source_image_width) / NETWORK_IMAGE_WIDTH
    y_ratio = float(source_image_height) / NETWORK_IMAGE_HEIGHT

213行目
それぞれのバウンティングボックスに対してのループ。

    print('Found this many objects in the image: ' + str(len(filtered_objects)))
    for obj_index in range(len(filtered_objects)):

215行目
認識結果のX座標(中心)、Y座標(中心)、幅、高さを表示用画像のサイズに変換する。

        center_x = int(filtered_objects[obj_index][1] * x_ratio) 
        center_y = int(filtered_objects[obj_index][2] * y_ratio)
        half_width = int(filtered_objects[obj_index][3] * x_ratio)//2
        half_height = int(filtered_objects[obj_index][4] * y_ratio)//2

221行目
X座標(中心)、Y座標(中心)、幅、高さからX座標(左端)、Y座標(上端)、X座標(右端)、Y座標(右端)に変換。
表示画像の範囲からはみ出ないように制限処理を付けてある。

        box_left = max(center_x - half_width, 0)
        box_top = max(center_y - half_height, 0)
        box_right = min(center_x + half_width, source_image_width)
        box_bottom = min(center_y + half_height, source_image_height)

        print('box at index ' + str(obj_index) + ' is... left: ' + str(box_left) + ', top: ' + str(box_top) + ', right: ' + str(box_right) + ', bottom: ' + str(box_bottom))  

229行目
表示画像にバウンティングボックスの四角を描く。
色は緑、線幅は2。

        box_color = (0, 255, 0)  # green box
        box_thickness = 2
        cv2.rectangle(display_image, (box_left, box_top),(box_right, box_bottom), box_color, box_thickness)

234行目
表示画像に認識結果の名称とスコアを書く。
背景は暗い緑。文字色は白。
表示位置はバウンティングボックスの上20ピクセルの場所。
サイズは縦20ピクセル、横バウンティングボックスと同サイズ。
(バウンティングボックスの上端が20未満の時大丈夫なんだろか？表示が切れるだけ？)
(バウンティングボックスの右端より認識結果文字列が長いときも？)

        label_background_color = (70, 120, 70) # greyish green background for text
        label_text_color = (255, 255, 255)   # white text
        cv2.rectangle(display_image,(box_left, box_top-20),(box_right,box_top), label_background_color, -1)
        cv2.putText(display_image,filtered_objects[obj_index][0] + ' : %.2f' % filtered_objects[obj_index][5], (box_left+5,box_top-7), cv2.FONT_HERSHEY_SIMPLEX, 0.5, label_text_color, 1)

ループはここまで。

239行目
画像の表示

    window_name = 'TinyYolo (hit key to exit)'
    cv2.imshow(window_name, display_image)

242行目
キー入力待ち。待ち時間は1msec。
待ち時間内にキーが押されなければ-1が返ってくる。
キー入力はGUIで表示されたウィンドウにフォーカスが当たっているときのみ有効で、コンソール(ターミナルなど)で入力してもダメ。
64bitマシンでは、キーコードを使用する場合は値を& 0xffする必要があるが、入力なしを検出するだけなのでそのままでOK。

    while (True):
        raw_key = cv2.waitKey(1)

247行目
ウィンドウパラメータの取得。
ウィンドウが閉じられていれば-1.0が返る。表示状態ならウィンドウのアクセプト比が返る。
×ボタンでウィンドウを閉じたときの対策。

        prop_val = cv2.getWindowProperty(window_name, cv2.WND_PROP_ASPECT_RATIO)

247行目
キー入力があった、または、×ボタンでウィンドウが閉じられたら終了。

        if ((raw_key != -1) or (prop_val < 0.0)):
            # the user hit a key or closed the window (in that order)
            break