[TIL] 영상을 탐지한 후, BBOX에 한글을 출력하기

공부하는삶/CV

[TIL] 영상을 탐지한 후, BBOX에 한글을 출력하기

Hanna 한나 2023. 12. 6. 15:28

행동 영상을 탐지한 후에, BBOX에 한글을 출력하고자 했다. ChatGPT에게 물어봤더니, opencv-contrib-python을 설치한 후 freetype 으로 하면 된다고 아래와 같이 샘플 코드를 제시해주었다

import cv2
import numpy as np

# 이미지를 불러오거나 생성합니다.
image = np.zeros((500, 500, 3), dtype=np.uint8)

# freetype 모듈을 불러옵니다.
ft = cv2.freetype.createFreeType2()
# 한글 폰트 파일의 경로를 지정합니다.
font_path = 'NanumGothic.ttf'
ft.loadFontData(fontFileName=font_path, id=0)

# 이미지에 텍스트를 추가합니다. 위치, 폰트 크기, 색상을 지정할 수 있습니다.
text = '안녕하세요!'
ft.putText(img=image, 
           text=text, 
           org=(50, 100), 
           fontHeight=30, 
           color=(255, 255, 255), 
           thickness=-1, 
           line_type=cv2.LINE_AA, 
           bottomLeftOrigin=True)

# 결과 이미지를 표시합니다.
cv2.imshow('Image with Korean Text', image)
cv2.waitKey(0)
cv2.destroyAllWindows()

하지만, 지속적으로 아래와 같은 오류가 발생했다.

AttributeError: module 'cv2.cv2' has no attribute 'freetype'

> opencv-contrib-python 이 설치 되어 있지 않았고, stackoverflow에서는 버전에 많이 영향을 받는 듯 하여 작동이 되었다던 4.5.1.48을 설치하였으나 동일한 오류가 발생하였다

> stackoverflow와 opencv 포럼에서는 OpenCV에 빌드된 라이브러리를 찾아서 확인해보라고 했고, 그 결과 Freetype을 빌드되어 있지 않음을 확인 할 수 있었다.

import cv2
print(cv2.getBuildInformation())

그래서 찾게된 블로그 게시글

OpenCV Freetype 모듈을 이용하여 영상에 한글 출력하기

OpenCV 3.2 버전부터 FreeType 모듈을 지원합니다. FreeType 모듈은 말 그대로 OpenCV에서 Freetype 라이브러리를 활용할 수 있게 도와주는 클래스이며, 이를 이용하면 영상에 다양한 폰트의 영문 및 한글을

kkokkal.tistory.com

결국 내가 빌드를 다시 해야 한다. 위 게시글은 윈도우 버전으로 작성이 되어 있어서, ChatGPT의 도움을 받아 Ubuntu 20.04 버전으로 빌드를 하였다.

1. OpenCV 빌드하기

1. Ubuntu 의존성 설치

sudo apt update
sudo apt install build-essential cmake git pkg-config libgtk-3-dev \
    libavcodec-dev libavformat-dev libswscale-dev libv4l-dev \
    libxvidcore-dev libx264-dev libjpeg-dev libpng-dev libtiff-dev \
    gfortran openexr libatlas-base-dev python3-dev python3-numpy \
    libtbb2 libtbb-dev libdc1394-22-dev libopenexr-dev \
    libgstreamer-plugins-base1.0-dev libgstreamer1.0-dev
sudo apt install libfreetype6-dev

2. github에서 opencv, opencv-contrib git clone

cd ~
git clone https://github.com/opencv/opencv.git

cd ~
git clone https://github.com/opencv/opencv_contrib.git

3. 빌드

cd ~/opencv
mkdir build
cd build

cmake -D CMAKE_BUILD_TYPE=RELEASE \
-D CMAKE_INSTALL_PREFIX=/usr/local \
-D INSTALL_C_EXAMPLES=ON \
-D INSTALL_PYTHON_EXAMPLES=ON \
-D OPENCV_GENERATE_PKGCONFIG=ON \
-D OPENCV_EXTRA_MODULES_PATH=~/opencv_contrib/modules \
-D WITH_FREETYPE=ON \
-D WITH_TBB=OFF \
-D WITH_IPP=OFF \
-D WITH_1394=OFF \
-D BUILD_WITH_DEBUG_INFO=OFF \
-D BUILD_DOCS=OFF \
-D BUILD_EXAMPLES=ON \
-D BUILD_TESTS=OFF \
-D BUILD_PERF_TESTS=OFF \
-D WITH_QT=ON \
-D WITH_OPENGL=ON \
-D WITH_V4L=ON \
-D WITH_FFMPEG=ON \
-D WITH_XINE=ON \
-D BUILD_NEW_PYTHON_SUPPORT=ON \
..

make -j$(nproc)
sudo make install

-- General configuration for OpenCV 4.8.0-dev =====================================
--   Version control:               4.8.0-465-g60d7dbb647
--
--   Extra modules:
--     Location (extra):            /home/hanna/opencv_contrib/modules
--     Version control (extra):     4.8.1-49-g0bcbc73b
--
--   Platform:
--     Timestamp:                   2023-12-06T04:18:13Z
--     Host:                        Linux 5.15.133.1-microsoft-standard-WSL2 x86_64
--     CMake:                       3.16.3
--     CMake generator:             Unix Makefiles
--     CMake build tool:            /usr/bin/make
--     Configuration:               RELEASE
--
--   CPU/HW features:
--     Baseline:                    SSE SSE2 SSE3
--       requested:                 SSE3
--     Dispatched code generation:  SSE4_1 SSE4_2 FP16 AVX AVX2 AVX512_SKX
--       requested:                 SSE4_1 SSE4_2 AVX FP16 AVX2 AVX512_SKX
--       SSE4_1 (16 files):         + SSSE3 SSE4_1
--       SSE4_2 (1 files):          + SSSE3 SSE4_1 POPCNT SSE4_2
--       FP16 (0 files):            + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 AVX
--       AVX (8 files):             + SSSE3 SSE4_1 POPCNT SSE4_2 AVX
--       AVX2 (36 files):           + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2
--       AVX512_SKX (5 files):      + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2 AVX_512F AVX512_COMMON AVX512_SKX
--
--   C/C++:
--     Built as dynamic libs?:      YES
--     C++ standard:                11
--     C++ Compiler:                /usr/bin/c++  (ver 9.4.0)
--     C++ flags (Release):         -fsigned-char -W -Wall -Wreturn-type -Wnon-virtual-dtor -Waddress -Wsequence-point -Wformat -Wformat-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Wsuggest-override -Wno-delete-non-virtual-dtor -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections  -msse -msse2 -msse3 -fvisibility=hidden -fvisibility-inlines-hidden -O3 -DNDEBUG  -DNDEBUG
--     C++ flags (Debug):           -fsigned-char -W -Wall -Wreturn-type -Wnon-virtual-dtor -Waddress -Wsequence-point -Wformat -Wformat-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Wsuggest-override -Wno-delete-non-virtual-dtor -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections  -msse -msse2 -msse3 -fvisibility=hidden -fvisibility-inlines-hidden -g  -O0 -DDEBUG -D_DEBUG
--     C Compiler:                  /usr/bin/cc
--     C flags (Release):           -fsigned-char -W -Wall -Wreturn-type -Waddress -Wsequence-point -Wformat -Wformat-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wuninitialized -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections  -msse -msse2 -msse3 -fvisibility=hidden -O3 -DNDEBUG  -DNDEBUG
--     C flags (Debug):             -fsigned-char -W -Wall -Wreturn-type -Waddress -Wsequence-point -Wformat -Wformat-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wuninitialized -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections  -msse -msse2 -msse3 -fvisibility=hidden -g  -O0 -DDEBUG -D_DEBUG
--     Linker flags (Release):      -Wl,--gc-sections -Wl,--as-needed -Wl,--no-undefined
--     Linker flags (Debug):        -Wl,--gc-sections -Wl,--as-needed -Wl,--no-undefined
--     ccache:                      NO
--     Precompiled headers:         NO
--     Extra dependencies:          dl m pthread rt
--     3rdparty dependencies:
--
--   OpenCV modules:
--     To be built:                 alphamat aruco bgsegm bioinspired calib3d ccalib core datasets dnn dnn_objdetect dnn_superres dpm face features2d flann freetype fuzzy gapi hdf hfs highgui img_hash imgcodecs imgproc intensity_transform java line_descriptor mcc ml objdetect optflow phase_unwrapping photo plot quality rapid reg rgbd saliency sfm shape stereo stitching structured_light superres surface_matching text tracking ts video videoio videostab wechat_qrcode xfeatures2d ximgproc xobjdetect xphoto
--     Disabled:                    world
--     Disabled by dependency:      -
--     Unavailable:                 cannops cudaarithm cudabgsegm cudacodec cudafeatures2d cudafilters cudaimgproc cudalegacy cudaobjdetect cudaoptflow cudastereo cudawarping cudev cvv julia matlab ovis python2 python3 viz
--     Applications:                examples apps
--     Documentation:               NO
--     Non-free algorithms:         NO
--
--   GUI:                           GTK3
--     QT:                          NO
--     GTK+:                        YES (ver 3.24.20)
--       GThread :                  YES (ver 2.64.6)
--       GtkGlExt:                  NO
--     OpenGL support:              NO
--     VTK support:                 NO
--
--   Media I/O:
--     ZLib:                        /usr/lib/x86_64-linux-gnu/libz.so (ver 1.2.11)
--     JPEG:                        /usr/lib/x86_64-linux-gnu/libjpeg.so (ver 80)
--     WEBP:                        build (ver encoder: 0x020f)
--     PNG:                         /usr/lib/x86_64-linux-gnu/libpng.so (ver 1.6.37)
--     TIFF:                        /usr/lib/x86_64-linux-gnu/libtiff.so (ver 42 / 4.1.0)
--     JPEG 2000:                   build (ver 2.5.0)
--     OpenEXR:                     /usr/lib/x86_64-linux-gnu/libImath.so /usr/lib/x86_64-linux-gnu/libIlmImf.so /usr/lib/x86_64-linux-gnu/libIex.so /usr/lib/x86_64-linux-gnu/libHalf.so /usr/lib/x86_64-linux-gnu/libIlmThread.so (ver 2_3)
--     HDR:                         YES
--     SUNRASTER:                   YES
--     PXM:                         YES
--     PFM:                         YES
--
--   Video I/O:
--     FFMPEG:                      YES
--       avcodec:                   YES (58.54.100)
--       avformat:                  YES (58.29.100)
--       avutil:                    YES (56.31.100)
--       swscale:                   YES (5.5.100)
--       avresample:                NO
--     GStreamer:                   YES (1.16.3)
--     v4l/v4l2:                    YES (linux/videodev2.h)
--     Xine:                        NO
--
--   Parallel framework:            pthreads
--
--   Trace:                         YES (with Intel ITT)
--
--   Other third-party libraries:
--     VA:                          YES
--     Lapack:                      NO
--     Eigen:                       YES (ver 3.3.7)
--     Custom HAL:                  NO
--     Protobuf:                    build (3.19.1)
--     Flatbuffers:                 builtin/3rdparty (23.5.9)
--
--   OpenCL:                        YES (INTELVA)
--     Include path:                /home/hanna/opencv/3rdparty/include/opencl/1.2
--     Link libraries:              Dynamic load
--
--   Python (for build):            /home/hanna/anaconda3/envs/hanna/bin/python3
--
--   Java:
--     ant:                         NO
--     Java:                        YES (ver 11.0.21)
--     JNI:                         /usr/lib/jvm/default-java/include /usr/lib/jvm/default-java/include/linux /usr/lib/jvm/default-java/include
--     Java wrappers:               YES (JAVA)
--     Java tests:                  NO
--
--   Install to:                    /usr/local
-- -----------------------------------------------------------------
--
-- Configuring done
-- Generating done
-- Build files have been written to: /home/hanna/opencv/build

4. 링크 설정

sudo ldconfig

2. OpenCV 빌드시 발생했던 오류들

1. isort 모듈 설치 오류

ModuleNotFoundError: No module named 'isort'

> isort 모듈 오류로, pip install isort 을 해도 동일한 오류를 발견할 수 있었다. 이건 pylint 관련 오류로 pylint를 재설치 또는 업그레이드를 하는 수 밖에 없었다.

pip install --upgrade pylint

2. Eigen, gflags, glog, HDF5, tesseract 라이브러리 오류

# Eigen : 주로 헤더 파일로 제공되는 C++ 템플릿 라이브러리
sudo apt install libeigen3-dev

# gflags : 명령줄 플래그 처리를 위한 라이브러리
sudo apt install libgflags-dev

# glog : 구글의 로깅 라이브러리
sudo apt install libgoogle-glog-dev

# HDF5 : 대량의 데이터를 저장하기 위한 데이터 모델, 라이브러리 및 파일 포맷
sudo apt install libhdf5-dev

# Tesseract : 오픈 소스 OCR(광학 문자 인식) 엔진
sudo apt install tesseract-ocr libtesseract-dev

# OpenBLAS 및 Atlas 라이브러리 : 선형 대수 연산
sudo apt install libopenblas-dev libatlas-base-dev

3. 쉬운 방법 - Pillow 라이브러리로 변환하기

사실 위에서 빌드를 하다가 링크가 연결이 안 되는 불상사로 인하여, 나는 이 방법을 택했다. 간단하게 OpenCV로 만든 이미지를 -> numpy array 형태의 Pillow 로 변환 -> 다시 OpenCV 이미지로 변환하여 호출하는 방법이었다.

다만, 고려해야 할 점이

1. OpenCV 는 BGR 기준이고 Pillow는 RGB이기 때문에 cv2.cvtColor를 통해 BGR2RGB 변환을 해주고 Pillow array로 변해줘야 하는 것, 마찬가지고 Pillow array로 변해주기 위해서는 다시 RGB2BGR로 변환을 해줘야 한다.

2. OpenCV와 Pillow 폰트 사이즈가 차이가 있기 때문에, 실험적 경험에 따라 폰트 사이즈를 정해야 한다는 것

OpenCV에서는 fontScale 매개변수를 사용하여 폰트의 크기 비율을 설정합니다. 이 값은 폰트의 크기를 절대적인 픽셀 단위로 지정하지 않고, 대신 폰트의 기본 크기에 적용되는 스케일링 팩터로 작동합니다. fontScale의 정확한 효과는 사용하는 폰트와 그 폰트의 기본 크기에 따라 달라집니다.
Pillow에서는 ImageFont.truetype() 함수를 사용할 때 폰트 사이즈를 픽셀 단위로 지정합니다. 이 값은 폰트의 높이에 해당하는 실제 픽셀 크기를 나타냅니다.
이러한 차이 때문에, 두 라이브러리 간 폰트 사이즈를 직접적으로 비교하거나 변환하는 것은 복잡하며, 일반적으로는 실험을 통해 적절한 값을 찾아야 합니다. 예를 들어, 특정 텍스트가 Pillow에서 어떻게 보이는지 확인한 후, OpenCV에서 유사한 시각적 결과를 얻기 위해 fontScale 값을 조정해야 할 수 있습니다.

YOLO-NAS의 오픈소스를 참조하여 소스를 변형하였다. YOLO-NAS에서는 이미지 비율에 따라 폰트 사이즈를 정해준 다음, cv2.getTestSize로 해당 text의 사이즈가 얼마나 되는지 확인 후 일정 offset을 주는 방식으로 구현을 하였다.

from typing import Tuple, List
import cv2
import numpy as np
import matplotlib.pyplot as plt

def draw_text_box(
    image: np.ndarray,
    text: str,
    x: int,
    y: int,
    font: int,
    font_size: float,
    background_color: Tuple[int, int, int],
    thickness: int = 1,
) -> np.ndarray:
    """Draw a text inside a box

    :param image:               The image on which to draw the text box.
    :param text:                The text to display in the text box.
    :param x:                   The x-coordinate of the top-left corner of the text box.
    :param y:                   The y-coordinate of the top-left corner of the text box.
    :param font:                The font to use for the text.
    :param font_size:           The size of the font to use.
    :param background_color:    The color of the text box and text as a tuple of three integers representing RGB values.
    :param thickness:           The thickness of the text.
    :return: Image with the text inside the box.
    """
    text_color = best_text_color(background_color)
    (text_width, text_height), baseline = cv2.getTextSize(text, font, font_size, thickness)
    text_left_offset = 7

    image = cv2.rectangle(image, (x, y), (x + text_width + text_left_offset, y - text_height - int(15 * font_size)), background_color, -1)
    image = cv2.putText(image, text, (x + text_left_offset, y - int(10 * font_size)), font, font_size, text_color, thickness, lineType=cv2.LINE_AA)
    return image

폰트 사이즈는 이미지 크기에 따라 일정 비율로 비례하도록 하고, PIL.ImageFont.ImageFont.getlength 메소드를 사용하여 텍스트의 크기를 확하는 코드를 짜고, 이를 적용하였다.

def draw_text_box(
    image,
    text,
    x,
    y,
    background_color,
    font_size=20,
    font_path="/usr/share/fonts/truetype/nanum/NanumGothicBold.ttf",
    bbox_offset=2,
):
    text_color = best_text_color(background_color)
    font = ImageFont.truetype(font_path, font_size)
    text_width = font.getlength(text, direction="ltr")

    # Draw the rectangle behind the text
    image = cv2.rectangle(
        image,
        (x, y),
        (int(x + text_width + bbox_offset * 2), int(y - font_size - bbox_offset * 2)),
        background_color,
        -1,
    )

    image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    pillow_image = Image.fromarray(image_rgb)

    # Pillow를 사용하여 텍스트 그리기
    draw = ImageDraw.Draw(pillow_image)

    draw.text(
        (int(x + bbox_offset), int(y - (font_size + bbox_offset))),
        text,
        font=font,
        fill=text_color,
    )

    image_with_text = cv2.cvtColor(np.array(pillow_image), cv2.COLOR_RGB2BGR)
    return image_with_text

728x90

저작자표시 비영리 동일조건