How do I read an image from a path with Unicode characters?

Posted on

Question :

How do I read an image from a path with Unicode characters?

I have the following code and it fails, because it cannot read the file from disk. The image is always None.

# -*- coding: utf-8 -*-
import cv2
import numpy

bgrImage = cv2.imread(u'D:\ö\handschuh.jpg')

Note: my file is already saved as UTF-8 with BOM. I verified with Notepad++.

In Process Monitor, I see that Python is acccessing the file from a wrong path:

Process Monitor

I have read about:

Answer #1:

It can be done by

  • opening the file using open(), which supports Unicode as in the linked answer,
  • read the contents as a byte array,
  • convert the byte array to a NumPy array,
  • decode the image
# -*- coding: utf-8 -*-
import cv2
import numpy

stream = open(u'D:\ö\handschuh.jpg', "rb")
bytes = bytearray(stream.read())
numpyarray = numpy.asarray(bytes, dtype=numpy.uint8)
bgrImage = cv2.imdecode(numpyarray, cv2.IMREAD_UNCHANGED)
Answered By: Thomas Weller

Answer #2:

Inspired by Thomas Weller’s answer, you can also use np.fromfile() to read the image and convert it to ndarray and then use cv2.imdecode() to decode the array into a three-dimensional numpy ndarray (suppose this is a color image without alpha channel):

import numpy as np

# img is in BGR format if the underlying image is a color image
img = cv2.imdecode(np.fromfile('????/test.jpg', dtype=np.uint8), cv2.IMREAD_UNCHANGED)

np.fromfile() will convert the image on disk to numpy 1-dimensional ndarray representation. cv2.imdecode can decode this format and convert to the normal 3-dimensional image representation. cv2.IMREAD_UNCHANGED is a flag for decoding. Complete list of flags can be found here.

PS. For how to write image to a path with unicode characters, see here.

Answered By: jdhao

Answer #3:

My problem is similar to you, however, my program will terminate at the
image = cv2.imread(filename)statement.

I solved this problem by first encode the file name into utf-8 and then decode it as

 image = cv2.imread(filename.encode('utf-8', 'surrogateescape').decode('utf-8', 'surrogateescape'))
Answered By: NarcissusInMirror

Answer #4:

bgrImage = cv2.imread(filename.encode('utf-8'))

encode file full path to utf-8

Answered By: mcolak

Leave a Reply

Your email address will not be published. Required fields are marked *