Installing theano on Windows 8 with GPU enabled

Posted on

Question :

Installing theano on Windows 8 with GPU enabled

I understand that the Theano support for Windows 8.1 is at experimental stage only but I wonder if anyone had any luck with resolving my issues. Depending on my config, I get three distinct types of errors. I assume that the resolution of any of my errors would solve my problem.

I have installed Python using WinPython 32-bit system, using MinGW as described here. The contents of my .theanorc file are as follows:

[global]
openmp=False
device = gpu

[nvcc]
flags=-LC:TheanoPythonpython-2.7.6libs
compiler_bindir=C:Program Files (x86)Microsoft Visual Studio 10.0VCbin

[blas]
ldflags = 

When I run import theano the error is as follows:

nvcc fatal   : nvcc cannot find a supported version of Microsoft Visual Studio.
Only the versions 2010, 2012, and 2013 are supported

['nvcc', '-shared', '-g', '-O3', '--compiler-bindir', 'C:\Program Files (x86)\
Microsoft Visual Studio 10.0\VC\bin# flags=-m32 # we have this hard coded for
now', '-Xlinker', '/DEBUG', '-m32', '-Xcompiler', '-DCUDA_NDARRAY_CUH=d67f7c8a21
306c67152a70a88a837011,/Zi,/MD', '-IC:\TheanoPython\python-2.7.6\lib\site-pa
ckages\theano\sandbox\cuda', '-IC:\TheanoPython\python-2.7.6\lib\site-pac
kages\numpy\core\include', '-IC:\TheanoPython\python-2.7.6\include', '-o',
 'C:\Users\Matej\AppData\Local\Theano\compiledir_Windows-8-6.2.9200-Intel6
4_Family_6_Model_60_Stepping_3_GenuineIntel-2.7.6-32\cuda_ndarray\cuda_ndarray
.pyd', 'mod.cu', '-LC:\TheanoPython\python-2.7.6\libs', '-LNone\lib', '-LNon
e\lib64', '-LC:\TheanoPython\python-2.7.6', '-lpython27', '-lcublas', '-lcuda
rt']
ERROR (theano.sandbox.cuda): Failed to compile cuda_ndarray.cu: ('nvcc return st
atus', 1, 'for cmd', 'nvcc -shared -g -O3 --compiler-bindir C:\Program Files (x
86)\Microsoft Visual Studio 10.0\VC\bin# flags=-m32 # we have this hard coded
 for now -Xlinker /DEBUG -m32 -Xcompiler -DCUDA_NDARRAY_CUH=d67f7c8a21306c67152a
70a88a837011,/Zi,/MD -IC:\TheanoPython\python-2.7.6\lib\site-packages\thean
o\sandbox\cuda -IC:\TheanoPython\python-2.7.6\lib\site-packages\numpy\co
re\include -IC:\TheanoPython\python-2.7.6\include -o C:\Users\Matej\AppDa
ta\Local\Theano\compiledir_Windows-8-6.2.9200-Intel64_Family_6_Model_60_Stepp
ing_3_GenuineIntel-2.7.6-32\cuda_ndarray\cuda_ndarray.pyd mod.cu -LC:\TheanoP
ython\python-2.7.6\libs -LNone\lib -LNone\lib64 -LC:\TheanoPython\python-2
.7.6 -lpython27 -lcublas -lcudart')
WARNING (theano.sandbox.cuda): CUDA is installed, but device gpu is not availabl
e

I have also tested it using Visual Studio 12.0 which is installed on my system with the following error:

mod.cu
nvlink fatal   : Could not open input file 'C:/Users/Matej/AppData/Local/Temp/tm
pxft_00001b70_00000000-28_mod.obj'

['nvcc', '-shared', '-g', '-O3', '--compiler-bindir', 'C:\Program Files (x86)\
Microsoft Visual Studio 12.0\VC\bin\', '-Xlinker', '/DEBUG', '-m32', '-Xcompi
ler', '-LC:\TheanoPython\python-2.7.6\libs,-DCUDA_NDARRAY_CUH=d67f7c8a21306c6
7152a70a88a837011,/Zi,/MD', '-IC:\TheanoPython\python-2.7.6\lib\site-package
s\theano\sandbox\cuda', '-IC:\TheanoPython\python-2.7.6\lib\site-packages
\numpy\core\include', '-IC:\TheanoPython\python-2.7.6\include', '-o', 'C:
Users\Matej\AppData\Local\Theano\compiledir_Windows-8-6.2.9200-Intel64_Fam
ily_6_Model_60_Stepping_3_GenuineIntel-2.7.6-32\cuda_ndarray\cuda_ndarray.pyd'
, 'mod.cu', '-LC:\TheanoPython\python-2.7.6\libs', '-LNone\lib', '-LNone\li
b64', '-LC:\TheanoPython\python-2.7.6', '-lpython27', '-lcublas', '-lcudart']
ERROR (theano.sandbox.cuda): Failed to compile cuda_ndarray.cu: ('nvcc return st
atus', 1, 'for cmd', 'nvcc -shared -g -O3 --compiler-bindir C:\Program Files (x
86)\Microsoft Visual Studio 12.0\VC\bin\ -Xlinker /DEBUG -m32 -Xcompiler -LC
:\TheanoPython\python-2.7.6\libs,-DCUDA_NDARRAY_CUH=d67f7c8a21306c67152a70a88
a837011,/Zi,/MD -IC:\TheanoPython\python-2.7.6\lib\site-packages\theano\sa
ndbox\cuda -IC:\TheanoPython\python-2.7.6\lib\site-packages\numpy\core\i
nclude -IC:\TheanoPython\python-2.7.6\include -o C:\Users\Matej\AppData\L
ocal\Theano\compiledir_Windows-8-6.2.9200-Intel64_Family_6_Model_60_Stepping_3
_GenuineIntel-2.7.6-32\cuda_ndarray\cuda_ndarray.pyd mod.cu -LC:\TheanoPython
\python-2.7.6\libs -LNone\lib -LNone\lib64 -LC:\TheanoPython\python-2.7.6
-lpython27 -lcublas -lcudart')
WARNING (theano.sandbox.cuda): CUDA is installed, but device gpu is not availabl
e

In the latter error, several pop-up windows ask me how would I like to open (.res) file before error is thrown.

cl.exe is present in both folders (i.e. VS 2010 and VS 2013).

Finally, if I set VS 2013 in the environment path and set .theanorc contents as follows:

[global]
base_compiledir=C:Program Files (x86)Microsoft Visual Studio 12.0VCbin
openmp=False
floatX = float32
device = gpu

[nvcc]
flags=-LC:TheanoPythonpython-2.7.6libs
compiler_bindir=C:Program Files (x86)Microsoft Visual Studio 12.0VCbin

[blas]
ldflags = 

I get the following error:

c:theanopythonpython-2.7.6includepymath.h(22): warning: dllexport/dllimport conflict with "round"
c:program filesnvidia gpu computing toolkitcudav6.5includemath_functions.h(2455): here; dllimport/dllexport dropped

mod.cu(954): warning: statement is unreachable

mod.cu(1114): error: namespace "std" has no member "min"

mod.cu(1145): error: namespace "std" has no member "min"

mod.cu(1173): error: namespace "std" has no member "min"

mod.cu(1174): error: namespace "std" has no member "min"

mod.cu(1317): error: namespace "std" has no member "min"

mod.cu(1318): error: namespace "std" has no member "min"

mod.cu(1442): error: namespace "std" has no member "min"

mod.cu(1443): error: namespace "std" has no member "min"

mod.cu(1742): error: namespace "std" has no member "min"

mod.cu(1777): error: namespace "std" has no member "min"

mod.cu(1781): error: namespace "std" has no member "min"

mod.cu(1814): error: namespace "std" has no member "min"

mod.cu(1821): error: namespace "std" has no member "min"

mod.cu(1853): error: namespace "std" has no member "min"

mod.cu(1861): error: namespace "std" has no member "min"

mod.cu(1898): error: namespace "std" has no member "min"

mod.cu(1905): error: namespace "std" has no member "min"

mod.cu(1946): error: namespace "std" has no member "min"

mod.cu(1960): error: namespace "std" has no member "min"

mod.cu(3750): error: namespace "std" has no member "min"

mod.cu(3752): error: namespace "std" has no member "min"

mod.cu(3784): error: namespace "std" has no member "min"

mod.cu(3786): error: namespace "std" has no member "min"

mod.cu(3789): error: namespace "std" has no member "min"

mod.cu(3791): error: namespace "std" has no member "min"

mod.cu(3794): error: namespace "std" has no member "min"

mod.cu(3795): error: namespace "std" has no member "min"

mod.cu(3836): error: namespace "std" has no member "min"

mod.cu(3838): error: namespace "std" has no member "min"

mod.cu(4602): error: namespace "std" has no member "min"

mod.cu(4604): error: namespace "std" has no member "min"

31 errors detected in the compilation of "C:/Users/Matej/AppData/Local/Temp/tmpxft_00001d84_00000000-10_mod.cpp1.ii".
ERROR (theano.sandbox.cuda): Failed to compile cuda_ndarray.cu: ('nvcc return status', 2, 'for cmd', 'nvcc -shared -g -O3 -Xlinker /DEBUG -m32 -Xcompiler -DCUDA_NDARRAY_CUH=d67f7c8a21306c67152a70a88a837011,/Zi,/MD -IC:\TheanoPython\python-2.7.6\lib\site-packages\theano\sandbox\cuda -IC:\TheanoPython\python-2.7.6\lib\site-packages\numpy\core\include -IC:\TheanoPython\python-2.7.6\include -o C:\Users\Matej\AppData\Local\Theano\compiledir_Windows-8-6.2.9200-Intel64_Family_6_Model_60_Stepping_3_GenuineIntel-2.7.6-32\cuda_ndarray\cuda_ndarray.pyd mod.cu -LC:\TheanoPython\python-2.7.6\libs -LNone\lib -LNone\lib64 -LC:\TheanoPython\python-2.7.6 -lpython27 -lcublas -lcudart')
ERROR:theano.sandbox.cuda:Failed to compile cuda_ndarray.cu: ('nvcc return status', 2, 'for cmd', 'nvcc -shared -g -O3 -Xlinker /DEBUG -m32 -Xcompiler -DCUDA_NDARRAY_CUH=d67f7c8a21306c67152a70a88a837011,/Zi,/MD -IC:\TheanoPython\python-2.7.6\lib\site-packages\theano\sandbox\cuda -IC:\TheanoPython\python-2.7.6\lib\site-packages\numpy\core\include -IC:\TheanoPython\python-2.7.6\include -o C:\Users\Matej\AppData\Local\Theano\compiledir_Windows-8-6.2.9200-Intel64_Family_6_Model_60_Stepping_3_GenuineIntel-2.7.6-32\cuda_ndarray\cuda_ndarray.pyd mod.cu -LC:\TheanoPython\python-2.7.6\libs -LNone\lib -LNone\lib64 -LC:\TheanoPython\python-2.7.6 -lpython27 -lcublas -lcudart')
mod.cu

['nvcc', '-shared', '-g', '-O3', '-Xlinker', '/DEBUG', '-m32', '-Xcompiler', '-DCUDA_NDARRAY_CUH=d67f7c8a21306c67152a70a88a837011,/Zi,/MD', '-IC:\TheanoPython\python-2.7.6\lib\site-packages\theano\sandbox\cuda', '-IC:\TheanoPython\python-2.7.6\lib\site-packages\numpy\core\include', '-IC:\TheanoPython\python-2.7.6\include', '-o', 'C:\Users\Matej\AppData\Local\Theano\compiledir_Windows-8-6.2.9200-Intel64_Family_6_Model_60_Stepping_3_GenuineIntel-2.7.6-32\cuda_ndarray\cuda_ndarray.pyd', 'mod.cu', '-LC:\TheanoPython\python-2.7.6\libs', '-LNone\lib', '-LNone\lib64', '-LC:\TheanoPython\python-2.7.6', '-lpython27', '-lcublas', '-lcudart']

If I run import theano without the GPU option on, it runs without a problem. Also CUDA samples run without a problem.

Asked By: Matt

||

Answer #1:

Theano is a great tool for machine learning applications, yet I found that its installation on Windows is not trivial especially for beginners (like myself) in programming. In my case, I see 5-6x speedups of my scripts when run on a GPU so it was definitely worth the hassle.

I wrote this guide based on my installation procedure and is meant to be verbose and hopefully complete even for people with no prior understanding of building programs under Windows environment. Most of this guide is based on these instructions but I had to change some of the steps in order for it to work on my system. If there is anything that I do that may not be optimal or that doesn’t work on your machine, please, let me know and I will try to modify this guide accordingly.

These are the steps (in order) I followed when installing Theano with GPU enabled on my Windows 8.1 machine:

CUDA Installation

CUDA can be downloaded from here. In my case, I chose 64-bit Notebook version for my NVIDIA Optimus laptop with Geforce 750m.

Verify that your installation was successful by launching deviceQuery from command line. In my case this was located in the following folder: C:ProgramDataNVIDIA CorporationCUDA Samplesv6.5binwin64Release . If successful, you should see PASS at the end of the test.

Visual Studio 2010 Installation

I installed this via dreamspark. If you are a student you are entitled for a free version. If not, you can still install the Express version which should work just as well. After install is complete you should be able to call Visual Studio Command Prompt 2010 from the start menu.

Python Installation

At the time of writing, Theano on GPU only allows working with 32-bit floats and is primarily built for 2.7 version of Python. Theano requires most of the basic scientific Python libraries such as scipy and numpy. I found that the easiest way to install these was via WinPython. It installs all the dependencies in a self-contained folder which allows easy reinstall if something goes wrong in the installation process and you get some useful IDE tools such as ipython notebook and Spyder installed for free as well. For ease of use you might want to add the path to your python.exe and path to your Scripts folder in the environment variables.

Git installation

Found here.

MinGW Installation

Setup file is here. I checked all the base installation files during the installation process. This is required if you run into g++ error described below.

Cygwin installation

You can find it here. I basically used this utility only to extract PyCUDA tar file which is already provided in the base install (so the install should be straightforward).

Python distutils fix

Open msvc9compiler.py located in your /lib/distutils/ directory of your Python installation. Line 641 in my case reads: ld_args.append ('/IMPLIB:' + implib_file). Add the following after this line (same indentation):

ld_args.append('/MANIFEST')

PyCUDA installation

Source for PyCUDA is here.

Steps:

Open cygwin and navigate to the PyCUDA folder (i.e. /cygdrive/c/etc/etc) and execute tar -xzf pycuda-2012.1.tar.gz.

Open Visual Studio Command Prompt 2010 and navigate to the directory where tarball was extracted and execute python configure.py

Open the ./siteconf.py and change the values so that it reads (for CUDA 6.5 for instance):

BOOST_INC_DIR = []
BOOST_LIB_DIR = []
BOOST_COMPILER = 'gcc43'
USE_SHIPPED_BOOST = True
BOOST_PYTHON_LIBNAME = ['boost_python']
BOOST_THREAD_LIBNAME = ['boost_thread']
CUDA_TRACE = False
CUDA_ROOT = 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v6.5'
CUDA_ENABLE_GL = False
CUDA_ENABLE_CURAND = True
CUDADRV_LIB_DIR = ['${CUDA_ROOT}/lib/Win32']
CUDADRV_LIBNAME = ['cuda']
CUDART_LIB_DIR = ['${CUDA_ROOT}/lib/Win32']
CUDART_LIBNAME = ['cudart']
CURAND_LIB_DIR = ['${CUDA_ROOT}/lib/Win32']
CURAND_LIBNAME = ['curand']
CXXFLAGS = ['/EHsc']
LDFLAGS = ['/FORCE']

Execute the following commands at the VS2010 command prompt:

set VS90COMNTOOLS=%VS100COMNTOOLS%
python setup.py build
python setup.py install

Create this python file and verify that you get a result:

# from: http://documen.tician.de/pycuda/tutorial.html
import pycuda.gpuarray as gpuarray
import pycuda.driver as cuda
import pycuda.autoinit
import numpy
a_gpu = gpuarray.to_gpu(numpy.random.randn(4,4).astype(numpy.float32))
a_doubled = (2*a_gpu).get()
print a_doubled
print a_gpu

Install Theano

Open git bash shell and choose a folder in which you want to place Theano installation files and execute:

git clone git://github.com/Theano/Theano.git
python setup.py install

Try opening python in VS2010 command prompt and run import theano

If you get a g++ related error, open MinGW msys.bat in my case installed here: C:MinGWmsys1.0 and try importing theano in MinGW shell. Then retry importing theano from VS2010 Command Prompt and it should be working now.

Create a file in WordPad (NOT Notepad!), name it .theanorc.txt and put it in C:UsersYour_Name or wherever your users folder is located:

#!sh
[global]
device = gpu
floatX = float32

[nvcc]
compiler_bindir=C:Program Files (x86)Microsoft Visual Studio 10.0VCbin
# flags=-m32 # we have this hard coded for now

[blas]
ldflags =
# ldflags = -lopenblas # placeholder for openblas support

Create a test python script and run it:

from theano import function, config, shared, sandbox
import theano.tensor as T
import numpy
import time

vlen = 10 * 30 * 768  # 10 x #cores x # threads per core
iters = 1000

rng = numpy.random.RandomState(22)
x = shared(numpy.asarray(rng.rand(vlen), config.floatX))
f = function([], T.exp(x))
print f.maker.fgraph.toposort()
t0 = time.time()
for i in xrange(iters):
    r = f()
t1 = time.time()
print 'Looping %d times took' % iters, t1 - t0, 'seconds'
print 'Result is', r
if numpy.any([isinstance(x.op, T.Elemwise) for x in f.maker.fgraph.toposort()]):
    print 'Used the cpu'
else:
    print 'Used the gpu'

Verify you got Used the gpu at the end and you’re done!

Answered By: Matt

Answer #2:

Here are my simple steps for installing theano on a
64-bit windows 10 machine. It’s tested on the code listed here

(All installation are with default installation path)

  • install anaconda python 3.x distribution (it already includes numpy,
    scipy, matlibplot, etc.)
  • run ‘conda install mingw libpython’ in command-line
  • install theano by downloading it from the official website and do `python setup.py install’
  • install lastest CUDA toolkit for 64-bit windows 10 (now is 7.5)
  • install visual studio 2013 (free for windows 10)
  • create .theanorc.txt file under %USERPROFILE% path and here are
    the content in the .theanorc.txt file to run theano with GPU

[global]

floatX = float32

device = gpu

[nvcc]

fastmath = True

compiler_bindir=C:Program Files (x86)Microsoft Visual Studio 12.0VCbincl.exe

[cuda]

C:Program FilesNVIDIA GPU Computing ToolkitCUDAv7.5


Answered By: Matt

Answer #3:

Here’s a guide to installing theano with CUDA on 64-bit Windows.

It seems straightforward, but I have not actually tested it to ensure that it works.

http://pavel.surmenok.com/2014/05/31/installing-theano-with-gpu-on-windows-64-bit/

Answered By: tangkk

Answer #4:

Following the tutorial by Matt, I ran into issues with nvcc.
I needed to add the path to VS2010 executables in nvcc.profile (you can find it in the cuda bin folder):

"compiler-bindir = C:Program Files (x86)Microsoft Visual Studio 10.0VCbinamd64"

Answered By: brentlance

Answer #5:

In case you want to upgrade to MS Visual Studio 2012 and CUDA 7 on Windows 8.1 x64, check out this tutorial here:

http://machinelearning.berlin/?p=383

It should work as long as you stick to it exactly.
All the best

Christian

Answered By: super-truite

Answer #6:

I could compile the cu files by adding the required dependencies in the nvcc profile located in “C:Program FilesNVIDIA GPU Computing ToolkitCUDAv7.5binnvcc.profile”

I modified the include and the lib path and it started working.

INCLUDES += “-I$(TOP)/include” $(SPACE) “-IC:/Program Files (x86)/Microsoft Visual Studio 12.0/VC/include” $(SPACE) “-IC:Program Files (x86)Microsoft SDKsWindowsv7.1AInclude” $(SPACE)
LIBRARIES =+ $(SPACE) “/LIBPATH:$(TOP)/lib/$(_WIN_PLATFORM_)” $(SPACE) “/LIBPATH:C:/Program Files (x86)/Microsoft Visual Studio 12.0/VC/lib/amd64” $(SPACE) “/LIBPATH:C:Program Files (x86)Microsoft SDKsWindowsv7.1ALibx64” $(SPACE)

I have made a full documentation of the install, hope it helps https://planetanacreon.wordpress.com/2015/10/09/install-theano-on-windows-8-1-with-visual-studio-2013-cuda-7-5/

Answer #7:

I used this guide, and it was quite helpful.
What many of Windows Theano guides only mention in passing (or not at all) is that you will need to compile theano from mingw shell, not from your IDE.

I ran mingw-w64.bat, and from there “python” and “import theano”. Only after that importing it from pycharm works.

Additionally, official instructions on deeplearning.net are bad because they tell you to use CUDA 5.5, but it won’t work with newer video cards.

The comments are also quite helpful. If it complains about missing crtdefs.h or basetsd.h, do what Sunando’s answer says. If AFTER THAT it still complains that identifier “Iunknown” is undefined in objbase.h, stick the following in
C:Program Files (x86)Microsoft SDKsWindowsv7.1AIncludeobjbase.h file, on line 236:

#include <wtypes.h>
#include <unknwn.h>

I had to do this last part to make it work with bleeding edge install (required for parts of Keras).

I also wrote a list of things that worked for me, here:
http://acoupleofrobots.com/everything/?p=2238
This is for 64 bit version.

Answered By: Sunando

Leave a Reply

Your email address will not be published.