The website uses cookies. By using this site, you agree to our use of cookies as described in the Privacy Policy.
I Agree
blank_error__heading
blank_error__body
Text direction?

Does the sparseConvNet have windows version? #128

Open
dengyuhk opened this issue on Oct 5, 2019 · 6 comments

Comments

The version you have here works for all OS as long as you have the libraries installed. If Pytorch is working on windows, this code would work on windows. I have run it on the windows system before without any issues.

Copy link

Would anyone mind providing the instructions to install this on Windows?
Thanks.

There are tons of ways to install this on Windows.
One method would be to install Anaconda on Windows. Then use the shell/terminal and follow the install instructions.
I would generally create an environment within Conda that uses PyTorch 1.3, CUDA 10.0, and Python 3.3.
Since it uses CUDA 10.0, make sure you have the relevant CUDA drivers installed on windows.
Then install all the libraries using:

conda install pytorch torchvision cudatoolkit=10.0 -c pytorch # See https://pytorch.org/get-started/locally/
git clone git@github.com:facebookresearch/SparseConvNet.git
cd SparseConvNet/
bash develop.sh
Copy link

victoryc commented on Jul 22
edited

I presume you ran "bash develop.sh" within an Anaconda Prompt? When I did that, I got the following error:
develop.sh: line 10: python: command not found

Based on some comment I found upon online search, I changed the "python" on line 10 of the script to "python3". But, that resulted in some other errors:

Traceback (most recent call last):
File "setup.py", line 7, in
import torch, os
ModuleNotFoundError: No module named 'torch'

But, I have already installed "pytorch" in this environment and "conda list" indeed shows pytorch, torchvision as installed. Seems like bash does not know anything about the packages in the conda environment and that is why it gave the "python: command not found" error to begin with.
Thanks.

Copy link

victoryc commented on Aug 4
edited

I finally managed to get this library to build on my Windows 10, 64-bit machine. I am leaving notes about the main changes I needed to get that far for the benefit of anyone else who might want to build this library on Windows. May be these notes might also help the authors in case they care to make this library cross-platform.

- Added self.use_ninja = False in the __init__ function of BuildExtension class in ...\site-packages\torch\utils\cpp_extension.py, to get more informative output about where the build errors are. Instead of hacking the code like that, there might be a way to pass some argument to disable ninja build; I didn't explore that.
   - For "Warning: Error checking compiler version for cl: [WinError 2] The system cannot find the file specified".
	- In each Anaconda prompt, before "cl" is invoked, we need to run appropriate Visual Studio Developer command file to set up the environment variables. Example: "call "C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Auxiliary\Build\vcvars64.bat". The exact batch file to execute depends on the architecture for which we want to compile etc. 
- I decided not to use develop.sh, because it leads to "Python: unknown command" error due to bash not being aware of Anaconda environment (??). The first line in develop.sh is for cleaning up -- which can be done manually. I chose to directly run the second line "python setup.py develop". 
- Handling the <tr1/functional> file not found error
	- "tr1" was a temporary namespace that was in use when "functional" was still not part of the C++ standard. 
	- In "sparseconfig.h" under "sparseconvnet\SCN\Metadata\sparsehash\internal", made the following changes:
		- Changed "#define HASH_FUN_H <tr1/functional>" to "#define HASH_FUN_H <functional>"
		- Changed "#define HASH_NAMESPACE std::tr1" to "#define HASH_NAMESPACE std"
	- In cpp_extension.py, line 1398, set num_workers = 1
	- On line 297, self.use_ninja = False
- There are  several occurrences of "or" (instead of "||") and "and" (instead of "&&") in Metadata.cpp, NetworkInNetwork.cpp etc. To get them to build with Microsoft compiler, we need to pass the "/permissive-" option (Note: Though '/Za' flag is supposed to fix them, it causes other errors). Updated setup.py to add that option.
- c:\blahblah\SparseConvNet\sparseconvnet\SCN\Metadata\sparsehash\internal\hashtable-common.h(166): error C2760: syntax error: unexpected token 'typedef', expected ';'
	- In hashtable-common.h, changed the definition of SPARSEHASH_COMPILE_ASSERT(expr, msg) to static_assert(expr, "message")
- c:\blahblah\SparseConvNet\sparseconvnet\SCN\CUDA/SparseToDense.cpp(29): error C2664: 'at::Tensor &at::Tensor::resize_(c10::IntArrayRef,c10::optional<c10::MemoryFormat>) const': cannot convert argument 1 from 'std::array<long,3>' to 'c10::IntArrayRef'
	- Changed "std::array<long, Dimension + 2> sz" to "std::array<int64_t, Dimension + 2> sz"
- c:\blahblah\SparseConvNet\sparseconvnet\SCN\CUDA/BatchNormalization.cu(83): error: calling a __host__ function("pow<float, double, (int)0> ") from a __global__ function("BatchNormalization_f_test<float, (int)16, (int)64> ") is not allowed
	- Changed "pow(_saveInvStd / nActive + eps, -0.5)" to "pow(double(_saveInvStd / nActive + eps), -0.5)". Otherwise, the calling signature happens to be pow(float, double) which does not correspond to the signature of any variant of "pow" function available on CUDA.
- Big one: After doing all of the above, I got the code to compile, but a mysterious link error appeared. The error said something like this: sparseconvnet_cuda.obj : error LNK2001: unresolved external symbol "public: long * __cdecl at::Tensor::data_ptr<long>(void)const " (??$data_ptr@J@Tensor@at@@QEBAPEAJXZ)
	- This was too mysterious. A knowledgeable poster on CUDA forum offered a clue to help this. As that poster said, code meant to be cross-platform should not be using "long". It would end up being 32 bit wide on 64-bit Windows machines while being 64 bit wide on 64-bit Linux machines.
	- Replaced all occurrences of "long" by "int64_t" and the mysterious link error went away.
Copy link

wjj31767 commented 7 days ago
edited

@victoryc

I finally managed to get this library to build on my Windows 10, 64-bit machine. I am leaving notes about the main changes I needed to get that far for the benefit of anyone else who might want to build this library on Windows. May be these notes might also help the authors in case they care to make this library cross-platform.

- Added self.use_ninja = False in the __init__ function of BuildExtension class in ...\site-packages\torch\utils\cpp_extension.py, to get more informative output about where the build errors are. Instead of hacking the code like that, there might be a way to pass some argument to disable ninja build; I didn't explore that.
   - For "Warning: Error checking compiler version for cl: [WinError 2] The system cannot find the file specified".
	- In each Anaconda prompt, before "cl" is invoked, we need to run appropriate Visual Studio Developer command file to set up the environment variables. Example: "call "C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Auxiliary\Build\vcvars64.bat". The exact batch file to execute depends on the architecture for which we want to compile etc. 
- I decided not to use develop.sh, because it leads to "Python: unknown command" error due to bash not being aware of Anaconda environment (??). The first line in develop.sh is for cleaning up -- which can be done manually. I chose to directly run the second line "python setup.py develop". 
- Handling the <tr1/functional> file not found error
	- "tr1" was a temporary namespace that was in use when "functional" was still not part of the C++ standard. 
	- In "sparseconfig.h" under "sparseconvnet\SCN\Metadata\sparsehash\internal", made the following changes:
		- Changed "#define HASH_FUN_H <tr1/functional>" to "#define HASH_FUN_H <functional>"
		- Changed "#define HASH_NAMESPACE std::tr1" to "#define HASH_NAMESPACE std"
	- In cpp_extension.py, line 1398, set num_workers = 1
	- On line 297, self.use_ninja = False
- There are  several occurrences of "or" (instead of "||") and "and" (instead of "&&") in Metadata.cpp, NetworkInNetwork.cpp etc. To get them to build with Microsoft compiler, we need to pass the "/permissive-" option (Note: Though '/Za' flag is supposed to fix them, it causes other errors). Updated setup.py to add that option.
- c:\blahblah\SparseConvNet\sparseconvnet\SCN\Metadata\sparsehash\internal\hashtable-common.h(166): error C2760: syntax error: unexpected token 'typedef', expected ';'
	- In hashtable-common.h, changed the definition of SPARSEHASH_COMPILE_ASSERT(expr, msg) to static_assert(expr, "message")
- c:\blahblah\SparseConvNet\sparseconvnet\SCN\CUDA/SparseToDense.cpp(29): error C2664: 'at::Tensor &at::Tensor::resize_(c10::IntArrayRef,c10::optional<c10::MemoryFormat>) const': cannot convert argument 1 from 'std::array<long,3>' to 'c10::IntArrayRef'
	- Changed "std::array<long, Dimension + 2> sz" to "std::array<int64_t, Dimension + 2> sz"
- c:\blahblah\SparseConvNet\sparseconvnet\SCN\CUDA/BatchNormalization.cu(83): error: calling a __host__ function("pow<float, double, (int)0> ") from a __global__ function("BatchNormalization_f_test<float, (int)16, (int)64> ") is not allowed
	- Changed "pow(_saveInvStd / nActive + eps, -0.5)" to "pow(double(_saveInvStd / nActive + eps), -0.5)". Otherwise, the calling signature happens to be pow(float, double) which does not correspond to the signature of any variant of "pow" function available on CUDA.
- Big one: After doing all of the above, I got the code to compile, but a mysterious link error appeared. The error said something like this: sparseconvnet_cuda.obj : error LNK2001: unresolved external symbol "public: long * __cdecl at::Tensor::data_ptr<long>(void)const " (??$data_ptr@J@Tensor@at@@QEBAPEAJXZ)
	- This was too mysterious. A knowledgeable poster on CUDA forum offered a clue to help this. As that poster said, code meant to be cross-platform should not be using "long". It would end up being 32 bit wide on 64-bit Windows machines while being 64 bit wide on 64-bit Linux machines.
	- Replaced all occurrences of "long" by "int64_t" and the mysterious link error went away.

Hi, i've tried your method, but didn't work now,

I think the problem is I don't understand how to do this

- There are  several occurrences of "or" (instead of "||") and "and" (instead of "&&") in Metadata.cpp, NetworkInNetwork.cpp etc. To get them to build with Microsoft compiler, we need to pass the "/permissive-" option (Note: Though '/Za' flag is supposed to fix them, it causes other errors). Updated setup.py to add that option.

I've tried to change

extra = {'cxx': ['-std=c++14', '-fopenmp'], 'nvcc': [ '-Xcompiler', '-fopenmp']}

which in the setup.py line 17

to

extra = {'cxx': ['-std=c++14', '-fopenmp'], 'nvcc': [ '-Xcompiler -fpermissive', '-fopenmp']}
# or
extra = {'cxx': ['-std=c++14', '-fopenmp'], 'nvcc': [ '-Xcompiler', '-fpermissive']}
# or
extra = {'cxx': ['-std=c++14', '-fopenmp'], 'nvcc': [ '-Xcompiler', '-fopenmp','/permissive-']}

they all didn't work

if I don't change it, I will get 64 errors mostly said like this,

d:\second.pytorch\second\sparseconvnet\sparseconvnet\scn\CUDA/AveragePooling.cu(126): error: expected a ")"
          detected during:
            instantiation of "void CopyFeaturesHelper_bp<T,NTX,NTY>(T *, T *, Int *, Int, Int) [with T=float, NTX=32, NTY=32]"
(144): here
            instantiation of "void cuda_CopyFeaturesHelper_BackwardPass(T *, T *, Int *, Int, Int) [with T=float]"
sparseconvnet/SCN/cuda.cu(140): here

Could you pls tell me how to modify the setup.py?

By the way, did you also try to config pointpillars in windows? If yes, have you successed on that?

Thanks

Measure
Measure
Related Notes
Get a free MyMarkup account to save this article and view it later on any device.
Create account

End User License Agreement

Summary | 1 Annotation
- Added self.use_ninja = False in the __init__ function of BuildExtension class in ...\site-packages\torch\utils\cpp_extension.py, to get more informative output about where the build errors are. Instead of hacking the code like that, there might be a way to pass some argument to disable ninja build; I didn't explore that. - For "Warning: Error checking compiler version for cl: [WinError 2] The system cannot find the file specified". - In each Anaconda prompt, before "cl" is invoked, we need to run appropriate Visual Studio Developer command file to set up the environment variables. Example: "call "C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Auxiliary\Build\vcvars64.bat". The exact batch file to execute depends on the architecture for which we want to compile etc. - I decided not to use develop.sh, because it leads to "Python: unknown command" error due to bash not being aware of Anaconda environment (??). The first line in develop.sh is for cleaning up -- which can be done manually. I chose to directly run the second line "python setup.py develop". - Handling the <tr1/functional> file not found error - "tr1" was a temporary namespace that was in use when "functional" was still not part of the C++ standard. - In "sparseconfig.h" under "sparseconvnet\SCN\Metadata\sparsehash\internal", made the following changes: - Changed "#define HASH_FUN_H <tr1/functional>" to "#define HASH_FUN_H <functional>" - Changed "#define HASH_NAMESPACE std::tr1" to "#define HASH_NAMESPACE std" - In cpp_extension.py, line 1398, set num_workers = 1 - On line 297, self.use_ninja = False - There are several occurrences of "or" (instead of "||") and "and" (instead of "&&") in Metadata.cpp, NetworkInNetwork.cpp etc. To get them to build with Microsoft compiler, we need to pass the "/permissive-" option (Note: Though '/Za' flag is supposed to fix them, it causes other errors). Updated setup.py to add that option. - c:\blahblah\SparseConvNet\sparseconvnet\SCN\Metadata\sparsehash\internal\hashtable-common.h(166): error C2760: syntax error: unexpected token 'typedef', expected ';' - In hashtable-common.h, changed the definition of SPARSEHASH_COMPILE_ASSERT(expr, msg) to static_assert(expr, "message") - c:\blahblah\SparseConvNet\sparseconvnet\SCN\CUDA/SparseToDense.cpp(29): error C2664: 'at::Tensor &at::Tensor::resize_(c10::IntArrayRef,c10::optional<c10::MemoryFormat>) const': cannot convert argument 1 from 'std::array<long,3>' to 'c10::IntArrayRef' - Changed "std::array<long, Dimension + 2> sz" to "std::array<int64_t, Dimension + 2> sz" - c:\blahblah\SparseConvNet\sparseconvnet\SCN\CUDA/BatchNormalization.cu(83): error: calling a __host__ function("pow<float, double, (int)0> ") from a __global__ function("BatchNormalization_f_test<float, (int)16, (int)64> ") is not allowed - Changed "pow(_saveInvStd / nActive + eps, -0.5)" to "pow(double(_saveInvStd / nActive + eps), -0.5)". Otherwise, the calling signature happens to be pow(float, double) which does not correspond to the signature of any variant of "pow" function available on CUDA. - Big one: After doing all of the above, I got the code to compile, but a mysterious link error appeared. The error said something like this: sparseconvnet_cuda.obj : error LNK2001: unresolved external symbol "public: long * __cdecl at::Tensor::data_ptr<long>(void)const " (??$data_ptr@J@Tensor@at@@QEBAPEAJXZ) - This was too mysterious. A knowledgeable poster on CUDA forum offered a clue to help this. As that poster said, code meant to be cross-platform should not be using "long". It would end up being 32 bit wide on 64-bit Windows machines while being 64 bit wide on 64-bit Linux machines. - Replaced all occurrences of "long" by "int64_t" and the mysterious link error went away.
2020/08/28 07:34