The version compatibility across the OS and these packages is a nightmare for every new person who tries to use Tensorflow. In here, I record the successful procedure to install everything listed in the title of this note.
First of all, make sure that you have the exact same version of software in every steps, or it may not work. To install Tensorflow at Ubuntu, you will need to install Ubuntu 16.04. Run
lsb_release -a
to check the version of your Ubuntu. If you want to dual-boost OS from window10, like what I did, there is another note teach you how to do the dual-boost. The following passages assume you have already had your Ubuntu 16.04 installed correctly.
Install pycharm
ref: https://itsfoss.com/install-pycharm-ubuntu/
Open a terminal and use the following commands:
sudo add-apt-repository ppa:mystic-mirage/pycharmsudo apt-get update
To install the community edition of PyCharm, use the command below. It will download data of around 120 MB.
sudo apt-get install pycharm-community
To install the professional edition of PyCharm, use the command below. It will download data of around 160 MB.
sudo apt-get install pycharm
Remove PyCharm
To uninstall PyCharm installed by the above described method, use the command below to uninstall the community edition of PyCharm:
sudo apt-get remove pycharm-community
To remove the professional version, use the command below:
sudo apt-get remove pycharm
Afterward, use the command below to remove the PPA from the sources list:
sudo add-apt-repository --remove ppa:mystic-mirage/pycharm
That’s all you need to do. I hope this tutorial helped you toinstall PyCharm in Ubuntu Linux. Any questions or suggestions are always welcomed.
Install Python2.7
ref: https://askubuntu.com/questions/101591/how-do-i-install-the-latest-python-2-7-x-or-3-x-on-ubuntu
sudo add-apt-repository ppa:fkrull/deadsnakes
sudo apt-get update
sudo apt-get install python2.7
Install pip
Open Pycharm that you just installed. Create a project. Use Alt + F12or View -> Tool Windowto open the terminal.
Then follow the instructions in here: https://www.rosehosting.com/blog/how-to-install-pip-on-ubuntu-16-04/
1. Connect to SSH and Update your System Software
First of all, connect to your server via SSH and make sure that all your system software is up to date. Run the following command to update the package list and upgrade all your system software to the latest version available:
sudo apt-get update && sudo apt-get -y upgrade
2. Install Pip on Ubuntu 16.04
Once the upgrade is completed, you can move on and install Pip on your Ubuntu VPS. The installation of Pip is very simple. The only thing you need to do is to run the following command:
sudo apt-get install python-pip
3. Verify the Pip Installation on Ubuntu 16.04
The apt package manager will install Pip and all the dependencies required for the software to work optimally. Once the installation is completed you can verify that it was successful by using the following command:
pip -V
You should see something similar to the following:
# pip -V
pip 8.1.1 from /usr/lib/python2.7/dist-packages (python 2.7)
That means Pip has been successfully installed on your Ubuntu server and it is ready to use.
Install numpy, tdpm, pillow, scrip etc...
Go to File -> Settings
In Project: -> Project interpreter, make sure you have a python2.7 selected. Click the green plus sign at the right. Then search whatever the package you need to install and click Install Package button at the bottom of the new page you just opened to mount it.
Caution: To run tensorflow using GPU, better to do not install it from pycharm. See the UPDATE below how to install it.
Installing CUDA Toolkit 8.0 on Ubuntu 16.04
ref: http://www.pradeepadiga.me/blog/2017/03/22/installing-cuda-toolkit-8-0-on-ubuntu-16-04/
GCC
One of them is to ensure where GCC is installed or not. We can confirm it by executing the following command.
gcc --version
Since I am using Ubuntu, GCC comes pre-installed and here is the output that I got.
build essentials
It is important have the build-essential package installed. This is usually pre-installed on Ubuntu, however if it is not you can install it by executing the following command.
sudo apt-get install build-essential
On my laptop it was already installed hence I got the following output.
Download CUDA package from NVIDIA website
Navigate to https://developer.nvidia.com/cuda-downloads and download the appropriate package.
Once the page is successfully downloaded, we need to install the package. First navigate to the folder where the package is located. In my case it is under ~/Downloads/CUDA$ folder. Then issue the following command which installs the package.
sudo dpkg -i cuda-repo-ubuntu1604-8-0-local-ga2_8.0.61-1_amd64.deb
Then update the package list from the repositories using the below command.
sudo apt-get update
Then install CUDA by executing the following command.
sudo apt-get install cuda
After a couple of minutes the installation would succeed and you should a screen similar to the following.
One of the important post installation steps is to update the PATH variable to include the CUDA binaries folder. To update it, we need to edit the file /etc/environment. I use the nano text editor in this post, so the command would be
sudo nano /etc/environment
Once nano is open edit the PATH variable to include /usr/local/cuda-8.0/bin folder. After editing the file screen would look like this.
After editing this line press Ctrl + X to exit the editor and press Y when prompted whether you want to save it.
This method of editing the PATH variable usually requires a reboot to take effect. However executing the below command would update the PATH variable immediately.
source /etc/environment
Now we are ready to validate the CUDA installation. Just execute the following command in the terminal.
nvcc --version
If the installation was successful, we should see the CUDA compiler version as seen in this screenshot.
Alternatively you can also execute the following command in the terminal. This gives more detailed information about the drivers.
nvidia-smi
We are now ready to enjoy the goodness of CUDA and can continue with the installation of TensorFlow. Stay tuned for the installation instructions of TensorFlow.
注意:如果你看到CUDA下载页下载的东西大小和版本都不一样了的话,说明CUDA官方更新了。 那么继续按照下载页下面的几行命令执行就可以了。
如果你全都安装完了,在运行nvidia-smi的时候收到这么个错误消息:
”NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver“
的话,重启一下电脑在运行nvidia-smi就可以了。
Install cudnn v5.1 (you may require for cudnn v6.0 too, see the update below)
ref: https://medium.com/@ikekramer/installing-cuda-8-0-and-cudnn-5-1-on-ubuntu-16-04-6b9f284f6e77
Download cudnn @https://developer.nvidia.com/rdp/cudnn-download
如果你安装的CUDA已经不是8.0了的话,没关系。注册登录上面的网址,它会给你写出什么版本的cudnn搭配什么版本的CUDA。对号下载就行了。
My os is Ubuntu 16.04. amd64. The debian archives for Ubuntu 16.04 Power8 doesn't apply to my os. So I have to download the tar file for cuDNN v5.1 Library for Linux. (This is under the testing. If it doesn't work. I have to change os to Ubuntu 14.04 cuz there are debian archives for amd64 version)
Next you need to uncompress and copy cuDNN to the toolkit directory. The toolkit default install location is /usr/local/cuda
tar xvzf cudnn-8.0-linux-x64-v5.1-ga.tgz
sudo cp -P cuda/include/cudnn.h /usr/local/cuda/include
sudo cp -P cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
Now you need to update your bash file
nano ~/.bashrc
With the text editor open, scroll to the bottom and put in these lines:
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64"
export CUDA_HOME=/usr/local/cuda
Save and close it.
UPDATE:
What to check?
To run gpu using tensorflow under Pycharm IDE, here is a checklist:
1, test whether CUDA and cudnn run correctly.
2. test whether CUDA could pick up your GPU device.
3, make sure tensorflow from Pycharm can see Cuda and cudnn
Solutions:
TO "1", install cuda and cudnn in the correct version as above says. Use nvcc -V and nvidia-smi command to examine the driver and installation.
Run sample code to test whether cuda works correctly
ref:http://xcat-docs.readthedocs.io/en/stable/advanced/gpu/nvidia/verify_cuda_install.html
-> Go to the dir that installed cuda (mine is/usr/local/cuda/samples)
-> runmake, it might take few minutes to make the files.
-> rundeviceQueryandbandwidthTest. They are at:
./bin/ppc64le/linux/release/deviceQuery
./bin/ppc64le/linux/release/bandwidthTest
if both passes, then cuda should work.
TO "2", run the testing code provided from the official site of Tensorflow:
ref:https://www.tensorflow.org/tutorials/using_gpu
write down the code below and run through terminal:
import tensorflow as tf
# Creates a graph.
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# Runs the op.
print(sess.run(c))
Save as a .py file and run python ./${file_full_path}
You should see the following output:
Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: Tesla K40c, pci bus
id: 0000:05:00.0
b: /job:localhost/replica:0/task:0/device:GPU:0
a: /job:localhost/replica:0/task:0/device:GPU:0
MatMul: /job:localhost/replica:0/task:0/device:GPU:0
[[ 22. 28.]
[ 49. 64.]]
If it shows device mapping: no known devices, then unintall your tensorflow and install tensorflow-gpu. See this:
Package tensorflow is likely a CPU version. To use gpu to run CUDA, you need to install tensorflow-gpu. However, since installing tensorflow-gpu will reinstall tensorflow automatically, it'd be wise to manually uninstall all the tensorflow-related package, then install tensorflow-gpu.
In my case:
pip uninstall tensorflow
is not enough. Because when reinstall with:
pip install tensorflow-gpu
It's still reinstall tensorflow with cpu not gpu. So, before install tensorfow-pgu, I tried to remove all related tensor folders in site-packages uninstall protobuf, and it works!
For conclusion:
pip uninstall tensorflow
Remove all tensor folders in ~/Python35/Lib/site-packages
pip uninstall protobuf
pip install tensorflow-gpu
If the commandpip uninstall tensorflow results in some kind of error and failed in uninstall, then we can remove the package from pycharm. Go File-> settings-> project interpreter->minus sign to uninstall. Also check the fold from terminal whether they are still there. /usr/local/lib/python2.7/dist-packages/ and /usr/local/lib/python2.7/site-packages/. List of folders need to remove:
tensorflow,tensorboard,anything starts from tensor, and protobuf.
Open NVIDIA X Server Settings. At GPU-0 (your gpu name) tab, monitor your GPU Utilization. If you see it rapidly goes up in a very short period of time, it means your tensorflow runs on gpu correctly.
If you see error message occurs while you import tensorflow like:
ImportError: libcudnn.so.5: cannot open shared object file: No such file or directory
or
ImportError: libcudnn.so.6: cannot open shared object file: No such file or directory
it means it couldn't find the corresponding version of cudnn library. It might trying to load cudnn v6. If so,Download cuDNN v6.0 (April 27, 2017), for CUDA 8.0, untar and place the files to the corresponding right places as above (since cudnn v5.1 and v6.0 can be exist at the same time, you don't have to delete the old copied files from v5.1) and run the tensorflow code again. It should be fixed.
TO "3", it is possible that everything works when you run through terminal, but it suddenly breaks when you run through pycharm although you have already added LD_LIBRARY_PATH and CUDA_HOME to ~/bashrc file. It is because pycharm has its own environment variable. Running in pycharm, we need to set CUDA_HOME and LD_LIBRARY_PATH again probably for each individual project.
To do that, go to the Run menu->edit configurations then choose your project. Then you click on the Environment Variables and add an entry for CUDA_HOME and LD_LIBRARY_PATH.
Up to now, your configuration of cuda and tensorflow should be done. Make sure you have enabled to use gpu in your code, then everything should work.
网友评论