Alright! We have two Rmarkdown notebooks written in Python and R running PyTorch libraries. There are many more things that still we can do to improve the accuracy of the model to recognize hand-written digits. But before we continue improving the algorithm and the model, I wanted to make a brief pause and showed you something that really made me jump ship from Windows to Linux. It is related to R, Python and data science.
I have been doing data science and machine learning most of the time in Windows, using virtual machines with Linux to test R and Python packages to verify they work in most of the operating systems. That includes Unix in macOS. But I will focus this episode on Linux.
RSuite is a multi-platform software; runs under Windows, Unix, Linux and macOS. To me RSuite is a bless because not only makes easier to run Python from R but allows me to add an extra layer on top of projects and packages; a layer that allows me to organize, run, clean up, transmit, receive scripts and packages. Call it, if you want, a supervising or orchestrating master project.
Virtual Machines are fine but when the job requires CPUs, GPUs and gigabytes of memory, the VM start getting laggy. So, I thought, “let’s try now this amazing rsuite paradigm in Linux. See how it goes.” I had a HP Zbook G2 laptop that I could not sell and put it as an Windows office server - had the license anyway. So, I installed Ubuntu 18.04 and off it went. Running Linux is tremendously satisfying because it feel like sailing with the wind in your favor. But that’s just me. I don’t run Windows specific complex software anyway.
I tested rsuite in Linux, cloned a RPyStats project I developed in Windows. It worked flawlessly!
Data Science should work anywhere, regardless of the operating system.
Here is what I will share with you:
- Cloning from GitHub a RPyStats project developed in Windows into a Linux virtual machine or physical machine
- Install the R dependencies of the project in Linux
- Install the Python dependencies
- Run the notebooks
We will do all this in a Linux Ubuntu 18.04 machine.
Cloning the repository *rpystats-apollo11*:
git clone https://github.com/f0nzie/rpystats-apollo11.git
Change (cd) to *rpystats-apollo11* folder and install R dependencies:
cd rpystats-apollo11
rsuite proj depsinst
Installing the Python dependencies:
rsuite sysreqs install
Go to master folder and run the RStudio project:
In RStudio navigate to the folder ./work/notebooks and open the Rmarkdown notebook *mnist_dgits_rstats.Rmd*:
Knit the notebook to HTML:
The notebook will start building (or kniting). You can follow the progress with the percentage numbers in the “R Markdown” pane.
It will take about 2 to 3 minutes because it is training the model.
Then you get your HTML file in the RStudio browser:
Done!
You just build a RPyStats project using rsuite and RStudio in Linux!
Example repository
[https://github.com/f0nzie/rpystats-apollo11](http://Example repository https://github.com/f0nzie/rpystats-apollo11)
Links
- MNIST digits dataset: https://github.com/f0nzie/mnist_digits_png_full
Download links
- R 3.6.0: https://cran.r-project.org/bin/windows/base/old/3.6.0/
- RSuite client: https://rsuite.io/RSuite_Download.php
- RStudio: https://www.rstudio.com/
- Anaconda Python: https://www.anaconda.com/distribution/#download-section
- Git for Windows: https://git-scm.com/download/win