R in command line – Read Pythonic Data into R

To read the data that cleaned up by Python. There are a few options. You can pump Python result into some kind of popular datatypes (JSON), or You can do some ETL work on either side to make the output the Python could be easily read by R. However, there is already an R package called RPython that seamlessly fill in this gap.
Take a look at document of this package, you can find there are only 4 functions in the content page but that makes my life so much easier.

1.

library(‘rPython’)
str_py_list <- "[1,'a', 'B']"
str(python.get(str_py_list))
List of 3
$ : num 1
$ : chr "a"
$ : chr "B"
str_py_tuple <- "(1,'a', 'B')"
str(python.get(str_py_tuple))
List of 3
$ : num 1
$ : chr "a"
$ : chr "B"
str_py_dict <- "{1:2, 'a':'A', 'B': 1+1}"
str(python.get(str_py_dict))
List of 3
$ a: chr "A"
$ 1: num 2
$ B: num 2

As you can see, the python.get will read in the string and parse python objects into the List in R. and then you can use data.frame to change it into dataframe type.

2. python.assign(PyObject,RObject) will read the R object and translate into an Python object. Combining with python.exec("python command"), you can read the R data into Python.

> data(iris)
> df python.assign(‘py_iris’, df)
> python.exec(“print len(py_iris)”)
5
> python.exec(“print py_iris.keys()”)
[u’Petal.Length’, u’Sepal.Length’, u’Petal.Width’, u’Sepal.Width’, u’Species’]

3. python.load() will run a script of python code

$cat datafireball.py
import urllib2, sys
sys.path.append(‘/Library/Python/2.7/site-packages/beautifulsoup4-4.2.1-py2.7.egg’)
from bs4 import BeautifulSoup
stream = urllib2.urlopen(‘https://datafireball.com/&#8217;)
soup = BeautifulSoup(stream)
print soup.find(‘div’, {‘class’:’site-description’}).text.encode(‘utf-8’)

Above is a very basic Python script which uses urllib2 library make http request to datafireball.com and then uses BeautifulSoup package to parse the html returned. In the end, it will print the description title of datafireball.com to the screen.
Note, if you know how to catch the title into R, please leave a comment, but this is what happens in R.

python.load(‘/tmp/datafireball.py’)
a journey of a data guy

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s