Functional pipes in python like %>% from R’s magrittr

Posted on

Question :

Functional pipes in python like %>% from R’s magrittr

In R (thanks to magrittr) you can now perform operations with a more functional piping syntax via %>%. This means that instead of coding this:

> as.Date("2014-01-01")
> as.character((sqrt(12)^2)

You could also do this:

> "2014-01-01" %>% as.Date 
> 12 %>% sqrt %>% .^2 %>% as.character

To me this is more readable and this extends to use cases beyond the dataframe. Does the python language have support for something similar?

Answer #1:

One possible way of doing this is by using a module called macropy. Macropy allows you to apply transformations to the code that you have written. Thus a | b can be transformed to b(a). This has a number of advantages and disadvantages.

In comparison to the solution mentioned by Sylvain Leroux, The main advantage is that you do not need to create infix objects for the functions you are interested in using — just mark the areas of code that you intend to use the transformation. Secondly, since the transformation is applied at compile time, rather than runtime, the transformed code suffers no overhead during runtime — all the work is done when the byte code is first produced from the source code.

The main disadvantages are that macropy requires a certain way to be activated for it to work (mentioned later). In contrast to a faster runtime, the parsing of the source code is more computationally complex and so the program will take longer to start. Finally, it adds a syntactic style that means programmers who are not familiar with macropy may find your code harder to understand.

Example Code:

run.py

import macropy.activate 
# Activates macropy, modules using macropy cannot be imported before this statement
# in the program.
import target
# import the module using macropy

target.py

from fpipe import macros, fpipe
from macropy.quick_lambda import macros, f
# The `from module import macros, ...` must be used for macropy to know which 
# macros it should apply to your code.
# Here two macros have been imported `fpipe`, which does what you want
# and `f` which provides a quicker way to write lambdas.

from math import sqrt

# Using the fpipe macro in a single expression.
# The code between the square braces is interpreted as - str(sqrt(12))
print fpipe[12 | sqrt | str] # prints 3.46410161514

# using a decorator
# All code within the function is examined for `x | y` constructs.
x = 1 # global variable
@fpipe
def sum_range_then_square():
    "expected value (1 + 2 + 3)**2 -> 36"
    y = 4 # local variable
    return range(x, y) | sum | f[_**2]
    # `f[_**2]` is macropy syntax for -- `lambda x: x**2`, which would also work here

print sum_range_then_square() # prints 36

# using a with block.
# same as a decorator, but for limited blocks.
with fpipe:
    print range(4) | sum # prints 6
    print 'a b c' | f[_.split()] # prints ['a', 'b', 'c']

And finally the module that does the hard work. I’ve called it fpipe for functional pipe as its emulating shell syntax for passing output from one process to another.

fpipe.py

from macropy.core.macros import *
from macropy.core.quotes import macros, q, ast

macros = Macros()

@macros.decorator
@macros.block
@macros.expr
def fpipe(tree, **kw):

    @Walker
    def pipe_search(tree, stop, **kw):
        """Search code for bitwise or operators and transform `a | b` to `b(a)`."""
        if isinstance(tree, BinOp) and isinstance(tree.op, BitOr):
            operand = tree.left
            function = tree.right
            newtree = q[ast[function](ast[operand])]
            return newtree

    return pipe_search.recurse(tree)
Answered By: Dunes

Answer #2:

Pipes are a new feature in Pandas 0.16.2.

Example:

import pandas as pd
from sklearn.datasets import load_iris

x = load_iris()
x = pd.DataFrame(x.data, columns=x.feature_names)

def remove_units(df):
    df.columns = pd.Index(map(lambda x: x.replace(" (cm)", ""), df.columns))
    return df

def length_times_width(df):
    df['sepal length*width'] = df['sepal length'] * df['sepal width']
    df['petal length*width'] = df['petal length'] * df['petal width']

x.pipe(remove_units).pipe(length_times_width)
x

NB: The Pandas version retains Python’s reference semantics. That’s why length_times_width doesn’t need a return value; it modifies x in place.

Answered By: shadowtalker

Answer #3:

PyToolz [doc] allows arbitrarily composable pipes, just they aren’t defined with that pipe-operator syntax.

Follow the above link for the quickstart. And here’s a video tutorial:
http://pyvideo.org/video/2858/functional-programming-in-python-with-pytoolz

In [1]: from toolz import pipe

In [2]: from math import sqrt

In [3]: pipe(12, sqrt, str)
Out[3]: '3.4641016151377544'
Answered By: smci

Answer #4:

Does the python language have support for something similar?

“more functional piping syntax” is this really a more “functional” syntax ? I would say it adds an “infix” syntax to R instead.

That being said, the Python’s grammar does not have direct support for infix notation beyond the standard operators.


If you really need something like that, you should take that code from Tomer Filiba as a starting point to implement your own infix notation:

Code sample and comments by Tomer Filiba (http://tomerfiliba.com/blog/Infix-Operators/) :

from functools import partial

class Infix(object):
    def __init__(self, func):
        self.func = func
    def __or__(self, other):
        return self.func(other)
    def __ror__(self, other):
        return Infix(partial(self.func, other))
    def __call__(self, v1, v2):
        return self.func(v1, v2)

Using instances of this peculiar class, we can now use a new “syntax”
for calling functions as infix operators:

>>> @Infix
... def add(x, y):
...     return x + y
...
>>> 5 |add| 6
Answered By: Sylvain Leroux

Answer #5:

If you just want this for personal scripting, you might want to consider using Coconut instead of Python.

Coconut is a superset of Python. You could therefore use Coconut’s pipe operator |>, while completely ignoring the rest of the Coconut language.

For example:

def addone(x):
    x + 1

3 |> addone

compiles to

# lots of auto-generated header junk

# Compiled Coconut: -----------------------------------------------------------

def addone(x):
    return x + 1

(addone)(3)
Answered By: shadowtalker

Answer #6:

There is dfply module. You can find more information at

https://github.com/kieferk/dfply

Some examples are:

from dfply import *
diamonds >> group_by('cut') >> row_slice(5)
diamonds >> distinct(X.color)
diamonds >> filter_by(X.cut == 'Ideal', X.color == 'E', X.table < 55, X.price < 500)
diamonds >> mutate(x_plus_y=X.x + X.y, y_div_z=(X.y / X.z)) >> select(columns_from('x')) >> head(3)
Answered By: BigDataScientist

Answer #7:

I missed the |> pipe operator from Elixir so I created a simple function decorator (~ 50 lines of code) that reinterprets the >> Python right shift operator as a very Elixir-like pipe at compile time using the ast library and compile/exec:

from pipeop import pipes

def add3(a, b, c):
    return a + b + c

def times(a, b):
    return a * b

@pipes
def calc()
    print 1 >> add3(2, 3) >> times(4)  # prints 24

All it’s doing is rewriting a >> b(...) as b(a, ...).

https://pypi.org/project/pipeop/

https://github.com/robinhilliard/pipes

Answered By: Robin Hilliard

Answer #8:

You can use sspipe library. It exposes two objects p and px. Similar to x %>% f(y,z), you can write x | p(f, y, z) and similar to x %>% .^2 you can write x | px**2.

from sspipe import p, px
from math import sqrt

12 | p(sqrt) | px ** 2 | p(str)
Answered By: mhsekhavat

Leave a Reply

Your email address will not be published. Required fields are marked *