I’m learning the Python pandas library. Coming from an R background, the indexing and selecting functions seem more complicated than they need to be. My understanding it that .loc() is only label based and .iloc() is only integer based.
Why should I ever use .loc() and .iloc() if .ix() is faster and supports integer and label access?
Please refer to the doc Different Choices for Indexing, it states clearly when and why you should use .loc, .iloc over .ix, it’s about explicit use case:
.ix supports mixed integer and label based access. It is primarily
label based, but will fall back to integer positional access unless
the corresponding axis is of integer type. .ix is the most general and
will support any of the inputs in .loc and .iloc. .ix also supports
floating point label schemes. .ix is exceptionally useful when dealing
with mixed positional and label based hierachical indexes.
However, when an axis is integer based, ONLY label based access and
not positional access is supported. Thus, in such cases, it’s usually
better to be explicit and use .iloc or .loc.
Hope this helps.
Update 22 Mar 2017
Thanks to comment from @Alexander, Pandas is going to deprecate
ix in 0.20, details in here.
One of the strong reason behind is because mixing indexes — positional and label (effectively using
ix) has been a significant source of problems for users.
It is expected to migrate to use
loc instead, here is a link on how to convert code.