Pandas Dataframes Basics: Reshaping Data

[This article was first published on Python – Predictive Hacks, and kindly contributed to python-bloggers]. (You can report issue about the content on this page here)
Want to share your content on python-bloggers? click here.

In this series of posts, we will show you the basics of Pandas Dataframes which is one of the most useful Data Science python libraries ever made. The first post of this series is about reshaping data.


pd.pivot: Spread columns into rows

pandas dataframes pivot

Example:

df = pd.DataFrame(
{"A" : ['a' ,'a', 'a', 'b', 'b' ,'b'],
"B" : ['A' ,'B', 'C', 'A', 'B' ,'C'],
"C" : [4, 5, 6 , 7 ,8 ,9]})

df
   A  B  C
0  a  A  4
1  a  B  5
2  a  C  6
3  b  A  7
4  b  B  8
5  b  C  9
df.pivot(columns='B',values='C',index='A')
B  A  B  C
A         
a  4  5  6
b  7  8  9

pd.melt: Gather columns into rows

pandas dataframes melt

Example

df=pd.DataFrame({'A': [4, 7], 'B': [5, 8], 'C': [6, 9]})
df
   A  B  C
0  4  5  6
1  7  8  9
df.melt()
  variable  value
0        A      4
1        A      7
2        B      5
3        B      8
4        C      6
5        C      9

pd.concat: Combine Data-Frames

pandas dataframes concat

Example

df1 = pd.DataFrame(
{"A" : [1 ,2, 3],
"B" : [4, 5, 6],
"C" : [7, 8, 9]})

df2 = pd.DataFrame(
{"A" : [10 ,11],
"B" : [12, 13],
"C" : [14, 15]})

print(df1)

print(df2)
   A  B  C
0  1  4  7
1  2  5  8
2  3  6  9

    A   B   C
0  10  12  14
1  11  13  15
pd.concat([df1,df2])
    A   B   C
0   1   4   7
1   2   5   8
2   3   6   9
0  10  12  14
1  11  13  15

pd.explode: Transform each element of a list-like to a row

pandas dataframes explode

Example

df=pd.DataFrame({'A':[[1,2,3],[4,5,6]]})
           A
0  [1, 2, 3]
1  [4, 5, 6]
df.explode('A')
   A
0  1
0  2
0  3
1  4
1  5
1  6

Stack: Stack columns to index

pandas dataframes stack

Example

df = pd.DataFrame([[0, 1], [2, 3]],
                                    index=['A', 'B'],
                                    columns=['COL1', 'COL2'])
df
   COL1  COL2
A     0     1
B     2     3
df.stack()
A  COL1    0
   COL2    1
B  COL1    2
   COL2    3

Unstack: Unstack columns from index

pandas dataframes unstack

Example

index = pd.MultiIndex.from_tuples([('A', 'col1'), ('A', 'col2'),
                                   ('B', 'col1'), ('B', 'col2')])
df = pd.Series(np.arange(1.0, 5.0), index=index)
df
A  col1    1.0
   col2    2.0
B  col1    3.0
   col2    4.0
df.unstack()
   col1  col2
A   1.0   2.0
B   3.0   4.0
To leave a comment for the author, please follow the link and comment on their blog: Python – Predictive Hacks.

Want to share your content on python-bloggers? click here.