Splitting list inside a Pandas Column into Separate Columns

309
June 01, 2017, at 4:16 PM

This is my "Feature" column in the pandas dataframe

Feature
Cricket:82379, Kabaddi:255, Reality:4751
Cricket:15640, Wildlife:730
LiveTV:13, Football:4129
TalkShow:658, Cricket:7690
Drama:5503, Cricket:3283, Reality:1345

and I want to make a column of Cricket and put the value 82379.

Similar to case mentioned in the below link Splitting dictionary/list inside a Pandas Column into Separate Columns

Answer 1

Suppose you have:

import pandas as pd
df = pd.DataFrame.from_dict({'Freature':[{"Cricket":82379, "Kabaddi":255, "Reality":4751},{"Cricket":15640, "Wildlife":730},{"LiveTV":13, "Football":4129},{"TalkShow":658, "Cricket":7690},{"Drama":5503, "Cricket":3283, "Reality":1345}]})
df
    Freature
0   {u'Cricket': 82379, u'Kabaddi': 255, u'Reality...
1   {u'Cricket': 15640, u'Wildlife': 730}
2   {u'LiveTV': 13, u'Football': 4129}
3   {u'TalkShow': 658, u'Cricket': 7690}
4   {u'Drama': 5503, u'Cricket': 3283, u'Reality':...

then try with:

df['Freature'].apply(pd.Series)

Output will be:

    Cricket Drama   Football    Kabaddi LiveTV  Reality TalkShow    Wildlife
0   82379.0 NaN     NaN         255.0   NaN     4751.0  NaN         NaN
1   15640.0 NaN     NaN         NaN     NaN     NaN     NaN         730.0
2   NaN     NaN     4129.0      NaN     13.0    NaN     NaN         NaN
3   7690.0  NaN     NaN         NaN     NaN     NaN     658.0       NaN
4   3283.0  5503.0  NaN         NaN     NaN     1345.0  NaN         NaN

Update:

Convert to dict:

new_df = df['Freature'].apply(pd.Series)
result = dict((column, list(new_df[column].dropna())) for column in new_df.columns)
result

Output of result will be a dict:

{'Cricket': [82379.0, 15640.0, 7690.0, 3283.0],
 'Drama': [5503.0],
 'Football': [4129.0],
 'Kabaddi': [255.0],
 'LiveTV': [13.0],
 'Reality': [4751.0, 1345.0],
 'TalkShow': [658.0],
 'Wildlife': [730.0]}

If the Freature content is string:

import pandas as pd
df = pd.DataFrame.from_dict({'Freature':["Cricket:82379, Kabaddi:255, Reality:4751","Cricket:15640, Wildlife:730","LiveTV:13, Football:4129","TalkShow:658, Cricket:7690","Drama:5503, Cricket:3283, Reality:1345"]})
df
    Freature
0   Cricket:82379, Kabaddi:255, Reality:4751
1   Cricket:15640, Wildlife:730
2   LiveTV:13, Football:4129
3   TalkShow:658, Cricket:7690
4   Drama:5503, Cricket:3283, Reality:1345

Then you can convert them to dict like this:

for i in range(len(df)):
    print(dict((e.strip().split(":")[0],int(e.strip().split(":")[1])) for e in df.iloc[i].Freature.split(",")))

It will print all converted dict:

{'Cricket': 82379, 'Kabaddi': 255, 'Reality': 4751}
{'Cricket': 15640, 'Wildlife': 730}
{'LiveTV': 13, 'Football': 4129}
{'TalkShow': 658, 'Cricket': 7690}
{'Drama': 5503, 'Cricket': 3283, 'Reality': 1345}
READ ALSO
read in csv and changing first value from 'ID' then write csv in python3

read in csv and changing first value from 'ID' then write csv in python3

I am trying to import a csv, change the first value in the file, and then write the file out to another csvI am doing this as excel opens the csv files as SYLK format files if 'ID' is in the first value

280
Insert Calendar using Google API (Python)

Insert Calendar using Google API (Python)

I would like to use the Google API to insert a secondary calendarI've used the Google explanation but can't seem to do it myself

427