I have a pandas data frame:
import pandas as pd
X = pd.DataFrame({'col1': [1,2],
'col2': [4,5]})
I have a replacement dictionary:
dict_replace = {
'col1': {1:'a', 2:'b'},
'col2': {4:'c', 5:'d'}
}
I can easily replace the values in X
using:
X = X.replace(dict_replace)
Resulting in:
X = pd.DataFrame({'col1': ['a','b'],
'col2': ['c','d']})
However, if a new value appears in X
which is not in dict_replace
(for the respective column) I want it replaced with np.nan
.
For example, a data frame:
X = pd.DataFrame({'col1': [1,2,3],
'col2': [4,5,7]})
Should look like:
X = pd.DataFrame({'col1': ['a','b',np.nan],
'col2': ['c','d',np.nan]})
What are some ways I can do this without having to iterate?
You are looking for pandas.Series.map
, which, though only available on columns, can be used on the whole dataframe with apply
:
X = X.apply(lambda col: col.map(dict_replace[col.name]))
Output:
>>> X
col1 col2
0 a c
1 b d
2 NaN NaN
Try with mask
out = X.replace(dict_replace).mask(lambda x : x==X)
Out[215]:
col1 col2
0 a c
1 b d
2 NaN NaN
How to prevent a token created with OAuth 2.0 from expiring?
I'm struggling with a practice problem below where I'm confused as to what the definitions would look like
some of example of schemaI have different constraints based on type
I'm trying to take the first 200 rows of my dataframe and pass it through a UMAP fit method but I'm not able to get it to shape the way I want at (200,2)Should I add an embedding parameter?
My site has multiple tests with multiple questions eachI'd like to make question creation form which will have preset test object depended on url