Pandas transforming chronological rows to columns

126
March 22, 2022, at 11:20 AM

I have a table of work experiences where each row represents a job in chronological order from the first job to the most recent job. For data science purposes I'm trying to create a new table based on this table that displays new job attributes and old job attributes on the same row. For example, the original table would be like:

uniqueID personID startdate enddate title functions
1 A1 1/1/21 12/1/21 Analyst data science
2 A1 1/1/22 12/1/22 Manager admin

The new table would be something like this:

uniqueID personID new_title new_function old_title old_function
1 A1 Analyst data science nan nan
2 A1 Manager admin Analyst data science

I tried to use some groupby variations but haven't been able to get this result.

Answer 1

If I understand correctly, you're looking for a shift:

cols = ['title', 'functions']
df[['old_' + c for c in cols]] = df.groupby('personID')[cols].shift(1)
df = df.drop(['startdate', 'enddate'], axis=1).rename({c: 'new_' + c for c in cols}, axis=1)

Output:

>>> df
   uniqueID personID new_title new_functions old_title old_functions
0         1       A1   Analyst  data science       NaN           NaN
1         2       A1   Manager         admin   Analyst  data science
Rent Charter Buses Company
READ ALSO
PySpark python issue: Py4JJavaError: An error occurred while calling o44.trainALSModel

PySpark python issue: Py4JJavaError: An error occurred while calling o44.trainALSModel

I'm new to spark and was playing around with some code that utilized the pyspark mllib library but was unable to run itHere's the code:

173
Nested functions in python: Polynomial exercise

Nested functions in python: Polynomial exercise

Context: I'm trying to make make a function that will return the value of any given polynomialSo for polynomial(a), where a = [a0,a1,

104
Iteration through a list using Selenium

Iteration through a list using Selenium

I'm trying to iterate a limited number of <span> elements from html, which are actually lists generated from webdriver

165
Automate forking a github repository

Automate forking a github repository

Here is a script in Python that is used to clone repositories given the github account name (source_account), the name of the source repo (source_repo), and the source branch (source_branch)Is there a way I could change this in order to Fork all public...

111