@vshets - I already had a look at the link, but could not use a similar approach because if I break down into chunks, then I will have to manually do the transformation which is a lot more processing than this. But yes, I would definitely be interested if there is a way to avoid
pivot and use an alternative.
@jeremy - The dataset is in the format of a text file, where each line contains one triplet (field1, field2, field3), and overall has:
• 1,019,318 unique items in field1
• 384,546 unique items in field2
So a sample of data would look like this
abc s1 9
xyz s2 1
pqr s3 189
abc s2 3
And I need to transform it to the following:
s1 s2 s3
abc 9 3 NaN
xyz NaN 1 NaN
pqr NaN NaN 189
The commands I am currently using are:
df = pd.read_table(path, header=None, names=('field1','field2', 'field3'), nrows=300000)
M = df.pivot_table(index='field1', columns='field2', values='field3')
M = df.pivot(index='field1', columns='field2') and the result is the same.
Let me know if any other details are needed.