Looping through data in a CSV file in order to output '1' and '0' to a text file (Python)

229
January 31, 2018, at 5:54 PM

I have recently started learning Python and have run into a problem in trying to format some data for a project I am working on. I have managed to take in a CSV file as an input and I am now trying to go through that data and output '1's and '0's based upon the data, in to a text file.

I have the following code so far:

data = {} 
productIds = [] 
for row in reader:
    productIds.append(row['productCode']) 
    if row['basketID'] not in data:
        data[row['basketID']] = [row['productCode']]
    else:
        data[row['basketID']].append(row['productCode'])
productIds = sorted(set(productIds))
for item in productIds:
    txtFile.write("%s " % item)
txtFile.write('\n')
for key in data: # Will loop through each basket
    for value in data[key]: #Loop through each product in basket
        for i in productIds: # Go through list of available products
            if value == i: 
                txtFile.write('1 ')
            else:
                txtFile.write('0 ')
    txtFile.write('\n')

The result:

23 24 25 #Products 
1  0  0  0 1 0 0 0 1 #Basket 1
1  0  0              #Basket 2
1  0  0              #Basket 3
0  0  1              #Basket 4
0  1  0  0 0 1       #Basket 5

Expected result:

23 24 25 #Products
1  1  1  #Basket 1  
1  0  0  #Basket 2  
1  0  0  #Basket 3  
0  0  1  #Basket 4
0  1  1  #Basket 5

CSV File:

basketID productCode 
1        23  
1        24  
1        25  
2        23  
3        23  
4        25  
5        24  
5        25  

I believe it is going wrong when looping through the product list against the same product, but I am not sure how else to achieve this.

Answer 1

Try this:

data = {} 
productIds = [] 
for row in reader:
    productIds.append(row['productCode']) 
    if row['basketID'] not in data:
        data[row['basketID']] = set(row['productCode'])
    else:
        data[row['basketID']].add(row['productCode'])
productIds = sorted(set(productIds))
for item in productIds:
    txtFile.write("%s " % item)
txtFile.write('\n')
for key in data: # Will loop through each basket
    for value in sorted(data[key]): #Loop through each product in basket
        for i in productIds: # Go through list of available products
            if value == i: 
                txtFile.write('1 ')
            else:
                txtFile.write('0 ')
    txtFile.write('\n')
READ ALSO
Could not find wheel packages using pip command on CentOS?

Could not find wheel packages using pip command on CentOS?

If I am running pip install scipy on Ubuntu, pip finds whl package and installs it but for centos, it tries to download the source and compile and install it explicitlyI have observed this with lots of packages while installing on centos I would like to know is there anything...

288
Issues with matshow

Issues with matshow

I am trying to show a matrix and a related vector data together using matplotlib matshow

263
Run code on every folder I have in the directory in a loop

Run code on every folder I have in the directory in a loop

First, I would like to extract fasta sequences from a multi-fasta file by specifying the >header in a list (ie

164