python - Find duplicates, add to variable and remove -
i have script writes sales values separate lines in file , ultimate goal save data database. problem i'm running there duplicate entries same sales person, date, product, price , quantity.
my code written file:
john 07-15-2016 tool belt $100 2 sara 07-15-2016 hammer $100 3 john 07-15-2016 tool belt $100 2 john 07-15-2016 tool belt $100 2 sara 07-15-2016 hammer $100 3
how remove duplicates , add them together? i.e. output be:
john 07-15-2016 tool belt $100 6 sara 07-15-2016 hammer $100 6
i've used counter doesn't catch multiple instances, nor can find way add 2 together.
any appreciated.
script:
for line in s: var = re.compile(r'(\$)',re.m) line = re.sub(var, "", line) var = re.compile(r'(\,)',re.m) line = re.sub(var, "", line) line = line.rstrip('\n') line = line.split("|") if line[0] != '': salesperson = str(salesperson) date = dt.now() t = line[0].split() print t t = str(t[0]) try: s = dt.strptime(t, "%h:%m:%s") except: s = dt.strptime(t, "%h:%m") s = s.time() date = dt.combine(date, s) date = str(date) price = line[1] quantity = line[2] fn.write("%s %s %s %s \n" % (salesperson, date, price, quantity)) fn.close()
sample.csv
john 07-15-2016 tool belt $100 2 sara 07-15-2016 hammer $100 3 john 07-15-2016 tool belt $100 2 john 07-15-2016 tool belt $100 2 sara 07-15-2016 hammer $100 3
test.py
with open("sample.csv") inputs: mydict = dict() line in inputs: elements = line.strip().split() key = " ".join(elements[0: len(elements) - 1]) mydict[key] = mydict.get(key, 0) + int(elements[-1]) # iterate dictionary , print out result key, value in mydict.iteritems(): print "{0} {1}".format(key, value)
i use dictionary, split each line , use first len(elements) - 1
elements key, , increase last elements when iterate lines.
mydict.get(key, 0)
returns value if key exist in dictionary, otherwise return value 0
result: python2.7 test.py
sara 07-15-2016 hammer $100 6 john 07-15-2016 tool belt $100 6
therefore in case need:
elements = line.strip().split() key = " ".join(elements[0: len(elements) - 1]) mydict[key] = mydict.get(key, 0) + int(elements[-1])
Comments
Post a Comment