Python Unique Lines
Hi I have a text file in the following format: Sam John Peter Sam Peter John I want to extract the unique records using REGULAR EXPRESSION from the file such as: Sam John Peter
Solution 1:
Use set:
In [1]: name="""
...: Sam
...: John
...: Peter
...: Sam
...: Peter
...: John"""
In [2]: print name
Sam
John
Peter
Sam
Peter
John
In [3]: a=name.split()
In [4]: a
Out[4]: ['Sam', 'John', 'Peter', 'Sam', 'Peter', 'John']
In [5]: set(a)
Out[5]: {'John', 'Peter', 'Sam'}
Solution 2:
Don't listen to them!
Of course this can be done in Regex. Never mind that they have the correct, O(1)
solution that's readable and concise, or that any Regex solution will be at least quadratic-time and about as readable as a drunkard's scrawling.
What matters is that it's Regex, and Regex must be good. Here you go:
re.findall(r"""(?ms)^([^\n]*)$(?!.*^\1$)""", target_string)
#>>> ['Sam', 'Peter', 'John']
Solution 3:
seems like you want to create a list by splitting the input by new line and then removing duplicates using set()
. you can then convert that to a list using list()
. looks something like below. The strip()
is used to remove the newline characters.
names = list(set([x.strip() for x in open('names.txt').readlines()]))
Post a Comment for "Python Unique Lines"