Python Find All File Names In Folder That Follows A Pattern
Solution 1:
import glob
glob.glob('index_[0-9]*.csv')
This will math the filename that starts with a digital .
John's solution matches exactly 8 digital .
Solution 2:
If you want to match exactly 8 digits with glob
you need to write them all out like this
import glob
glob.glob('index_[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9].csv')
Help on function glob in module glob:
glob(pathname) Return a list of paths matching a pathname pattern.
The pattern may contain simple shell-style wildcards a la fnmatch. However, unlike fnmatch, filenames starting with a dot are special cases that are not matched by '*' and '?' patterns.
If you want real regex, use os.listdir and filter the result
[x for x in os.listdir('.') if re.match('index_[0-9]*.csv', x)]
Solution 3:
I would take the following approach. You can define a simple file filter factory.
import time
def make_time_filter(start, end, time_format, file_format='index_{time_format:}.csv'):
t_start = time.strptime(start, time_format)
t_end = time.strptime(end, time_format)
ft_fmt = file_format.format(time_format=time_format)
def filt(fname):
try:
return t_start <= time.strptime(fname, ft_fmt) <= t_end
except ValueError:
return False
return filt
Now, you can simply make a predicate to filter out the date range you want
time_filt = make_time_filter('20091101', '20091201', '%Y%m%d')
Then pass this to filter
filter(time_filt, os.listdir(your_dir))
Or put it a comprehension of some sort
(fname for fname in os.listdir(your_dir) if time_filt(fname))
A regex will be more general, but you don't need one in your case since your file names all follow a simple pattern which you know must contain a date. For more on the time
module see the docs.
Solution 4:
This will get you where you want to be and allows you to provide start and end dates:
import os
import re
import datetime
start_date = datetime.datetime.strptime('20071102', '%Y%m%d')
end_date = datetime.datetime.strptime('20071103', '%Y%m%d')
files = os.listdir('.')
files_in_range = []
for fl in files:
if re.match('index_\d+\.csv', fl):
date = re.match('index_(\d+)\.csv', fl).group(1)
date = datetime.datetime.strptime(date, '%Y%m%d')
if date >= start_date and date <= end_date:
files_in_range.append(fl)
print files_in_range
Post a Comment for "Python Find All File Names In Folder That Follows A Pattern"