Unable To Extract Tables
Beginner here. I'm having issues while trying to extract data from the second (Team Statistics) and third (Team Analytics 5-on-5) Table on this page: https://www.hockey-reference.c
Solution 1:
You get this error because the read_html()
method returns a list of 1 element and that element is at position 0
instead of
df = df_list[1]
use this
df = df_list[0]
You get combined table of all teams from your mentioned site so if you want to extract the table of 2nd and 3rd team use loc[]
accessor:-
east_division=df.loc[9:17]
north_division=df.loc[18:25]
Solution 2:
Use the URL directly in pandas.read_html
df = pd.read_html('https://www.hockey-reference.com/leagues/NHL_2021.html')
Solution 3:
The tables are in fact there in the html (within the comments). Use BeautifulSoup to pull out the comments and parse those tables as well. The code below will pull all (both commented and uncommented tables). and put it into a list. Just a matter of pulling out the table by index that you want, in this case indices 1 and 2.
import requests
from bs4 import BeautifulSoup, Comment
import pandas as pd
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.104 Safari/537.36'}
url = "https://www.hockey-reference.com/leagues/NHL_2021.html"
# Gets all uncommented tables
tables = pd.read_html(url, header=1)
# Get the html source
response = requests.get(url, headers=headers)
# Creat soup object form html
soup = BeautifulSoup(response.content, 'html.parser')
# Get the comments in html
comments = soup.find_all(string=lambda text: isinstance(text, Comment))
# Iterate thorugh each comment and parse the table if found
# # Append the table to the tables list
for each in comments:
if 'table' in str(each):
try:
tables.append(pd.read_html(each, header=1)[0])
tables = tables[tables['Rk'].ne('Rk')]
tables = tables.rename(columns={'Unnamed: 1':'Team'})
except:
continue
Output:
for table in tables[1:3]:
print(table)
Rk Unnamed: 1 AvAge GP W ... S S% SA SV% SO
0 1.0 New York Islanders 29.0 28 18 ... 841 9.8 767 0.920 5
1 2.0 Tampa Bay Lightning 28.3 26 19 ... 798 12.2 725 0.919 3
2 3.0 Florida Panthers 28.1 27 18 ... 918 10.0 840 0.910 0
3 4.0 Toronto Maple Leafs 28.9 29 19 ... 883 11.2 828 0.909 2
4 5.0 Carolina Hurricanes 27.2 26 19 ... 816 10.9 759 0.912 3
5 6.0 Washington Capitals 30.4 27 17 ... 768 12.0 808 0.895 0
6 7.0 Vegas Golden Knights 29.1 25 18 ... 752 11.0 691 0.920 4
7 8.0 Edmonton Oilers 28.4 30 18 ... 945 10.6 938 0.907 2
8 9.0 Winnipeg Jets 28.0 27 17 ... 795 11.4 856 0.910 1
9 10.0 Pittsburgh Penguins 28.1 27 17 ... 779 11.0 784 0.899 1
10 11.0 Chicago Blackhawks 27.2 29 14 ... 863 10.1 997 0.910 2
11 12.0 Minnesota Wild 28.8 25 16 ... 764 10.3 723 0.913 2
12 13.0 St. Louis Blues 28.2 28 14 ... 836 10.4 835 0.892 0
13 14.0 Boston Bruins 28.8 25 14 ... 772 8.8 665 0.913 2
14 15.0 Colorado Avalanche 26.8 25 15 ... 846 8.7 622 0.905 4
15 16.0 Montreal Canadiens 28.8 27 12 ... 890 9.7 782 0.909 0
16 17.0 Philadelphia Flyers 27.5 25 13 ... 699 11.7 753 0.892 3
17 18.0 Calgary Flames 28.0 28 13 ... 838 8.9 845 0.904 3
18 19.0 Los Angeles Kings 27.7 26 11 ... 748 10.3 814 0.910 2
19 20.0 Vancouver Canucks 27.7 31 13 ... 951 8.8 1035 0.903 1
20 21.0 Columbus Blue Jackets 27.0 29 11 ... 839 9.3 902 0.895 1
21 22.0 Arizona Coyotes 28.5 27 12 ... 689 9.7 851 0.907 1
22 23.0 San Jose Sharks 29.3 25 11 ... 749 9.5 800 0.890 1
23 24.0 New York Rangers 25.7 26 11 ... 773 9.2 746 0.906 2
24 25.0 Nashville Predators 28.9 28 11 ... 880 7.4 837 0.885 1
25 26.0 Anaheim Ducks 28.4 29 8 ... 804 7.7 852 0.891 3
26 27.0 Dallas Stars 28.3 23 8 ... 657 10.2 626 0.904 3
27 28.0 Detroit Red Wings 29.4 28 8 ... 785 8.0 870 0.891 0
28 29.0 Ottawa Senators 26.4 30 9 ... 942 8.2 960 0.874 0
29 30.0 New Jersey Devils 26.2 24 8 ... 708 8.5 741 0.896 2
30 31.0 Buffalo Sabres 27.4 26 6 ... 728 7.7 804 0.893 0
31 NaN League Average 28.1 27 13 ... 808 9.8 808 0.902 2
[32 rows x 32 columns]
Rk Unnamed: 1 S% SV% ... HDGF HDC% HDGA HDCO%
0 1 New York Islanders 8.3 0.931 ... 11 12.2 11 11.8
1 2 Tampa Bay Lightning 8.7 0.933 ... 11 14.9 6 6.3
2 3 Florida Panthers 7.9 0.926 ... 15 14.4 12 17.6
3 4 Toronto Maple Leafs 8.8 0.933 ... 16 13.4 8 11.1
4 5 Carolina Hurricanes 7.5 0.932 ... 12 12.8 7 9.3
5 6 Washington Capitals 9.8 0.919 ... 10 10.9 5 7.8
6 7 Vegas Golden Knights 9.3 0.927 ... 20 15.9 11 14.5
7 8 Edmonton Oilers 8.2 0.920 ... 9 11.3 13 9.8
8 9 Winnipeg Jets 8.5 0.926 ... 15 15.0 8 7.8
9 10 Pittsburgh Penguins 8.8 0.922 ... 10 14.5 15 13.5
10 11 Chicago Blackhawks 7.3 0.925 ... 10 10.5 14 15.1
11 12 Minnesota Wild 9.9 0.930 ... 16 14.2 8 11.9
12 13 St. Louis Blues 8.4 0.914 ... 15 18.1 15 15.8
13 14 Boston Bruins 6.6 0.922 ... 5 7.4 11 12.2
14 15 Colorado Avalanche 6.7 0.916 ... 8 8.1 8 13.3
15 16 Montreal Canadiens 7.8 0.935 ... 15 12.0 8 11.3
16 17 Philadelphia Flyers 10.1 0.907 ... 18 15.9 9 12.9
17 18 Calgary Flames 7.6 0.929 ... 6 6.9 8 9.2
18 19 Los Angeles Kings 7.5 0.925 ... 11 13.1 8 9.8
19 20 Vancouver Canucks 7.3 0.919 ... 17 13.2 20 17.4
20 21 Columbus Blue Jackets 8.1 0.918 ... 5 9.6 15 13.6
21 22 Arizona Coyotes 7.7 0.924 ... 11 14.7 14 12.8
22 23 San Jose Sharks 8.1 0.909 ... 12 14.6 16 14.0
23 24 New York Rangers 7.8 0.921 ... 17 14.0 8 12.7
24 25 Nashville Predators 5.7 0.918 ... 5 10.6 11 13.4
25 26 Anaheim Ducks 7.4 0.909 ... 12 13.3 25 16.8
26 27 Dallas Stars 7.4 0.929 ... 11 13.3 5 12.8
27 28 Detroit Red Wings 7.5 0.923 ... 13 15.3 12 16.7
28 29 Ottawa Senators 7.1 0.894 ... 7 8.6 20 14.3
29 30 New Jersey Devils 7.2 0.923 ... 10 14.3 12 13.2
30 31 Buffalo Sabres 5.8 0.911 ... 6 8.2 16 14.0
Post a Comment for "Unable To Extract Tables"