Skip to content Skip to sidebar Skip to footer

Unable To Extract Tables

Beginner here. I'm having issues while trying to extract data from the second (Team Statistics) and third (Team Analytics 5-on-5) Table on this page: https://www.hockey-reference.c

Solution 1:

You get this error because the read_html() method returns a list of 1 element and that element is at position 0

instead of

df = df_list[1]

use this

df = df_list[0]

You get combined table of all teams from your mentioned site so if you want to extract the table of 2nd and 3rd team use loc[] accessor:-

east_division=df.loc[9:17]
north_division=df.loc[18:25]

Solution 2:

Use the URL directly in pandas.read_html

df = pd.read_html('https://www.hockey-reference.com/leagues/NHL_2021.html')

Solution 3:

The tables are in fact there in the html (within the comments). Use BeautifulSoup to pull out the comments and parse those tables as well. The code below will pull all (both commented and uncommented tables). and put it into a list. Just a matter of pulling out the table by index that you want, in this case indices 1 and 2.

import requests
from bs4 import BeautifulSoup, Comment
import pandas as pd

headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.104 Safari/537.36'}

url = "https://www.hockey-reference.com/leagues/NHL_2021.html"

# Gets all uncommented tables
tables = pd.read_html(url, header=1)

# Get the html source
response = requests.get(url, headers=headers)

# Creat soup object form html
soup = BeautifulSoup(response.content, 'html.parser')


# Get the comments in html
comments = soup.find_all(string=lambda text: isinstance(text, Comment))

# Iterate thorugh each comment and parse the table if found
# # Append the table to the tables list
for each in comments:
    if 'table' in str(each):
        try:
            tables.append(pd.read_html(each, header=1)[0])
            tables = tables[tables['Rk'].ne('Rk')]
            tables = tables.rename(columns={'Unnamed: 1':'Team'})
        except:
            continue

Output:

for table in tables[1:3]:
    print(table)
      Rk             Unnamed: 1  AvAge  GP   W  ...    S    S%    SA    SV%  SO
0    1.0     New York Islanders   29.0  28  18  ...  841   9.8   767  0.920   5
1    2.0    Tampa Bay Lightning   28.3  26  19  ...  798  12.2   725  0.919   3
2    3.0       Florida Panthers   28.1  27  18  ...  918  10.0   840  0.910   0
3    4.0    Toronto Maple Leafs   28.9  29  19  ...  883  11.2   828  0.909   2
4    5.0    Carolina Hurricanes   27.2  26  19  ...  816  10.9   759  0.912   3
5    6.0    Washington Capitals   30.4  27  17  ...  768  12.0   808  0.895   0
6    7.0   Vegas Golden Knights   29.1  25  18  ...  752  11.0   691  0.920   4
7    8.0        Edmonton Oilers   28.4  30  18  ...  945  10.6   938  0.907   2
8    9.0          Winnipeg Jets   28.0  27  17  ...  795  11.4   856  0.910   1
9   10.0    Pittsburgh Penguins   28.1  27  17  ...  779  11.0   784  0.899   1
10  11.0     Chicago Blackhawks   27.2  29  14  ...  863  10.1   997  0.910   2
11  12.0         Minnesota Wild   28.8  25  16  ...  764  10.3   723  0.913   2
12  13.0        St. Louis Blues   28.2  28  14  ...  836  10.4   835  0.892   0
13  14.0          Boston Bruins   28.8  25  14  ...  772   8.8   665  0.913   2
14  15.0     Colorado Avalanche   26.8  25  15  ...  846   8.7   622  0.905   4
15  16.0     Montreal Canadiens   28.8  27  12  ...  890   9.7   782  0.909   0
16  17.0    Philadelphia Flyers   27.5  25  13  ...  699  11.7   753  0.892   3
17  18.0         Calgary Flames   28.0  28  13  ...  838   8.9   845  0.904   3
18  19.0      Los Angeles Kings   27.7  26  11  ...  748  10.3   814  0.910   2
19  20.0      Vancouver Canucks   27.7  31  13  ...  951   8.8  1035  0.903   1
20  21.0  Columbus Blue Jackets   27.0  29  11  ...  839   9.3   902  0.895   1
21  22.0        Arizona Coyotes   28.5  27  12  ...  689   9.7   851  0.907   1
22  23.0        San Jose Sharks   29.3  25  11  ...  749   9.5   800  0.890   1
23  24.0       New York Rangers   25.7  26  11  ...  773   9.2   746  0.906   2
24  25.0    Nashville Predators   28.9  28  11  ...  880   7.4   837  0.885   1
25  26.0          Anaheim Ducks   28.4  29   8  ...  804   7.7   852  0.891   3
26  27.0           Dallas Stars   28.3  23   8  ...  657  10.2   626  0.904   3
27  28.0      Detroit Red Wings   29.4  28   8  ...  785   8.0   870  0.891   0
28  29.0        Ottawa Senators   26.4  30   9  ...  942   8.2   960  0.874   0
29  30.0      New Jersey Devils   26.2  24   8  ...  708   8.5   741  0.896   2
30  31.0         Buffalo Sabres   27.4  26   6  ...  728   7.7   804  0.893   0
31   NaN         League Average   28.1  27  13  ...  808   9.8   808  0.902   2

[32 rows x 32 columns]
    Rk             Unnamed: 1    S%    SV%  ...  HDGF  HDC%  HDGA  HDCO%
0    1     New York Islanders   8.3  0.931  ...    11  12.2    11   11.8
1    2    Tampa Bay Lightning   8.7  0.933  ...    11  14.9     6    6.3
2    3       Florida Panthers   7.9  0.926  ...    15  14.4    12   17.6
3    4    Toronto Maple Leafs   8.8  0.933  ...    16  13.4     8   11.1
4    5    Carolina Hurricanes   7.5  0.932  ...    12  12.8     7    9.3
5    6    Washington Capitals   9.8  0.919  ...    10  10.9     5    7.8
6    7   Vegas Golden Knights   9.3  0.927  ...    20  15.9    11   14.5
7    8        Edmonton Oilers   8.2  0.920  ...     9  11.3    13    9.8
8    9          Winnipeg Jets   8.5  0.926  ...    15  15.0     8    7.8
9   10    Pittsburgh Penguins   8.8  0.922  ...    10  14.5    15   13.5
10  11     Chicago Blackhawks   7.3  0.925  ...    10  10.5    14   15.1
11  12         Minnesota Wild   9.9  0.930  ...    16  14.2     8   11.9
12  13        St. Louis Blues   8.4  0.914  ...    15  18.1    15   15.8
13  14          Boston Bruins   6.6  0.922  ...     5   7.4    11   12.2
14  15     Colorado Avalanche   6.7  0.916  ...     8   8.1     8   13.3
15  16     Montreal Canadiens   7.8  0.935  ...    15  12.0     8   11.3
16  17    Philadelphia Flyers  10.1  0.907  ...    18  15.9     9   12.9
17  18         Calgary Flames   7.6  0.929  ...     6   6.9     8    9.2
18  19      Los Angeles Kings   7.5  0.925  ...    11  13.1     8    9.8
19  20      Vancouver Canucks   7.3  0.919  ...    17  13.2    20   17.4
20  21  Columbus Blue Jackets   8.1  0.918  ...     5   9.6    15   13.6
21  22        Arizona Coyotes   7.7  0.924  ...    11  14.7    14   12.8
22  23        San Jose Sharks   8.1  0.909  ...    12  14.6    16   14.0
23  24       New York Rangers   7.8  0.921  ...    17  14.0     8   12.7
24  25    Nashville Predators   5.7  0.918  ...     5  10.6    11   13.4
25  26          Anaheim Ducks   7.4  0.909  ...    12  13.3    25   16.8
26  27           Dallas Stars   7.4  0.929  ...    11  13.3     5   12.8
27  28      Detroit Red Wings   7.5  0.923  ...    13  15.3    12   16.7
28  29        Ottawa Senators   7.1  0.894  ...     7   8.6    20   14.3
29  30      New Jersey Devils   7.2  0.923  ...    10  14.3    12   13.2
30  31         Buffalo Sabres   5.8  0.911  ...     6   8.2    16   14.0

Post a Comment for "Unable To Extract Tables"