Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not possible to use #12

Open
daniperaleda opened this issue Aug 13, 2023 · 2 comments
Open

Not possible to use #12

daniperaleda opened this issue Aug 13, 2023 · 2 comments

Comments

@daniperaleda
Copy link

Hi:

I have tried to use the code but don't know if it is still working for you.

I am facing problems to start using it.

The main error I get is "TypeError: WebDriver.init() got an unexpected keyword argument 'executable_path'"

It seems about the code but after google search I have not been able to fix it.

Thanks in advance

@meraline
Copy link

yes, it outputs an error for me too

(TorchEnv) PS C:\Users\Анатолий\Documents\GitHub> & C:/Users/Анатолий/source/repos/PyTorchtest/PyTorchtest/TorchEnv/Scripts/python.exe c:/Users/Анатолий/Documents/GitHub/scrapeOP/FinalS
craper.py
C:\Users\Анатолий\Documents\GitHub
Data will be saved in the following directory: C:\Users\Анатолий\Documents\GitHub
Please indicate the format of tournament (3 sets or 5 sets) :

Please indicate the surface :

We start to scrape the following tournament : charleston-challenger-men
Traceback (most recent call last):
File "c:\Users\Анатолий\Documents\GitHub\scrapeOP\FinalScraper.py", line 14, in
scrape_oddsportal_current_season(sport = 'tennis', country = 'usa', league = 'charleston-challenger-men', season = '2023', max_page = 25)
File "c:\Users\Анатолий\Documents\GitHub\scrapeOP\functions.py", line 1363, in scrape_oddsportal_current_season
df = scrape_current_tournament_typeB(Surface = surface, bestof = bestof, tournament = league,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\Users\Анатолий\Documents\GitHub\scrapeOP\functions.py", line 556, in scrape_current_tournament_typeB
driver = webdriver.Chrome(executable_path = DRIVER_LOCATION)

@draccc
Copy link

draccc commented Sep 30, 2023

I did edit this part of the code and then this error stoped.

def scrape_current_tournament_typeC(sport, tournament, country, SEASON, max_page = 25):
global driver
############### NOW WE SEEK TO SCRAPE THE ODDS AND MATCH INFO################################
DATA_ALL = []
for page in range(1, max_page + 1):
print('We start to scrape the page n°{}'.format(page))
try:
driver.quit() # close all widows
except:
pass
driver = webdriver.Chrome()
data = scrape_page_typeC(page, sport, country, tournament, SEASON)
DATA_ALL = DATA_ALL + [y for y in data if y != None]
driver.close()
data_df = pd.DataFrame(DATA_ALL)
try:
data_df.columns = ['TeamsRaw', 'Bookmaker', 'OddHome','OddDraw', 'OddAway', 'DateRaw' ,'ScoreRaw']
except:
print('Function crashed, probable reason : no games scraped (empty season)')
return(1)
##################### FINALLY WE CLEAN THE DATA AND SAVE IT ##########################
'''Now we simply need to split team names, transform date, split score'''
# (0) Filter out None rows
data_df = data_df[~data_df['Bookmaker'].isnull()].dropna().reset_index()
data_df["TO_KEEP"] = 1
for i in range(len(data_df["TO_KEEP"])):
if len(re.split(':',data_df["ScoreRaw"][i]))<2 :
data_df["TO_KEEP"].iloc[i] = 0
data_df = data_df[data_df["TO_KEEP"] == 1]
# (a) Split team names
data_df["Home_id"] = [re.split(' - ',y)[0] for y in data_df["TeamsRaw"]]
data_df["Away_id"] = [re.split(' - ',y)[1] for y in data_df["TeamsRaw"]]
# (b) Transform date
data_df["Date"] = [re.split(', ',y)[1] for y in data_df["DateRaw"]]
# (c) Split score
data_df["Score_home"] = [re.split(':',y)[0][-2:] for y in data_df["ScoreRaw"]]
data_df["Score_away"] = [re.split(':',y)[1][:2] for y in data_df["ScoreRaw"]]
# (e) Set season column
data_df["Season"] = SEASON
# Finally we save results
if not os.path.exists('./{}_FULL'.format(tournament)):
os.makedirs('./{}_FULL'.format(tournament))
if not os.path.exists('./{}'.format(tournament)):
os.makedirs('./{}'.format(tournament))
data_df.to_csv('./{}FULL/{}{}FULL.csv'.format(tournament,tournament, SEASON), sep=';', encoding='utf-8', index=False)
data_df[['Home_id', 'Away_id', 'Bookmaker', 'OddHome','OddDraw', 'OddAway', 'Date', 'Score_home', 'Score_away','Season']].to_csv('./{}/{}
{}.csv'.
format(tournament,tournament, SEASON), sep=';', encoding='utf-8', index=False)
return(data_df)

I did also get an error at this part so i did just try to comment it out.
#def reject_ads(switch_to_decimal = True):
# Reject ads
# ffi2('//[@id="onetrust-reject-all-handler"]')
# if switch_to_decimal:
# Change odds to decimal format
# driver.find_element("xpath", '//
[@id="user-header-oddsformat-expander"]').click()
# driver.find_element("xpath", '//*[@id="user-header-oddsformat"]/li[1]/a/span').click()

There is reject_ads in some of the def witch i also just did comment out.

Let me know if it helps you :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants