Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse wikimedia URLs as input param #271

Closed
divanvb opened this issue Apr 18, 2019 · 11 comments
Closed

Parse wikimedia URLs as input param #271

divanvb opened this issue Apr 18, 2019 · 11 comments

Comments

@divanvb
Copy link

divanvb commented Apr 18, 2019

I'm unable to fetch and parse from the following wiki: https://dota2.gamepedia.com/Abaddon

wtfWikipedia.fetch('https://dota2.gamepedia.com/Abaddon', (err, doc) => {
    if (err) console.log(err);
    console.log(doc);
});`

From what I read through the other issues, mediawiki's should be supported or am I mistaken?
@niebert
Copy link
Contributor

niebert commented Apr 18, 2019

wtf_wikipedia was meant to fetch from Wikipedia, Wikiversity, .... see https://github.com/spencermountain/wtf_wikipedia/wiki - Fork library and modify cross-fetch call in directory /src/_fetch/.

@spencermountain
Copy link
Owner

hi @divanvb yeah, it looks like that wiki has an api
so it should work

it may just be some custom url path to it, or something

here's the url we want:
https://dota2.gamepedia.com/api.php?action=query&prop=revisions&rvprop=content&maxlag=5&rvslots=main&format=json&origin=*&redirects=true&titles=Abaddon

@spencermountain
Copy link
Owner

spencermountain commented Apr 18, 2019

oh, this works:

wtf.fetch('Abaddon', 'dota2', {
  wikiUrl: 'https://dota2.gamepedia.com/api.php'
}).then(function(doc) {
  console.log(doc.json())
})

i like that idea of passing-in the url though

@divanvb
Copy link
Author

divanvb commented Apr 21, 2019

Indeed, passing the url through gives a bit more flexibility towards specifying the exact wiki etc. I'll have a look at the code above and let you know how well it works with what I'm trying to achieve.

1 similar comment
@divanvb
Copy link
Author

divanvb commented Apr 21, 2019

Indeed, passing the url through gives a bit more flexibility towards specifying the exact wiki etc. I'll have a look at the code above and let you know how well it works with what I'm trying to achieve.

@spencermountain spencermountain changed the title Can't parse gamepedia wiki's (mediawiki) Parse wikimedia URLs as input param Apr 22, 2019
@divanvb
Copy link
Author

divanvb commented Apr 23, 2019

I've been playing around with the above-mentioned solution, seems to be working quite well. Can you please provide me with a link that shows the options that can be passed through as the 3rd argument. I can't find anything in the documentation on what options are available. It just states [options] with no details.

@spencermountain spencermountain added this to the v8 milestone Apr 23, 2019
@spencermountain
Copy link
Owner

yeah sorry, these were added in prs and need to be cleaned-up.

i think these are the supported params right now:

  • userAgent
  • wikiUrl
  • follow_redirects

@niebert
Copy link
Contributor

niebert commented May 1, 2019

add extension of wtf_wikipedia to the Github Wiki

@divanvb
Copy link
Author

divanvb commented Sep 16, 2019

This is breaking with the latest version for me. Unable to fetch anymore.

import wtf from 'wtf_wikipedia';

TypeError: Cannot read property 'fetch' of undefined
at AppController.getHello (/Users/divan.van.biljon/Workspace/Personal/Cover Me/scraper/dist/app.controller.js:20:33)
at /Users/divan.van.biljon/Workspace/Personal/Cover Me/scraper/node_modules/@nestjs/core/router/router-execution-context.js:37:29
at InterceptorsConsumer.intercept (/Users/divan.van.biljon/Workspace/Personal/Cover Me/scraper/node_modules/@nestjs/core/interceptors/interceptors-consumer.js:10:20)
at /Users/divan.van.biljon/Workspace/Personal/Cover Me/scraper/node_modules/@nestjs/core/router/router-execution-context.js:45:60
at /Users/divan.van.biljon/Workspace/Personal/Cover Me/scraper/node_modules/@nestjs/core/router/router-proxy.js:8:23
at Layer.handle [as handle_request] (/Users/divan.van.biljon/Workspace/Personal/Cover Me/scraper/node_modules/express/lib/router/layer.js:95:5)
at next (/Users/divan.van.biljon/Workspace/Personal/Cover Me/scraper/node_modules/express/lib/router/route.js:137:13)
at Route.dispatch (/Users/divan.van.biljon/Workspace/Personal/Cover Me/scraper/node_modules/express/lib/router/route.js:112:3)
at Layer.handle [as handle_request] (/Users/divan.van.biljon/Workspace/Personal/Cover Me/scraper/node_modules/express/lib/router/layer.js:95:5)
at /Users/divan.van.biljon/Workspace/Personal/Cover Me/scraper/node_modules/express/lib/router/index.js:281:22

@spencermountain
Copy link
Owner

hey @divanvb thanks, I'll need to know more information about your setup, it seems to work for me, and the tests.

import wtf from 'wtf_wikipedia'
wtf.fetch('Toronto').then(doc => {
  console.log(doc.categories())
})

node --experimental-modules index.mjs
cheers

@spencermountain
Copy link
Owner

this works now, in v8.0.0
check on the docs here
thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants