-
-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
httpClient.Transport #9
Comments
Hi Dmitry, I don't think adding a public option for Transport is the way to go, maybe allowing the http client to be specified through the Options. It could allow some flexibility for following redirects without necessarily overwriting the Fetch implementation, while allowing custom Transports. I'll give it a look next year! :) Martin On 2012-12-27, at 6:16 AM, Dmitry Bondarenko notifications@github.com wrote:
|
Just to let you know, that's next on my "give-some-love-to-gocrawl" todo list. |
That's great, thanks! |
I'm thinking about simply making It's a dead simple change in the code, it doesn't add bloat to an already crowded |
Personally I like setting this via Options a bit more, because that way I can set different clients for different crawlers. It is debatable, whether it is ever needed or not, but I feel that it could. For example, I may want to set different transports for crawling different web sites (E.g. different connection deadlines/dial timeouts). So I vote for making client somehow settable via options. But in that case some 'default client constructor' would be useful, so that I can create clients, based on default one (Construct a default and then make needed changes). |
I took a closer look at the impact of doing this, and I will opt for the public variable. I think you make a good point that using What it doesn't do is allow for various clients per crawler, but if this is absolutely required (which I would assume is not often), then it is already possible by providing a custom |
Well, I understand your decision, actually it's not that critical and you make a good point about breaking changes. Maybe this degree of flexibility is not worth it. Public variable is totally okay, and now if I want a crawler with its own custom client I can just create a new client based on yours, (copy CheckRedirect from your HttpClient) and then just rewrite Fetch using it. Thanks! |
Currently I'm having a little problem with changing http client transport. It is caused by two facts:
In my case I needed to change the standard transport to add timeouts:
In current situation I ended up copypasting your default 'Fetch' code (as-is) and 'CheckRedirect' part of your httpClient (as-is). Also, I ended up copypasting your 'isRobotsTxt' func :)
But I think it would be better to allow somehow to change just parts of this logic, without rewriting it all.
For example, if we talk about Transport, it could be OK to add a Transport field to Options and to use it in the instantiation code (probably then it should be moved from package-level vars to DefaultExtender):
I'm not sure if it is already in your package reorganization plans, I thought that I should submit it just in case.
The text was updated successfully, but these errors were encountered: