Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rectify control of easy-handle callbacks #155

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

StefanKarpinski
Copy link
Member

@StefanKarpinski StefanKarpinski commented Oct 27, 2021

This PR changes the complex easy handle callback flow to a more straightforward version where we register actual callbacks (or IO objects to read from or write to) with the Easy objects, which are then directly used in the easy handle callbacks. The steps:

  • rectify control of progress callback
  • rectify control of write callback
  • rectify control of read callback
  • rename easy.{ready => done}

There are a few issues with this though:

  1. When an easy-handle callback blocks, it now blocks any progress on anything the multi-handle is working on, even for other downloads.

  2. If the Easy type isn't parametric then its input, output and progress fields are abstract, which may cause performance issues. We can fix this by making Easy parametric, but that may lead to excess compilation and specialization.

The first point wasn't an issue with the old, convoluted version of the callbacks because the easy handle callbacks were designed to never block. The second point wasn't an issue because the callbacks always did the same thing and the variation in types was isolated to the request function, which would be specialized on different argument types, which would, of course cause specialization, but at least that was contained to one function.

So, overall, it's unclear that this is an improvement. The main issue with the status quo is that the output and progress channels can grow unbounded, but in practice that seems unlikely to be a real issue.

@codecov
Copy link

codecov bot commented Oct 27, 2021

Codecov Report

Merging #155 (f448eb0) into master (dbb0625) will increase coverage by 0.30%.
The diff coverage is 95.00%.

❗ Current head f448eb0 differs from pull request most recent head 2af290f. Consider uploading reports for the commit 2af290f to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master     #155      +/-   ##
==========================================
+ Coverage   91.15%   91.46%   +0.30%     
==========================================
  Files           5        5              
  Lines         475      445      -30     
==========================================
- Hits          433      407      -26     
+ Misses         42       38       -4     
Impacted Files Coverage Δ
src/Curl/Easy.jl 95.00% <93.75%> (+0.48%) ⬆️
src/Curl/Multi.jl 93.33% <100.00%> (-0.19%) ⬇️
src/Downloads.jl 83.51% <100.00%> (+0.34%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dbb0625...2af290f. Read the comment docs.

@StefanKarpinski StefanKarpinski marked this pull request as draft October 27, 2021 14:40
n = min(size * count, length(buf))
ccall(:memcpy, Ptr{Cvoid}, (Ptr{Cvoid}, Ptr{Cvoid}, Csize_t), data, buf, n)
deleteat!(buf, 1:n)
eof(easy.input) && return 0
Copy link
Member

@vtjnash vtjnash Oct 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can block, so we'll eventually want to add flow-control with something like:

n = UInt(bytesavailable(easy.input))
if n == 0
    errormonitor(@async begin
        eof(easy.input) ? #= notify curl of EOF =# : #= notify curl of more data =#
    end)
    return CURL_READFUNC_PAUSE
end
n = min(n, UInt(size * count))
unsafe_read(easy.input, data, n)
return n

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, most of these implementations can block... that's one of the main drawbacks noted in the PR notes.

src/Curl/Easy.jl Outdated
Comment on lines 342 to 343
buf = unsafe_wrap(Vector{Cchar}, data, size*count)
write(easy.output, buf)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
buf = unsafe_wrap(Vector{Cchar}, data, size*count)
write(easy.output, buf)
unsafe_write(easy.output, data, UInt(size*count))

@@ -372,7 +352,7 @@ function progress_callback(
ul_now :: curl_off_t,
)::Cint
easy = unsafe_pointer_to_objref(easy_p)::Easy
put!(easy.progress, (dl_total, dl_now, ul_total, ul_now))
easy.progress(dl_total, dl_now, ul_total, ul_now)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid the callback, I'd suggest here doing:

easy.dl_now = dl_now
easy.dl_total = dl_total
easy.ul_total = ul_total
easy.ul_now = ul_now
notify(easy.progress::Event)

@vtjnash
Copy link
Member

vtjnash commented Oct 27, 2021

The second point wasn't an issue because the callbacks always did the same thing and the variation in types was isolated to the request function, which would be specialized on different argument types, which would, of course cause specialization, but at least that was contained to one function.

I think I've measured this before to be false: a generic function dispatch is expected to be much faster than a task switch, and not much slower than a non-generic function dispatch (though it may allocate slightly more memory and doesn't inline).

src/Curl/Easy.jl Outdated
@@ -1,26 +1,29 @@
mutable struct Easy
mutable struct Easy{Input<:IO, Output<:IO, Progress<:Function}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
mutable struct Easy{Input<:IO, Output<:IO, Progress<:Function}
mutable struct Easy{Input<:IO, Output<:IO}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I suspect there's no performance advantage to giving any of these explicit types, and it could simply be mutable struct Easy as it was before

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slight additional question: do we need to inform the user that, since we are using IO directly now instead of Channel from a pinned Task, that those objects must now be thread-safe?

src/Curl/Easy.jl Outdated
output :: Channel{Vector{UInt8}}
progress :: Channel{NTuple{4,Int}}
output :: Output
progress :: Progress
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
progress :: Progress
progress :: Any

@StefanKarpinski StefanKarpinski changed the title simplify easy-handle callbacks rectify control of easy-handle callbacks Oct 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants