Skip to content

Conversation

@JoseKling
Copy link
Owner

  • src/hawkes/hawkes_process.jl

    • HawkesProcess type
    • Methods: rand, fit, time_change, ground_intensity, integrated_ground_intensity, logdensityof
  • tests/hawkes.jl

    • Some unit tests for Hawkes processes
  • history.jl

    • added 'event_times(h, tmin, tmax)'
    • added 'event_marks(h, tmin, tmax)'
  • simulation.jl

    • added 'simulate_poisson' - for other methods to use

- src/hawkes/hawkes_process.jl
    - Methods: rand, fit, time_change, ground_intensity,
               integrated_ground_intensity, logdensityof

- tests/hawkes.jl
    - Some unit tests for Hawkes processes

- history.jl
    - added 'event_times(h, tmin, tmax)'
    - added 'event_marks(h, tmin, tmax)'

- simulation.jl
    - added 'simulate_poisson' - for other methods to use
@codecov
Copy link

codecov bot commented Jul 10, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Copy link
Collaborator

@jucheval jucheval left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not check every line, but there are already lots of changes requested and questions.

test/hawkes.jl Outdated
@test isa(h_sim, History{Nothing,Float64})
# Estimation
hp_est = fit(HawkesProcess, h_sim)
@test isa(hp_est, HawkesProcess)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a test to check that the estimated parameters are close to the real ones ?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am no sure how to do this, to be honest.

One way would be to simulate many processes with known parameters and then get the mean/median of the estimations, but how would I define a bound for how far off this can be from the true parameters?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is to give us a warning in case of a harmful change in a future update.
You could chose a history and a rng seed such that the estimation is "close" to the true parameters. Then, add a unit test to check that this distance to the true parameters do not deteriorate in the next updates.

Copy link
Owner Author

@JoseKling JoseKling Jul 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is the best approach, but, in principle, this would show us if something changes the algorithm too much.

Could make it simpler by just fixing a seed and checking if it is close to the median, if this is too much.

Also, I saved the result of a simulation in a text file, because this is recommended in the Random.jl documentation, as they might change the algorithm and, therefore, change the results in the future.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Julien is right, checking proximity between estimated and true parameters is an excellent way to guard against mistakes or breaking changes.
You can use StableRNGs.jl to get reproducible random seeds across versions and platforms, which makes text files unnecessary.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Room for improvement is good ; )

About your question, the problem is just a poor explanation in the comments.
In reality this t90 is the inverse of the time it takes for a 90% decay. The only reason for sampling the inverse of t90 instead of t90 is that this ensures that the distribution of $\omega$ will be uniform. The thing is that, because t90 is chosen from the interval (1/2, 2), both t90 and its inverse are inside this interval.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, got it. I assumed it was something like this but wanted to make sure there wasn't a bug. I'll move forward on optimizing a bit!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I optimized to significantly improve performance and increase memory efficiency. The main change replaces the previous O(n²) double loop used to compute descendant probabilities with an O(n) recurrence formulation based on running sums (A and S). This eliminates redundant computations and scales efficiently with the number of events. I also introduced a few small buffers to avoid unnecessary array allocations and removed redundant exponential recalculations within the EM loop. This caused the fit time to go from 55s for ~8k events to 10.53 ms without changing the final estimates. All tests still pass as well

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rsenne!
Completely missed that! It is even in the paper, actually. Anyway, that's the good thing about having extra pair of eyes going through it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happens to the best of us, glad I could help :)

- Fixed `logdensityof`
- HawkesProcess is now parametric and has as check in the inner constructor
- Added explanation for `generate_descendants` and choice of initial guess in `fit`
- Fixed logdensityof
- HawkesProcess is now parametric and has as check in the inner constructor
- Added explanation for generate_descendants and choice of initial guess in fit

Fixed formatting
- Changed parameter names of `HawkesProcess`
- Added test for estimation
- Changed parameter names of `HawkesProcess`
- Added test for estimation

Forgot to run the formatter
- Changed parameter names of `HawkesProcess`
- Added test for estimation

Forgot to run the formatter
Removed the docstring from `simulate_poisson_times` causing docs deployment problem.
Copy link
Collaborator

@gdalle gdalle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome that you're getting started with new processes!
I left a few comments to try and guide you towards idiomatic Julia code, let me know what you think!

Expectation-Maximization algorithm from [E. Lewis, G. Mohler (2011)](https://api.semanticscholar.org/CorpusID:2105771)).
The relevant calculations are in page 4, equations 6-13.

Let (t_1 < ... < t_N) be the event times over the interval [0, T). We use the immigrant-descendant representation,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible, one should refrain from using LaTeX in docstrings because it renders badly in the REPL. If you have to include formulas or symbols, it's better to put them inside backticks so that they display as monospace code, and then add a page to the Documenter.jl documentation for proper LaTeX equations

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can visualize the docstring by typing

?fit

in a Julia REPL

T = duration(h)
norm_ts = h.times .* (N / T) # Average inter-event time equal to 1. Numerical stability
n_iters = 0
step = step_tol + 1.0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whenever you use floating point numbers like 1.0, you're forcing the entire computation to happen in Float64, even if the user provided Float32 everywhere. This could be replaced by one(T) for the right choice of T.

What is the right choice of T? Probably the result of the promotion between the time type of the History and the parameter type of the HawkesProcess

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed the types returned by rand and time_change too. The rule I used is the following:

time_change(hp::HawkesProcess{HP}, h::{M,H})

  • Returns a History{M,H}, same type as argument

fit(::Type{HawkesProcess{HP}}, h::History{M,H})

  • Returns a HawkesProcess{HP}, the exact type passed as an argument
  • If argument passed is HawkesProcess (without a type parmeter), return HawkesProcess{T}, where T == promote_type(Float64, H).

rand(hp::HawkesProcess{HP}, tmin, tmax)

  • Returns a History{M,H} where H == promote_type(typeof(tmin), typeof(tmax))

Comment on lines 147 to 152
# Internal function for simulating Hawkes processes
# The first generation, gen_0, is the `immigrants`, which is a set of event times.
# For each t_g ∈ gen_n, simulate an inhomogeneous Poisson process over the interval [t_g, T]
# with intensity λ(t) = α exp(-ω(t - t_g)) with the inverse method.
# gen_{n+1} is the set of all events simulated from all events in gen_n.
# The algorithm stops when the simulation from one generation results in no further events.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use block comments like so:

#=
First line
second line
=#

function generate_descendants(rng::AbstractRNG, immigrants::Vector, T, α, ω)
descendants = eltype(immigrants)[]
next_gen = immigrants
while ~isempty(next_gen)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
while ~isempty(next_gen)
while !isempty(next_gen)

src/history.jl Outdated
Return the sorted vector of event times between `tmin` and `tmax` for `h`.
"""
function event_times(h::History, tmin, tmax)
h.times[searchsortedfirst(h.times, tmin):(searchsortedfirst(h.times, tmax) - 1)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be a @view?

test/hawkes.jl Outdated
@test isa(h_sim, History{Nothing,Float64})
# Estimation
hp_est = fit(HawkesProcess, h_sim)
@test isa(hp_est, HawkesProcess)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Julien is right, checking proximity between estimated and true parameters is an excellent way to guard against mistakes or breaking changes.
You can use StableRNGs.jl to get reproducible random seeds across versions and platforms, which makes text files unnecessary.

@@ -0,0 +1 @@
[0.01017337415246089, 0.013640560097159748, 0.013925428236956083, 0.014608995173263944, 0.015735173436628665, 0.017731455643465788, 0.018173140414903033, 0.019398584582704777, 0.02169310038049055, 0.026539311564590558, 0.03535296046523939, 0.035385747642310054, 0.04888791827373706, 0.049245437622213895, 0.05207317680771159, 0.05417399567567511, 0.06282683568521101, 0.07081244234996387, 0.0791015159737285, 0.0861333024505229, 0.087551141093, 0.08772509630067638, 0.09411335483093886, 0.09724491891582732, 0.0988625084503767, 0.10010837332313449, 0.10135018373031557, 0.10363364868768966, 0.10592656977963455, 0.10715180725875367, 0.11028563316661419, 0.1134331444864608, 0.12272955724493595, 0.12726221430956405, 0.12836304084244704, 0.12844801940952075, 0.13597112047166005, 0.15072073156516308, 0.15114633616168527, 0.15451628188316555, 0.1615175726493574, 0.17404961232856206, 0.19135505688083287, 0.22076206529908327, 0.23982012455994361, 0.24316977218179758, 0.24474350243009585, 0.24644188741540635, 0.24758741892662547, 0.26335250468309224, 0.2657968651067172, 0.26846031955131877, 0.2714547249648518, 0.2752925737618863, 0.27565342198838594, 0.27609484665183454, 0.2793520184225446, 0.28043167139411607, 0.28266772707753685, 0.2892355684930643, 0.28935629867942364, 0.2910561718681186, 0.29157899048672486, 0.30523440749876696, 0.31427715680728796, 0.3160799465423002, 0.3186409783533404, 0.31928256511258085, 0.3561905127846101, 0.35726422828011145, 0.37542826109696614, 0.3761737738905845, 0.383211879818265, 0.38412250885804855, 0.3892202718510143, 0.38973687161088244, 0.3903222364578105, 0.3921460092118373, 0.3927565640326585, 0.3929931412548428, 0.39550199927854035, 0.4004287509915203, 0.4012584947658139, 0.4188933113083467, 0.4222797931631154, 0.42547015568968916, 0.43096758007962505, 0.4318550285906202, 0.44109809732147987, 0.4411113062102109, 0.4414891960155095, 0.4419159595553975, 0.4455996720525952, 0.4496160036372285, 0.4498795159779206, 0.45469339795091446, 0.45799213961322816, 0.4582125417978449, 0.4593459923561455, 0.460330625273055, 0.46151099501089304, 0.4663379688495272, 0.4668223709854475, 0.4689158086068933, 0.47513195296839694, 0.48261581524557473, 0.5018848417890917, 0.5082613332032163, 0.5098025975578675, 0.5103574120239697, 0.5105994460715356, 0.5251305262758121, 0.5394744828867348, 0.5398929198384044, 0.5451644369828262, 0.5456458484033474, 0.5461471626817158, 0.5562494823653819, 0.5577837518050861, 0.5629099224574312, 0.5639313234396448, 0.5652484663007217, 0.56579448745316, 0.569761662149659, 0.5772534932071732, 0.57747271973829, 0.581312103742883, 0.5897117080989583, 0.6042594925582441, 0.6062770267185059, 0.6146506742853325, 0.619164121825413, 0.6233280853442464, 0.6562295767400433, 0.657191314194358, 0.6683396665157701, 0.6934098253153014, 0.7098168085084123, 0.7142551495263125, 0.7149292250425998, 0.7222898663646184, 0.7230225303240918, 0.7258703766070407, 0.7329257086975677, 0.734965964575541, 0.7367052753163055, 0.7369917675462134, 0.7377952054108926, 0.7431668020456873, 0.7442335599332027, 0.7494356056840505, 0.7592464069425915, 0.76513378821372, 0.7778400193541839, 0.783570871666886, 0.7892533149681009, 0.8151907848154715, 0.8163624371624438, 0.8203512192351479, 0.8231907075189483, 0.8254450315531231, 0.8327541619684787, 0.8340446391585701, 0.8365483397626655, 0.8380870238704485, 0.8396006152087313, 0.8424109556863868, 0.8449279007257848, 0.8493882720082039, 0.8499247624654398, 0.8570201197254786, 0.8708262595580382, 0.8757516578617132, 0.8851984472410247, 0.8943124667367437, 0.9099513775965703, 0.9158323560687229, 0.916124781335173, 0.9186099384299968, 0.9207850430116301, 0.9261702419474741, 0.9266763718478324, 0.927525059408, 0.9326181870421033, 0.9332024579347786, 0.9369755618858636, 0.9371907441403706, 0.9389797249122629, 0.939953948070523, 0.9420001470511369, 0.943798275663379, 0.9466861545841257, 0.947173907438536, 0.94880346770712, 0.9500849345227314, 0.953356815578679, 0.9535829930299445, 0.9596111063743575, 0.9686141854288621, 0.9810644372158931] No newline at end of file
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above, this is not needed

- Added docstring to `HawkesProcess`
- Avoided LaTeX in docstrings
- Remove all `@inbounds`
@JoseKling JoseKling mentioned this pull request Jul 21, 2025
3 tasks
- Reference parameters inside `hawkes_times.txt`
  file, instead of hard coded in tests.
- Fixed return types for `rand`, `fit`, and `time_change`
  and added tests.
@JoseKling
Copy link
Owner Author

Hello.
Any problems left here? Had some issues lately but I am back.
Cheers

@rsenne
Copy link
Collaborator

rsenne commented Oct 24, 2025

Anything else we want to do for this PR?

@JoseKling
Copy link
Owner Author

I think all issues were resolved...

@rsenne
Copy link
Collaborator

rsenne commented Oct 27, 2025

I think all issues were resolved...

Are we good to merge then? What would be good to work on next?

@JoseKling
Copy link
Owner Author

Up to you guys to approve ; )

My suggestion would be work on issue #58 and #57. I already have some modifications for the latter, just small changes. And after that I could resume PR #60, with these possible API modifications already in place.

@rsenne rsenne self-requested a review October 28, 2025 13:32
@rsenne
Copy link
Collaborator

rsenne commented Oct 28, 2025

I requested myself for a review and went ahead and approved the changes now that I've seen the code. It seems @gdalle and @jucheval were more involved in this than I was so I don't know if you want to wait for their approval?

Copy link
Collaborator

@jucheval jucheval left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I quickly looked over the changes and they seem fine to me.

@JoseKling
Copy link
Owner Author

Since this is a big PR and @gdalle was quite involved, I just want to make sure. Is this good to go or do you have anything to add?

@gdalle
Copy link
Collaborator

gdalle commented Nov 6, 2025

This is your project now, fire away :)
As a rule of thumb, and I know other package authors disagree, I'd rather have an implementation than none at all, even if some aspects can be improved later on. So I hope you feel encouraged to add stuff here and make it your own. There aren't many users right now, but once it's a bit more thorough, maybe you can solicit external feedback and that will be much more useful to guide you than my old PhD memories

@JoseKling JoseKling merged commit 6af56b1 into main Nov 6, 2025
5 checks passed
@JoseKling JoseKling deleted the hawkes branch December 19, 2025 15:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants