Skip to content

Conversation

@schochastics
Copy link
Contributor

@schochastics schochastics commented Jan 19, 2025

This PR refactors single bracket manipulating of a graph (g[1:3,4:6] <- 2) (#1465).

The only real change is the use of get_edge_ids instead of the old [.igraph to get edge ids which makes the function more readable, slightly faster and a lower memory footprint.

Fixes an unintended behaviour (fix #1662)

@aviator-app
Copy link
Contributor

aviator-app bot commented Jan 19, 2025

Current Aviator status

Aviator will automatically update this comment as the status of the PR changes.
Comment /aviator refresh to force Aviator to re-examine your PR (or learn about other /aviator commands).

This PR was merged manually (without Aviator). Merging manually can negatively impact the performance of the queue. Consider using Aviator next time.


See the real-time status of this PR on the Aviator webapp.
Use the Aviator Chrome Extension to see the status of your PR within GitHub.

Copy link
Contributor

@krlmlr krlmlr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I like the size and intention of this PR.

} else {
todel <- unlist(x[[i, j, ..., edges = TRUE]])
edge_pairs <- expand.grid(i, j)
edge_ids <- get_edge_ids(x, c(rbind(edge_pairs[, 1], edge_pairs[, 2])))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this covered by tests?

Suggested change
edge_ids <- get_edge_ids(x, c(rbind(edge_pairs[, 1], edge_pairs[, 2])))
edge_ids <- get_edge_ids(x, as.vector(t(edge_pairs)))

The interface of get_edge_ids() is interesting. Should we extend that to accept two-column data frames?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this covered by tests?

Going through the existing tests, I realize there are some gaps. I will add a set of tests for this functionality

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The interface of get_edge_ids() is interesting. Should we extend that to accept two-column data frames?

It has been bothering me mildly for years as a user that edges need to be supplied as a vector (same with add_edges() and delete_edges() and probably more). However that's required by the c core. It might be too much of a fundamental change at this point?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The c(rbind(...)) pattern is probably fine, I forgot the semantics for vectors:

c(rbind(1:3, 4:6))
#> [1] 1 4 2 5 3 6
c(t(data.frame(1:3, 4:6)))
#> [1] 1 4 2 5 3 6
as.vector(t(data.frame(1:3, 4:6)))
#> [1] 1 4 2 5 3 6

Created on 2025-01-19 with reprex v2.1.1

There are two layers here: the C core and the R interface. We should provide an idiomatic R user interface that translates to what the C core needs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c(rbind(...)) is faster for data frames by an order of magnitude, but not for matrices:

df <- as.data.frame(cbind(1:30, 4:33))

bench::mark(
  c(t(df)),
  c(rbind(df[, 1], df[, 2])),
  c(rbind(df[[1]], df[[2]]))
)
#> # A tibble: 3 × 6
#>   expression                      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                 <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 c(t(df))                    12.96µs  14.51µs    57926.    74.4KB     34.8
#> 2 c(rbind(df[, 1], df[, 2]))   5.33µs   5.99µs   157537.      576B     47.3
#> 3 c(rbind(df[[1]], df[[2]]))   3.65µs   4.26µs   219083.      576B     43.8

m <- cbind(1:30, 4:33)

bench::mark(
  c(t(m)),
  c(rbind(m[, 1], m[, 2]))
)
#> # A tibble: 2 × 6
#>   expression                    min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>               <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 c(t(m))                     820ns 983.94ns   964028.      576B     96.4
#> 2 c(rbind(m[, 1], m[, 2]))    984ns   1.15µs   791706.      576B      0

Created on 2025-01-20 with reprex v2.1.1

Draft PR for new UI in #1663.

@schochastics schochastics marked this pull request as draft January 19, 2025 11:40
@schochastics
Copy link
Contributor Author

should probably be blocked until #1662 is resolved

} else {
todel <- unlist(x[[i, j, ..., edges = TRUE]])
edge_pairs <- expand.grid(i, j)
edge_ids <- get_edge_ids(x, c(rbind(edge_pairs[, 1], edge_pairs[, 2])))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c(rbind(...)) is faster for data frames by an order of magnitude, but not for matrices:

df <- as.data.frame(cbind(1:30, 4:33))

bench::mark(
  c(t(df)),
  c(rbind(df[, 1], df[, 2])),
  c(rbind(df[[1]], df[[2]]))
)
#> # A tibble: 3 × 6
#>   expression                      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                 <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 c(t(df))                    12.96µs  14.51µs    57926.    74.4KB     34.8
#> 2 c(rbind(df[, 1], df[, 2]))   5.33µs   5.99µs   157537.      576B     47.3
#> 3 c(rbind(df[[1]], df[[2]]))   3.65µs   4.26µs   219083.      576B     43.8

m <- cbind(1:30, 4:33)

bench::mark(
  c(t(m)),
  c(rbind(m[, 1], m[, 2]))
)
#> # A tibble: 2 × 6
#>   expression                    min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>               <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 c(t(m))                     820ns 983.94ns   964028.      576B     96.4
#> 2 c(rbind(m[, 1], m[, 2]))    984ns   1.15µs   791706.      576B      0

Created on 2025-01-20 with reprex v2.1.1

Draft PR for new UI in #1663.

@schochastics
Copy link
Contributor Author

schochastics commented Jan 20, 2025

waiting for #1663 to be merged (then rebase and adapt)

@krlmlr krlmlr changed the title refactor: single bracket manipulating of a graph (#1465) fix!: Subset assignment of a graph avoids addition of double edges and ignores loops unless the new loops argument is set to TRUE Jan 20, 2025
@krlmlr
Copy link
Contributor

krlmlr commented Jan 20, 2025

Do you want to add more tests here?

@schochastics
Copy link
Contributor Author

Do you want to add more tests here?

Yes!

Copy link
Contributor

@krlmlr krlmlr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me after #1663. Please auto-squash or squash when good on your end.

@schochastics schochastics marked this pull request as ready for review January 23, 2025 20:07
@schochastics schochastics merged commit cee57e1 into igraph:main Jan 23, 2025
22 checks passed
schochastics added a commit to schochastics/rigraph that referenced this pull request Jan 27, 2025
…d ignores loops unless the new `loops` argument is set to `TRUE` (igraph#1661)
krlmlr added a commit that referenced this pull request Oct 13, 2025
igraph 2.2.0

Update C core to version 0.10.17. See <https://github.com/igraph/rigraph/blob/20552ef94aed6ae4b23465ae8c7e4d3b0e558c71/src/vendor/cigraph/CHANGELOG.md> for a complete changelog, in particular the section "Breaking changes".

- Generate almost all R implementations (#2047).

- Expose `align_layout()` and add to `layout_nicely()` to align layout with axis automatically (#1907, #1957, #1958).

- Expose `simple_cycles()` which lists all simple cycles (#1573, #1580).

- Expose `is_complete()`, `is_clique()` and `is_ivs()` (#1316, #1388, #1581).

- Expose `find_cycle()` (#1471, #1571).

- Expose `feedback_vertex_set()` to find a minimum feedback vertex set in a graph (#1446, #1447, #1560).

- Add `weights` parameter to `local_scan()` (#1082, #1448, #1982).

- Add more layouts to `tkplot()` (#160, #1967).

- Add `plot(mark.lwd = )` to change line width of mark.groups (#306, #1898).

- Add `plot(vertex.label.angle = , vertex.label.adj = )` arguments to rotate vertex labels (#106, #1899).

- Add relative size scaling to vertices in `plot()` (@gvegayon, #172).

- Split `sample_bipartite()` into two functions for the G(n, m) and G(n, p) case (#630, #1692).

- Implement multi attribute assignment (#55, #1916) and adding attributes via data frames (#1373, #1669, #1716). Support factors in `graph_from_data_frame()` (#34, #1829).

- All `_hrg()` functions check their argument (#1074, #1699).

- HRG printing with `type = "auto"` uses `"plain"` for large trees (#1879).

- `get_edge_ids()` accepts data frames and matrices (#1663).

- `igraph_version()` returns version of C core in an attribute (#1208, #1781).

- Breaking change: change arguments default and order for `graph_from_lcf()` (#1858, #1872).

- Breaking change: Subset assignment of a graph avoids addition of double edges and ignores loops unless the new `loops` argument is set to `TRUE` (#1662, #1661).

- Breaking change: remove deprecated `neimode` parameter from `bfs()` and `dfs()` (#1105, #1526).

- Breaking change: stricter deprecation of non-functional parameters of `layout_with_kk()` and `layout_with_fr()` (#1108, #1628).

- `NA` attribute values are replaced with default values in `plot()` (#293, #1707).

- `NA` checking only in from/to columns of edge data frame (#1906).

- Keep vertex attribute type for `disjoint_union()` (#1640, #1909).

- Error in bipartite projection if `type` is not a vertex attribute (#898, #1889).

- Do not try to destroy non-initialized SIR objects upon error (#1888).

- Added proper `NA` handling for matrix inputs (#917, #918, #1828).

- Remove string matrix support from functions operating on biadjacency matrices (#1540, #1542, #1803).

- Integer vectors are validated before transferring them to the C library (#1434, #1582).

- Changed base location for `graph_from_graphdb()` and added tests (#1712, #1732).

- Recycling of logical vectors when indexing into edge/vertex selectors now throws an error (#848, #1731).

- Use `function()` instead of `(x)` in `arrow.mode` (#1722).

- Temporarily disable generating an interface for `igraph_simple_cycles_callback()` as the framework for handling callback functions is not yet present.

- Adjust loop position to vertex size in `plot()` (#1980).

- Don't rescale plot coordinates to `[-1,1] x [-1,1]` by default (#1492, #1956, #1962).

- Fail if `"layout"` attribute doesn't match the number of vertices (#1880).

- Automatically arrange loops in `plot()` (#407, #556, #1881).

- Vectorized drawing of arrows in `plot()` (#257, #1904).

- Allow more than one edge label font family in `plot()` (#37, #1896).

- Pie shapes now work as intended (#1882, #1883).

- Loops not plotted on canvas (#1799, #1800).

- Replace `NA` values in `label` attributes in `plot()` with default values (#1796, #1797).

- Removed duplicated plotting of arrow heads (#640, #1709).

- Correct mapping of edge label properties in plots when loops are present (#157, #1706).

- Welcome Maëlle Salmon and David Schoch as authors (#1733), add author links (#1821).

- Remove demos (#2008).

- Add 2023 preprint (#1240, #1984).

- Update allcontributors info (#1975).

- Link to replacements of deprecated functions (#1823).

- Add documentation of all file formats to `read_graph()` and `write_graph()` (#777, #1969). Recommend `saveRDS()` and `readRDS()` for saving and loading graphs (#1242, #1700).

- Document return value of `make_clusters()` (#1794).

- Clarify that `girth()` returns `Inf` for acyclic graphs (@eqmooring, #1831).

- Clarify the use of weights in `layout_with_kk()`.

- Refer to current latest version of R in troubleshooting page.

- Fix typos in `laplacian_matrix()` documentation.

- Document ellipsis in `cohesion()` (#971, #1985).

- Correct the description of the `weights` parameter of `hits_scores()`.

- Better describe output of `all_shortest_paths()` (#1029, #1778).

- `make_graph()` now supports `"Groetzsch"` as an alias of `"Grotzsch"`. This change was implemented in the C core.

- Update description of `order` parameter of `ego()` and related functions (#1746).

- Added lifecycle table (#1525).

- Add more about igraph.r2cdocs in the contributing guide (#1686, #1697).

- Accelerate check if an index sequence corresponds to the entire list of vertices (#1427, #1818).

- Faster single bracket querying of a graph (#1465, #1658).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Intended behaviour of [<-.igraph

2 participants