Skip to content

discussion: docs: improve documentation of prudence argument#589

Draft
maelle wants to merge 5 commits intomainfrom
prudence-rmd
Draft

discussion: docs: improve documentation of prudence argument#589
maelle wants to merge 5 commits intomainfrom
prudence-rmd

Conversation

@maelle
Copy link
Copy Markdown
Collaborator

@maelle maelle commented Feb 6, 2025

Fix #508
And some other suggested changes for readability.

@maelle maelle requested a review from krlmlr February 6, 2025 09:55
Comment thread vignettes/prudence.Rmd
Sys.setenv(DUCKPLYR_FALLBACK_COLLECT = 0)
```

Unlike traditional data frames, duckplyr defers computation until absolutely necessary, allowing DuckDB to optimize execution.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is explained better in one of the sections.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but is this sentence harmful here?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be discouraging to have a first sentence that's "complicated".

Comment thread vignettes/prudence.Rmd Outdated
Thrifty duckplyr frames are a compromise between lavish and frugal, discussed further below.
Thrifty duckplyr frames are a compromise between lavish and frugal, discussed below.

### Thrift
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found the previous location of this section a bit jarring.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to give the readers time to digest "lavish" and "frugal".

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happened to me when reading was that I saw it would be discussed later and thought, ok then. But when the "Thrift" section appeared, it felt misplaced.

Comment thread vignettes/prudence.Rmd Outdated


### Enforcing DuckDB operation
### Side effect: Enforcing DuckDB operation
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is an important section but also does not go with the flow of the rest of the vignette.

Copy link
Copy Markdown
Member

@krlmlr krlmlr Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, moving to "fallback".

Later.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the PR where I edit the fallback vignette, I added a section about this but that refers to this vignette.

Comment thread vignettes/prudence.Rmd
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Feb 6, 2025

This is how benchmark results would change (along with a 95% confidence interval in relative change) if 00b4546 is merged into main:

  • ✔️001_tpch_01: 24.2ms -> 24.3ms [-3.94%, +5.03%]
  • ✔️001_tpch_02: 66ms -> 64.9ms [-4.92%, +1.72%]
  • ✔️001_tpch_03: 39ms -> 38.9ms [-1.3%, +0.92%]
  • ✔️001_tpch_04: 27.6ms -> 27.7ms [-1.03%, +1.79%]
  • ✔️001_tpch_05: 57.8ms -> 57.5ms [-2.09%, +1.05%]
  • ✔️001_tpch_06: 11.6ms -> 11.5ms [-3.46%, +3.17%]
  • ❗🐌001_tpch_07: 70.5ms -> 71.3ms [+0.06%, +2.31%]
  • ✔️001_tpch_08: 98.1ms -> 97.8ms [-1.66%, +0.95%]
  • ✔️001_tpch_09: 70.4ms -> 70.9ms [-1.24%, +2.6%]
  • ✔️001_tpch_10: 47.2ms -> 47.4ms [-0.54%, +1.17%]
  • ✔️001_tpch_11: 37.7ms -> 37.6ms [-1.89%, +1.42%]
  • ✔️001_tpch_12: 21.1ms -> 21ms [-2.12%, +1.67%]
  • ✔️001_tpch_13: 18.2ms -> 18.5ms [-1.14%, +4.89%]
  • ✔️001_tpch_14: 16.7ms -> 16.7ms [-1.52%, +1.67%]
  • ✔️001_tpch_15: 36.1ms -> 36.2ms [-1.35%, +2.29%]
  • ✔️001_tpch_16: 34.2ms -> 34.4ms [-0.43%, +1.7%]
  • ✔️001_tpch_17: 30.8ms -> 30.6ms [-2.18%, +0.95%]
  • ✔️001_tpch_18: 27.8ms -> 27.7ms [-2.02%, +1.68%]
  • ✔️001_tpch_19: 38ms -> 38ms [-0.54%, +0.95%]
  • ✔️001_tpch_20: 49.8ms -> 49.9ms [-0.63%, +1.35%]
  • ✔️001_tpch_21: 78.7ms -> 78.7ms [-1.46%, +1.51%]
  • ✔️001_tpch_22: 45.9ms -> 45.9ms [-0.65%, +0.75%]
  • ✔️010_tpch_01: 84.4ms -> 85.3ms [-2.31%, +4.37%]
  • ✔️010_tpch_02: 70.9ms -> 70.4ms [-2.49%, +0.85%]
  • ✔️010_tpch_03: 57.4ms -> 58.7ms [-2.05%, +6.49%]
  • ✔️010_tpch_04: 42.3ms -> 43.4ms [-2.71%, +7.55%]
  • ✔️010_tpch_05: 89.1ms -> 87.8ms [-3.6%, +0.61%]
  • ✔️010_tpch_06: 29.5ms -> 29.2ms [-9.24%, +7.28%]
  • ✔️010_tpch_07: 101ms -> 101ms [-2.41%, +1.11%]
  • ✔️010_tpch_08: 126ms -> 127ms [-2.34%, +2.88%]
  • ✔️010_tpch_09: 117ms -> 115ms [-3.73%, +0.2%]
  • ✔️010_tpch_10: 75.3ms -> 74.7ms [-2.21%, +0.58%]
  • ✔️010_tpch_11: 39.4ms -> 39.7ms [-2.71%, +4.44%]
  • ✔️010_tpch_12: 50.7ms -> 50.8ms [-1.02%, +1.45%]
  • ✔️010_tpch_13: 50.8ms -> 51.4ms [-1.75%, +4.3%]
  • ✔️010_tpch_14: 35ms -> 36.1ms [-5.35%, +12.01%]
  • ✔️010_tpch_15: 54.1ms -> 54.2ms [-5.98%, +6.18%]
  • ✔️010_tpch_16: 41.6ms -> 40.1ms [-8.73%, +1.63%]
  • ✔️010_tpch_17: 56.9ms -> 54.9ms [-10.65%, +3.75%]
  • ✔️010_tpch_18: 56.2ms -> 55.3ms [-7.75%, +4.57%]
  • ✔️010_tpch_19: 87.1ms -> 85.7ms [-3.46%, +0.29%]
  • ✔️010_tpch_20: 65.2ms -> 65.4ms [-2.38%, +2.9%]
  • ✔️010_tpch_21: 253ms -> 252ms [-4.82%, +3.6%]
  • ✔️010_tpch_22: 54.4ms -> 55.2ms [-1.22%, +4.26%]
  • ✔️100_tpch_01: 351ms -> 342ms [-19.66%, +14.69%]
  • ✔️100_tpch_02: 123ms -> 128ms [-3.47%, +11.63%]
  • ✔️100_tpch_03: 174ms -> 169ms [-8.62%, +3.02%]
  • 🚀100_tpch_04: 150ms -> 146ms [-4.2%, -1.17%]
  • ✔️100_tpch_05: 268ms -> 267ms [-10.73%, +9.83%]
  • ✔️100_tpch_06: 99.3ms -> 93.3ms [-23.4%, +11.32%]
  • ✔️100_tpch_07: 224ms -> 231ms [-9.51%, +15.79%]
  • ✔️100_tpch_08: 262ms -> 253ms [-16.63%, +9.17%]
  • ✔️100_tpch_09: 329ms -> 329ms [-18.19%, +17.68%]
  • ✔️100_tpch_10: 218ms -> 210ms [-9.5%, +1.89%]
  • ✔️100_tpch_11: 84ms -> 93.9ms [-32.49%, +56.16%]
  • ✔️100_tpch_12: 173ms -> 178ms [-7.56%, +13.88%]
  • ✔️100_tpch_13: 313ms -> 327ms [-0.04%, +9.06%]
  • ✔️100_tpch_14: 110ms -> 111ms [-11.12%, +13.29%]
  • ✔️100_tpch_15: 207ms -> 203ms [-10.16%, +6.08%]
  • ✔️100_tpch_16: 120ms -> 121ms [-6.39%, +9.05%]
  • ✔️100_tpch_17: 176ms -> 176ms [-10.92%, +10.95%]
  • ✔️100_tpch_18: 194ms -> 189ms [-14.23%, +8.99%]
  • ✔️100_tpch_19: 269ms -> 270ms [-17%, +17.6%]
  • ✔️100_tpch_20: 175ms -> 182ms [-6.73%, +14.48%]
  • ✔️100_tpch_21: 1.37s -> 1.33s [-5.17%, +0.55%]
  • ✔️100_tpch_22: 156ms -> 151ms [-13.54%, +7.83%]

Further explanation regarding interpretation and methodology can be found in the documentation.

@krlmlr
Copy link
Copy Markdown
Member

krlmlr commented Feb 7, 2025

This would be easier to review in two PRs. Will split and review individually.

@krlmlr
Copy link
Copy Markdown
Member

krlmlr commented Feb 7, 2025

Other commit now in #593.

@krlmlr krlmlr marked this pull request as draft February 7, 2025 03:23
@krlmlr krlmlr mentioned this pull request Feb 7, 2025
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Feb 7, 2025

This is how benchmark results would change (along with a 95% confidence interval in relative change) if 3385047 is merged into main:

  • ✔️001_tpch_01: 23.9ms -> 23.8ms [-5.94%, +5.23%]
  • ✔️001_tpch_02: 62.9ms -> 63.1ms [-0.45%, +1.1%]
  • ✔️001_tpch_03: 38.8ms -> 39.2ms [-0.69%, +2.85%]
  • ✔️001_tpch_04: 28.3ms -> 28.1ms [-4.24%, +2.75%]
  • ✔️001_tpch_05: 56.2ms -> 56.2ms [-0.74%, +0.74%]
  • ✔️001_tpch_06: 11.4ms -> 11.3ms [-4.78%, +1.75%]
  • ✔️001_tpch_07: 70.4ms -> 70.1ms [-1.28%, +0.37%]
  • ✔️001_tpch_08: 96.9ms -> 96.4ms [-1.9%, +0.87%]
  • ✔️001_tpch_09: 67.1ms -> 67.4ms [-0.53%, +1.55%]
  • ✔️001_tpch_10: 46.8ms -> 47ms [-0.8%, +1.48%]
  • ✔️001_tpch_11: 37.8ms -> 37.7ms [-1.35%, +0.54%]
  • ✔️001_tpch_12: 20.9ms -> 20.9ms [-3.64%, +3.32%]
  • ✔️001_tpch_13: 18.3ms -> 17.8ms [-5.64%, +0.77%]
  • ✔️001_tpch_14: 16.5ms -> 16.5ms [-2.29%, +2.44%]
  • ✔️001_tpch_15: 36.2ms -> 36.1ms [-2.11%, +1.48%]
  • ✔️001_tpch_16: 34.2ms -> 34.3ms [-1.22%, +2.04%]
  • ✔️001_tpch_17: 30.4ms -> 30.8ms [-0.47%, +3.29%]
  • ✔️001_tpch_18: 28.2ms -> 27.9ms [-2.85%, +0.65%]
  • ✔️001_tpch_19: 37.5ms -> 37.8ms [-0.18%, +2.23%]
  • ✔️001_tpch_20: 49.4ms -> 49.2ms [-1.53%, +0.9%]
  • ✔️001_tpch_21: 78.4ms -> 78.5ms [-0.92%, +1.18%]
  • ✔️001_tpch_22: 45.5ms -> 45.5ms [-0.88%, +1.1%]
  • ✔️010_tpch_01: 86.4ms -> 83.4ms [-9.63%, +2.8%]
  • ✔️010_tpch_02: 69.9ms -> 69.9ms [-1.36%, +1.39%]
  • ✔️010_tpch_03: 58.2ms -> 56.4ms [-7.09%, +0.87%]
  • ✔️010_tpch_04: 41.6ms -> 42.9ms [-1.45%, +7.5%]
  • ✔️010_tpch_05: 87.8ms -> 87.7ms [-1.31%, +1.06%]
  • ✔️010_tpch_06: 29.2ms -> 28.4ms [-12.99%, +7.23%]
  • ✔️010_tpch_07: 101ms -> 101ms [-5.01%, +3.63%]
  • ✔️010_tpch_08: 124ms -> 127ms [-1.02%, +4.87%]
  • ✔️010_tpch_09: 114ms -> 116ms [-0.52%, +2.61%]
  • ✔️010_tpch_10: 73.4ms -> 73.3ms [-1.66%, +1.39%]
  • ✔️010_tpch_11: 37.8ms -> 38.5ms [-2.27%, +5.96%]
  • ✔️010_tpch_12: 49.1ms -> 50.5ms [-0.42%, +6.11%]
  • ✔️010_tpch_13: 49.3ms -> 49.3ms [-1.43%, +1.82%]
  • ✔️010_tpch_14: 33.2ms -> 33.4ms [-2.15%, +3.04%]
  • ❗🐌010_tpch_15: 49.9ms -> 54.1ms [+1.98%, +15.17%]
  • ✔️010_tpch_16: 40.4ms -> 38.9ms [-8.55%, +1.17%]
  • ✔️010_tpch_17: 53.4ms -> 53.1ms [-5.23%, +4.2%]
  • ✔️010_tpch_18: 56ms -> 52.6ms [-12.56%, +0.34%]
  • ✔️010_tpch_19: 82.9ms -> 82.5ms [-1.23%, +0.34%]
  • ✔️010_tpch_20: 62.9ms -> 62.9ms [-2.7%, +2.64%]
  • ✔️010_tpch_21: 242ms -> 243ms [-1.24%, +2.13%]
  • ✔️010_tpch_22: 52.4ms -> 53.1ms [-0.9%, +3.58%]
  • ✔️100_tpch_01: 352ms -> 350ms [-20.1%, +19.22%]
  • ✔️100_tpch_02: 124ms -> 125ms [-2.06%, +3.31%]
  • ✔️100_tpch_03: 165ms -> 166ms [-8.99%, +10.19%]
  • ✔️100_tpch_04: 146ms -> 143ms [-12.28%, +8.46%]
  • ✔️100_tpch_05: 242ms -> 260ms [-4.52%, +19.43%]
  • ✔️100_tpch_06: 97ms -> 88.8ms [-18.76%, +1.83%]
  • ✔️100_tpch_07: 220ms -> 221ms [-7.56%, +8.83%]
  • ✔️100_tpch_08: 257ms -> 260ms [-5.82%, +7.71%]
  • ✔️100_tpch_09: 343ms -> 337ms [-15.8%, +12.64%]
  • ✔️100_tpch_10: 208ms -> 208ms [-5.73%, +5.27%]
  • ✔️100_tpch_11: 81.7ms -> 84.7ms [-4.21%, +11.33%]
  • ✔️100_tpch_12: 170ms -> 177ms [-5.09%, +13.27%]
  • ✔️100_tpch_13: 301ms -> 295ms [-8.79%, +4.75%]
  • ✔️100_tpch_14: 117ms -> 119ms [-11.6%, +14.32%]
  • ✔️100_tpch_15: 204ms -> 206ms [-18.03%, +19.56%]
  • ✔️100_tpch_16: 117ms -> 119ms [-8.67%, +11.46%]
  • ✔️100_tpch_17: 175ms -> 177ms [-16.06%, +18.55%]
  • ✔️100_tpch_18: 205ms -> 216ms [-7.19%, +17.45%]
  • ✔️100_tpch_19: 267ms -> 265ms [-10.6%, +8.76%]
  • ✔️100_tpch_20: 166ms -> 168ms [-8.92%, +11.86%]
  • ✔️100_tpch_21: 1.36s -> 1.34s [-11.14%, +8.7%]
  • ✔️100_tpch_22: 157ms -> 156ms [-16.42%, +15.37%]

Further explanation regarding interpretation and methodology can be found in the documentation.

@krlmlr krlmlr force-pushed the prudence-rmd branch 2 times, most recently from f404f17 to 3f29cba Compare February 7, 2025 04:11
Copy link
Copy Markdown
Member

@krlmlr krlmlr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping open for discussion.

Comment thread vignettes/prudence.Rmd
Sys.setenv(DUCKPLYR_FALLBACK_COLLECT = 0)
```

Unlike traditional data frames, duckplyr defers computation until absolutely necessary, allowing DuckDB to optimize execution.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but is this sentence harmful here?

Comment thread vignettes/prudence.Rmd Outdated
The term "prudence" is introduced here to set a clear distinction from the concept of "laziness", and because "control of automatic materialization" is a mouthful.

## Eager and lazy computation
## DuckDB optimization: lazy evaluation
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"lazy evaluation" is easy to confuse with how R handles arguments.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, this should be "lazy computation", sorry.

Comment thread vignettes/prudence.Rmd Outdated
Thrifty duckplyr frames are a compromise between lavish and frugal, discussed further below.
Thrifty duckplyr frames are a compromise between lavish and frugal, discussed below.

### Thrift
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to give the readers time to digest "lavish" and "frugal".

Comment thread vignettes/prudence.Rmd Outdated


### Enforcing DuckDB operation
### Side effect: Enforcing DuckDB operation
Copy link
Copy Markdown
Member

@krlmlr krlmlr Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, moving to "fallback".

Later.

@krlmlr
Copy link
Copy Markdown
Member

krlmlr commented Nov 26, 2025

@copilot: Resolve conflicts.

Copy link
Copy Markdown
Contributor

Copilot AI commented Nov 26, 2025

@krlmlr I've opened a new pull request, #817, to work on those changes. Once the pull request is ready, I'll request review from you.

@krlmlr
Copy link
Copy Markdown
Member

krlmlr commented Nov 26, 2025

Completely borked by Copilot. This is a "discussion-only" PR, not intended to be merged.

@krlmlr krlmlr changed the title docs: improve documentation of prudence argument discussion: docs: improve documentation of prudence argument Nov 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

add "memory protection" somewhere near definition of funneling

3 participants