lib/python/pygrass: fix Python 3.14 pickling error in grid tests#7257
lib/python/pygrass: fix Python 3.14 pickling error in grid tests#7257SyedAhad01 wants to merge 37 commits intoOSGeo:mainfrom
Conversation
|
Read our contribution guide, you'll see that most of the back and forth of precious shared CI time can be avoided by using precommit (or prek) to have most simple formatting checks already made. And I assume you're using an AI or agent or something like that, and I saw you were in VS Code in another PR, so you can reread and stage each line to make sure you're only committing what is relevant to the PR. Touching unrelated lines only makes reviewing harder: the last PR together, it started with many many changes, whilst it ended up being less than 10 lines, two parts. Way easier when concise like that. |
|
Thank you for the feedback and the guidance. I will read the contribution guide and set up pre-commit before pushing further changes. I will also make sure to carefully review each line and only commit what is directly relevant to the fix |
|
Don't worry, your general approach is fine. There's already some examples in the repo of the pattern needed, that I've fixed as a proof of concept a while ago. And it looked a bit like that. |
|
You don’t have to keep your branch (and draft) updated with main, after each of our commits. We can’t have all hundreds of PRs redo all CI after each commit, it would be exponential |
|
Unless you’re pushing new changes and it will run again, then ya, it’s a good time to make it run on updated code and linters |
|
Thank you for clarifying. I understand now I should only update the branch when I am pushing my own new changes, not after every commit to main. I will keep that in mind for future PRs. Sorry for the unnecessary CI runs. |
|
Hi @echoix just a gentle heads-up that the PR is ready for review whenever you have a moment. All 26 checks are passing. I appreciate your time and continued guidance throughout this process. |
echoix
left a comment
There was a problem hiding this comment.
I'm not sure it is working as you'd think yet. If you look at pytest test in macOS or Windows (Linux should work fine), let's take Windows, the tests of pygrass grid tests should not end up with expected failure anymore. At one point in your development, assuming everything went right, the tests should've been fixed and passing, and they would fail because of an unexpected success. That would have been because the custom attribute @xfail_mp_spawn that wasn't removed.
I see that that attribute's error has changed, but the tests that should've been fixed doesn't have that attribute removed. So the tests still fail, but we ignore it if the default start method is spawn, like Windows.
So, there's still a bit of work to do.
echoix
left a comment
There was a problem hiding this comment.
To help you out, I identified the lines where the attributes should be removed, plus other second reading comments
| ), | ||
| check=multiprocessing.get_start_method() == "spawn" | ||
| or surface != "non_exist_surface", |
There was a problem hiding this comment.
That logic here is unexpected to me, could you explain a bit?
There was a problem hiding this comment.
I added this condition to handle the case where the surface does not exist. On Windows/macOS (spawn), the subprocess always fails, so we let the error propagate. On Linux (fork), we skip the error so the test can still check if mapsets were cleaned. Let me know sir if there is a simpler way to do this
|
After removing xfail_mp_spawn, all tests now fail with RuntimeError: Subprocess failed with exit code 1 on both macOS and Windows. It seems the issue is inside GridModule itself where the internal worker functions are not picklable under spawn. Could you point me to the example in the repo that shows the correct pattern for fixing this? I want to make sure I apply the right fix |
|
I investigated the failures further and it looks like the issue is not in the test wrapper but inside GridModule itself, likely due to multiprocessing spawn requiring top-level worker functions. I’ll explore the GridModule implementation to identify non-picklable functions and propose a fix. Please let me know if there is an existing pattern in the repo I should follow. |
|
For the tests, it’s about right. For the other part, I’m not sure there’s a pattern yet, and that is the core of the work, to investigate what’s happening under the hood. Debugging subprocesses is maybe a bit harder, but you probably did a little bit. It will be easier for you if you manage to do that locally. That’s not the kind of task that you can do by iterating in a PR. |
|
Thank you for the feedback. I understand now I should debug locally using spawn mode instead of iterating in the PR. I will reproduce the issue locally on Windows with multiprocessing.set_start_method('spawn', force=True), find the root cause, and then submit a clean fix. I will convert back to "Ready for review" once the local debugging is done. |
|
I just had an idea. In some python code recently-ish (last two years), we ended up needing to pass around env/session around in order to be able to properly initialize c-based tools. Is it possible that one of the functions here need to have that too? Like the failures would’ve been because the environment variables are not defined in the new subrosse |
|
Thank you for the suggestion! I implemented the fix by passing the environment variables explicitly through the work tuple to |
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Edouard Choinière <27212526+echoix@users.noreply.github.com>
Co-authored-by: Edouard Choinière <27212526+echoix@users.noreply.github.com>
Co-authored-by: Edouard Choinière <27212526+echoix@users.noreply.github.com>
…spawn compatibility
Pass env explicitly to copy_groups so subprocess workers inherit the correct GRASS session environment in spawn mode (macOS/Windows). Fixes RuntimeError: Subprocess failed with exit code 1 in grid tests.
Pass env explicitly to copy_groups so subprocess workers inherit the correct GRASS session environment in spawn mode (macOS/Windows). Fixes RuntimeError: Subprocess failed with exit code 1 in grid tests.
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
ee756ec to
05cf371
Compare
|
What about actually adding Python 3.14 to pytest test workflow, temporarily so you can confirm it actually runs ok with 3.14, the topic of your PR. You could revert it afterwards. Did you manage to get any part working correctly on 3.14 locally? I see you are still working on it and pushing changes |
|
I haven't been able to test locally on Python 3.14 yet, which is why I have been iterating in the PR. I will add Python 3.14 to the pytest workflow temporarily as you suggested so CI can confirm whether the fix works. I'll revert it afterwards once verified. The changes I've made so far: I'll push the workflow update shortly. |
Fixes PicklingError failures when running tests with Python 3.14.
Python 3.14 cannot pickle local functions defined inside other functions
when using multiprocessing. This fix moves all run_grid_module inner
functions to module-level helper functions and uses functools.partial
to pass parameters.
Related to #6104