feat: expose root waker#518
Conversation
11431b7 to
04347ef
Compare
|
Hey @Patryk27 thank you for this, and apologies for being slow to comment. There's nothing wrong with this implementation per se, and we could ship it as is, but, after a lot of discussion with @charypar, we're struggling to understand what legitimate use cases it solves for. The example you've given (using async IO directly from inside the update function) is one that we're trying to avoid. Instead we want to preserve the testing story for Crux, where tests are easy, deterministic and fast because all they do is inspect the effect tree, resolving where needed. As soon as you bring real IO into the update function it now has to deal with an external executor, potential timing problems and race conditions, interleaving between executors, mocks of e.g. databases, and all the problems associated with testing in non-Crux apps. We do, however, understand the need to be able to run async IO from inside the core (but outside the update function), and so initially introduced middleware layers to intercept effects on their way to the shell and handle them locally in Rust. It's very likely that this wasn't the most ergonomic way to achieve this, and so we recently raised an RFC #514 (it has a sample implementation) for an Effect Router that routes effects to one of several lanes, either (a) serialised over the bridge to the shell, (b) directly on a custom FFI, or (c) locally to a Rust library, or SDK, on an external async runtime. By keeping real work out of the managed effects system, we keep the tests easy, and easy to reason about, fast and portable. A Crux core can be tested, in seconds, anywhere that We want the Crux API to guide users into a pit of success, and I'm worried that if we were to allow an external waker, it would be used to run external IO inside Crux, diluting its strengths, and causing problems down the line. Did you have any other use cases for this that we might have overlooked? Would be super interested if you know of any situations where you'd want to advance the internal executor without calling update/resolve. TLDR; instead of bringing async IO into Crux we want users to something like this instead:
|
I know you want to avoid it, but what we saw was a hanging app. From that perspective, this PR looks like a bug fix: returning a You can still warn against this pattern, but having a working app (that might go against the recommended pattern) seems better than one that hangs mysteriously at runtime. Rust is famous for "if it compiles, it works" and this what we're leaning into here. I think you're basically running into a limitation of what you can express in the type system and this limitation is dangerous when it silently lets people write apps that don't fully work at runtime. |
Yes, this is definitely problematic, but in the opposite way – it was never supposed to work, the fact that it "half works" (but hangs) is an issue. But making it "work" by breaking out of the managed effect system completely is a fix in the opposite direction from what we'd want to do. We want to lead people into the managed effect system, not out of it, because without it, the utility of Crux (the portability and testability) disappears. In other words, if avoiding the managed effects is genuinely the right trade-off for an app, it probably shouldn't be built with Crux? What we need to find instead is a way to stop this use compiling, or at least raise a runtime error. We need a way to clearly signal that this is intentionally not a supported use of Crux. |
I'm not sure I follow the argument: after all, that's the case with traits as well! On their own they don't require any additional support and they don't introduce any non-determinism. Algebraic effects - at least as implemented in Crux - are completely isomorphic to trait methods, they don't provide anything more.
Sure, but it's an opt-in mechanism - it might even be
It's about integrating with Rust code that doesn't use Crux - an example use case could be an external function such as: pub async fn send_email(db: &dyn email::Database, body: &str);It can be integrated with Crux by doing: impl email::Database for crux_core::CommandContext</* ... */> {
/* ... */
}... but that requires implementing a translation layer, i.e. taking: pub trait Database {
async fn foo(&self, /* ... */) -> /* ... */;
async fn bar(&self, /* ... */) -> /* ... */;
async fn zar(&self, /* ... */) -> /* ... */;
}... representing it with effects: pub enum Effect {
Foo(/* ... */),
Bar(/* ... */),
Zar(/* ... */),
}
pub enum Output {
Foo(/* ... */),
Bar(/* ... */),
Zar(/* ... */),
}... and then: impl email::Database for crux_core::CommandContext</* ... */> {
async fn foo(&self, /* ... */) -> /* ... */ {
let Output::Foo(output) = self.request_from_shell(Effect::Foo(/* ... */)).await {
unreachable!()
};
output
}
}... and then implementing the inverse of that for the middleware - that's a lot of "no-op" code! And pray you don't have to represent anything more complex than request->response - say: pub trait Database {
async fn tx(&self, f: impl FnOnce(&mut dyn Transaction)) -> Result<()>;
/* ... */
}
pub trait Transaction {
/*... */
}Representing all of that with effects brings way more problems than it promises to solve, IMO. Edit: in a hindsight, in this specific case you could represent pub async fn send_email(db: &dyn email::Database, net: &dyn Network, body: &str);... where |
|
Not to make it all black and white, but I think this is a straight choice – either you find the managed effect abstraction helpful for your use, in which case the added plumbing is worth your while, or you don't, in which case Crux is probably not the right fit. The problem is that as soon as you opt out even slightly, all the guarantees are out of the window 😔 Not only is this difficult for the user (where it's at least self-inflicted), but it's also difficult for us to maintain, having to consider this as a possibility at every turn. We'd be giving up the guarantees.
Totally. It's the price of the testability and easy portability. Maybe, at some point in the future, we'll find a way to derive all that plumbing mechanically from a user-defined trait. But the data orientation of the interface is very intentional. Two questions, just to understand the problems you're facing better:
pub async fn send_email(db: &dyn email::Database, net: &dyn Network, body: &str);if both
Can you spell them out for me, please? 🙏🏻
What is the scenario for this need? I would've thought this is another case where the extra effects to be emitted, depending on the resolution outcome, is again part of the app side of the capability? On this:
We do say that as guidance, exactly because it allows the kind of orchestration across different types of effects that we're alluding to with the examples above – lower level effects mean more of the orchestration logic is in the core, pure and very easly testable, and the total set of them tends to be smaller as you get further down to the system interface. However, it's more of a rule of thumb, which applies especially when building things from scratch. If there are existing, well tested SDKs which just need integrating, I don't see a problem with more coarse-grained effects which are effectively just bindings to more complicated transactions outside of the app logic. More of a side note, but:
They are indeed more or less isomorphic, but the tradeoffs are different – absolutely not the case that they don't provide anything more. The difference is in how you can inspect their usage. In order to test a use of a trait, you have to provide a test-only implementation that records the method calls and allows you to somehow set the behaviour for the test. It's doable, but quite fiddly, and as far as I know, you get no help to make it more convenient. That is just standard faking/mocking. Each fake tends to be a one-off for the test, otherwise it grows complex enough to almost require its own tests. This is where there's a sudden step-change in testing effectful code. For one, you need to introduce traits to even do it (with capabilities, you effectively do that by default, especially when using existing ones), and then you still need to do the mocking work... With the data-oriented approach, you use the same exact mechanic for testing as you do for implementing the effects in the first place, and we can give you a convenient API to write expectations (see #523 for the latest example). So yes, isomorphic, but that doesn't quite mean the same or just as ergonomic. IMHO all API design is about finding the nicest arrangement of the basic mechanic of what's going on, that is the most convenient to use (and test). |
Why are the guarantees (easy testing) impacting your ability to maintain the code in Basically: you can have "non-idiomatic" code in a corner of your own code base without compromising your ability to maintain said codebase. These ~50 lines of new code are not hard to maintain, they are not invasive. Are you saying that the The opt-in code @Patryk27 provided (an "escape hatch" if you will) allows We can put it behind a Cargo feature: that way you can have the guarantee that your own products which use Crux can maintain the properties you like. Others can opt-in (via the feature) to correct interop with other async runtimes. |
I guess we'll have to agree to disagree - in my eyes this isn't allowing the two to "play nice" with each other, it alows opting out of the effect system and its promises altogether, in a way which stops crux from making safe assumptions about control flow and execution sequencing. It also begs the question why one would want to do that, but still use Crux. Wouldn't it be easier to go directly to uniffi or facet_generate at that point?
No, I am saying that crux as a framework relies on the property of the effect system being an isolated async runtime driven by the core, not an external, active runtime. The two can coexist, but not mix. I agree it's not ideal that this restriction isn't encoded in the type system, it would be better if it was, and as I said, I'd love to find a way to do that. Re. maintenance – it's not about maintaining the escape hatch itself, it's maintaining and developing the rest of Crux with respect to the uses it would allow. If this exists as part of the API, then we have to take it into considerations as something people will do, when building other features and subsystems, to make sure it can actually be used, and things work as expected. Given it's going against the intent of the design, that is not a trivial ask. |
But this is orthogonal to other features, no? I don't see how enabling correct interop with other async runtimes can be detrimental to your framework? Your features cannot be reliant on the absense of this correctness, right? I 100% get that you don't like this idea of people bridging to other frameworks — that is is against how you would like people to use Crux. However, your arguments for not letting them do it are not that... it's that it's somehow a burden which doesn't follow from what you write. |
|
Portability and test-ability are the two most important properties of Crux. They're actually 2 sides of the same coin. Portability gives us test-ability (you can run a Crux app just as easily in a test as you can in a shell, or anywhere else for that matter). |
It's not that we don't like it, out of some kind of spite (we're in fact in the process of building a second iteration of supporting coexisiting with another runtime safely, with all the other guarantees kept intact). It's that we can't see the consequences of what supporting it in this specific way will have. I'm honestly not even sure I'm picturing the intended use correctly, let alone the other possible uses. My best attempt is that this is intended to enable hosting the Crux core inside an async task on another runtime, and allow a future created inside the Crux app to propagate its wake signal out to the hosting task, in order for that task to be polled by the external runtime, and effectively "poll" the core, to in turn poll the future created from inside the app to continue. Except.. that hosting future will now be woken every time anything is ready to advance inside the Crux executor. That feels like it could easily cause an inifinite poll loop for example. And that's just one use. How would it interact with being hosted in a Swift actor? Can it? I'm not even sure 🤷🏻♂️ What other consequences of notifying an arbitrary provided waker on every wake in the effect system are there? All I can say is that @StuartHarris and I spent half of today going back and forth thinking through the nuances and consequences of supporting this – possible race conditions it might introduce, deadlocks and other traps, what assumptions are no longer safe – and we're still not entirely sure we can fully envision them. Maybe we're being slow, but it certainly doesn't bode well for our ability to correctly reason about the effect system with this change in it, as we maintain it the future. Concurrent, and especially parallel programs are hard to think about, hard to test and hard to get right. It's the entire premise of Crux – it exists to insulate the decision making in the app from that noisy complexity, so it is predictable. Also, I'd appreciate it if you could refrain from framing the design choices Crux makes as an "absence of correctness", where said correctness is based solely on your opinion. It's frankly pretty manipulative. |
|
@Patryk27 and @mgeisler thank you for your contributions, and @StuartHarris @charypar thanks for your deep consideration. One thing that I am still confused about is the core element that is not working, that you would like to be working. And I notice there isn't an issue. And per our contributing guidelines in the repo, we want there to be issues for associated changes. I myself violated this rule just now when I introduced the assertion changes, so not a huge deal, but I think it would help us avoid confusion and conflation of the desired outcome with this implementation. I am writing a terminal app myself with That's a long way of saying: would you mind creating an issue? And then perhaps we can discuss in our community meeting on May 20, if not before then? |
This merge request provides
Core::with_waker()that allows to register a waker attached to the root command.Having such waker comes handy for injecting async dependencies that you can't (or don't bother to) model via effects - with the code here you can do:
... with an example integration being: