A stream consumer can resume an explicit subscription that previously threw on subscribe due to an exception when notifying a producer of the consumer. The resumed subscription will miss events.
When a consumer subscribes to a stream, the PubSubRendezvouzGrain will persist the new subscription before notifying any producers of the new subscriber.
If adding the subscriber to a producer fails for a reason other than a select few (GrainExtensionNotInstalled, ClientNotAvailable, OrleansMessageRejection on a system target), then we end up with:
- A PubSubGrainState that has the new subscription, and the producer that failed to add the subscriber
- The producer without knowledge of the subscriber
Although the consumer's subscribe operation will throw, the consumer subscription remains in the state. Therefore:
- If the consumer requests all of its subscriptions from the stream, it will get a handle for the subscription the publisher doesn't know about.
- If the consumer then resumes the subscription, it will expect to get all future events published by the producer, but the producer will not deliver events to this consumer.
This assumes the consumer follows the recommended Orleans behavior to resume existing explicit subscriptions, rather than to subscribe to new ones, to avoid being subscribed multiple times.
This project contains two grain types and a grain filter to recreate the condition described above:
- ProducerGrain - Grain that publishes events when called directly
- ConsumerGrain - Grain that subscribes to a stream, or resumes an existing handle if it exists.
- FailAddSubscriberGrainCallFilter - This throws an exception on the first call to IStreamProducerExtension.AddSubscriber, where the producer is a ProducerGrain.
We see the following order of events:
- The
ProducerGrainpublishes an event. This ensures it is registered as a producer on the PubSubGrainState. - The
ConsumerGrainis told to subscribe to the stream - The subscribe operation fails due to the filter. - The
ProducerGrainpublishes an event. We see that the consumer does not receive this event. - The
ConsumerGrainis told to subscribe again - this time, it finds it already has an existing handle, and resumes that handle successfully. - The
ProducerGrainpublishes more events. The consumer does not receive these events either, despite having successfully resumed its subscription handle.
[Producer] Publishing event 0...
[Producer] Successfully published event 0...
[Consumer-6c1c] Activating subscription to producer.
[Consumer-6c1c] Found no existing subscription to producer stream. Subscribing.
[Consumer-6c1c] ERROR - Failed to subscribe to producer stream.
[Producer] Publishing event 1...
[Producer] Successfully published event 1...
[Consumer-6c1c] Activating subscription to producer.
[Consumer-6c1c] Found existing subscription to producer stream. Resuming.
[Consumer-6c1c] Successfully resumed subscription to producer stream.
[Producer] Publishing event 2...
[Producer] Successfully published event 2...
[Producer] Publishing event 3...
[Producer] Successfully published event 3...
This is problematic because the consumer thinks it has resumed a handle successfully, when the producer doesn't know about it, and therefore it misses events. Perhaps the consumer should not get back any subscription handles on step 4 above, as the subscribe operation in step 2 had failed:
[Producer] Publishing event 0...
[Producer] Successfully published event 0...
[Consumer-6c1c] Activating subscription to producer.
[Consumer-6c1c] Found no existing subscription to producer stream. Subscribing.
[Consumer-6c1c] ERROR - Failed to subscribe to producer stream.
[Producer] Publishing event 1...
[Producer] Successfully published event 1...
[Consumer-6c1c] Activating subscription to producer.
[Consumer-6c1c] Found no existing subscription to producer stream. Subscribing.
[Consumer-6c1c] Successfully subscribed to producer stream.
[Producer] Publishing event 2...
[Consumer-6c1c] Received event 2.
[Producer] Successfully published event 2...
[Producer] Publishing event 3...
[Consumer-6c1c] Received event 3.
[Producer] Successfully published event 3...
The solution has a commented out second consumer, which can be added to the repro to demonstrate that a successfully subscribed consumer will print that it handles an event.