[17:29] <toad_> okay, where was i?
[17:30] <toad_> if A sends a subscribe request to B
[17:30] <toad_> and B forwards it
[17:30] <toad_> and B gets several more subscribers
[17:31] <toad_> then from their point of view, B really ought to send a SubscribeRestarted...
[17:31] <toad_> but if it does, it could eat its tail
[17:31] <toad_> ... except that it won't, because it's preserving the ID
[17:32] <toad_> pub/sub is wierd
[17:32] <Sugadude> Sounds like Indian mythology to me. ;)
[17:33] <toad_> so.... what.... we ensure we can't bite our own tail, by preserving the UID, and when we change it, we send CoalesceNotify back along the subscribe request chain - only to the nodes actually involved in the chain
[17:33] <toad_> the ORIGINAL chain
[17:34] <toad_> so we can end up subscribing to or through one of our dependants who coalesced with us
[17:34] <toad_> just not to one which is already on the chain......
[17:34] <toad_> hrrrrrrrrrm
[17:34] <toad_> Sugadude: ;)
[17:35] <toad_> now, does THAT work?
[17:36] <toad_> are loops even a problem? well, if they lead us down a suboptimal route and we end up not going to the real root, then yes, they are...
[17:37] * Sugadude shakes his magic 8 ball. "Outlook uncertain"
[17:37] <toad_> yeah
[17:37] <toad_> if we send nodes a CoalesceNotify when they join our previously existing request, we will end up propagating it...
[17:37] <toad_> across the tree, eventually
[17:37] <toad_> that is bad
[17:38] <toad_> if we make the joiners state distinct from the original's state, that's the way forward... maybe
[17:39] <toad_> but then we can still deadlock
[17:39] <toad_> A joins B, B joins C, C joins A
[17:39] <toad_> bang
[17:39] <toad_> deadlock
[17:40] <Sugadude> "A joins B, B joins C, C joins A". A,B,C start a support group and invite D to join. ;P
[17:40] <toad_> we can have each request go all the way individually... but that doesn't scale
[17:41] <toad_> we can have each node reject (terminally, exponential backoff on client) subscribe requests while it is itself subscribing...
[17:41] <toad_> that was ian's suggestion
[17:41] <toad_> and it looks the simplest thing
[17:42] <toad_> okay so how would restarts work in such a scenario?
[17:42] <toad_> same as planned really...
[17:43] <toad_> i don't think there is a big distinction between subscribe and resubscribe... just by a few nodes joining a subscribe, it becomes effectively a resubscribe...
[17:50] <linyos> hmm, a malicious node could break the broadcast-subgraph in two, couldn't it?
[17:51] <toad_> a malicious node could do a lot in the current pub/sub architecture, which is basically a tree
[17:51] <toad_> if we don't use the tree to reduce bandwidth use, we can reduce the vulnerability
[17:51] <toad_> i.e. if we relay messages to our parent and our dependants equally
[17:52] <toad_> rather than going up the tree so that the root can decide on collisions
[17:52] <linyos> tricky business.
[17:52] <toad_> there are two advantages to using the tree that way - one is that we have an authoritative decision maker for collisions. the other is that we reduce bandwidth usage significantly if the graph isn't very treeish
[17:53] <toad_> but then it should be quite treeish
[17:53] <toad_> so it may not be an issue
[17:53] <toad_> likewise, as long as we only relay a given ID once, we don't really need a collision dispute resolution mechanism
[17:54] <toad_> although that does mean that you can't easily detect collisions from the client end...
[17:54] <toad_> s/easily/reliably
[17:55] <toad_> i suppose we can just say that it is not guaranteed to be reliable in the presence of multiple writers, and client authors must take this into account
[17:55] <linyos> why would you want multiple writers?
[17:55] <linyos> just use one stream for each writer.
[17:55] <linyos> subscribe to them all.
[17:55] <linyos> aggregate at client end.
[17:56] <toad_> well
[17:56] <Sugadude> One stream to bring them, One stream to control them, One stream to bind them.... Oh wait, wrong movie? ;)
[17:56] <toad_> the original reason i thought about that was that clients may not know the latest sequence number
[17:56] <toad_> that's not a problem if the sender subscribes well before he sends
[17:57] <toad_> the idea was that frost etc might benefit from one stream per channel, if you know the key you can post...
[17:57] <toad_> if you have separate streams for each client, you'll have a LOT of streams, and that means each one must be fairly low bandwidth
[17:58] <toad_> but yeah, lets turn this upside down
[18:00] <linyos> for message boards you really want one stream per client anyway.
[18:01] <linyos> for security reasons---you can kick out malicious/compromised clients.
[18:01] <linyos> imho, better to just make sure the system scales to thousands of streams.
[18:01] <toad_> yeah, any shared bus is vulnerable
[18:02] <toad_> well
[18:02] <toad_> we have to keep some packets
[18:02] <toad_> in order to deal with the inevitable breakages
[18:02] <toad_> we have to cache the last N packets for some N
[18:02] <toad_> packets will probably be around 1kB
[18:03] <linyos> write them to the disk if breakages are infrequent.
[18:03] <toad_> so if we have a limit of 4096 streams, and cache 8 packets for each stream, we get 32MB of data to cache
[18:03] <toad_> which sucks!
[18:04] <linyos> yeah, that is pretty harsh when you really want to scale.
[18:05] <toad_> well, lets say we cache it on disk
[18:05] <linyos> can't you just forget them after a while?
[18:06] <toad_> streams, or packets?
[18:06] <linyos> i mean, if the tree is broken and you're receiving yet more incoming packets, you can't just queue them indefinitely.
[18:06] <toad_> i don't think LRU on streams is going to work well
[18:07] <toad_> well, if we are using this for non-real-time apps like RSS, it'd be very nice if we could cache them
[18:07] <toad_> so if i turn my node off for 2 hours to play quake, then reconnect, i still get the updates i've missed
[18:07] <toad_> anyway, i'm not totally averse to a disk-based cache
[18:07] <linyos> when do you stop?
[18:08] <toad_> hmm?
[18:08] <toad_> i don't see why we shouldn't use a disk based cache
[18:08] <linyos> i mean, the publisher must know how long his messages will be cached, right?
[18:09] <linyos> ie, "up to ten messages", or "for up to two hours"
[18:09] <toad_> i don't see that setting an arbitrary time period after which stuff is dropped would predictably reduce space usage
[18:09] <toad_> we need it to *PREDICTABLY* reduce space usage for it to be useful, don't we?
[18:10] <linyos> i'm out of my depth here.
[18:10] <toad_> well
[18:11] <toad_> the proposal is we keep the last 32 packets
[18:11] <toad_> or 8 packets
[18:11] <toad_> or N packets
[18:11] <toad_> for each stream
[18:11] <toad_> for as long as we are involved in that stream
[18:11] <linyos> so that?
[18:11] <toad_> we unsubscribe if nobody is subcribed to the stream
[18:11] <toad_> including local subscriber clients
[18:12] <linyos> so people can still access recent messages that they missed because they rebooted or something?
[18:12] <toad_> so that if nodes/people disconnect, we can bring them up to date
[18:12] <linyos> ok.
[18:12] <toad_> right
[18:13] <linyos> why not insert each message as a normal file then?
[18:13] <toad_> hmm?
[18:13] <linyos> then they can stay around as long as people keep downloading them.
[18:13] <toad_> too big, for a start
[18:13] <toad_> well lets explore it
[18:13] <toad_> the practical issue is that for a lot of apps 32kB is ridiculously huge
[18:14] <toad_> but say we changed the block size to 4kB
[18:14] <toad_> we still couldn't do IRC with it... but suppose we don't care about IRC
[18:15] <linyos> or just make them a special case and insert them variable-length?
[18:15] <toad_> well, say we have 1kB SSK's
[18:15] <toad_> that is, the signature stuff, plus 1kB of data
[18:15] <toad_> kept in a separate store to CHKs
[18:15] <linyos> yeah.
[18:15] <toad_> then we have what amounts to the combination of passive requests and TUKs
[18:16] <toad_> i.e. SSKs are versioned, and you can do a passive request which does not expire when it returns data
[18:17] <toad_> passive requests are coalesced, to the extent that if a node sends a passive request, and it already has one, it doesn't need to forward it further
[18:19] <linyos> in principle you are doing two things in pub/sub: one, you are _notifying_ all the subscribers that another message is available. two, you are actually pushing it to them.
[18:20] <toad_> right
[18:20] <toad_> in freenet, we generally don't want to do the first without the second
[18:20] <toad_> as a matter of principle
[18:21] <linyos> fair enough. anyway, i think LRU is exactly what you want for this purpose, ie for keeping around recent messages for catching-up
[18:22] <toad_> okay
[18:22] <toad_> suppose we do that...
[18:22] <toad_> what about the actual subscriptions? the passive requests?
[18:24] <linyos> the publisher simply publishes each message as he does now, except that he can also concurrently insert it normally under the same key.
[18:24] <toad_> it's the same thing
[18:24] <linyos> if he so chooses in order that his subscribers can catch-up
[18:24] <toad_> the publisher inserts the message under a versioned SSK
[18:24] <toad_> okay, here's a problem...
[18:25] <toad_> oh
[18:25] <toad_> i see
[18:25] <linyos> sure, i'm just saying the two systems are logically separate
[18:25] <linyos> pub/sub and block cache
[18:25] <toad_> we CAN do LRU...
[18:25] <toad_> a request for "everything since revision 98713" will only promote blocks since that point
[18:28] <toad_> ok
[18:28] <toad_> brb
[18:28] <linyos> it is conceivable that some stream publishers would not want their messages cached for security reasons
[18:29] <linyos> and that others would have more efficient, application-specific ways of catching up.
[18:31] <toad_> ok, where was i?
[18:32] <toad_> w 
[18:32] <toad_> we can eliminate the TUK scalability problem (which is "we don't want everyone going all the way on every TUK request") by not forwarding a TUK request if we are already subscribed to that TUK
[18:32] <toad_> because if we are, we already have the current version
[18:33] <toad_> well we might have a small probability of forwarding it in the name of housekeeping
[18:33] <toad_> we definitely do not want LRU on the actual subscriptions
[18:34] <toad_> on the actual subs, we'd have a maximum number of keys subscribed to per node, and we'd obviously stay subbed as long as at least one node is subbed to the key in question
[18:35] <toad_> now, how do we do the actual subscribe?
[18:35] <toad_> we send a regular request out for the key, and it is routed until HTL runs out
[18:35] <toad_> whether or not it finds the data, because of the flags set, it sets up a passive request chain
[18:36] <toad_> so far so good... now for the hard part - coalescing
[18:37] <toad_> if the request finds a node with an existing subscription to the key, with sufficient HTL on it, it stops on that node
[18:38] <toad_> if the request finds a node already running a similar request, it does nothing... it just continues the request
[18:38] <toad_> this is inefficient, and we should find some way to avoid sending the actual data blocks more than once
[18:38] <toad_> but it does prevent all the various nightmares w.r.t. loops
[18:45] <linyos> the thing is that it's got to scale like crazy. since people are going to have tons of streams all over the place.
[18:45] <toad_> right...
[18:46] <linyos> ideally it's just a matter of keeping a little record in your stream state table
[18:46] * toad_ is trying to write up a proposal... please read it when i've posted it...
[18:48] <toad_> well
[18:49] <toad_> the basic problem with scalability is that we don't want most requests to go right to the end node
[18:49] <toad_> right?
[18:49] <toad_> popular streams should be cached nearer to the source, and subscribed to nearer to the source
[18:50] <toad_> 2. If any request for a TUK, even if not passive, reaches a node which
[18:50] <toad_> already has a valid subscription at a higher or equal HTL for the key,
[18:50] <toad_> then the request ends at that point, and returns the data that node has.
[18:50] <toad_> If the passive-request flag is enabled, then passive request tags are
[18:50] <toad_> added up to that node, and that node adds one for the node connecting to
[18:50] <toad_> it if necessary.
[18:50] <toad_> what if the most recent data has been dropped?
[18:51] <linyos> yeah, that is another problem. i was thinking about scalability as regards the cost of maintaining idle streams.
[18:51] <toad_> should we have a small probability of sending the request onwards?
[18:51] <linyos> dealing with them once they start blasting tons of messages is also hard...
[18:51] <toad_> partly for the reason that the network may have reconfigured itself...
[18:52] <toad_> well
[18:52] <toad_> do we want to a) have an expiry date/time on each passive request, and force the clients to resubscribe every so often, or b) have the network occasionally resubscribe?
[18:53] <linyos> toad_: i'll have to study your mail before we're back on the same page.
[18:53] <toad_> ok
[18:53] <toad_> will send it soon
[18:54] <toad_> Subject: [Tech] Pub/sub = passive requests + TUKs
[18:54] <toad_> please read :)
[18:55] <toad_> the hard bit probably is what to do about looping and coalescing
[18:57] <toad_> linyos: you know what TUKs and passive requests are, right?
[18:57] <toad_> that email may not be very comprehensible otherwise
[18:59] <toad_> so the basic remaining problems are: coalescing, loop prevention, and expiry/renewal/resubscription (when something changes, or routinely)
[19:00] <toad_> linyos: ping
[19:01] <-- Romster has left this server. (Connection reset by peer)
[19:01] <-- Sugadude has left this server. (Remote closed the connection)
[19:01] <toad_> loop prevention is not a problem unless we have coalescing
[19:01] --> Romster has joined this channel. (n=Romster@tor/session/x-25dc686d1fb1a531)
[19:01] <toad_> if we do coalescing on requests as planned, then:
[19:02] <toad_> we can't run into our own tail, because our tail knows all our ID's
[19:02] <linyos> i know what a passive request is, but no idea about TUKs
[19:02] <toad_> on the other hand, that could be very restricting...
[19:02] <toad_> linyos: TUKs == updatable keys
[19:03] <linyos> ok, that's what i guessed.
[19:03] <toad_> an SSK with a version number or datestamp
[19:03] <toad_> you can fetch the most recent version
[19:03] <toad_> this is one of the most requested features
[19:03] <toad_> and it gels with pub/sub
[19:03] <linyos> when the publisher inserts a new version, how do we know that it actually reaches the passive request graph?
[19:04] <toad_> we don't
[19:04] <toad_> in the proposed topology
[19:04] <-- TheBishop_ has left this server. (Read error: 113 (No route to host))
[19:04] <toad_> but we don't know that for sure in classic pub/sub either, as there may be no subscribers, or there may be catastrophic network fragmentation
[19:05] <toad_> okay, if the network isn't completely degenerate, then running into our own tail won't be catastrophic
[19:05] <toad_> in fact, we could exploit it to give the graph more redundancy :)
[19:06] <toad_> if we run into our own tail, we give it a special kind of non-refcounted subscription, meaning essentially that if you get a packet, send us it, but don't let this prevent you from dropping the subscription
[19:06] <toad_> (since if it was refcounted, it would create a dependancy loop)
[19:07] <toad_> (which would be BAD!)
[19:07] --> Sugadude has joined this channel. (n=Sugadude@tor/session/x-fe3d50601157f088)
[19:07] <linyos> so essentially the idea is to cast this big net that catches inserts.
[19:07] <toad_> or requests
[19:07] <toad_> but yes
[19:08] <toad_> it's a conglomeration of request chains; they all point in the same direction, towards the key
[19:08] <toad_> so it should be fairly treeish, and it should be connected
[19:09] <toad_> and unlike structures with a root, there should be enough redundancy to prevent the obvious attacks
[19:10] <toad_> so in summary as regards coalescing... we do it exactly the same way we do it on ordinary requests; with CoalesceNotify messages to prevent us biting our own tail
[19:11] <toad_> somehow these need to be passed out to all the requestors
[19:11] <linyos> in a TUK, all the updates are routed to the same place?
[19:11] <toad_> but that's going to get done anyway
[19:11] <toad_> linyos: all the updates have the same routing key
[19:11] <linyos> how do the updates happen.
[19:11] <linyos> isn't that bad? not scalable?
[19:11] <toad_> linyos: hrrm?
[19:11] <toad_> what/
[19:11] <toad_> what not scalable?
[19:11] <toad_> all the updates having the same key not scalable? why?
[19:12] <linyos> i mean, suppose that somebody inserts updates at a huge rate
[19:12] <linyos> for some high-bandwidth application
[19:12] <toad_> as a flood?
[19:12] <linyos> and they all hammer one part of the keyspace
[19:12] <linyos> no, just because their application uses tons of data
[19:13] <toad_> well, they may hit flood defences
[19:13] <toad_> if they don't, and nobody reads their data, it will eventually drop out
[19:13] <toad_> what's the problem here?
[19:13] <linyos> i'm just talking about the bandwidth
[19:13] <toad_> well, if nobody is listening it will just take up the insert bandwidth
[19:13] <linyos> the nodes at the end of the request routing path would have to carry the full bandwidth of the stream
[19:14] <linyos> which could be enormous for some applications
[19:14] <toad_> well yeah, so don't code such applications :)
[19:14] <toad_> the links between the nodes won't bear it; most of them are limited to 20kB/sec or less
[19:14] <toad_> so he'll get RejectedOverload
[19:14] <toad_> for many of his inserts
[19:14] <toad_> which is fatal
[19:14] <linyos> right.
[19:15] <toad_> which is a subtle hint to SLOW THE FUDGE DOWN
[19:15] <linyos> but if the inserts went to different parts of the keyspace that would not be a problem.
[19:15] <toad_> well maybe
[19:15] <linyos> he might have 50 neighbor nodes in the darknet and they could handle the 100MB/s
[19:15] <toad_> but he'd still need a big link himself
[19:15] <toad_> meaning he's not very anonymous
[19:15] <linyos> but not if you aim them all down the same request path.
[19:15] <toad_> anyway he could just stripe across 10 streams
[19:15] <linyos> i guess he could split it up.
[19:16] <linyos> my point is only that aiming all the updates down the same request path creates another kind of scalability problem.
[19:16] <toad_> FEC encode it, then split it up into 18 streams where you can reconstruct it from any 10 :)
[19:16] <toad_> "it" being his illegal video stream :)
[19:17] <linyos> hmm, what if one of the nodes in the request path was really slow? that could break even modestly sized streams.
[19:17] <linyos> i do not like that...
[19:18] <toad_> that will break lots of things
[19:18] <toad_> e.g. requests
[19:18] <linyos> not really, since it never becomes a bottleneck.
[19:20] <toad_> you think?
[19:20] <toad_> anyway if you want to shove a LOT of data, you use the 1kB SSKs as redirects
[19:20] <toad_> to your real data which is in 32kB CHKs which are scattered across the network
[19:21] <linyos> i guess, though that does not help if a node in the chain is really overloaded and drops packets like crazy.
[19:22] <linyos> or is malicious, even.
[19:22] <toad_> which chain? the chain to the stream?
[19:22] <linyos> the insertion chain.
[19:22] <toad_> well if it's malicious, maybe we can detect it and kill it
[19:22] <toad_> the insertion chain for the SSK stream
[19:22] <linyos> all the inserts go through the same 10 nodes.
[19:22] <linyos> yeah.
[19:23] <toad_> so what you're saying is that it is vulnerable to selective dropping
[19:23] <toad_> if the cancer node happens to be on the path
[19:23] <toad_> well, so are ordinary inserts
[19:23] <toad_> it's more viable with multiple blocks on the same key, of course...
[19:24] <toad_> i don't see how TUKs, updatable keys or any sort of stream could work any other way though
[19:24] <toad_> you have to be able to route to them
[19:24] <linyos> my main worry is just that the insertion path will often happen to include some really slow dog of a node. and then you will not be able to stream much data at all.
[19:24] <toad_> so parallelize
[19:24] <toad_> if you really need to stream a lot of data
[19:24] <toad_> which usually you don't
[19:25] <linyos> audio/video links?
[19:25] <linyos> seem like a common thing to do.
[19:25] <toad_> video would definitely have to be parallelized
[19:25] <toad_> audio too probably
[19:25] --> Eol has joined this channel. (n=Eol@tor/session/x-94f512933bd62f63)
[19:25] <toad_> but we are talking multicast streams here
[19:25] <toad_> in 0.8 we will use i2p to do 1:1 streams
[19:26] <toad_> well hopefully
[19:26] <toad_> ian would say we will use tor to do 1:1 streams, because he knows the guy at tor better than he knows jrandom; i say vice versa; IMHO there are significant technical advantages to i2p, but we'll see
[19:27] <toad_> anyway
[19:27] <toad_> there simply is no other option; for anything like this to work, there HAS to be rendezvous at a key
[19:27] <toad_> that key may change from time to tiem
[19:28] <linyos> now that i think about it, i don't like the idea of pushing streams through chains of nodes in the first place, since you are limited by the weakest link. better to use the chain to signal "next message available", which requires negligible bandwidth, and then to insert and request each message through the network at large.
[19:28] <toad_> but it has to stay at one key for a while, unless you want major jumping around overhead/latency/etc
[19:28] <toad_> linyos: well in that case...
[19:28] <toad_> audio streams don't require a new-data-available indicator at all
[19:28] <toad_> unless they're doing half duplex
[19:28] <toad_> what does is things like RSS
[19:29] <toad_> frost messages
[19:29] <toad_> etc
[19:29] --> FallingBuzzard has joined this channel. (n=FallingB@c-24-12-230-255.hsd1.il.comcast.net)
[19:29] <toad_> also 1kB is intentionally small
[19:29] <toad_> much smaller and the signature starts to become a major overhead
[19:29] <toad_> so we're arguing over usage here
[19:29] <linyos> yeah, constant-message-rate applications would be best done through SSKs
[19:30] <toad_> wel
[19:30] <toad_> well
[19:30] <toad_> we are talking about SSKs here
[19:30] <toad_> all we do:
[19:30] <toad_> we stuff the 1kB with CHKs
[19:30] <toad_> (URLs of CHKs)
[19:30] <toad_> we overlap them
[19:30] <toad_> so if you miss a packet, you pick them up in the next one
[19:30] <toad_> lets say we have a 96kbps stream
[19:31] <toad_> that's loads for voice and arguably enough for music
[19:31] <toad_> that's 12kB/sec
[19:31] <toad_> we divide it up into blocks of 128kB
[19:31] <toad_> each block goes into 6 CHKs of 32kB each (4 data + 2 check)
[19:31] <linyos> ooh, i have a big reason why you want to use streams for signalling and not data transmission.
[19:31] <toad_> a CHK URI is maybe 100 bytes
[19:31] <toad_> so we can put 10 of them in each 1kB block
[19:32] <linyos> signalling is so cheap, you can have lots of redundant paths.
[19:32] <linyos> and hence more reliability when a node falls off a cliff.
[19:32] <toad_> linyos: that's a nice one
[19:32] <toad_> well i think we can do them in.. hmmm, yeah, it's going to be maybe 67 bytes
[19:32] <toad_> so
[19:32] --> TheBishop_ has joined this channel. (n=bishop@port-212-202-175-197.dynamic.qsc.de)
[19:32] <toad_> 15 in a 1kB block
[19:33] <toad_> that's 2.5 groups
[19:33] <toad_> so we carry signaling, including redirects
[19:33] <toad_> in the SSK stream
[19:33] <toad_> and then fetch the actual data from CHKs, which are distributed
[19:34] <linyos> a CHK is 100 bytes???
[19:34] <toad_> frost messages will sometimes fit into a 1kB SSK, and sometimes will have a redirect
[19:34] <linyos> you only need 100 bits...
[19:34] <toad_> linyos: 32 bytes routing key, 32 bytes decryption key
[19:34] <toad_> maybe 3 bytes for everything else
[19:34] <toad_> 32 bytes = 256 bits
[19:34] <toad_> anything less would be grossly irresponsible IMHO
[19:36] <linyos> oh, that includes encryption.
[19:36] <toad_> yes
[19:36] <toad_> and yes you can cheat
[19:36] <linyos> really you want to do that on the client side.
[19:36] <toad_> but we haven't really formalized that into a client API yet :)
[19:36] <toad_> if you cheat, it can be a fixed decrypt key, and fixed extra bits
[19:36] <toad_> so 32 bytes
[19:37] <toad_> => you can fit 32 of them in 1024 bytes (just)
[19:37] <toad_> 31 in practice, you'd want SOME control bytes
[19:38] <linyos> that's fair enough. 32 bytes is a big hash, but who am i to argue hash-security.
[19:38] <toad_> well, SHA1 is looking very dodgy
[19:39] <toad_> okay
[19:39] <toad_> usage is important
[19:39] <toad_> but we also have to figure out how exactly it's going to work
[19:39] <toad_> two options for renewal/resubscription
[19:40] <toad_> one is we let the client do it periodically
[19:40] <toad_> the other is we do it ourselves
[19:40] <toad_> actually a third option would be not to do it at all
[19:40] <toad_> which may sound crazy initially, but if we have the client-node automatically switch to a new stream every so often...
[19:40] <toad_> ooooh
[19:41] <toad_> that could radically simplify things
[19:41] <toad_> ... or could it?
[19:41] <toad_> hmmm, maybe not
[19:41] <toad_> if a subscribed node loses its parent, it's still screwed
[19:41] <toad_> it has to resubscribe
[19:41] <toad_> it may have dependants who might send it the data
[19:42] <toad_> but in general, it is quite possible that that was a critical link...
[19:42] <toad_> we could mirror it across two or more streams, but that would suck... lots of bandwidth usage
[19:43] <toad_> (since there will be many streams)
[19:45] <linyos> tricky business indeed.

- Send SubscribeRestarted *only if upstream has sent us one*. Relay it to all dependants on receipt, and send one to new nodes when they connect, after Accepted.
- Use CoalesceNotify.
-- Send it when we coalesce two subscribe requests.
-- When we receive one, arrange to reject requests with the coalesced ID, and forward it backwards along the chain.
- Let through pending requests if we receive(d) a SubscribeRestarted with a RESTART_ID equal to their UID. Create a separate SubscribeSender for them, and a separate driver object.
- SubscriptionHandler.subscribeSucceeded should verify that the root is acceptable




Handle FNPUnsubscribe's.





Implement SubscriptionHandler.handleResubscribeRequest.


What's the diff. between must beat location and nearest location on a resub request?? Is there any?
- On a sub request, we will not subscribe through a node unless the root is closer to the target than the nearest location.
- likewise on a resub. request
???

Resub. req. can come from:
- node which is dependant on us (relaying it)
- our parent (relaying or originating)
- node which is not subbed (as far as we know) (relaying or originating)

Success handling different in each case.



*****************************************************************************
We only forward a resub. request if our parent (or ultimate parent) has sent us a SubscribeRestarted, and the resub. req. has a similar ID
- Do we want to not have global handling of resub. req.s then? Perhaps it would be better to wait for them after we receive a SubscribeRestarted? It would certainly be simpler, and would get the properties we want...
-- how to implement this? create another thread to handle Resub Req's when we get a Restarted??
-- neatly solves the multiple simultaneous resub's problem too - there is only one happening at once.
-- maybe some sort of alternative callback interface with MessageFilter and USM... some object we can turn on and off in the main SubscribeSender loop, which will create a thread only if it receives a ResubReq???
*****************************************************************************



So:
Architecture:
Lose a connection, complete a swap request (with a nonlocal partner; time delay if could be local), get a request that indicates there may be a better root out there somewhere -> maybeRestart -> maximum of 1 resubscribe request every X time period (say 5 minutes) -> create a resubscribe request, and send it


While in RESTARTING phase on a subscription, we can accept resubscribe requests. Get one -> handle it...




What happens if we are resubscribing while a subscription fails and we start a new ordinay subscription?
- SubscriptionHandler




ResubscribeRequest vs SubscribeRequest: Sender side:

We lose our upstream connection
Our upstream connection sends us a failure message
Other fatal error

->

We tell our dependants that we are resubscribing
We send a resubscribe request



Non-fatal errors:
- We see that there may be a better node to subscribe through somewhere else.
- A swap completes successfully.

->

Same thing. Only diff. is that we may still receive data from upstream. This is irrelevant as it goes down a different pathway and is in any case verifiable.



Receiver side:

We get a subscribe request -> if our root is compatible, we accept it. If we are not subscribed, we forward it. If we are already subscribing, we wait. Etc. In any case we send the client a SubscribeRestarting. (with an ID...)

We get a resubscribe request (which we are expecting) -> we forward it even if we are already subscribed.


The difference here is purely whether we were expecting the request.


So:
- If we receive a subscribe request:
-- If we are already subscribed, and our root is compatible, we accept, and send a SubscribeSucceeded
-- If we are restarting, we accept, and send a SubscribeRestarted. If the subscribe fails, this request may be the next one to try.
-- If we were not subscribing, we normally start subscribing, and forward the request.


- If we receive a valid SubscribeRestarting from our parent (in SubscribeSender), we check our current pending list for its ID. If a pending request matches the ID, we let it through in parallel. Likewise, if a request comes in and has ID equal to the restarting ID, we let that through immediately.








Which means architecturally:


- No distinction between SubscribeRequest and ResubscribeRequest. Really, there isn't; there may be coalesced clients behind a SubscribeRequest.
- We have only one parent node at a time.
- But sometimes we will be receiving data packets from more than one node.


Scenario:

We send a subscribe request out.
Several nodes ask us to subscribe; we tell them to wait for our sub req to complete.
We receive a SubscribeRestarted from our current prospective parent - the node which we are currently talking with.
The ID matches one of the waiting nodes' subscriptions, so we let that pass through. We now have two subscribe senders running; the first one is waiting in RESTARTING, and the second one is SEARCHING.
The second one also restarts.
We have to pass through a third subscribe request...

Etc.

Eventually the third subscribe request runs out of hops. A downstream node declares root. The third subscribe request goes from RESTARTING to SUCCEEDED. We pass this on to the node which was trying to subscribe through us.








A: 0.5
B: 0.6
C: 0.7

Subscribe request:

A -> B -> C -> A -> B -> C -> A -> B -> C

(No ID's because of coalescing)

None are initially subscribed.

This is degenerate; we need some form of loop protection.

We have it.
Once we know a request is a resub request (i.e. when we get the corresponding subcribe restarted message), we let it through with its original ID. Which is loop-protected. That is the point of resubscribe requests.




So:

Searching for 0.796
1: A -> B
X, Y, Z -> A. A says wait, restarting, ID=1. => A's chain can route through X,Y,Z.
1: B -> C
1: C -> A
A rejects: loop
1: C -> X
1: X -> A
A rejects: loop
X becomes root
success: X -> C -> B -> A. root is X, loc 0.79
X goes down.
C restarts: C -> B -> A -> Y,Z: restarting, ID=2
2: C -> Y
2: Y -> A
A becomes root (0.8)
A -> Y -> C -> B -> A: success at 0.8
A is root, when receives success message with identical value, unsubscribes from B.
B subscribes through C through Y through A.





If we get a request:
- If we are successfully subscribed, we accept it and send success
- If we are waiting for an upstream restart, whose ID matches the request's, we start a subscribesender to route the request
- If we are waiting for a different upstream restart, we accept it and send restarting (i.e. wait)
- If we are not subscribing, we subscribe (we start a subscribe sender).




Therefore:

We have one parent. (or no parent).

We have one SubscribeHandler per incoming SubscribeRequest.

We have any number of SubscribeSender's.


We send a subscription request out. We find a node. That node is already restarting. So we subscribe to it, and relay the SubscribeRestarted - to us and all our clients. That ID then routes through us. We relay it, since it matches our upstream restarting ID. That chain then finds another node which is also already restarting. That then routes back to us. And so on. And so on.

Coalescing causes these problems.

Can we avoid it?

Separate ResubscribeRequest.

Routed exactly as SubscribeRequest, but not coalesced with pending [Re]SubscribeRequests.

























SubscriptionHandler.resubscribe()






What happens when:
- upstream restarts??
- lose upstream? - when lose request originator, we cancel the resub
- resub succeeds -> we subscribe through the new node, unsub from the old, kill the current subscribe handler (it's not fatal to be temporarily subscribed to both)
- resub fails -> pass through to predecessor



All sender's need to deal with the precursor node having been lost. Their Handler's need to detect this.


SubscribeSender:
- Sometimes when the status changes, we will need to abort the resubscribe*



SubscribeSender: counter is misused. It must be incremented only when we actually receive a message.











How to implement resubscription?
- SubscriptionManager locks ID and then feeds to SubscriptionHandler
- SubscriptionHandler does most of the work
- We can have a resubscribe request going on while a subscribe request is occurring. Need to deal with all the possible contingencies here.
- What about multiple simultaneous resubscribe requests? Ordering issues? => error, right?



SubscriptionHandler: where do we unlock the ID?
- Where should we unlock it? If we succeed, then restart, do we reuse the existing restart ID?
- ID most directly belongs to SubscribeHandler... when we reject it, we can unlock...
- We need to have some means to detect ID leaks




Do we register a new ID when we create a new ID?


Look into >= / > in decisions on whether to become root.



SubscriptionHandler.subscribeRestarted()
- Relationship between our restarting and upstream restarting
- Don't we have to propagate upstream's restarting ID etc?
- Probably...
- But what about our own?
- Implies that we don't necessarily want RESTARTING to be the status we send when we are actually SEARCHING i.e. when we have sent a SubscribeRequest rather than a ResubscribeRequest
- But what else can we do? Two options:
-- 1. No difference between SubscribeRequest and ResubscribeRequest. -- not sure this would help?? 
-- 2. Send a different status message.

Node connects.
We send restarting: ID = 0x23456
We connect to an upstream.
It sends restarting: ID = 0x12356
We relay this to our connected peers.
So they receive restarting: ID = 0x12356





SubscriptionHandler.becomeRoot()

SubscriptionHandler.maybeRestart()
Resubscribe*


Make sure clients are kept up to date (see SubscriptionCallback interface).



Various handler.*** calls before setStatus(*).
Fix this.




Does FNPSubscribeRestarted require a sequence number? Surely yes?





Search phase:
Run out of HTL -> RNF?!

Do we ever relay an RNF in the search phase?
Yes, if we can't find anything closer.

Surely not.


SubscribeHandler.setStatus().
- If we return RNF status:
-- If we are closer to the target than the best-seen-before-arriving, success
-- If not, RNF (or DNF/similar?)

Do we want to overload the RNF message to also carry terminal failures?



Do we need SubscribeSucceededNewRoot? If so, need to handle it in SubscribeSender.







RejectedOverload doesn't have an ORDERED_MESSAGE_NUM!

All messages involved in subscription or restart must have an ORDERED_MESSAGE_NUM.

So we need:
FNPSubscribeRejectedLoop
FNPSubscribeRejectedOverload
FNPSubscribeRouteNotFound

Or do we?

Probably...

When we first send the request, we can get:
- SubscribeSucceeded (must have a counter)
- SubscribeRestarted (must have a counter)
- RejectedLoop (doesn't need a counter as terminates our contact with the node)
- RejectedOverload (likewise)

Do we need a subscription ID on any of these messages?
- Yes, if we reuse the ID'd messages, they must have an ID
- This can be the restart ID in the case of restarting, and the subscribe ID when searching.

What about in SUCCEEDED phase?
- Ideally yes for reasons of style and consistency



SubscribeSucceeded includes the root node's location.

SubscribeHandler.subscribeSucceeded(...).

SubscribeSender.runPhase2Restarting().




If we are subscribed:
- If we get a SubscribeRequest with nearest-seen-so-far further away from target than our root, we accept it.
- If we get a SubscribeRequest with nearest-seen-so-far closer to the target than our root, we reject it with an RNF.

If we are restarting:
- Either way, we accept and queue.
- When we succeed, we process each one individually.


SubscriptionHandler.statusChange(...).





if we become root, then we get a request with a higher htl, we probably should still forward it...???

or if we become subscribed at low htl..?



SubscribeSender.

SubscribeHandler.
- Needs to send initial status
- Needs to send status updates later on too.


Should RNF include nearest location?
- Need input from Ian.

Splitfiles:
- Can use onion's FECFile.
- But still have to wait till have a whole segment for decode. Need to do decode one segment while downloading the next.

Publish/subscribe:
- Implement rest of subscribe.
- Resubscribe (after restart, don't know seqnum), and multiple posters support.

- SubscriptionHandler: When a subscription request fails, run the next one on the queue.
- SubscriptionHandler: When a subscription request succeeds, send success to each subscriber node.
- Once we have subscribed, we no longer need the ID and HTL for each node, because ResubscribeRequest is sent at full HTL with its own new, random ID.


Changes:
- SubscriptionHandler:
-- We have a maximum of one SubscribeSender at a time. We do not have a SubscribeHandler, because SubscriptionHandler takes this function.
-- We have a queue of pending subscriptions. These are nodes who have asked to subscribe, with different IDs and HTLs, after the first one arrived. If we succeed, we can drop the queue; if we were already connected or if we reset the HTL, we don't have to queue in the first place. If a SubscribeSender fails, we try again with the next element on the queue.
-- Objects: SubscribeSender (analogous to RequestSender; has status, can be waited on, may have a callback for SubscriptionHandler), QueuedSubscribeRequest (a SubscribeRequest received from a node, with its ID, HTL, etc, is currently suspended - we have told the node we are restarting).
-- When one fails, we can send another SubscribeRestarted. We should probably impose a maximum time limit on the client end on these.


SubscriptionHandler status:
- Our SubscribeSender.
SubscribeSender
- Are we subscribed (including root)?
If we are not restarting, we are subscribed. There is no failure mode.
- Are we root?
- Is upstream restarting, and if so, what is its restart ID?
- Are we restarting, and if so, what is our restart ID?



Race condition:
We connect.
We are restarting, so we send an FNPSubscribeRestarted.
We connect. We send an FNPSubscribeSucceeded.
They get reordered.
Oops!

Solution:
- After an FNPSubscribeRestarted, the FNPSubscribeSucceeded must have the same ID as the FNPSubscribeRestarted did.
- After a subscription, the FNPSubscribeRestarted or FNPSubscribeSucceeded, or FNPSubscribeSucceededNewRoot, must have the same ID as the subscribe request.

Conclusion:
- SubscribeSucceeded, SubscribeRestarted, SubscribeSucceededNewRoot need an ID.

Except that it shouldn't be the same ID as the resubscribe itself...

One ID for the resubscribe, (the ignore-ID) and one for the success/failure (the completion-ID). ??


First we tell everyone we are restarting, with UID X.
Then we send a ResubscribeRequest out, with UID X.
This is routed normally, pretending they were not already subscribed.
When it succeeds, the tree topology can change a bit - the root for the node which was forwarding it becomes the node which it has just subscribed through, and likewise back along the chain. The rest of the tree is still off the restarting node. So it is grafted from the top of the tree down to somewhere below that...
- Is this inefficient? Will need to look into this later...





We definitely want a sequence number on SubscribeRestarted.
Sometimes we will go from watching somebody else's SR to ourselves SR'ing. And we will need to change the ID, so that his SR doesn't go through our subnodes, because if it does we will ignore it.



When we first run the subscribe request, an RNF can be a fatal state, from which the SubscribeHandler must recover by starting a new SubscribeSender, probably with new ID, HTL, etc, or by becoming root. But an incoming RNF normally means move on to the next node. That's what we propose to do either way. We have a set of nodes we have routed to; we can try the next one we haven't. If THAT doesn't work, then we do fail, and let the parent decide whether to retry.




SubscribeSender:
- Must have callbacks to SubscriptionHandler for everything
- Specifically whenever call setStatus()
- Then delete setStatus()'s callback, it's no longer necessary. Keep setStatus() though.









- Implement QueuedSubscribeRequest




Reordering artefacts:
- A subscribed node must have an accurate, up to date knowledge of the current status (subscribed or restarting) of its parent.
- But packets may be reordered in transit.
- Solution?

The obvious solution is to introduce an explicit order...

Another option is to have the data packets carry the current status.

A third would be to ack the status.


So:
- A sends a SubscribeRequest to B
- B sends Restarted to A
- B sends Succeeded to A
- Succeeded arrives before Restarted


Solution:
- A sends SubscribeRequest to B, including ID and initial seq#
- B sends Restarted to A (seq#+1)
- B sends Succeeded to A (seq#+2)
- Succeeded arrives before Restarted
- A ignores it until it has processed Restarted
- If a packet is completely missed out, then in all likelihood something is seriously wrong given the low-level retransmission code. We disconnect.

Division of message processing:

SubscribeSender: (searches for and handles parent)
- Send SubscribeRequest with seq#
- Wait for response with right seq#
- If loop, RNF, timeout, etc, move to next node
- If SubscribeSucceeded, wait for next message (could wait forever)
- Wait for either next message in sequence, or SubscribeData, from parent.

SubscriptionHandler:
- SubscribeRequest -> 
-- Track outgoing seq#
-- If we are already subscribed, send a SubscribeSucceeded
-- If we are restarting, send a SubscribeRestarting
-- If we successfully subscribe, send a SubscribeSucceeded
-- If we restart, send a SubscribeRestarting



SubscribeHandler.


Node sends a subscribe request.
Dispatched to SubscriptionHandler.
SubscriptionHandler creates a SubscribeHandler.



SubscribeRequest, SubscribeRestarted, SubscribeSucceeded have a seq#









- Implement SubscribeSender.
-- It will finish(), which will call a callback on SubscriptionHandler, which will decide whether to create a new one.
-- Do we need to have a thread running continually tracking a live subscription, after it has succeeded?











- Implement SubscriptionHandler.statusChange(...).









Hmmm.
The queued requests may have more stringent requirements for their link into the tree.

No, they don't. They are subscribe requests, NOT resubscribe requests. Subscribe requests will succeed if we connect to root, or become root, even if they were closer. This is to avoid local optima causing problems, and to reduce overhead. It may need to be reviewed in future.


Should we have a must-beat parameter on ordinary SubscribeRequests?

Probably not a problem - if we do end up with a loop, there will be a restarted from upstream, and then it can send a ResubscribeRequest. The closest-seen-so-far parameter is just for HTL purposes on a SubscribeRequest. No it isn't. If we can't find somebody closer, we fail the request; the CLOSEST node gets to be root, not the LAST node.

So:
- If we get a SubscribeSucceededNewRoot, it will include the location of the new root. So we know which of our requests have succeeded, and which to fail.
- If we get a SubscribeSucceeded, 












What is the role of HTL in SubscribeRequests?
- If we cannot find a node to subscribe to which is closer to the key than we are within HTL, we are the root.
- Therefore, we must decrement the HTL as normal.
- If we are subscribed, and we receive a request with a higher HTL than we have so far seen, we may need to pass it on???


How about every time we find a closer node, we reset the HTL? In which case, the new HTL scheme is rather strange...

Wait a minute.

If we have an HTL of 1, that means we can backtrack once before we run out of HTL.

If we think we might be the root, because of routing a request, we fire off another request at max HTL????



So:
We route a SubscribeRequest.
It finds a few nodes which are closer to K than the original. The closest is N. It was passed on from N at HTL 5.
N gets a DataNotFound - there are no closer nodes than N within 5 backtrack hops.
N then sends the SubscribeRequest again at full HTL (10).
If it still gets a DNF, N is the root. If it gets an FNPSubscribeRequestSucceededNewRoot, the new root is downstream.

It may well be possible to DoS this; the whole pubsub scheme has a few weaknesses. In particular you can take root rather easily. But with an ordinary request you can just return DNF quite easily.


When we create a SubscriptionHandler, set status to restarting.
Then immediately start searching, with the HTL provided.


Another corner case:
We have 1 link.
The SubscribeRequest gets routed to us.
We don't have anywhere to send it to.
Does this make us the root? No, it must be an RNF.
The difference being? DNF == you can find nodes but not close enough nodes. RNF == you can't find ANY nodes.


So:
- We get a SubscribeRequest
- We create a SubscriptionHandler, because we don't have one already. The status is set to RESTARTING, so if another request comes in, it increments. We immediately add the node which sent the SubscribeRequest to the list of node-subscribers, so that the SubscriptionHandler is not removed. We create a thread, which checks HTL, forwards the request etc. When it completes, it unsets restarting. We track the best-found-so-far values from our subscribers (we have to in order to forward the subscribe request).

-- What if we have multiple SubscribeRequest's going through us, coalesced, with some having different best-seen-node-location values? What if we cannot find a better node to subscribe to? Can we subscribe through the nodes which sent us the requests?

What is the alternative?

We can route the first subscribe request more or less normally.
But if it fails, and we are not root (he is upstream), and we already have a second subscriber, we are in trouble. Can we just fail them? Hopefully they will get routed to the right place? What if we are the gateway, and we routed back out to the wrong side? Do we need some sort of reversal protocol?
-- I suppose we should just fail them
-- What if the first one had lower HTL than the second?




Every time we get closer, we reset HTL to max.
When we don't, we decrement the HTL
If htl=0, we return DNF
If, on a subscription, you receive a DNF and you are the closest node, you are the root.


Max HTL of 5??
Poss. separate max HTL for subscriptions?


Use this for requests as well

Can climb out of dead ends.



if a request comes in while we are running one, we queue it. if we succeed, we ditch the queue. if not, we run the second request.










How to implement?
- SubscriptionHandler.forceRestart()
- Need to keep upstreamRestarting up to date
- SubscriptionHandler has a "restarting" flag. This needs to be set while we are restarting.
- SubscribeHandler needs to be told when its parent is lost; set up a callback
- SubscribeSender
- SubscribeHandler
- SubscribeManager

