The MPLS WG Archive

Cell Relay Retreat>MPLS WG Archive>month:2003-Apr> msg00089



[Date Prev][Date Next][Thread Prev][Thread Next]  
  [Date Index][Thread Index][Author Index][Subject Index]

Check MPLS WG Consensus (on soft preemption)

  • From: Curtis Villamizar <curtis@fictitious.org>
  • Date: Sat, 05 Apr 2003 10:52:30 -0500
  • cc: curtis@fictitious.org, mpls@UU.NET


In message <3E8CB445.F8CCF8E7@alcatel.be>, Dimitri.Papadimitriou@alcatel.be wri
tes:
> curtis,
> 
> see in-line...
> 
> > Dimitri,
> > 
> > Please note that at this point the primary disscussion is about
> > whether this should become a WG document.  
> 
> well george asked for further comments (i should probably
> change the title of this e-mail in this case) nonetheless i 
> think that this topic is relevant to be further progressed 
> by the community

Just checking whether you thought the work was worthwhile.

> > I haven't seen anything to
> > indicate that it should not be, just a discussion of RRO vs Path-Err
> > feedback and followup discusion of details.
> 
> see in-line, because i believe some refinements might be
> considered in the next wg version...

Thanks.

> > Further comments on the details of the draft below inline.
> 
> [..]
> 
> > > curtis,
> > >
> > > thanks for the clarification, and trying to summarize
> > > the discussion (which was "how soft the preemption is"
> > > at the end) it seems we're focusing on the second case
> > > were consolidated feedback is expected, there might be
> > > thus some words to ponder within the actual version of
> > > the document (which tends to imply that timing is a
> > > critical issue) such as "This indicates to the HE of
> > > this LSP that it must be re-routed *as soon as possible*
> > > using a make before break." and "The preempting node MUST
> > > *immediately* send a Resv message with the 'Preemption
> > > pending' RRO flag set for each soft preempted TE LSP."
> > 
> > The RRO should be sent immediately, but the reroute should be slightly
> > less than immediate.  This is true in any case where packets are not
> > being pitched into the bit bucket in large numbers.  The reason is
> > that multiple ingress can all try to make reservations for the same
> > resources.  If some form of pacing is applied then some feedback from
> > the midpoints can influence further setups to go elsewhere.
> > 
> > If this is the case where reroute is somewhat less than immediate then
> > the number of LSPs soft-preempted by multiple congested links (the
> > prime example of this occurs in any overlapping rings topology) can be
> > substantially reduced if all LSR on the path know which ones are
> > already preempted.
> 
> well this is probably the turning point here, if the provided
> mechanism does delay - how long ? - and at which point in the
> process at generation of the Resv ? and/or only at the edge  
> for the make-before-break decision ? the latter goes in your
> direction, and the former as well but with slight refinement 
> "what immediately means here?" is it a global lsr or a per-lsp 
> decision process that we want? or equivalently do we make 
> dependent or independent decisions? 

RFC2702 gave us the term resilience.  The delay is configured using
the resilience timer, a feature that most implementations have had for
a very long time.  The resilience timer sometimes has other names like
setup pacing.  In any case its an existing feature in most
implementations.

> also if nodes wait for consolidating the information received 
> through signalling (a time sufficient for edge node to have
> the full rro consolidated feedback and smaller than the 
> soft-preemption elapsing timer) do they have to wait for
> an lsa update or not in addition to the (optional) operation  
> of updating the current running copy of the te lsdb with the
> per-lsp signalled info; 

The CSPF is often run on slightly inaccurate (as little as
milliseconds too old) information.  The set of preemptions is a solid
indication that it can't use specific links so it could be considered
prior to receiving IGP flooding of changes.  Thats up to the
implementation.

> thus the operation mode should probably distinguish between 
> consolidated feedback from signalling - by default of rro 
> usage and the feedback from routing (optional); the issue i 
> see here is that the bandwidth adjustment will be known after
> the update... what happens is probably that the working group 
> has to decide either we strictly focus on the signalling part 
> or do we enter into "ways in which this can work" imho we are 
> probably in the middle of the bridge here as it stands in the 
> current version of the i-d (we are opening the doors and i
> don't know how far we should go in consolidating each of them)

The draft strictly focuses on the signaling part as all RFCs do.
Anything which is an implementation specific optimization that falls
within the specified behaviour can be omitted.

> > > you have mentioned "The RRO with "Preemption pending"
> > > set can be sent in both the PATH and RESV to insure this."
> > > would you clarify what do you mean in the former one?
> > > i don't see this in the current i-d version (i see only
> > > Resv RRO mentioned w/o further details)
> > 
> > An ERO is sent in the path and an RRO is sent in both the PATH and
> > RESV, initially with only the ingress in the PATH RRO.  The midpoint
> > can update the RRO that is sends in either direction.
> >
> > I've discussed this with Mathew but you are correct that it is not in
> > the current draft.  I can't be sure it will be but this brings the
> > discussion on list.
> 
> other issue to look at, is that it is considered as a 
> trigger message so details concerning maintaining the 
> "preemption state pending" using refresh should also 
> be discussed in there (this in order to have clear 
> description)

Good point.  The draft should mention that this is a trigger message
when using refresh reduction.

> > > note also that i am not sure on how far we can go in
> > > soft-preemption of soft-preempted lsp, when you say:
> > > "allows further soft-preemptions to act on already
> > > soft-preempted LSPs." wouldn't we then propagate the
> > > problems? imho it might be wise (and i think this is
> > > what the current doc says in section 6) to limit it
> > > *by default* to external events -
> > 
> > Soft preemption essentially means that the resourses at a node are
> > overbooked beyond the normal connection admission and an LSP has been
> > selected to be removed but has to be nice to the ingress, the removal
> > has been deferred for some non-zero time.
> 
> when i said "external events" is avoid that for soft-
> preemption reasons, other lsp's get themselves soft-
> preempted (so that cascading wouldn't be possible during 
> the process of make-before-break during this operation 
> itself, we may be just delaying the process and then the 
> soft preemption elapsing timer would simply drop the lsp) 
> was it allowed within current version of the i-d (i think 
> in section 6 it refers only to point of occurrence of the 
> event) in order to avoid it's timer should be reset then
> otherwise lower priority lsp will be penalized more than
> once, another way is to allow for more than one value of 
> this timer (per priority) well just some thoughts here in 
> order to progress.

Section 6 isn't very clearly written.  The above paragraph isn't very
clearly written either.  I couldn't make any sense of it.  It is
sufficiently unclear that you're going to have to try again.

> > Consider overlapping large rings which overlap at A-B-C-D.  If one
> > ring goes down consider reroutes from that ring that would go in the
> > A-B-C-D direction.  Either FRR or standby LSP may be in use (standby
> > has obvious advantages in this topology) on the preferred LSPs.  If so
> > rerouting the preferred LSP may occur gradually (less than immediate,
> > but not slowly).  Even if FRR or standby LSP are not in use there are
> > good arguments to try to set up LSPs exactly immediately.  At some
> > point a link on A-B-C-D will be full and one lower preference LSP will
> > be soft-preempted.  The ingress reroute of the less preferred LSP is
> > also less than immediate.  As additional more preferred LSPs are added
> > to A-B-C-D other links will become overloaded.  These can effectively
> > "credit" the already soft-preempted LSPs as gone, knowing that this
> > will minimize disruption.
> >
> > This is in effect an optimization of soft-preempt designed to minimize
> > disruption of the network.
> > 
> > In restoration doing some things "as fast as possible" is best but
> > doing everything "as fast as possible" isn't always the best approach.
> > With FRR or standby LSP on more preferred LSPs, cutover to the
> > presignaled backups as fast as possible is desireable but rerouting
> > the primaries with some pacing is desireable.  Without soft-preempt,
> > the less preferred LSPs have to be rerouted "as fast as possible"
> > because often less preferred LSPs aren't backed up by FRR or standby
> > LSPs and all traffic for these is pitch when hard preempted.  With
> > soft-preempt, and TCP dominated traffic, pacing of the reroute of the
> > less preferred LSPs is desireable.
> > 
> > This of course doesn't prevent an ISP from configuring their LSR to do
> > everything "as fast as possible" even in the above scenarios and some
> > will, might even be most of them, that I don't know.  For the rest,
> > opinions will vary regarding optimal values of "less than immediate".
> > I'd guess that opinions on the range on the optimal values of "less
> > than immediate" will vary from all LSP being rerouted in 100s of msec
> > (which differs from immediate but not by much) to a few 10s of
> > seconds.  My point was that the latter, being very much on the long
> > side given the stress placed on sub second reroute capability was
> > actually not a problem from a user perception for some types of
> > service (non-SLA, mostly TCP, on a less preferred LSP).
> > 
> > > thanks,
> > > - dimitri.
> > >
> > > Curtis Villamizar wrote:
> > > >
> > > > In message <3E89335B.E1691D6E@alcatel.be>, Dimitri.Papadimitriou@alcate
> l.be
> > >  wri
> > > > tes:
> > > > > hi, to address the following comment exchange:
> > > > >
> > > > > -----
> > > > >
> > > > > > > > The preference for the RRO flag is that like the protect-inuse,
> > > > > > > > the ingress knows which hops it does not have resources on.
> > > > > > > > Consider the path A-B-C-...Z. If hops D-E and G-H have preempte
> d,
> > > > > > > > but all of the hops are near 100% utilized, the ingress knows i
> t
> > > > > > > > can share bandwidth with its prior LSP on all hops for which th
> e
> > > > > > > > RRO flag bit is not set. Its harder to do that with a collectio
> n
> > > > > > > > of path-err messages.
> > > > >
> > > > > > > That's perfectly correct and one of the reasons why we ended up
> > > > > > > with this scheme. Otherwise, the HE would have had to wait for so
> me
> > > > > > > unknown period of time (to make sure it has received all the PERR
> > > > > > > from the set of preempting nodes) before triggering a new CSPF on
> > > > > > > the modified topology
> > > > >
> > > > > > OK. But I don't see how using Resv lets you know when all of the
> > > > > > premption is complete. Since in your example preemption of D-E is
> > > > > > likely to happen first it will trigger a Resv reporting just one
> > > > > > hop as preempted. Later there will be another Resv that indicates
> > > > > > D-E and G-H as preempted. Sometime later there might be another
> > > > > > Resv indicating further preemption down near Z. The only advantage
> > > > > > seems to be that the Resv gives you a list of preemptions that have
> > > > > > happened (saving the HE from having to maintain that list itself).
> > > > > > It does not remove the "unknown period of time" issue.
> > > > >
> > > > > -----
> > > > >
> > > > > we may consider two modes, a fast one using PathErr messages (or
> > > > > even Notify messages) to the sender (optimizing the time performance)
> ,
> > > > > and a trace mode using the RRO but with a prior notification to
> > > > > the receiver so that the complete trace is available through the
> > > > > RRO at the sender side before making a decision (optimizing the
> > > > > resource performance), i have got the impression that the current
> > > > > solution tries to optimize both at the same time but as mentioned
> > > > > by adrian it doesn't seem to be feasible
> > > > >
> > > > > thanks,
> > > > > - dimitri.
> > > >
> > > > Dimitri,
> > > >
> > > > The vast majority of traffic is IP and of that some 95% or more is
> > > > TCP.  In the last major ISP traffic sampling I've seen (available
> > > > through CAIDA - look around) there was enough relatively high speed
> > > > and long duration TCP flows to make traffic easily compressible by
> > > > 30-40% and possibly by 50% with very low loss.  If this were to occur,
> > > > customers with high speed access would notice a degredation in
> > > > performance of bulk transfers, but would otherwise be nearly
> > > > imperceptible.  If the degredation were for a brief period, customers
> > > > with high speed access that were not making explicit measurements
> > > > would not notice either (or barely notice).
> > > >
> > > > This means that soft preemption can be provide many seconds or even
> > > > 10s of seconds of "grace period" before hard preemption.  The
> > > > performance loss for "less preferred" IP traffic over temporarily
> > > > overloaded links for seconds or a few 10s of seconds would be
> > > > imperceptible unless doing bulk transfer and measuring the throughput.
> > > > Rerouting by multiple ingress need not be rushed to the point that
> > > > poor layout results (ie: the ingress reroutes can be paced such that
> > > > feedback from the midpoints is effective).
> > > >
> > > > The reroute can also be configured to go "as fast as possible" if that
> > > > is what the ISP would prefer.
> > > >
> > > > If a large number of LSPs is soft preempted, and preemption occurs at
> > > > multiple hops, then the RRO method consolidates the feedback and most
> > > > important, allows further soft-preemptions to act on already
> > > > soft-preempted LSPs.  The RRO with "Preemption pending" set can be
> > > > sent in both the PATH and RESV to insure this.
> > > >
> > > > The case where a large number of LSPs is soft preempted is likely to
> > > > be caused by a failure at some other link that causes higher
> > > > preference LSPs to be rerouted.  If these use either FRR or standby
> > > > LSPs, then the primary LSP need not be rerouted "as fast as possible"
> > > > and the effective result will be a pacing of LSP setups and therefore
> > > > of soft-preemptions.  Knowing which lower preference LSPs are already
> > > > soft-preempted is more important than fast notification of the
> > > > ingress.
> > > >
> > > > Curtis
> > > >
> > > > > George Swallow wrote:
> > > > > >
> > > > > > In San Francisco the workgroup showed support for making
> > > > > >
> > > > > >   MPLS Traffic Engineering Soft preemption
> > > > > >     draft-meyer-mpls-soft-preemption-00.txt
> > > > > >
> > > > > > an MPLS WG Document.  This message is to solicit any further commen
> ts
> > > > > > prior to making a final determination.
> > > > > >
> > > > > > Please reply by 4/7 24:00 GMT.
> > > > > >
> > > > > > ...George
> > >
> > > --
> > > Papadimitriou Dimitri
> > > E-mail : dimitri.papadimitriou@alcatel.be
> > > Private: http://www.rc.bel.alcatel.be/~papadimd/index.html
> > > E-mail : dpapadimitriou@psg.com
> > > Public : http://psg.com/~dpapadimitriou/
> > > Address: Fr. Wellesplein 1, B-2018 Antwerpen, Belgium
> > > Phone  : +32 3 240-8491
> > >
> 
> -- 
> Papadimitriou Dimitri 
> E-mail : dimitri.papadimitriou@alcatel.be 
> Private: http://www.rc.bel.alcatel.be/~papadimd/index.html
> E-mail : dpapadimitriou@psg.com
> Public : http://psg.com/~dpapadimitriou/
> Address: Fr. Wellesplein 1, B-2018 Antwerpen, Belgium
> Phone  : +32 3 240-8491
>