Discussion:
Microsoft.com PMTUD black hole?
(too old to reply)
Nathan Anderson/FSR
2008-05-06 19:07:05 UTC
Permalink
Hello,

Has anyone else here seen problems with microsoft/msn/hotmail/live.com
sites not performing PMTUD correctly? We have, for a while now, had
people on our network complain of poor microsoft.com reachability, and
discovered we can work around the issue by changing MSS on all TCP SYN
as they go out of our network.

I recently watched the whole conversation between msn.com and a host on
our network (with the MSS rewrite disabled), and if I'm reading it
right, we are following PMTUD protocol correctly by sending back ICMP
type 3 code 4, but all Microsoft hosts seem to ignore this and continue
to send packets back to our host with an MSS that is too large.

I hope I'm wrong and that it is we who are doing something stupid, but
after cruising Google for a while, I found a multitude of other
complaints from people connected to other ISPs specifically about not
being able to reach Microsoft web sites. It seems crazy that MS could
have PMTUD broken for so long with nobody ever raising a complaint to
them directly, though, which makes me wonder if there is another answer
here that I'm missing.

I sent the following message to a couple of addresses that I gleaned
from ARIN WHOIS for the IP block in question and threw hostmaster in
there just in case it went somewhere, but ***@microsoft.com appears to
be defunct. I have yet to receive acknowledgment of receipt from the
other address.

Are there any microsoft.com admins that hang out here that can comment
on this or get in touch with me, or is there perhaps someone on here
with connections to the Microsoft NOC?

(BTW, I stripped the referenced libpcap attachment off of this message
to the list just so that I wouldn't accidentally incur the wrath of
NANOG...if y'all want to see it, I'm happy to post it.)

Thanks,
--
Nathan Anderson
First Step Internet, LLC
***@fsr.com

-------- Original Message --------
Subject: Microsoft/MSN/Live!/Hotmail behind blackhole router?
Date: Thu, 01 May 2008 19:00:46 -0700
From: Nathan Anderson/FSR <***@fsr.com>
To: ***@microsoft.com, ***@microsoft.com, ***@microsoft.com

To microsoft.com NOC admins:

I work for a regional ISP in the inland pacific northwest. May of our
customers' connections have MTUs of less than 1500, and we get routine
complaints from them that they have trouble reaching web sites that are
under your administration.

Usually we can fix the problem by "mangling" the TCP SYNs originating
from our customers and headed to the world to reflect a lower value;
however, we would rather not have to do that. The fact that we are
REQUIRED to do this in order for your sites to be reachable by our
customers strongly suggests that either the servers that respond to HTTP
requests sent to www.microsoft/msn/hotmail/live.com are behind routers
that are blocking ALL ICMP traffic sent their way -- even ICMP type 3
code 4 (packet too large, DF set), which is necessary in order for Path
MTU Discovery to work -- or the servers themselves are not listening to
the ICMP messages that we are sending their way when our routers are
forced to drop a packet sent by you which is too large to be forwarded
to a customer of ours.

I set up a test connection "on the bench" so to speak, and had our
router capture a copy of the conversation between our test client and
www.msnbc.msn.com and forward that conversation encapsulated in TZSP to
the same test client over a different interface. The capture clearly
shows our test client establishing the TCP connection with MSNBC
(SYN/SYN+ACK/ACK), and then goes on to show MSNBC send ethernet
MTU-sized packets our way that an intermediate router of ours drops and
responds with "packet too big, DF set." Despite this, MSNBC continues
to retrasmit the original packet with the same payload and the same size
back to us. We continue to respond "packet too big, DF set," but the
MSNBC server never seems to get the message (literally).

We see the same behavior with all sites across the board contained
within the 207.46.0.0/16 space, regardless of actual hostname/FQDN.

We also find this ironic considering that Microsoft published a Technet
article a few years back on black hole routers and the problems they
pose, found at http://technet.microsoft.com/en-us/library/bb878081.aspx
(which we can't read/access unless we are mangling the MSS).

We would appreciate it if Microsoft NOC admins would please look into
the matter and take the appropriate corrective action: allowing ICMP
type 3 code 4 messages through your routers/firewalls, and making sure
that your servers respond to them appropriately as defined in RFC 1191.

I have attached the capture we made of the conversation to this e-mail
message in libpcap format for your analysis. The test client itself had
a 1500 MTU to a desktop router, which in turn had an MTU of 1492 on its
uplink to us.

I am available to answer any additional clarifying questions you may have.

Thank you for your time and attention to this matter.

Regards,
--
Nathan Anderson
First Step Internet, LLC
***@fsr.com
Brandon Butterworth
2008-05-06 19:58:38 UTC
Permalink
Post by Nathan Anderson/FSR
Has anyone else here seen problems with microsoft/msn/hotmail/live.com
sites not performing PMTUD correctly?
I used to see it a lot when hosting on windows was popular and people
realised they needed a firewall or decided to add a load balancer
but broke PMTUD by leaving it enabled on the servers.

I've not heard of it for some time so those people got
a clue or moved to something else (or everyone worked around them)

brandon
Iljitsch van Beijnum
2008-05-06 20:26:46 UTC
Permalink
Post by Brandon Butterworth
Post by Nathan Anderson/FSR
Has anyone else here seen problems with microsoft/msn/hotmail/
live.com
sites not performing PMTUD correctly?
I used to see it a lot when hosting on windows was popular and people
realised they needed a firewall or decided to add a load balancer
but broke PMTUD by leaving it enabled on the servers.
I've not heard of it for some time so those people got
a clue or moved to something else (or everyone worked around them)
Many years ago I had occasion to terminate dial-up service over L2TP
from modem pools operated by a service provider who shall remain
nameless to protect the guilty. This service had the unfortunate
tendency to drap all packets larger than 576 bytes. So we needed to
negotiate a 576-byte MTU over PPP.

We then got many complaints from users who dialed in using ISDN
routers (yes this was a while ago) because of broken path MTU
discovery. The behavior that Microsoft exhibits was EXTREMELY common
in those days, and I have no reason to assume it's any less common
today. (I also see it regularly with IPv6.) What I did was clear the
DF bit on packets going out to the L2TP virtual interfaces so the
packets could be fragmented.

A more common approach is to rewrite the MSS option in all TCP SYNs
with a smaller value so there won't be TCP segments large enough to
trigger the problem. AFAIK, all boxes that do PPPoE do this.

All of this even went so far that the IETF came up with RFC 4821,
which will do path MTU discovery by correlating lost packets with
packet sizes to determine the path MTU rather than depend on ICMP
messages.
Nathan Anderson/FSR
2008-05-06 21:29:03 UTC
Permalink
Post by Iljitsch van Beijnum
A more common approach is to rewrite the MSS option in all TCP SYNs
[snip]

Yeah, we do this now, but the software that we have been using for PPPoE
termination as well as for a huge portion of our clients (MikroTik
RouterOS) doesn't do it correctly in my estimation when you flip on the
automatic "change-tcp-mss" option...it rewrites the MSS in ALL SYNs
passing through it, either coming OR going. This has the effect of
breaking communication with other hosts that actually have a SMALLER MSS
than our PPPoE customers since our client will get a SYN+ACK from the
remote host that we have rewritten to reflect a larger MSS than the
remote host is capable of dealing with. Because MikroTik rewrote both
the SYNs generated by us as well as received by us, our customer's host
is now under the impression that the lowest MSS between the two hosts
matches its own.

At least that's the best theory I've come up with. We can write (and
have written) custom IP manglers on the MikroTik boxes that only touch
SYNs generated by our clients, and only when the MSS is larger than a
certain value (in order to honor MSSes even lower than that allowed by
their PPPoE gateway). But it's a PITA to deal with. I'd just rather
everyone follow protocol. :-P Although we can't always expect everyone
to do it by the book, I don't think it is too much to ask that those who
operate sizable networks that nearly everyone is required to interact
with on a daily basis (read: Microsoft) act responsibly.
Post by Iljitsch van Beijnum
All of this even went so far that the IETF came up with RFC 4821,
which will do path MTU discovery by correlating lost packets with
packet sizes to determine the path MTU rather than depend on ICMP
messages.
What's funny is that I ran my tests from a Windows XP host with the
recently-released Service Pack 3 installed, which is supposed to
activate Microsoft's "PMTUD Black Hole Router Detection" by default
(available pre-SP3 but apparently not turned on without a registry
change). I haven't read up on exactly how it's supposed to work, but I
think the basic idea is that if the TCP connection is negotiated
properly but it doesn't get a response beyond that, it will try lower
and lower MSSes until it does.

However it works (or doesn't as the case may be), it didn't make a lick
of difference. I waited and waited for content to be delivered to me
until eventually Microsoft's end sent me a TCP RST.

While I was poking at this, though, I had a thought...most IP stacks I
believe keep a path MTU cache of some sort. I know Windows does: if I
send an ICMP packet with DF set that is larger than the PPPoE gateway
can handle, I get something similar to the following:

C:\Documents and Settings\nathana>ping 64.126.160.1 -f -l 1472

Pinging 64.126.160.1 with 1472 bytes of data:

Reply from 64.126.142.249: Packet needs to be fragmented but DF set.
Packet needs to be fragmented but DF set.
Packet needs to be fragmented but DF set.
Packet needs to be fragmented but DF set.
[...]

Next time that I try the same thing, Windows doesn't even bother trying
to send the packet. It looks at its PMTU table for that IP, and already
KNOWS it is too big:

C:\Documents and Settings\nathana>ping 64.126.160.1 -f -l 1472

Pinging 64.126.160.1 with 1472 bytes of data:

Packet needs to be fragmented but DF set.
Packet needs to be fragmented but DF set.
[...]

However, even when trying this with www.msnbc.msn.com, and with the
MSNBC entry in its PMTU cache (and its IP set statically in my 'hosts'
file so that Akamai/MS round-robin DNS doesn't screw with me during the
test), when I tried to build a TCP connection to MSNBC from this same
host, Windows told the remote host it had a 1460 MSS.

Now, although that makes sense, in order to avoid issues like the one we
are facing with Microsoft, would it not make _more_ sense for the stack
to look at the PMTU cache first, and then adjust its own MSS just for
connections to that one host? Maybe even send out an MTU - 40 ICMP
packet to the host that we want to build a TCP connection with FIRST to
get an ICMP type 3 code 4 response from the router in-between with the
smaller MTU?

That would put the burden of PMTUD on the host requesting the TCP
session rather than on the one responding, but if hosts were "smarter"
like this it seems to me it might smooth out some of these issues. The
remote end could be "broken" with respect to PMTUD but it wouldn't matter.

Thoughts?
--
Nathan Anderson
First Step Internet, LLC
***@fsr.com
Nathan Anderson/FSR
2008-05-06 21:32:48 UTC
Permalink
Nathan Anderson/FSR wrote:

[...]
Post by Nathan Anderson/FSR
connections to that one host? Maybe even send out an MTU - 40 ICMP
:s/40/sized. Brain fart.
--
Nathan Anderson
First Step Internet, LLC
***@fsr.com
Iljitsch van Beijnum
2008-05-07 05:29:41 UTC
Permalink
Post by Nathan Anderson/FSR
Now, although that makes sense, in order to avoid issues like the one we
are facing with Microsoft, would it not make _more_ sense for the stack
to look at the PMTU cache first, and then adjust its own MSS just for
connections to that one host? Maybe even send out an MTU - 40 ICMP
packet to the host that we want to build a TCP connection with FIRST to
get an ICMP type 3 code 4 response from the router in-between with the
smaller MTU?
No. This would add significant delay because you'd have to give the
other side enough time to respond to the large packet (also sending a
large packet on something like GPRS/EDGE is a waste of bandwidth and
battery power) while if there is ICMP filtering, there won't be a
response, which is exactly the reason why we're in this bind in the
first place (along with the stupid idea that DF should be set for ALL
packets rather than just once in a while).

And adjusting the MSS based on ephemeral information is the wrong
thing to do in the first place. The path MTU can vary. Once you've
advertised a small MSS you can never increase it.

It is incredibly unprofessional that people enable PMTUD, then break
it and require the rest of the world to implement workarounds. Either
use PMTUD properly by accepting the ICMP messages or turn PMTUD off.
Nathan Anderson/FSR
2008-05-07 18:50:34 UTC
Permalink
Post by Iljitsch van Beijnum
No. This would add significant delay because you'd have to give the
other side enough time to respond to the large packet (also sending a
large packet on something like GPRS/EDGE is a waste of bandwidth and
battery power) while if there is ICMP filtering, there won't be a
response, which is exactly the reason why we're in this bind in the
first place
I admit the idea needs tweaking (at best), and it was just a stray
thought :-), but 1) even if there is ICMP filtering happening way at the
other end, I (the TCP initiator) will still get a response from the
router in the middle (RITM) that is reducing the total path MTU if I try
to send a packet through it larger than the actual path MTU, and 2) if I
don't get a response to my single large packet (either from a RITM or
the other end) in a timely fashion (less than a second?), then the
client/initiator may just assume that path MTU == local MTU and will set
its MSS accordingly (which is no different than what is happening now),
until it has a reason to think differently.

Also, if there is already something in the local PMTU cache for a single
host address, I'm not sure I follow why it would be a bad idea for the
TCP initiator to consult that cache when preparing the SYN. Although,
on second thought, I suppose it is possible (and, in more than a few
cases, likely) that in instances of route path asymmetry, the PMTU of
the path from the initiator to the server may be different than the PMTU
of the path back from the server to the client. Hmmm.

Okay, scratch that idea then. :-P
--
Nathan Anderson
First Step Internet, LLC
***@fsr.com
Nathan Anderson/FSR
2008-05-07 19:38:32 UTC
Permalink
The usual case where you get screwed over is when the router trying to toss
the ICMP FRAG NEEDED is *behind* the ICMP-munching firewall. And in case (2),
you still can't assume that path MTU == local MTU, because your local MTU is
likely 1500, and the fragging router often trying to stuff your 1500 byte
packet down an PPPoE tunnel that's got an MTU of 1492....
Yes, but my point was precisely that one OR the other side (server OR
client) is going to NOT have the ICMP-munching firewall in between
itself and the "RITM" as I have affectionately been calling it (although
it is definitely possible that there are two ICMP-munchers on either
side of the RITM).

And case #2 is exactly what is occurring right now _anyway_: hosts
assume that path MTU == local MTU even if there is already an active
PMTU cache entry from a recent earlier communication with the remote
host. So I don't see how making that assumption _after_ making an
honest attempt at actively determining whether or not it is actually the
case is any more broken than they way things are already being done.

The problem is that, as I realized at the end of the message you quoted,
there are potentially multiple paths between the same two hosts, and the
path that the packet takes in one direction is not guaranteed to be the
same path that the packet takes in the opposite direction.
--
Nathan Anderson
First Step Internet, LLC
***@fsr.com
Tomas L. Byrnes
2008-05-07 19:43:49 UTC
Permalink
I'm not sure what the issue is here.

Just about every modern firewall I've used has an option to enable PMTU
on interfaces, while blocking all other ICMP.

Is MS not running something manufactured in the last 10 years at their
perimeter?
-----Original Message-----
Sent: Wednesday, May 07, 2008 12:39 PM
Subject: Re: [NANOG] Microsoft.com PMTUD black hole?
The usual case where you get screwed over is when the
router trying to
toss the ICMP FRAG NEEDED is *behind* the ICMP-munching
firewall. And
in case (2), you still can't assume that path MTU == local MTU,
because your local MTU is likely 1500, and the fragging
router often
trying to stuff your 1500 byte packet down an PPPoE tunnel
that's got an MTU of 1492....
Yes, but my point was precisely that one OR the other side (server OR
client) is going to NOT have the ICMP-munching firewall in
between itself and the "RITM" as I have affectionately been
calling it (although it is definitely possible that there are
two ICMP-munchers on either side of the RITM).
hosts assume that path MTU == local MTU even if there is
already an active PMTU cache entry from a recent earlier
communication with the remote host. So I don't see how
making that assumption _after_ making an honest attempt at
actively determining whether or not it is actually the case
is any more broken than they way things are already being done.
The problem is that, as I realized at the end of the message
you quoted, there are potentially multiple paths between the
same two hosts, and the path that the packet takes in one
direction is not guaranteed to be the same path that the
packet takes in the opposite direction.
--
Nathan Anderson
First Step Internet, LLC
_______________________________________________
NANOG mailing list
http://mailman.nanog.org/mailman/listinfo/nanog
Nathan Anderson/FSR
2008-05-07 21:00:27 UTC
Permalink
Post by Tomas L. Byrnes
I'm not sure what the issue is here.
Just about every modern firewall I've used has an option to enable PMTU
on interfaces, while blocking all other ICMP.
Is MS not running something manufactured in the last 10 years at their
perimeter?
Not sure, but you actually entered in here to a subthread of the
original conversation, this one about other possible ways of dealing
with black hole "ICMP-munchers" in a pre-emptive fashion. I had a
brainstorm that I thought would be workable, which is what we were
discussing here. Apparently, it turns out my idea was no good. ;-)

The original discussion about MS blocking ICMP to their own servers,
which is the discussion it sounds like you are looking for, is over
that-a-way... *points*
--
Nathan Anderson
First Step Internet, LLC
***@fsr.com
Matthew Petach
2008-05-12 16:19:06 UTC
Permalink
Post by Tomas L. Byrnes
I'm not sure what the issue is here.
Just about every modern firewall I've used has an option to enable PMTU
on interfaces, while blocking all other ICMP.
Is MS not running something manufactured in the last 10 years at their
perimeter?
Unless things have changed drastically since we parted ways, it's a simple
ACL applied on all edge interfaces. It should be possible for them to modify
it to allow the list of ICMP subtypes listed at
http://www.cymru.com/Documents/icmp-messages.html

It would *certainly* make troubleshooting easier for the poor folks at
Microsoft, since one side effect of the edge filter being set that way
meant we couldn't traceroute outside the network; the port unreachable
messages never made it back, so everything outside the edge routers
was all just stars.

Of course, that was in a former lifetime, so it's entirely possible and
probable things have changed considerably since then. ^_^;;

Matt
(speaking only for myself, not for my current employer, and most
certainly not for my previous employer who I'm still somewhat bitter
at, not having gotten any of my hardware back yet...)
Iljitsch van Beijnum
2008-05-07 05:22:21 UTC
Permalink
Post by Iljitsch van Beijnum
A more common approach is to rewrite the MSS option in all TCP SYNs
with a smaller value so there won't be TCP segments large enough to
trigger the problem. AFAIK, all boxes that do PPPoE do this.
Very few people out there use an MTU significantly below 1500
bytes. A
1500-byte MTU will give you an _average_ packet size of ~1000 on
long-
lived TCP flows because there is one tiny ACK for every two full size
data segments.
Right. Why is that noteworthy?

I have a lot more to say about MTU issues in this draft about
negotating MTUs between two hosts/routers on a subnet so jumboframes
can be deployed without manual configuration:

http://www.ietf.org/internet-drafts/draft-van-beijnum-multi-mtu-02.txt
It is generally desirable to avoid local fragmentation and to
choose EMTU_S low enough to avoid fragmentation in any gateway
along the path. In the absence of actual knowledge of the
minimum MTU along the path, the IP layer SHOULD use
EMTU_S <= 576 whenever the destination address is not on a
connected network, and otherwise use the connected network's
MTU.
Tell it to Microsoft and their ICMP-filtering friends...
Bjørn Mork
2008-05-07 08:10:35 UTC
Permalink
Post by Iljitsch van Beijnum
Many years ago I had occasion to terminate dial-up service over L2TP
from modem pools operated by a service provider who shall remain
nameless to protect the guilty. This service had the unfortunate
tendency to drap all packets larger than 576 bytes. So we needed to
negotiate a 576-byte MTU over PPP.
We then got many complaints from users who dialed in using ISDN
routers (yes this was a while ago) because of broken path MTU
discovery. The behavior that Microsoft exhibits was EXTREMELY common
in those days, and I have no reason to assume it's any less common
today. (I also see it regularly with IPv6.) What I did was clear the
DF bit on packets going out to the L2TP virtual interfaces so the
packets could be fragmented.
Right. I once stumbled across a SOHO-router doing just that. I never
understood why, but now you've given at least one explanation how it
could appear to be a good idea.

I can also provide the reason why we found it to be an extremely bad
idea at the time: Some (most? all?) systems won't set both the DF flag
and the identification field at the same time. If you clear the DF flag
without changing the identification field, you might end up with
fragmented packets that are impossible to reassemble. Which was why I
stumbled across the DF-clearing SOHO-router in the first place. The
random problems it generated were extremely difficult to debug, and when
we started we truly believed that we had a problem with a layer 4 load
balancing switch.

Note: There are solutions that will both clear the DF flag and generate
a new id. E.g. http://www.openbsd.org/faq/pf/scrub.html

This is the proper way to clear DF, if you must. Never just clear it.



Bjørn
Nathan Anderson/FSR
2008-05-06 20:57:28 UTC
Permalink
Post by Brandon Butterworth
I used to see it a lot when hosting on windows was popular and people
realised they needed a firewall or decided to add a load balancer
but broke PMTUD by leaving it enabled on the servers.
Yeah, but this is Microsoft's OWN server farm we are talking about here,
not some small podunk IIS-based hosting provider.

...well, you may be right. I am probably giving MS too much credit here.

On another note, someone pointed out to me off-list that I apparently
tyop'd "hostmaster" when I sent the e-mail to MS. I have since re-sent
it to the properly-spelled address and again promptly received a "User
unknown" bounceback.
--
Nathan Anderson
First Step Internet, LLC
***@fsr.com
Robert Bonomi
2008-05-06 22:53:58 UTC
Permalink
`
Date: Tue, 06 May 2008 14:29:03 -0700
Subject: Re: [NANOG] Microsoft.com PMTUD black hole?
Now, although that makes sense, in order to avoid issues like the one we
are facing with Microsoft, would it not make _more_ sense for the stack
to look at the PMTU cache first, and then adjust its own MSS just for
connections to that one host?
This _is_ Microsoft we're talking about, remember. 'sense' and 'Microsoft'
are, at a =minimum= orthogonal to each other -- and may not even inhabit
the same address-space. <wry grin>

As for standards, it is official Microsoft policy to "embrace and extend",
not to implement in a way compatible with the rest of the world. *sigh*

I -don't- believe the rumor that "PMTUD/Vista Ultimate" sends incrementally
increasing-size packets, and uses the first one that -doesn't- get through
as the size limit. <giggle>
Marshall Eubanks
2008-05-07 01:06:18 UTC
Permalink
Interestingly, Windows XP, Sp3, released today, describes changes in
PMTUD behavior.
http://download.microsoft.com/download/6/8/7/687484ed-8174-496d-8db9-f02
b40c12982/Overview%20of%20Windows%20XP%20Service%20Pack%203.pdf
<http://download.microsoft.com/download/6/8/7/687484ed-8174-496d-8db9-f02b40c12982/Overview%20of%20Windows%20XP%20Service%20Pack%203.pdf
or

http://tinyurl.com/323xb

Regards
-----Original Message-----
Sent: Tuesday, May 06, 2008 3:54 PM
Subject: Re: [NANOG] Microsoft.com PMTUD black hole?
`
Date: Tue, 06 May 2008 14:29:03 -0700
Subject: Re: [NANOG] Microsoft.com PMTUD black hole?
Now, although that makes sense, in order to avoid issues
like the one
we are facing with Microsoft, would it not make _more_
sense for the
stack to look at the PMTU cache first, and then adjust its own MSS
just for connections to that one host?
This _is_ Microsoft we're talking about, remember. 'sense'
and 'Microsoft'
are, at a =minimum= orthogonal to each other -- and may not
even inhabit the same address-space. <wry grin>
As for standards, it is official Microsoft policy to "embrace
and extend",
not to implement in a way compatible with the rest of the
world. *sigh*
I -don't- believe the rumor that "PMTUD/Vista Ultimate" sends
incrementally increasing-size packets, and uses the first one
that -doesn't- get through
as the size limit. <giggle>
_______________________________________________
NANOG mailing list
http://mailman.nanog.org/mailman/listinfo/nanog
_______________________________________________
NANOG mailing list
http://mailman.nanog.org/mailman/listinfo/nanog
Nathan Anderson/FSR
2008-05-07 01:12:42 UTC
Permalink
All,

A member of Microsoft's GNS network escalations team saw my postings on
NANOG about this issue and took offense at my use of this forum to raise
this issue with them, and criticized me as being unprofessional and
lacking in business acumen.

Therefore, I would like to publicly apologize for my actions here. It
was not my intention to "humiliate" Microsoft into compliance but rather
to find a means of effective contact with them since none was to be
found before today. However, I recognize that I did step over the line,
especially with regards to one comment I made in an earlier post about
"giving Microsoft too much credit." I apologize for this and retract
this, and ask their forgiveness.

As I promised, I will not be posting any more to this list regarding
this issue unless it is to report the final verdict that I receive from
my now-open ticket with Microsoft (thanks to this list, I found an
effective contact), or to discuss the mechanics of PMTUD in general.

Regards,
--
Nathan Anderson
First Step Internet, LLC
***@fsr.com
Randy Bush
2008-05-07 05:44:06 UTC
Permalink
Post by Nathan Anderson/FSR
A member of Microsoft's GNS network escalations team saw my postings on
NANOG about this issue and took offense at my use of this forum to raise
this issue with them, and criticized me as being unprofessional and
lacking in business acumen.
they try that intimidation every time a vulnerability or bug is
revealed. laugh and post their overly-aggressive message on a public
web site.

randy
Glen Turner
2008-05-07 07:12:45 UTC
Permalink
Post by Nathan Anderson/FSR
A member of Microsoft's GNS network escalations team saw my postings on
NANOG about this issue and took offense at my use of this forum to raise
this issue with them, and criticized me as being unprofessional and
lacking in business acumen.
Hang on a tick. Aren't you one of their customers...<looking through
mail spool>...
Post by Nathan Anderson/FSR
As I pointed out in my post earlier today timestamped at 2:29PM, I was
using an XP SP3 host to perform my tests with...
...why yes, you are.

I can't think of any other supplier that would be so unprofessional and
so lacking in business acumen as to say that their customer was UALIBI.

Amazing. A fine case study of a person in customer contact undoing the
work of millions of dollars in PR. Whatever you say about Steve Ballmer
he's a great sales person at heart. He must despair at some of his staff.
--
Glen Turner
Mark Newton
2008-05-07 08:05:59 UTC
Permalink
Post by Glen Turner
Amazing. A fine case study of a person in customer contact undoing the
work of millions of dollars in PR.
I wouldn't worry too much about it, Glen. My observation is that the
millions of dollars in PR isn't working very well either :-)

- mark


--
Mark Newton Email: ***@internode.com.au
(W)
Network Engineer Email:
***@atdot.dotat.org (H)
Internode Systems Pty Ltd Desk: +61-8-82282999
"Network Man" - Anagram of "Mark Newton" Mobile: +61-416-202-223
Patrick Giagnocavo
2008-05-07 13:49:15 UTC
Permalink
Post by Glen Turner
Amazing. A fine case study of a person in customer contact undoing the
work of millions of dollars in PR. Whatever you say about Steve Ballmer
he's a great sales person at heart. He must despair at some of his staff.
The rest of us however, despair at having to support their crap.

Patrick Giagnocavo
***@zill.net
Rich Kulawiec
2008-05-07 13:45:07 UTC
Permalink
Post by Nathan Anderson/FSR
A member of Microsoft's GNS network escalations team saw my postings on
NANOG about this issue and took offense at my use of this forum to raise
this issue with them, and criticized me as being unprofessional and
lacking in business acumen.
This is a typical Microsoft reaction: blame the messenger for their
own incompetence, laziness, stupidity, and greed. I think you should
post their assinine message so that it can receive the public ridicule
it surely deserves.

---Rsk
Nathan Anderson/FSR
2008-05-07 19:24:57 UTC
Permalink
Here is a brief update on the situation:

I have been in contact with someone at Microsoft's service operations
center, who has confirmed for me that MS does in fact block _all_ ICMP
at the edge of their network, that they are aware that this will in fact
break PMTUD, and that they have no current plans to change this practice
which they have implemented in the interest of security.

Nevertheless, the person I have been in contact with is naturally not
the final decision-maker on this issue and is going to continue to pass
the issue on up the chain of command for me. So although this issue is
not over and I do not have a final verdict from MS yet, I felt that,
given that I don't know how much time to expect to pass between now and
when that final verdict is rendered, it would be appropriate to let
everybody here know what I have learned thus far. Hopefully public
dissemination of this information factoid will prevent others in a
position similar to mine from having to helplessly beat their heads into
their keyboards.

I, naturally, voiced my strong objection over this security policy, and
attempted to make a reasoned argument with the contact I have over
there. We will see what comes of this.

Some have asked me to post copies of my private communication with my
Microsoft contact here. I don't think it is appropriate for me to post
copies of private communication without the other party's consent, so I
will have to decline unless he first gives me said consent.

Others have asked for valid contact information for the Microsoft NOC,
since the ARIN records for their 207.46.0.0/16 do not appear to be up to
date. I eventually found a working e-mail address from somebody
off-list who pointed to the WHOIS lookup from TUCOWS for
microsoft.comosoft.com (which I'm still not clear on what exactly this
is...). The e-mail address that was gleaned from this lookup was
***@microsoft.com, which goes to the Microsoft Corporate Domains
Team. They, in turn, forwarded my message on to
***@microsoft.com, which generated a ticket # for me and is, as I
understand it, the e-mail address I was looking for in the first place
(leads to their network/system people).

I hope this is helpful to others.

Regards,
--
Nathan Anderson
First Step Internet, LLC
***@fsr.com
Michael Sinatra
2008-05-07 19:46:06 UTC
Permalink
Post by Nathan Anderson/FSR
I have been in contact with someone at Microsoft's service operations
center, who has confirmed for me that MS does in fact block _all_ ICMP
at the edge of their network, that they are aware that this will in fact
break PMTUD, and that they have no current plans to change this practice
which they have implemented in the interest of security.
Although the need for your previous apology has already been questioned
in this forum, the confirmation that they block not only certain ICMP
types, but all ICMP, further vacates the need for any apology for
criticizing this behavior in a pubic forum. It is disheartening for
those of us who use and support MSFT's products to learn that their
understanding of security lacks even the basic nuance to know not to
block an entire--critical--portion of the Internet Protocol. Perhaps
they should also block _all_ TCP and UDP as well, and then we can move on.

I agree with Iljitsch that it happens frequently, but I think I am
justified in expecting more than that from Microsoft. Anything less
would be unprofessional.

*Speaking for myself only, of course!*

michael
Iljitsch van Beijnum
2008-05-07 20:35:14 UTC
Permalink
Post by Michael Sinatra
Post by Nathan Anderson/FSR
MS does in fact block _all_ ICMP
at the edge of their network, that they are aware that this will in fact
break PMTUD, and that they have no current plans to change this practice
which they have implemented in the interest of security.
Perhaps
they should also block _all_ TCP and UDP as well, and then we can move on.
I agree with Iljitsch that it happens frequently, but I think I am
justified in expecting more than that from Microsoft. Anything less
would be unprofessional.
Right.

Now Microsoft is also the company that built the OS that could be
crashed by a maliciously crafted fragmented IP packet, so maybe
there's something to this security policy. (One hopes that this bug
and others like it are now fixed.)

However, in that case the only workable course of action would be TO
DISABLE PATH MTU DISCOVERY!

You can't have your cake and eat it too.
Tomas L. Byrnes
2008-05-07 20:42:15 UTC
Permalink
The remedy you have below is NOT the only one, and is, in fact, a
non-sequitur in this case.

PMTUD uses the DF (for Don't_Fragment) bit, and works by getting an ICMP
Fragmentation needed response from the hop on the path where the packet
is too large, not a fragmentation and forward, so the union of PMTUD
packets and fragmented ones is 0.

The network-level solution to ping of death is to BLOCK fragmented
packets, and the way to ensure this doesn't self-deny-service is to
perform PMTUD and Black-Hole Router discovery.
-----Original Message-----
Sent: Wednesday, May 07, 2008 1:35 PM
To: Michael Sinatra
Subject: Re: [NANOG] Microsoft.com PMTUD black hole?
Post by Michael Sinatra
Post by Nathan Anderson/FSR
MS does in fact block _all_ ICMP
at the edge of their network, that they are aware that
this will in
Post by Michael Sinatra
Post by Nathan Anderson/FSR
fact break PMTUD, and that they have no current plans to
change this
Post by Michael Sinatra
Post by Nathan Anderson/FSR
practice which they have implemented in the interest of security.
Perhaps
they should also block _all_ TCP and UDP as well, and then
we can move
Post by Michael Sinatra
on.
I agree with Iljitsch that it happens frequently, but I think I am
justified in expecting more than that from Microsoft.
Anything less
Post by Michael Sinatra
would be unprofessional.
Right.
Now Microsoft is also the company that built the OS that
could be crashed by a maliciously crafted fragmented IP
packet, so maybe there's something to this security policy.
(One hopes that this bug and others like it are now fixed.)
However, in that case the only workable course of action
would be TO DISABLE PATH MTU DISCOVERY!
You can't have your cake and eat it too.
_______________________________________________
NANOG mailing list
http://mailman.nanog.org/mailman/listinfo/nanog
Nathan Anderson/FSR
2008-05-07 21:08:22 UTC
Permalink
Post by Tomas L. Byrnes
The remedy you have below is NOT the only one, and is, in fact, a
non-sequitur in this case.
How so? Iljitsch is suggesting that ICMP blockers originate packets
without DF set if they are going to block the ICMP messages that PMTUD
needs in order to work in the first place. That's what (I think) he
means by "disabling path MTU discovery."
Post by Tomas L. Byrnes
The network-level solution to ping of death is to BLOCK fragmented
packets, and the way to ensure this doesn't self-deny-service is to
perform PMTUD and Black-Hole Router discovery.
Which end are you talking about here, the servers or the client? If the
servers, how do you expect them to do PMTUD if they _can't hear the ICMP
messages_?

Also, for some reason, as I pointed out before, XP black hole router
discovery doesn't seem to be working for me for whatever reason. Does
anybody have any clue why that might be the case?
--
Nathan Anderson
First Step Internet, LLC
***@fsr.com
Iljitsch van Beijnum
2008-05-07 21:19:59 UTC
Permalink
Post by Nathan Anderson/FSR
How so? Iljitsch is suggesting that ICMP blockers originate packets
without DF set if they are going to block the ICMP messages that PMTUD
needs in order to work in the first place. That's what (I think) he
means by "disabling path MTU discovery."
Yes.
Post by Nathan Anderson/FSR
Also, for some reason, as I pointed out before, XP black hole router
discovery doesn't seem to be working for me for whatever reason. Does
anybody have any clue why that might be the case?
The problem is in the direction from M$ to you, so you can't fix that
from your end. I wonder if they've installed SP3 on their servers...
Nathan Anderson/FSR
2008-05-07 21:47:13 UTC
Permalink
Post by Iljitsch van Beijnum
The problem is in the direction from M$ to you, so you can't fix that
from your end. I wonder if they've installed SP3 on their servers...
Ah, you are right. I re-read the section on black-hole detection in
http://technet.microsoft.com/en-us/library/bb878081.aspx
more closely this time, and found that, yes, it only helps if the host
trying to send the large packets has the feature enabled:

"When PMTU black hole router detection is enabled, TCP tries to send
segments with the DF flag set to 0 after several retransmissions of a
segment are not acknowledged. If a segment with the DF flag set to 0 is
acknowledged, the MSS is decreased and the DF flag is set to 1 in
subsequent segments on the connection. Enabling PMTU black hole
detection increases the maximum number of retransmissions that are
performed for a given segment, and therefore has an effect on overall
performance."

I for some reason interpreted the advertisement of the black hole
detection feature as being a help to clients impacted by the inability
of the server to perform PMTUD.
--
Nathan Anderson
First Step Internet, LLC
***@fsr.com
Tomas L. Byrnes
2008-05-07 21:20:55 UTC
Permalink
I was responding to his post that blocking or disabling PMTUD was the
way to avoid the ping of death, which is False, nothing more, nothing
less.

As far as who Iljitsch is, everyone misspeaks from time to time. Even
those of us who have been at this for nearly 3 decades.
-----Original Message-----
Sent: Wednesday, May 07, 2008 2:08 PM
Subject: Re: [NANOG] Microsoft.com PMTUD black hole?
Post by Tomas L. Byrnes
The remedy you have below is NOT the only one, and is, in fact, a
non-sequitur in this case.
How so? Iljitsch is suggesting that ICMP blockers originate
packets without DF set if they are going to block the ICMP
messages that PMTUD needs in order to work in the first
place. That's what (I think) he means by "disabling path MTU
discovery."
Post by Tomas L. Byrnes
The network-level solution to ping of death is to BLOCK fragmented
packets, and the way to ensure this doesn't self-deny-service is to
perform PMTUD and Black-Hole Router discovery.
Which end are you talking about here, the servers or the
client? If the servers, how do you expect them to do PMTUD
if they _can't hear the ICMP messages_?
Also, for some reason, as I pointed out before, XP black hole
router discovery doesn't seem to be working for me for
whatever reason. Does anybody have any clue why that might
be the case?
--
Nathan Anderson
First Step Internet, LLC
_______________________________________________
NANOG mailing list
http://mailman.nanog.org/mailman/listinfo/nanog
Iljitsch van Beijnum
2008-05-07 21:40:02 UTC
Permalink
Post by Tomas L. Byrnes
I was responding to his post that blocking or disabling PMTUD was the
way to avoid the ping of death, which is False, nothing more, nothing
less.
I never said that disabling PMTUD will get rid of the ping of death,
what I said was that if your system is susceptible to a ping of death
you may be tempted to filter ICMP but if you do that then you need to
disable PMTUD because PMTUD + ICMP filtering = breakage.
Post by Tomas L. Byrnes
As far as who Iljitsch is, everyone misspeaks from time to time. Even
those of us who have been at this for nearly 3 decades.
After making the jump to academia I often feel a bit long in the tooth
between all these students. But considering that (apparently) some
people have been posting flames on NANOG for 30 years makes me feel
young in comparison. :-)
Tomas L. Byrnes
2008-05-07 21:45:51 UTC
Permalink
Sorry if I misunderstood what you were saying. I thought you said that
they couldn't have their cake and eat it too, as in protect against Ping
of death, AND do PMTUD.

As for those flames, well, that was a long time ago, in a valley not too
far from here ;-).

Can we have the 'net before the endless September back?
-----Original Message-----
Sent: Wednesday, May 07, 2008 2:40 PM
To: Tomas L. Byrnes
Subject: Re: [NANOG] Microsoft.com PMTUD black hole?
Post by Tomas L. Byrnes
I was responding to his post that blocking or disabling
PMTUD was the
Post by Tomas L. Byrnes
way to avoid the ping of death, which is False, nothing
more, nothing
Post by Tomas L. Byrnes
less.
I never said that disabling PMTUD will get rid of the ping of
death, what I said was that if your system is susceptible to
a ping of death you may be tempted to filter ICMP but if you
do that then you need to disable PMTUD because PMTUD + ICMP
filtering = breakage.
Post by Tomas L. Byrnes
As far as who Iljitsch is, everyone misspeaks from time to
time. Even
Post by Tomas L. Byrnes
those of us who have been at this for nearly 3 decades.
After making the jump to academia I often feel a bit long in
the tooth between all these students. But considering that
(apparently) some people have been posting flames on NANOG
for 30 years makes me feel young in comparison. :-)
Nathan Anderson/FSR
2008-05-07 21:50:12 UTC
Permalink
Post by Tomas L. Byrnes
As far as who Iljitsch is, everyone misspeaks from time to time. Even
those of us who have been at this for nearly 3 decades.
I was simply LOLing at the fact that you found it necessary to give him
a link to the NetHeaven article is all. ;-)
--
Nathan Anderson
First Step Internet, LLC
***@fsr.com
Tomas L. Byrnes
2008-05-07 20:43:35 UTC
Permalink
Some Edumacation on the topic is here:

http://www.netheaven.com/pmtu.html
-----Original Message-----
Sent: Wednesday, May 07, 2008 1:35 PM
To: Michael Sinatra
Subject: Re: [NANOG] Microsoft.com PMTUD black hole?
Post by Michael Sinatra
Post by Nathan Anderson/FSR
MS does in fact block _all_ ICMP
at the edge of their network, that they are aware that
this will in
Post by Michael Sinatra
Post by Nathan Anderson/FSR
fact break PMTUD, and that they have no current plans to
change this
Post by Michael Sinatra
Post by Nathan Anderson/FSR
practice which they have implemented in the interest of security.
Perhaps
they should also block _all_ TCP and UDP as well, and then
we can move
Post by Michael Sinatra
on.
I agree with Iljitsch that it happens frequently, but I think I am
justified in expecting more than that from Microsoft.
Anything less
Post by Michael Sinatra
would be unprofessional.
Right.
Now Microsoft is also the company that built the OS that
could be crashed by a maliciously crafted fragmented IP
packet, so maybe there's something to this security policy.
(One hopes that this bug and others like it are now fixed.)
However, in that case the only workable course of action
would be TO DISABLE PATH MTU DISCOVERY!
You can't have your cake and eat it too.
_______________________________________________
NANOG mailing list
http://mailman.nanog.org/mailman/listinfo/nanog
Nathan Anderson/FSR
2008-05-07 21:16:35 UTC
Permalink
You do know who it is that you are responding to, right? :)

http://www.oreillynet.com/pub/au/970
--
Nathan Anderson
First Step Internet, LLC
***@fsr.com
Bjørn Mork
2008-05-08 07:00:19 UTC
Permalink
Post by Iljitsch van Beijnum
Now Microsoft is also the company that built the OS that could be
crashed by a maliciously crafted fragmented IP packet, so maybe
there's something to this security policy. (One hopes that this bug
and others like it are now fixed.)
Although the fact that Microsoft block all icmp makes me wonder which
unfixed icmp related security holes they know about...

I am not saying that there are any such holes in current Windows
versions, but I will certainly not use a Windows server in an
environment where I could receive icmp after learning that Microsoft
themselves don't trust Windows' icmp handling.

After all, Microsoft must have a reason to block all icmp. Or?
Post by Iljitsch van Beijnum
However, in that case the only workable course of action would be TO
DISABLE PATH MTU DISCOVERY!
You can't have your cake and eat it too.
But maybe the death of icmp is worth some sort of ceremony? Cake or
not.



Bjørn
Joel Jaeggli
2008-05-08 07:53:23 UTC
Permalink
<snip>
Post by Bjørn Mork
After all, Microsoft must have a reason to block all icmp. Or?
Post by Iljitsch van Beijnum
However, in that case the only workable course of action would be TO
DISABLE PATH MTU DISCOVERY!
You can't have your cake and eat it too.
But maybe the death of icmp is worth some sort of ceremony? Cake or
not.
Oddly enough there is a draft on the subject of icmp filtering
recomendations is making the rounds.

http://tools.ietf.org/wg/opsec/draft-gont-opsec-icmp-filtering-00.txt

The opsec working group (***@ietf.org) and the authors would
appreciate feedback from operators on the subject.

thanks
joelja
Post by Bjørn Mork
Bjørn
_______________________________________________
NANOG mailing list
http://mailman.nanog.org/mailman/listinfo/nanog
Iljitsch van Beijnum
2008-05-08 09:24:17 UTC
Permalink
Post by Joel Jaeggli
Oddly enough there is a draft on the subject of icmp filtering
recomendations is making the rounds.
http://tools.ietf.org/wg/opsec/draft-gont-opsec-icmp-filtering-00.txt
appreciate feedback from operators on the subject.
Speaking as someone who isn't interested in reading an explanation of
what happens when the message is filtered for every ICMP message known
to man, I find this a completely useless document: I can't find the
recommendations. Either they're there but impossible to find by
looking at the table of contents or searching for "recommend", or
they're not there in which case the title is EXTREMELY misleading.

Also:

2.1.1.5.4. Operational/interoperability impact if blocked Filtering
this error message breaks the Path-MTU Discovery mechansim described
in [RFC1191].

This is completely insufficient because it doesn't mention that 99% of
all TCP traffic on today's internet uses PMTUD and filtering these
messages leads to broken connectivity towards destinations that have
an MTU lower than the source (lower than 1500 in practice).

Please spell check and five levels of numbering is considered bad style.
Smith, Donald
2008-05-08 17:19:41 UTC
Permalink
A few comments on your comments below.


RM=for(1)
{manage_risk(identify_risk(product[i++]) &&
(identify_threat[product[i++]))}
-----Original Message-----
On Behalf Of Iljitsch van Beijnum
Sent: Thursday, May 08, 2008 3:24 AM
To: Joel Jaeggli
Subject: Re: [OPSEC] [NANOG] Microsoft.com PMTUD black hole?
Post by Joel Jaeggli
Oddly enough there is a draft on the subject of icmp filtering
recomendations is making the rounds.
http://tools.ietf.org/wg/opsec/draft-gont-opsec-icmp-filtering-00.txt
Post by Joel Jaeggli
appreciate feedback from operators on the subject.
Speaking as someone who isn't interested in reading an
explanation of
what happens when the message is filtered for every ICMP
message known
to man, I find this a completely useless document: I can't find the
recommendations. Either they're there but impossible to find by
looking at the table of contents or searching for "recommend", or
they're not there in which case the title is EXTREMELY misleading.
I believe a table of what to filter where was recommended.
I hope that table includes filtering and ratelimiting from, through, and
to.

However blindly accepting recommendations without understanding the
possibly ramifications
such filtering can have on your network is not wise.
2.1.1.5.4. Operational/interoperability impact if blocked Filtering
this error message breaks the Path-MTU Discovery mechansim described
in [RFC1191].
This is completely insufficient because it doesn't mention
that 99% of
all TCP traffic on today's internet uses PMTUD and filtering these
messages leads to broken connectivity towards destinations that have
an MTU lower than the source (lower than 1500 in practice).
I suspect your statistics. I don't believe the number is anywhere near
99% but haven't seen a study that would support any actual % numbers of
traffic that relies on PMTUD. If your aware of such a study/research I
would be interested in reviewing the results.

Again filtering THROUGH a device is probably not advisable filtering TO
your device might be advisable.
Please spell check and five levels of numbering is considered
bad style.
_______________________________________________
OPSEC mailing list
https://www.ietf.org/mailman/listinfo/opsec
This communication is the property of Qwest and may contain confidential or
privileged information. Unauthorized use of this communication is strictly
prohibited and may be unlawful. If you have received this communication
in error, please immediately notify the sender by reply e-mail and destroy
all copies of the communication and any attachments.
Hank Nussbacher
2008-05-08 21:10:12 UTC
Permalink
Post by Michael Sinatra
Post by Nathan Anderson/FSR
I have been in contact with someone at Microsoft's service operations
center, who has confirmed for me that MS does in fact block _all_ ICMP
at the edge of their network, that they are aware that this will in fact
break PMTUD, and that they have no current plans to change this practice
which they have implemented in the interest of security.
Although the need for your previous apology has already been questioned
in this forum, the confirmation that they block not only certain ICMP
types, but all ICMP, further vacates the need for any apology for
criticizing this behavior in a pubic forum. It is disheartening for
those of us who use and support MSFT's products to learn that their
understanding of security lacks even the basic nuance to know not to
block an entire--critical--portion of the Internet Protocol. Perhaps
they should also block _all_ TCP and UDP as well, and then we can move on.
I agree with Iljitsch that it happens frequently, but I think I am
justified in expecting more than that from Microsoft. Anything less
would be unprofessional.
I wonder if MS knows about:
ICMP Packet Filtering v1.2 from 2003:
http://www.cymru.com/Documents/icmp-messages.html
Only been around 5 years or so. Hopefully MS people reading this email
will take note, read the entire page and implement what everyone else has
been doing for a number of years.

-Hank
Deepak Jain
2008-05-07 22:07:06 UTC
Permalink
Post by Nathan Anderson/FSR
Nevertheless, the person I have been in contact with is naturally not
the final decision-maker on this issue and is going to continue to pass
the issue on up the chain of command for me. So although this issue is
not over and I do not have a final verdict from MS yet, I felt that,
given that I don't know how much time to expect to pass between now and
when that final verdict is rendered, it would be appropriate to let
everybody here know what I have learned thus far. Hopefully public
dissemination of this information factoid will prevent others in a
position similar to mine from having to helplessly beat their heads into
their keyboards.
Let's also not ignore the generally overworked IT administrator at any
small or medium sized enterprise. He/she may not be (as many folks I've
run into are) of the mistaken impression that ICMP *is* bad and leaves
you vulnerable to all sorts of things like SMURF. There are even tools
out there that "test" your vulnerability by "pinging" you and do other
investigations.

I know of a tool that a major financial institution uses when certifying
your networks security -- that scrapes the version number from your
ESTMP banner to decide whether you comply or not (and other banners).
(Rather than actually testing for a specific vulnerability). Simply
blocking all of these packets from their test host gives you a high
passing score; possibly a perfect one. [Irony and humor aside...]

Many non-SP IT folks think they understand TCP, grudgingly accept UDP
for DNS from external sources and think everything else is bollocks.
Many *might* have a fit if they saw Microsoft accepting ICMPs because
that seems inconsistent with their knowledge of turn-the-knob network
security. To their view, their Linksys/Netgear/whathaveyou COTS
firewalls block everything too.

I don't think I'm exaggerating here.

Just a thought, not saying its a good one or whose fault it is...

Deepak Jain
AiNET
SML
2008-05-07 22:18:51 UTC
Permalink
Post by Deepak Jain
Many non-SP IT folks think they understand TCP, grudgingly accept
UDP for DNS from external sources and think everything else is
bollocks. Many *might* have a fit if they saw Microsoft accepting
ICMPs because that seems inconsistent with their knowledge of turn-
the-knob network security. To their view, their Linksys/Netgear/
whathaveyou COTS firewalls block everything too.
I don't think I'm exaggerating here.
No, you are not. I have seen the same from "firewall engineers" at
large companies, people who, supposedly, have done "network security"
for years. Even after showing them numerous Web sites detailing
current best practices, especially Rob Thomas's fine site, these folks
would not change their practices.

Some days it is hard to not give in to the "I give up" feelings.
Tony Finch
2008-05-08 12:54:41 UTC
Permalink
Post by Deepak Jain
I know of a tool that a major financial institution uses when certifying
your networks security -- that scrapes the version number from your
ESTMP banner to decide whether you comply or not (and other banners).
(Rather than actually testing for a specific vulnerability). Simply
blocking all of these packets from their test host gives you a high
passing score; possibly a perfect one. [Irony and humor aside...]
Cisco PIX/ASA firewalls in SMTP fuxup mode are so incredibly broken.
Possibly the worst SMTP implementation ever.

Tony.
--
f.anthony.n.finch <***@dotat.at> http://dotat.at/
FISHER GERMAN BIGHT: VARIABLE 3, BUT EASTERLY 4 OR 5 IN SOUTH GERMAN BIGHT.
SLIGHT. FOG PATCHES. MODERATE OR GOOD, OCCASIONALLY VERY POOR.
Blaine Christian
2008-05-08 16:27:49 UTC
Permalink
First of all I would like to thank everyone for their support and concern.
We certainly have a lot of things to "fix" at Microsoft. In fact, I can
tell you that we have several brand new positions open (working on my team
and for teams near mine) and could use more hands at the tiller.

My apologies to the moderators for posting a help wanted but I figure since
folks are expressing concerns with Microsoft networking they should have the
opportunity to come over and help.

We have an INCREDIBLE amount of interesting work ahead of us. I can't
really speak to it directly but I can tell you that this is a really good
place to be if you think big. We have a very interesting playbook for the
next few years. Keep in mind that this is one of the few spots where
thinking big can actually result in action.

The current positions...

Principal Network Engineer 204091
Senior Network Engineer 185014
SR PM 220793
IT/Ops PM 2 220797
Network Engineer 3 226032
Group Manager, Core Engineering 227621
Senior Network Engineer 229347
BOSD Network Engineer 231413
BOSD Senior Network Engineer 231414

If you think you have what it takes go check out these positions at
http://www.microsoft.com/careers and apply (the numbers to the right are the
job codes). Microsoft is an incredible place to work if you truly enjoy
what you do.

To be honest I don't read as much NANOG as I did a number of years ago. I
had a couple friends point this thread out to me. I hope you are all doing
well.

Regards,

Blaine
Mark Smith
2008-05-17 00:50:06 UTC
Permalink
On Wed, 07 May 2008 12:24:57 -0700
<snip>
Post by Nathan Anderson/FSR
Team. They, in turn, forwarded my message on to
understand it, the e-mail address I was looking for in the first place
(leads to their network/system people).
Doesn't look like it unfortunately. I've just tried to use this email
address to advise them of some routing issues we're having with new
APNIC IP ranges (114/8 specifically), and I've just got:

--
This message was created automatically by mail delivery software.

A message that you sent could not be delivered to one or more of its
recipients. This is a permanent error. The following address(es) failed:

***@microsoft.com
SMTP error from remote mail server after RCPT TO:<***@microsoft.com>:
host mailb.microsoft.com [131.107.115.215]: 550 5.1.1 User unknown
--

Regards,
Mark.
--
"Sheep are slow and tasty, and therefore must remain constantly
alert."
- Bruce Schneier, "Beyond Fear"
Mark Smith
2008-05-17 01:12:05 UTC
Permalink
Hi,

On Sat, 17 May 2008 10:20:06 +0930
Post by Mark Smith
On Wed, 07 May 2008 12:24:57 -0700
<snip>
Post by Mark Smith
--
This message was created automatically by mail delivery software.
A message that you sent could not be delivered to one or more of its
host mailb.microsoft.com [131.107.115.215]: 550 5.1.1 User unknown
--
Somebody from Microsoft kindly contacted me off list, the email address
is ***@microsoft.com - note the removed "s".

Regards,
Mark.
--
"Sheep are slow and tasty, and therefore must remain constantly
alert."
- Bruce Schneier, "Beyond Fear"
Stephen Sprunk
2008-05-07 16:32:23 UTC
Permalink
Post by Nathan Anderson/FSR
A member of Microsoft's GNS network escalations team saw my
postings on NANOG about this issue and took offense at my use
of this forum to raise this issue with them, and criticized me as
being unprofessional and lacking in business acumen.
First, it's "unprofessional and lacking in business acumen" for someone to
criticize their customers to their face. As one manager taught me, "The
customer may not always be right, but they're never wrong."

Second, it's their own damn fault for not maintaining their contact
information properly in public databases. If the only option they leave you
is to post to NANOG, because they don't respond to (or even accept) direct
requests to the listed contacts, then that's what you have to do.

Many companies are guilty of the latter, and we all get the benefit of
seeing the state of their customer service for reference when making future
buying decisions. Very few are arrogant enough to do the former, though.

S

Stephen Sprunk "God does not play dice." --Albert Einstein
CCIE #3723 "God is an inveterate gambler, and He throws the
K5SSS dice at every possible opportunity." --Stephen Hawking
Nathan Anderson/FSR
2008-05-07 01:18:08 UTC
Permalink
Interestingly, Windows XP, Sp3, released today, describes changes in
PMTUD behavior.
As I pointed out in my post earlier today timestamped at 2:29PM, I was
using an XP SP3 host to perform my tests with, and it made no
difference. I also used BBR's DrTCP application to make sure that black
hole router detection was, in fact, enabled on my XP box before
commencing my packet captures.

I cannot explain why it made no difference, but at the same time I don't
know enough about how WinNT's black hole router detection works to begin
speculating at this point. I do plan on looking into it, however.
--
Nathan Anderson
First Step Internet, LLC
***@fsr.com
Tomas L. Byrnes
2008-05-07 00:51:44 UTC
Permalink
Interestingly, Windows XP, Sp3, released today, describes changes in
PMTUD behavior.

Black Hole Router detection is now on by default:

http://download.microsoft.com/download/6/8/7/687484ed-8174-496d-8db9-f02
b40c12982/Overview%20of%20Windows%20XP%20Service%20Pack%203.pdf
-----Original Message-----
Sent: Tuesday, May 06, 2008 3:54 PM
Subject: Re: [NANOG] Microsoft.com PMTUD black hole?
`
Date: Tue, 06 May 2008 14:29:03 -0700
Subject: Re: [NANOG] Microsoft.com PMTUD black hole?
Now, although that makes sense, in order to avoid issues
like the one
we are facing with Microsoft, would it not make _more_
sense for the
stack to look at the PMTU cache first, and then adjust its own MSS
just for connections to that one host?
This _is_ Microsoft we're talking about, remember. 'sense'
and 'Microsoft'
are, at a =minimum= orthogonal to each other -- and may not
even inhabit the same address-space. <wry grin>
As for standards, it is official Microsoft policy to "embrace
and extend",
not to implement in a way compatible with the rest of the
world. *sigh*
I -don't- believe the rumor that "PMTUD/Vista Ultimate" sends
incrementally increasing-size packets, and uses the first one
that -doesn't- get through
as the size limit. <giggle>
_______________________________________________
NANOG mailing list
http://mailman.nanog.org/mailman/listinfo/nanog
Michael Sinatra
2008-05-07 21:16:50 UTC
Permalink
Post by Michael Sinatra
I agree with Iljitsch that it happens frequently, but I think I am
justified in expecting more than that from Microsoft. Anything less
would be unprofessional.
And you would consider an organization that threatens someone who
complains publicly about its obvious incompetence "professional"?
Absolutely not. That was actually the point of my statement, although I
admit that it wasn't clear.
Many of Microsoft's people are highly professional, but the corporation,
as a whole, has been found to be large scale law breakers on two
continents and frequently incapable of even the most basic of technical
operations. I'm afraid that I don't see them as at all "professional". I
quit expecting any such behavior from them over a decade ago, probably
closer to two. And mentioning security and Microsoft is inviting bad
jokes and shudders.
* Speaking only for myself *
Agreed.

michael
Scott Weeks
2008-05-08 18:21:57 UTC
Permalink
---------- ***@blaines.net wrote: ---------------

First of all I would like to thank everyone for their support and concern.
We certainly have a lot of things to "fix" at Microsoft. In fact, I can
tell you that we have several brand new positions open (working on my team
and for teams near mine) and could use more hands at the tiller.

My apologies to the moderators for posting a help wanted but I figure since
folks are expressing concerns with Microsoft networking they should have the
opportunity to come over and help.
---------------------------------------------------

Not support. Concern about M$ on soooo many levels. Your apologies should go to the other 10,000 folks that had to endure your improper email:

http://www.nanog.org/listfaq.html#nocjobs




--------------------------------------------
BOSD Network Engineer 231413
BOSD Senior Network Engineer 231414
--------------------------------------------

heh, you mean BSOD engineer? ;-)

scott

































----------------------------
Janet Sullivan
2008-05-08 21:35:03 UTC
Permalink
I thought I'd post a few constructive comments on this thread. (Full
disclosure: I am an ex-Microsoft employee. I do not speak for the
company, I'm just trying to help out the network community.)

1) Yes, Microsoft blocks ICMP for the most part, which will break Path
MTU Discovery. This is a known issue. If you run into it, its most
likely because the servers you are trying to talk to in MS-land don't
have black hole router detection turned on.

2) Instead of trying to get all the various ACLs and firewalls in
Microsoft fixed to allow PMTUD, you are more likely to experience joy if
you can contact the server owners. Ask if they have black hole router
detection turned on, and if not, if they can do so.

3) So how do you get in contact with the server owners or MSN's
networking people? ***@microsoft.com is your best bet. That's the
email address monitored by the basic Tier 1 "Service Operations Center".
They cut tickets, follow scripts, and do very basic front line work.
They probably won't be able to fix the problem for you, but they CAN get
you in touch with the right people.

4) FINDING the right people can be a challenge, even internally.
Microsoft is a very big company, and its far from centralized. Be
specific in what URLs and IPs you are having trouble with, and be
prepared to bounce around a bit. The people who run microsoft.com's
servers aren't the same group that does hotmail, etc. Have patience,
and try to get ticket numbers for tracking at much as possible.

5) Try to give a realistic estimate of how many users are being impacted
by the problem. Your problem will be triaged as it moves through
various groups, and yes, the response time may not be what you want.
Your problem is one fire among many, and there aren't enough firefighters.

6) Be nice. Seriously. People love to hate Microsoft, and sometimes
take it out on the poor overworked geeks who are trying to actually make
things better. Every vulnerability, BSOD, or Vista delay is not the
fault of the network or systems engineer you get in touch with. ;-)
Niels Bakker
2008-05-08 22:44:34 UTC
Permalink
Post by Janet Sullivan
1) Yes, Microsoft blocks ICMP for the most part, which will break Path
MTU Discovery. This is a known issue. If you run into it, its most
likely because the servers you are trying to talk to in MS-land don't
have black hole router detection turned on.
I find it hilarious that one part of the company had to come up with a
hack to work around the inability of another part of the company to
understand how TCP/IP works


-- Niels.

--
Loading...