Network Policy Server discarded the request for a user, yet again
Another variant on the neverending "Network Policy Server discarded the request for a user" problems, but this one's a bit more tricky. Auth-type is MSCHAPv2 over PEAP from two clients, X and Y authenticating to NPS on Server 2019 with all updates applied. X authenticates successfully. Y, even though it's been configured to send exactly the same data as X, doesn't. In addition both X and Y authenticate successfully to FreeRADIUS, but only X authenticates to NPS.
Log Name: Security
Source: Microsoft-Windows-Security-Auditing
Date: 3/30/2023 5:42:14 PM
Event ID: 6274
Task Category: Network Policy Server
Level: Information
Keywords: Audit Failure
User: N/A
Computer: radius.bt.local
Description:
Network Policy Server discarded the request for a user.
Contact the Network Policy Server administrator for more information.
User:
Security ID: bt\test321
Account Name: test321
Account Domain: BT
Fully Qualified Account Name: BT\test321
Client Machine:
Security ID: NULL SID
Account Name: -
Fully Qualified Account Name: -
Called Station Identifier: -
Calling Station Identifier: -
NAS:
NAS IPv4 Address: XXXXXXXXXXXX
NAS IPv6 Address: -
NAS Port-Type: -
NAS Identifier: -
NAS Port: -
RADIUS Client:
Client Friendly Name: client
Client IP Address: XXXXXXXXXXXX
Authentication Details:
Connection Request Policy Name: CRP - process EAP requests locally
Network Policy Name: -
Authentication Provider: Windows
Authentication Server: radius.bt.local
Authentication Type: EAP
EAP Type: -
Account Session Identifier: -
Reason Code: 1
Reason: An internal error occurred. Check the system event log for additional information.
No logs on the DC provided anything useful, enabling further logging on the NPS server with "netsh ras set tracing * enabled" and looking in %windir%\tracing produced an IASSAM.LOG which is empty (well, a 2-byte BOM). IAS.LOG is empty, as is IASACCT.LOG, IASHLPR.LOG, IASNAP.LOG, IASRAD.LOG, IASRECST.LOG, IASSDO.LOG, and IASSVCS.LOG. Again, to emphasise the first point, client X and client Y are sending identical data, things like RADIUS and EAP TLVs, with identical entries in the server RADIUS logs apart from the datestamps, but after sending the MSCHAP response inside the tunnel client X gets back a success/failure and client Y gets no response. Both clients authenticate successfully to FreeRADIUS, so it's not a problem with Y's MSCHAP implementation, and client X authenticates successfully to NPS so it's not one of the usual problems leading to the discarded-request issue.
Windows Server 2019
-
Gary Nebbett 5,851 Reputation points
2023-04-21T07:27:11.7633333+00:00 Hello Dave,
Have you tried the command
netsh nps set tracing *=verbose
? This produces a .etl (Event Tracing for Windows) trace file (in %SystemRoot%\System32\LogFiles\NPS\IAS.etl, I think).The trace file contents are difficult to interpret.
Gary
-
Dave Gzorple 5 Reputation points
2023-04-21T07:45:15.48+00:00 I don't actually run the server so have to do things by proxy, but the admin has turned on tracing and sent me various .etl files but they seem to be intended for an entirely different purpose, they contain CPU type, memory used, DLLs loaded, and a bunch of other stuff that might be useful if I was trying to track down a segfault somewhere but I couldn't find anything in there that related to this problem.
-
Gary Nebbett 5,851 Reputation points
2023-04-21T07:55:57.2733333+00:00 Hello Dave,
Without knowing what the admin understands by "turn on tracing" and which .etl files are selected for you, it is difficult to say much. Below is an example of the type of information on internal IAS/NPS processing that might be available.
Gary
-
Dave Gzorple 5 Reputation points
2023-04-29T05:15:35.87+00:00 Ah, OK. It wasn't at that level of verbosity, I'll give it a go but if that's the output then I think diagnosing this particular problem from it will be as likely as figuring out why a web server is returning a 403 error by looking at strace output.
-
Gary Nebbett 5,851 Reputation points
2023-04-29T07:41:15.6433333+00:00 Hello Dave,
The .etl file contains binary data - it takes quite a bit of processing to get it into a readable format.
Once in an understandable format, it is a great help in analysing IAS/NPS problems. I can only remember one time when I failed to understand the root cause of a problem.
Gary
-
Dave Gzorple 5 Reputation points
2023-05-10T05:17:50.2866667+00:00 Thanks for that, your recollection was correct :-). So I've now got a 20MB ETL log, but despite seeing useful-looking info in there with a binary file viewer, a ton of strings starting "NPSSDO..." including things like "NPSSDONPS - authenticate EAP (PEAP)", I can't seem to do anything useful with it. Using netsh to try and convert it to a human-readable format:
netsh trace convert input=IAS.etl output=IAS.txt dump=txt report=yes
produces:
Generating data collection and report ... failed (error=0x8007007B)
Conversion failed (error=0x8007007b).
The filename, directory name, or volume label syntax is incorrect.
Trying tracerpt:
tracerpt IAS.etl
produces nothing useful as output (meaning that the text data that's present in the ETL isn't present in the output file), Perfview indicates it has 342,000 events but shows:
I've read assorted Perfview tutorials and watched some of the videos but they all just show windows populated with data, not a single line of NaN and nothing else.
Any ideas on how to get anything useful out of the ETL beyond using a hex viewer? The data's obviously present there, I just can't get anything to display it.
-
Gary Nebbett 5,851 Reputation points
2023-05-10T09:14:50.9166667+00:00 Hello Dave,
To easily interpret the trace data, one needs access to the symbol files for the deployed NPS DLLs that contain the TMF annotations; these annotations can be included in public symbol files but in the NPS case they are not (they are only included in the private symbol files, which Microsoft do not make available). I use some control and data flow analysis of the DLLs to generate "helpful" representations of the events.
From the limited amount of information about your problem, it is not clear that the existing trace data will be useful - it is a detailed trace of the RADIUS operation but not of the authentication process (you mention that an MSCHAP message inside an established TLS tunnel seems to trigger the problem - different trace providers are needed to analyse that activity). The one NPS problem that I did not solve was also a remote diagnosis of a PAP authentication problem (the RADIUS functionality otherwise worked perfectly).
That said, the existing trace data is still a good starting point. If BT domain name in the event log is a reference to the entity formerly known as British Telecom then there may still be people there who can vouch for my bona fides (I have been retired for approaching seven years now); you could share the trace data with me.
An alternative would be to perform a new trace on the NPS server using only manifest and MOF ETW trace providers (which can be viewed by readily available tools). I would recommend a trace combining:
- Microsoft-Windows-EapHost
- Microsoft-Windows-EapMethods-RasChap
- Microsoft-Windows-EapMethods-RasTls
- "NTLM Security Protocol"
Gary
-
Dave Gzorple 5 Reputation points
2023-05-20T06:57:32.2466667+00:00 The problem turned out to be a one-bit difference in a low-level internal field, sent inside the tunnel where you can't normally see it, that FreeRADIUS treats as a don't-care but NPS treats as a do-care. Setting it to the value that NPS wants, which FreeRADIUS ignores, fixes the problem.
-
HPCPK-4455 1 Reputation point
2023-05-22T07:54:53.55+00:00 Hello Dave,
could you provide some details about it? We have a similar issue and would appreciate any hint.
Best regards
Patrick
-
Gary Nebbett 5,851 Reputation points
2023-05-22T08:32:03.0933333+00:00 Hello Patrick,
I helped Dave get to the bottom of the problem, which was a protocol misunderstanding in a custom EAP authentication client.
I would be happy to help you too, if you are prepared to share details.
Gary
-
MrMJFisher 46 Reputation points
2023-07-05T23:16:04.5366667+00:00 Good afternoon Gary,
I'm having the same log error message. I have a set of Extreme Networks VDX 6740 running 7.4.1f. RADIUS is using 1812 and PEAP-MSCHAP for authentication. On the NPS I've deleted the password template, client, connection request, and network policy and rebuilt them by hand. Sending vender specific (ID 1588 & conforms) attribute 1, string, admin. Judging from the logs; it never appears to complete EAP to PEAP negotiation. From the packet capture it appears to at least begin the TLS handshake. Same NPS server is working for other requests like 802.1x (PEAP-MSCHAPv2 / PEAP-TLS), VPN (PEAP-TLS), and Ruckus ICX Authentication (PAP).
Extreme Networks VDX 6740 output:
show run aaa aaa authentication login radius local-auth-fallback aaa accounting exec default start-stop none aaa accounting commands default start-stop none aaa authorization command none show running-config radius-server radius-server host 10.2.2.6 use-vrf default-vrf protocol peap-mschap key "ueHwGsh2N1oNJHkJpBnw6O+Q key Hash cgLEaUqN1+ZGR9OzA+Ssw\n" encryption-level 7 show vrf detail Total number of VRFs configured: 2 VRF-Name: default-vrf, VRF-Id: 1 IP Router-Id: 10.255.3.2 Interfaces: Ve 2, Ve 10, Ve 11, Ve 255 Address-family IPV4 unicast Max routes:- Route count:13 No import route-maps No export route-maps VRF-Name: mgmt-vrf, VRF-Id: 0 IP Router-Id: 0.0.0.0 Interfaces: mgmt 1, Null0 Address-family IPV4 unicast Max routes:- Route count:0 No import route-maps No export route-maps
Microsoft Network Policy Server event:6274, Reason Code:1
Log Name: Security Source: Microsoft-Windows-Security-Auditing Date: 7/5/2023 10:43:23 PM Event ID: 6274 Task Category: Network Policy Server Level: Information Keywords: Audit Failure User: N/A Computer: NPSServerName.mydomain.local Description: Network Policy Server discarded the request for a user. Contact the Network Policy Server administrator for more information. User: Security ID: MyDOMAIN\MyUserName Account Name: MyUserName Account Domain: MyDOMAIN Fully Qualified Account Name: MyDOMAIN\MyUserName Client Machine: Security ID: NULL SID Account Name: - Fully Qualified Account Name: - Called Station Identifier: - Calling Station Identifier: - NAS: NAS IPv4 Address: 10.2.2.6 NAS IPv6 Address: - NAS Identifier: - NAS Port-Type: - NAS Port: 0 RADIUS Client: Client Friendly Name: Extreme VDX6740 Client IP Address: 10.255.3.2 Authentication Details: Connection Request Policy Name: CRP-PEAP-Authentication Network Policy Name: NP-PEAP-Authentication Authentication Provider: Windows Authentication Server: NPSServerName.mydomain.local Authentication Type: EAP EAP Type: - Account Session Identifier: - Reason Code: 1 Reason: An internal error occurred. Check the system event log for additional information.
Event Xml: <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event"> <System> <Provider Name="Microsoft-Windows-Security-Auditing" Guid="{54849625-5478-4994-a5ba-3e3b0328c30d}" /> <EventID>6274</EventID> <Version>1</Version> <Level>0</Level> <Task>12552</Task> <Opcode>0</Opcode> <Keywords>0x8010000000000000</Keywords> <TimeCreated SystemTime="2023-07-05T22:43:23.620664500Z" /> <EventRecordID>977994</EventRecordID> <Correlation ActivityID="{cefcaf8c-cae6-0001-4dfc-6a676ca8d901}" /> <Execution ProcessID="716" ThreadID="6712" /> <Channel>Security</Channel> <Computer>NPSServerName.mydomain.local</Computer> <Security /> </System> <EventData> <Data Name="SubjectUserSid">S-1-5-21-1078081533-651377827-725345543-1588110</Data> <Data Name="SubjectUserName">MyUserName</Data> <Data Name="SubjectDomainName">MyDOMAIN</Data> <Data Name="FullyQualifiedSubjectUserName">MyDOMAIN\MyUserName</Data> <Data Name="SubjectMachineSID">S-1-0-0</Data> <Data Name="SubjectMachineName">-</Data> <Data Name="FullyQualifiedSubjectMachineName">-</Data> <Data Name="CalledStationID">-</Data> <Data Name="CallingStationID">-</Data> <Data Name="NASIPv4Address">10.2.2.6</Data> <Data Name="NASIPv6Address">-</Data> <Data Name="NASIdentifier">-</Data> <Data Name="NASPortType">-</Data> <Data Name="NASPort">0</Data> <Data Name="ClientName">Extreme VDX6740</Data> <Data Name="ClientIPAddress">10.255.3.2</Data> <Data Name="ProxyPolicyName">CRP-PEAP-Authentication</Data> <Data Name="NetworkPolicyName">NP-PEAP-Authentication</Data> <Data Name="AuthenticationProvider">Windows</Data> <Data Name="AuthenticationServer">NPSServerName.mydomain.local</Data> <Data Name="AuthenticationType">EAP</Data> <Data Name="EAPType">-</Data> <Data Name="AccountSessionIdentifier">-</Data> <Data Name="ReasonCode">1</Data> <Data Name="Reason">An internal error occurred. Check the system event log for additional information.</Data> </EventData> </Event>
Extreme Networks support has been unable to help me.
-
Gary Nebbett 5,851 Reputation points
2023-07-06T05:50:02.0266667+00:00 Hello MrMJFisher,
My suggestion would be to make a trace of the Microsoft-Windows-EapHost ETW (Event Tracing for Windows) provider on the NPS system while reproducing the problem.
There are many commands that can control ETW providers (wpr, pktmon, logman, netsh trace, PowerShell cmdlets, ...); for the initial trace, I would propose the command
logman start gary -ets -p Microsoft-Windows-EapHost -o why.etl
(to start the trace) andlogman stop gary -ets
(to stop the trace).Depending on what the trace data shows, further traces may well be necessary.
The trace data (why.etl) can be shared with me by any means that you wish, including posting a link to OneDrive, Google Drive, etc. (possibly of the type requiring a "request access" message) or e-mailing it to me at gary.nebbett at outlook.com.
If the time of your posting in my time zone (01:36 Central European Summer Time, UTC+2) warranted the greeting "good afternoon", then there might not be much overlap in the times of day that we are on-line.
Gary
-
MrMJFisher 46 Reputation points
2023-07-06T14:49:48.95+00:00 edit: In a direct message to me, Gary explained how my assumptions here were incorrect.
I ran the trace and clearly there is something throwing errors/warring.Like Dave, I appear to be missing the symbol files.(which matches Gary's trace command).logman query providers
shows that "{6EB8DB94-FE96-443F-A366-5FE0CEE7FB1C}" is Microsoft-Windows-EapHost -
Dave Gzorple 5 Reputation points
2023-11-03T10:47:41.5566667+00:00 A very delayed reply to @HPCPK-4455 (sorry, had notifications turned off), in this case it was related to a low-level internal detail of how PEAP encodes messages, what Microsoft did in PEAP was omit the EAP header that's supposed to be at the start of each message (because they're Microsoft) but still encode the data as if some of the EAP header was present. In other words the length field buried at offset 4 inside the PEAP message is preceded by an EAP header that isn't there and the first byte of the message is counted as being part of the nonexistant EAP header rather than the PEAP message itself, so the length field is one byte less than the actual length. Once you set the length to this incorrect-to-everyone-but-Microsoft value, things work. FreeRADIUS ignores it since it's incorrect, but NPS verifies that it is indeed incorrect and rejects the message if not.
-
Dave Gzorple 5 Reputation points
2023-11-03T10:52:02.58+00:00 A recommendation for @MrMJFisher , try running eapol_test as your client to see if that works. This provides two benefits, firstly eapol_test has workarounds for a great many of the bugs in NPS so you can at least see the workflow as it could be in theory, and secondly it provides a very detailed trace of how far it got and what it sent and received. Finally, since you've got the source code you can, worst-case, modify it to dump extra diagnostics to try and discover problems, which is what I ended up doing.
-
MrMJFisher 46 Reputation points
2023-11-03T18:45:15.4966667+00:00 @Dave Gzorple I'd be happy to do that the "Linux" environment on the Extreme Networks' VDX switch does not contain the "make" command so I cannot compile the "eapol_test".
-
Dave Gzorple 5 Reputation points
2023-11-04T10:09:53.9466667+00:00 Oh, I didn't mean on the switch, I meant take anything random running Linux and use that to run eapol_test with the same credentials as the switch. That'll give you either a known-good run or, failing that, some sort of diagnostics around the point of failure.
Alternatively, use the switch to auth against FreeRADIUS to see if that works, and if not, check the FreeRADIUS log for details. I can provide more notes on how to set it up if you're not familiar with it.
-
Gary Nebbett 5,851 Reputation points
2023-11-05T09:51:30.61+00:00 Hello @MrMJFisher
The offer to analyse your trace data (if you are prepared to share it) still stands, as an complementary approach to that described by Dave.
Gary
-
MrMJFisher 46 Reputation points
2023-11-07T18:01:49.4233333+00:00 @Dave Gzorple ; I see and understand now what you were suggesting. I don't know if I'll be able to make that a priority as we are looking to replace those devices in the "near future".
@Gary Nebbett ; You and I did exchange a few emails where you graciously shared your expertise and time. I'll review our email chain, but if I remember correctly you had found there was an issue with how the VDX calculated the size of the inner portion of the EAP packet and I was never able to get Extreme Networks to acknowledge that.
-
Gary Nebbett 5,851 Reputation points
2023-11-07T18:48:56.6133333+00:00 Hello @MrMJFisher
Thanks for that reminder! When I first started to read it, it caused a bit of panic about its implications for my memory (I am of an age where that is an increasing concern), but I have calmed down now :-) (if that is the right smiley for "relief").
I think (but can't remember exactly :-( ) that I had enough information to argue persuasively with Extreme Networks if I was able to talk directly to their technical experts. @Dave Gzorple would have been a good "sounding board" for the technical accuracy of my views. Dave and I both would probably both be willing to help further, but that would be unlikely to result in any improvement to the situation in a reasonable time frame.
Gary
Sign in to comment