Penn State shield
Skip to content Skip to search

Headlines

ITS Alert - Resolved: PASS CIFS Gateway Group Lookup Issue

/alerts

ITS Alerts by Date


ITS Alerts by Service

ITS Alerts by Location

  • red boxCurrent Alert
  • green boxResolved Alert
  • orange boxFuture Alert

Resolved: PASS CIFS Gateway Group Lookup Issue

Last updated on November 24, 2009 at 5:34PM

Update on November 24, 2009 at 5:34PM

By noon today, all PASS CIFS gateway servers have been switched to use the new LDAP Directory Servers. Changes in access due to new account creations or group membership changes will be reflected on this service once again.

ITS will continue to monitor and tune the service as needed however it is expected that the new Directory Servers will greatly out perform the old version. The new version uses a much more robust, multithreaded replication model and tests have shown the new servers performed orders of magnitude better with proper tuning. This is expected to resolve the user and group account information lookup problem with the CIFS service.

Update on November 15, 2009 at 10:42PM

After the conclusion of <http://alerts.its.psu.edu/alert-1279>, the CIFS gateway servers will continue to use the dedicated directory servers until tomorrow to ensure a successful transition of load. Note that group changes made after the upgrade may not take effect on access control via CIFS until after the transition to the new replicas is complete.

Update on October 28, 2009 at 10:36AM

At 10:06 a.m., both backend servers had reached maximum load. By 10:33 a.m., some of the gateways had been forced to use failover servers, and the load on all servers has returned to normal. User experience should have improved.

Update on October 28, 2009 at 10:01AM

Starting 9:53 a.m., one of the backend directory servers has reached maximum load, which affects half of the CIFS Gateways. Some users may experience delays or errors during login to the CIFS gateway servers. ITS expects the load to subside in a few minutes.

Update on October 7, 2009 at 1:29PM

The load had subsided by 1:18 p.m.

Update on October 7, 2009 at 12:14PM

At 12:07 p.m. the load on one LDAP server from the CIFS gateways has exceeded the normal operating threshold, and requests are being queued. Some users may experience delays or errors during login to the CIFS gateway servers. ITS expects the load to subside in a few minutes.

Update on October 2, 2009 at 10:38AM

From 9:59 a.m. to 10:19 a.m., LDAP requests from some of CIFS gateways have overloaded one of the LDAP servers, temporarily placing further requests into a backqueue, which affected about half of the CIFS gateway servers. Some users may have experienced delays or errors while using CIFS during that time. The service has recovered. ITS continues to monitor the service and work on a improvements.

Update on September 23, 2009 at 12:36PM

The CIFS gateways have recovered. ITS has collected information on this event and continues to tune the servers.

Update on September 23, 2009 at 12:22PM

Peak load has returned. Network drives to the CIFS gateways may be slow or generate errors for login and access. An update will announce when it resolves.

Update on September 16, 2009 at 11:02PM

Today, the load passed to the directory servers from the CIFS gateway servers seemed to have exceeded previous levels and surpassed the previous safeguards. On all days in which the problem had occurred, which only happened during the peak load hours of 8:50 a.m. - 4 p.m., on days while classes were in session, the problem rarely persisted for more than a few hours, and often subsided after a few minutes. Some users may have experienced delays or errors in login during periods where previously logged-in users may continue to work without problems. Some may have experienced problems with group authorization during times when individual authorization controls remained operational.

Users may possibly allay future problems if seen by retrying a few minutes later, logging in prior to 8:30 if possible, or using another access method (see http://its.psu.edu/PASS/connect.html). No problems related to this issue have been detected outside of peak hours or with other access methods.

Tonight, the directory servers were tuned with 50% more responders, which is expected to handle the higher load seen today. Additional options will be tested tomorrow.

Update on September 16, 2009 at 12:13PM

This morning the problem has returned. Further investigation is underway.

Update on September 12, 2009 at 12:22PM

By Thursday afternoon, September 3, ITS had configured and finished tuning two dedicated directory servers to support the PASS CIFS gateway servers. These servers were closely monitored the following week with no further issues detected.

Update on September 1, 2009 at 10:15AM

Since mid day yesterday, ITS staff have been testing a new backup directory server with the PASS CIFS servers which has newer hardware than currently in production, so far with success. Should the test remain successful, it will be rolled into the production cluster to receive updates. We hope to have this complete by tomorrow.

Update on August 31, 2009 at 12:15PM

By 11:30am, group information has failed again on one directory server, and the affected CIFS gateways have been moved to the backup directory server by 12:10pm.

Update on August 31, 2009 at 9:08AM

Some CIFS gateways have been switched back to the backup directory server by 8:54am. Some LDAP based applications may have seen delays in response over the past 10 minutes.

Update on August 28, 2009 at 7:49PM

By 5:42pm, the CIFS gateway servers have been reverted back to use production directory servers.

Update on August 28, 2009 at 12:06PM

LDAP and LDAP-based services were affected shortly after the last update.

Affected services may include:
- NFS
- blogs.psu.edu
- explorer.pass.psu.edu
- php.scripts.psu.edu
- www.work.psu.edu

Availability of affected services should be restored shortly. CIFS access to PASS is currently operating in degraded mode: any accounts and groups added or changed within the last 24 hours may not be current.

Update on August 28, 2009 at 11:42AM

ITS further adjusted performance tuning of the production servers, and the CIFS gateway servers have been switched back to use production account and group information by 11:35am. Additional tuning may be forthcoming.

Update on August 28, 2009 at 9:03AM

At 8:55am, the CIFS gateways were switched to a backup copy of the directory to alleviate stress on the primary directory servers (LDAP) during peak hours for updates. Group information should still function correctly during this time, however, recent account and group changes over the past 2 days may not be seen via CIFS access to PASS. This will be restored shortly after the peak time has finished. This is expected to be resolved by 10 am. More information will come as this progresses.

Update on August 27, 2009 at 6:38PM

Group based access on the affected CIFS servers has been fully restored by 6:00pm today. Additional performance adjustments have been made. ITS staff will continue to monitor the service.

Update on August 27, 2009 at 3:23PM

While the group information problem had been corrected by 9:50pm last night, there was a recurrence detected by 10:50am today. ITS staff continue to investigate the cause of the problem.

Update on August 26, 2009 at 11:01PM

ITS staff were able to trace the issue to a transient error, and a limitation in the directory information client program on the CIFS servers. The directory client limitation caused a performance problem on other services, as well as the service interruption in the PASS filesystem noted in http://alerts.its.psu.edu/alert-1211.

At this point, the group resolution issue has been resolved, and ITS staff will take measures to compensate for the known limitation. Other services have stabilized by 6pm.

Original Alert

This morning, access to PASS files via the SMB/CIFS gateway (win.pass.psu.edu, smb.pass.psu.edu, etc) intermittently fails to honor group membership. Access to user owned files does not appear to be affected. ITS staff are actively investigating this issue.

For more information, please contact ITS Help Desk (helpdesk@psu.edu).


Back to ITS Alerts

Impact Information

  • Incident Type:
    Service Degradation
  • Services affected:
    PASS File Service
  • Locations affected:
    All locations
  • Began on:
    August 26, 2009 at 11:00AM
  • Issue Resolved:
    November 24, 2009 at 12:00PM