Best Practices for Cold Failover and Disaster Recovery with PI Analysis and Notification Services

In industrial environments where high availability is critical, configuring cold failover and disaster recovery (DR) strategies for PI Analysis and PI Notification Services can ensure data integrity and business continuity. This blog post explores key considerations and best practices for managing failover between primary and backup clusters, especially in multi-datacenter scenarios.

Key Concepts and Challenges

1. Shared Execution File for Analyses

PI Analysis services keep a shared file (often referred to as "execution status" or "last executions" file) that records the state and progress of active analyses. This file is:

Essential for determining where to resume analysis tasks after service restarts, failovers, or unexpected shutdowns.
Typically hosted on a shared file location, especially in Windows Failover Cluster (WFC) configurations.
Documented in detail within the Windows Failover Clustering section of the PI System Administration guides.

Best Practice: Ensure the shared execution file is reliably available from both your primary and backup clusters or data centers. If this isn't possible, plan for an auto-backfill or "mini-backfill" to recalculate analysis results for the downtime period.

2. AF Configuration Database Considerations

The Asset Framework (AF) configuration contains critical data about which server (or cluster node) is running which services.

During a failover, it may be necessary to update AF configuration to point to the newly active server or cluster hostname.
Manual edits to AF configuration files should be avoided. PI System safeguards can detect tampering and may ignore or even delete altered files.

Best Practice: Use supported configuration tools or management interfaces to update AF configuration. Document and automate these steps in your DR procedures, if possible.

3. Mini-Backfill for Data Gaps

If the shared execution file cannot be accessed or is not retained between sites:

The backup PI Analysis service will not know precisely where to resume processing.
A mini-backfill (reprocessing a targeted historical data window) is recommended to ensure no analysis results are lost for the downtime period.

Tip: Automate mini-backfills as part of DR runbooks to reduce manual intervention and minimize errors.

4. Rethinking Redundancy Levels

Clustered setups with multiple levels of redundancy (primary cluster, backup cluster, active-passive nodes, etc.) are common, but may introduce unnecessary complexity or cost.

When possible, individually address failures at the analytics service or machine level.
If the cluster infrastructure fails, consider manually starting analytics services on a standalone node and updating the AF configuration accordingly.

5. Service Activation in Active/Passive Scenarios

For simpler configurations with an "active" and a "passive" analytics server:

In most cases, you only need to start the analytics and notification services on the passive server if the active server fails.
Confirm AF configuration or cluster references are updated.

6. Buffered Data Ownership

Buffered data writes (handled by PI Buffer Subsystem or PI Interface Buffering) automatically transfer point ownership during failover, reducing the risk of buffer locks and data loss.

7. File Share Accessibility

If both primary and backup data centers can access a common file share, seamless auto-backfilling and execution status recovery are possible. If not, ensure your DR procedures include steps for proper analysis recalculation.

8. Documentation and References

OSIsoft's manuals contain specific recommendations for configuring shared folders for PI Notification and Analysis servers (commonly under ProgramData\OSIsoft\). If needed, contact OSIsoft (now AVEVA) support or check their online documentation for the latest best practices and folder paths.

Final Tips for DR Procedures

Document all DR procedural steps, including how to bring services online, update configuration, and trigger recalculations.
Test your failover and DR scenarios regularly to catch issues before a real incident.
Review and update your DR documentation as your PI System and infrastructure evolve.

By understanding the role of the shared execution file, managing AF configuration properly, and planning for both routine failover and full DR scenarios, you can ensure high reliability and recoverability for PI Analysis and Notification services across clusters and data centers.

Further Reading:

PI System High Availability Documentation (AVEVA/OSIsoft)
[PI AF and Analytics Administrator's Guide]

Categories: High Availability, Disaster Recovery, PI Server, PI AF Tags: PI Analysis, PI Notifications, Failover, DR, AF Configuration, Windows Clustering, Mini-Backfill

Best Practices for Cold Failover and Disaster Recovery with PI Analysis and Notification Services

Best Practices for Cold Failover and Disaster Recovery with PI Analysis and Notification Services

Key Concepts and Challenges

1. Shared Execution File for Analyses

2. AF Configuration Database Considerations

3. Mini-Backfill for Data Gaps

4. Rethinking Redundancy Levels

5. Service Activation in Active/Passive Scenarios

6. Buffered Data Ownership

7. File Share Accessibility

8. Documentation and References

Final Tips for DR Procedures

Tags

About Roshan Soni

Sign in to comment

No comments yet

Share Article

Related Articles

Enhancing PI ProcessBook Trends with Banding and Zones: User Needs, Workarounds, and the Road Ahead

Migrating PIAdvCalcFilVal Uptime Calculations from PI DataLink to PI OLEDB

Understanding PI Web API WebID Encoding: Can You Generate WebIDs Client-Side?

Related Articles

Enhancing PI ProcessBook Trends with Banding and Zones: User Needs, Workarounds, and the Road Ahead
Roshan Soni
May 8, 2025

Migrating PIAdvCalcFilVal Uptime Calculations from PI DataLink to PI OLEDB
Roshan Soni
May 8, 2025

Understanding PI Web API WebID Encoding: Can You Generate WebIDs Client-Side?
Roshan Soni
May 8, 2025