For further investigation to avoid reccurrence, please help to provide :
From AIS Server :
- The server.log and eh.log will tell you when the most TPEs occurred, and you can concentrate on these time periods,
- The slow_transactions_log.csv, and server_counters.log.csv are things to look at.
- When running with multiple AIS instances, check if the TPEs occurred at around the same time, which can be an indication that it’s DB-related
Note : You can find those file from <AIS_SERVER>/Instances/ais/logs/
From DB Server :
- SQL_FOR_BATCH_LOG_ARCHIVE and SQL_FOR_BATCH_LOG. These tables log the various activities that are performed by the solution.
- Find out if maintenance was running when the TPEs occurred.
- If maintenance was running at the time, find out which step/SQL is involved.
- Disk Usage Summary when the most TPEs occurred.
- Query Statistics History when the most TPEs occurred.
- Server Activity History when the most TPEs occurred.
Send all data above to Q2 Support ([email protected]) for further investigation
And during investigation, you need to make sure :
- Network Latency among server AIS - DB, AIS - RCM, and DB - RCM is <1ms
- Disk Latency for Database server is <10ms
- Monitor Performance Activity for each Server AIS, DB, RCM.