ServiceNow - "service mapping recomputation" jobs causing ServiceNow performance issues

Please review these - Service Mapping Recomputation jobs and as mentioned in KB - KB0824377 lower the job count to 1 for immediate relief.

(https://support.servicenow.com/kb?id=kb_article_view&sysparm_article=KB0824377

https://support.servicenow.com/kb?id=kb_article_view&sysparm_article=KB0869419

)

see also below re: sa.service.max_ci_service_population sys property setting

Memory degradation caused by high number of "Service Mapping Recomputation" jobs

https://support.servicenow.com/kb?id=kb_article_view&sysparm_article=KB0824377

note my observation is that pulling this property via an update set DOES NOT WORK - it needs to be created manually per instance. On San Diego version at least

1) Go to the "System Properties" list (https://<instance>.service-now.com/sys_properties_list.do) 

2) Click new 

a) Fill out: 
Name ==> glide.service_mapping.recomputation.job_count
Type: Integer
Value: 1 

b) Save

can actually kill these jobs by setting this to 0

==============================

Issue:
Dev running slow

Investigation Summary:
I have reviewed the instance and could see the memory contention on different nodes across the instance .

servlet       started              heapspace        metaspace          rss      virt   cpu    resp    tpm sess errors
------------ ------------------- --------------- ----------------- ------- ----- ----- ------ --- ---- ------
xxxpdev015 2022-04-20 10:34:22   1.8 G / 1.9 G 147.0 M / 304.0 M    2.6 G 2.8 G 214.6 12.81 1.4     4     201
xxxdev021 2022-04-13 10:54:21   1.8 G / 1.9 G 201.0 M / 304.0 M    2.9 G 3.0 G 215.3 158.01 1.4    13 224975
xxxev022 2022-04-20 12:11:58 311.0 M / 1.9 G 106.0 M / 304.0 M 681.0 M 2.7 G 875.0    0.00 0.0    0       0
xxxdev023 2022-04-13 11:01:16    1.6 G / 1.9 G 186.0 M / 304.0 M    2.8 G 3.0 G 184.6   5.72 0.2    11 223210

job           thread                    item                                      started              age
------------ ------------------------ ---------------------------------------- ------------------- -------
xxxdev021 glide.scheduler.worker.6 Service Mapping Recomputation 2           2022-04-20 11:54:33 0:18:39
xxxdev021 glide.scheduler.worker.5 Service Mapping Recomputation 1           2022-04-20 11:56:36 0:16:35
xxxdev015 glide.scheduler.worker.2 Autoclose Incidents                       2022-04-20 12:00:12 0:13:00
xxxdev015 glide.scheduler.worker.3 Service Mapping Recomputation 1           2022-04-20 12:04:02 0:09:10
xxxdev015 glide.scheduler.worker.0 Service Mapping Recomputation 2           2022-04-20 12:04:02 0:09:10
xxxdev023 glide.scheduler.worker.2 Service Mapping Recomputation 1           2022-04-20 12:09:38 0:03:36
xxxdev021 glide.scheduler.worker.1 ASYNC: Affected ci notifications          2022-04-20 12:09:45 0:03:26
xxxdev023 glide.scheduler.worker.4 Service Mapping Recomputation 2           2022-04-20 12:11:27 0:01:47
xxxdev021 glide.scheduler.worker.3 Run Instance-side Probes                  2022-04-20 12:11:24 0:01:47
xxxdev023 glide.scheduler.worker.6 ASYNC: Affected ci notifications          2022-04-20 12:11:39 0:01:35
xxxdev023 glide.scheduler.worker.0 ASYNC: Affected ci notifications          2022-04-20 12:11:39 0:01:35
xxxdev015 glide.scheduler.worker.5 Event Management - Impact Calculator fo 2022-04-20 12:11:39 0:01:33
xxxdev023 glide.scheduler.worker.1 Event Management - Impact Calculator fo 2022-04-20 12:12:00 0:01:14
xxxdev021 glide.scheduler.worker.7 Update Business Service Status            2022-04-20 12:12:06 0:01:05
xxxdev015 glide.scheduler.worker.4 UsageAnalytics App Persistor              2022-04-20 12:12:42 0:00:29
xxxdev021 glide.scheduler.worker.4 ASYNC: Discovery - Sensors get (https:// 2022-04-20 12:12:43 0:00:28
xxxdev023 glide.scheduler.worker.7 GCF Download Definition Collections       2022-04-20 12:12:55 0:00:19
xxxdev023 glide.scheduler.worker.5 GCF Download Blacklist and Whitelist      2022-04-20 12:12:55 0:00:19
xxxdev021 glide.scheduler.worker.2 ASYNC: Discovery - Sensors get (https:// 2022-04-20 12:12:55 0:00:17
xxxdev022 glide.scheduler.worker.5 Init UI Metadata                          2022-04-20 12:13:07 0:00:04
xxxdev022 glide.scheduler.worker.3 Init Service Designer Form                2022-04-20 12:13:07 0:00:04
xxxdev022 glide.scheduler.worker.0 Register Instance                         2022-04-20 12:13:07 0:00:04
xxxdev022 glide.scheduler.worker.2 Init Service Portal SCSS                  2022-04-20 12:13:07 0:00:04

I have analysed the heapdumps for nodes - xxxdev021 and xxxdev022. Observed that multiple threads of 'Service Mapping Recomputation' jobs running on these nodes occupying the major heapspace and are the main reason of slowness.
for node - xxxdev021 , around 67% heapspace was used by 'Service Mapping Recomputation 1' job.

The thread com.glide.schedule_v2.SchedulerWorkerThread @ 0x9b531838 glide.scheduler.worker.0 keeps local variables with total size 1.196.819.952 (67,21%) bytes.

The memory is accumulated in one instance of com.glide.schedule_v2.SchedulerWorkerThread, loaded by

com.snc.orbit.container.tomcat8.Tomcat8$OrbitTomcat8ClassLoader

@ 0x91c4fc78, which occupies 1.196.819.952 (67,21%) bytes.

2022-04-20 02:56:36 (969) worker.5 worker.5 txid=e5d1a3901bcf Name: Service Mapping Recomputation 1
Job Context:
#Wed Apr 20 02:56:33 PDT 2022
fcScriptName=in the schedule record

Script:
SNC.ServiceMappingFactory.recompute();
2022-04-20 02:56:36 (979) worker.5 worker.5 txid=e5d1a3901bcf service_mapping.coordinator                     : Recomputing environment '7216c62d1b3249106248a822b24bcb92'
2022-04-20 02:56:37 (054) worker.5 worker.5 txid=e5d1a3901bcf service_mapping.coordinator                     : Pre processing environment '7216c62d1b3249106248a822b24bcb92'
2022-04-20 02:56:37 (054) worker.5 worker.5 txid=e5d1a3901bcf service_mapping.service_populator               : Populating service 3e16c62d1b3249106248a822b24bcb4d via Script Populator
2022-04-20 02:56:37 (055) worker.5 worker.5 txid=e5d1a3901bcf service_mapping.service_populator               : About to acquire lock SMServicePopulatorLock (service=3e16c62d1b3249106248a822b24bcb4d, mode=RECOMPUTATION)
2022-04-20 02:56:37 (661) worker.5 worker.5 txid=e5d1a3901bcf [0:00:00.165] id: xxxdev_1[glide.12 (connpid=756727)] for: DBQuery#loadResultSet[cmdb_rel_ci:

parentIN717907df1bde85906248a822b24bcb5a,a280ede31bde499049c38732f54bcb71,2e209ed61b464990d4802f0a2d4bcb49,1d7bfb0f1bdac1906248a822b24bcb9f,94c944521bb2411049c38732f54bcbae,8a8d9f3f1b2ac9946248a822b24bcbb9,b87903df1bde85906248a822b24bcb31,651b06261b3d09946248a822b24bcbfd,050e2be61beec55449c38732f54bcb6f,1b20ded61b464990d4802f0a2d4bcb5f,e3101ad61b464990d4802f0a2d4bcb86,098b7f4f1bdac1906248a822b24bcb29,7c7903df1bde85906248a822b24bcb33,1cf2341a1bb2811049c38732f54bcbe9,e97b3f0f1bdac1906248a822b24bcb00,3fc99f2e1b0e45506248a822b24bcbbb,31fa06e21b3d09946248a822b24bcbc1,864be7371b220d94987d1fc3b24bcb95,e47767a61b4e01d0987d1fc3b24bcbd3,8ace7f271b754d14987d1fc3b24bcb61,070825831b0a49d0d4802f0a2d4bcb2e,1050d25a1b464990d4802f0a2d4bcb0c,e57bfb0f1bdac1906248a822b24bcba7,f47903df1bde85906248a822b24bcb2f^child.sys_class_pathNOT

LIKE/!!/#L%^child.sys_class_pathNOT LIKE/!!/!D%^child.sys_class_pathNOT LIKE/!!/!(%^child.sys...
2022-04-20 02:56:37 (661) worker.5 worker.5 txid=e5d1a3901bcf Time: 0:00:00.198 id: xxxdev_1[glide.12] (connpid=756727) for: SELECT cmdb_rel_ci0.`parent`, cmdb_rel_ci0.`child`, cmdb_rel_ci0.`type`, cmdb_rel_ci0.`sys_id`, cmdb_rel_ci0.`sys_updated_on`, cmdb1.`sys_class_name` AS parent_sys_class_name, cmdb1.`sys_domain` AS parent_sys_domain_sys_id, cmdb1.`name` AS parent_name, cmdb2.`sys_class_name` AS child_sys_class_name, cmdb2.`sys_domain` AS child_sys_domain_sys_id, cmdb2.`name` AS child_name, cmdb1.`sys_id` AS parent_sys_id, cmdb2.`sys_id` AS child_sys_id FROM ((cmdb_rel_ci cmdb_rel_ci0 LEFT JOIN cmdb cmdb1 ON cmdb_rel_ci0.`parent` = cmdb1.`sys_id` ) LEFT JOIN cmdb cmdb2 ON cmdb_rel_ci0.`child` = cmdb2.`sys_id` ) WHERE cmdb_rel_ci0.`parent` IN ('717907df1bde85906248a822b24bcb5a' , 'a280ede31bde499049c38732f54bcb71' , '2e209ed61b464990d4802f0a2d4bcb49' , '1d7bfb0f1bdac1906248a822b24bcb9f' , '94c944521bb2411049c38732f54bcbae' , '8a8d9f3f1b2ac9946248a822b24bcbb9' , 'b87903df1bde85906248a822b24bcb31' , '651b06261b3d09946248a822b24bcbfd' , '050e2be61beec55449c38732f54bcb6f' , '1b20ded61b464990d4802f0a2d4bcb5f' , 'e3101ad61b464990d4802f0a2d4bcb86' , '098b7f4f1bdac1906248a822b24bcb29' , '7c7903df1bde85906248a822b24bcb33' , '1cf2341a1bb2811049c38732f54bcbe9' , 'e97b3f0f1bdac1906248a822b24bcb00' , '3fc99f2e1b0e45506248a822b24bcbbb' , '31fa06e21b3d09946248a822b24bcbc1' , '864be7371b220d94987d1fc3b24bcb95' , 'e47767a61b4e01d0987d1fc3b24bcbd3' , '8ace7f271b754d14987d1fc3b24bcb61' , '070825831b0a49d0d4802f0a2d4bcb2e' , '1050d25a1b464990d4802f0a2d4bcb0c' , 'e57bfb0f1bdac1906248a822b24bcba7' , 'f47903df1bde85906248a822b24bcb2f') AND (cmdb2.`sys_class_path` NOT LIKE '/!!/#L%' AND cmdb2.`sys_class_path` NOT LIKE '/!!/!D%' AND cmdb2.`sys_class_path` NOT LIKE '/!!/!(%' AND cmdb2.`sys_class_path` NOT LIKE '/!!/!M%' AND cmdb2.`sys_class_path` NOT LIKE '/!!/#3%' AND cmdb_rel_ci0.`type` != '11ee47317f723100ed1c3b19befa91f9') /* xxxdev021, gs:glide.scheduler.worker.5, tx:e5d1a3901bcfc5906248a822b24bcb13 */
2022-04-20 02:56:37 (678) worker.5 worker.5 txid=e5d1a3901bcf WARNING *** WARNING *** service_mapping.batch_manual_service_populator : CI count in populate action exceeded the maximum allowed (1000). You can change the default by adding sys_property sa.service.max_ci_service_population
2022-04-20 02:56:37 (678) worker.5 worker.5 txid=e5d1a3901bcf WARNING *** WARNING *** service_mapping.cmdb_walker : Relation count 1,000 exceeded its defined limit (1,000). There will be no more relations added to the result
2022-04-20 02:56:39 (618) worker.5 worker.5 txid=e5d1a3901bcf identification_engine : logId:[4ad127901bcf] Encountered an insert during delay locking, restarting processing under lock
2022-04-20 02:56:40 (656) worker.5 worker.5 txid=e5d1a3901bcf [0:00:00.011] Compacting large row block (file.write: cmdb_rel_ci 10000 rows 160000 saveSize)
2022-04-20 02:56:40 (673) worker.5 worker.5 txid=e5d1a3901bcf [0:00:00.011] Compacting large row block (file.write: cmdb_rel_ci 10000 rows 160000 saveSize)
2022-04-20 02:56:40 (694) worker.5 worker.5 txid=e5d1a3901bcf [0:00:00.012] Compacting large row block (file.write: cmdb_rel_ci 10000 rows 160000 saveSize)
2022-04-20 02:56:40 (711) worker.5 worker.5 txid=e5d1a3901bcf [0:00:00.011] Compacting large row block (file.write: cmdb_rel_ci 10000 rows 160000 saveSize)
2022-04-20 02:56:40 (727) worker.5 worker.5 txid=e5d1a3901bcf [0:00:00.009] Compacting large row block (file.write: cmdb_rel_ci 10000 rows 160000 saveSize)
2022-04-20 02:56:40 (751) worker.5 worker.5 txid=e5d1a3901bcf [0:00:00.013] Compacting large row block (file.write: cmdb_rel_ci 10000 rows 160000 saveSize)
2022-04-20 02:56:40 (769) worker.5 worker.5 txid=e5d1a3901bcf [0:00:00.011] Compacting large row block (file.write: cmdb_rel_ci 10000 rows 160000 saveSize)
2022-04-20 02:56:40 (790) worker.5 worker.5 txid=e5d1a3901bcf [0:00:00.012] Compacting large row block (file.write: cmdb_rel_ci 10000 rows 160000 saveSize)
2022-04-20 02:56:40 (811) worker.5 worker.5 txid=e5d1a3901bcf [0:00:00.013] Compacting large row block (file.write: cmdb_rel_ci 10000 rows 160000 saveSize)
2022-04-20 02:56:40 (830) worker.5 worker.5 txid=e5d1a3901bcf [0:00:00.012] Compacting large row block (file.write: cmdb_rel_ci 10000 rows 160000 saveSize)

Next Steps:
Please review these - Service Mapping Recomputation jobs and as mentioned in KB - KB0824377 lower the job count to 1 for immediate relief.

(https://support.servicenow.com/kb?id=kb_article_view&sysparm_article=KB0824377)

Please update the case if further assistance is required.

***

As part of the troubleshooting process, I and other ServiceNow personnel may need to access your instance(s) in order to review the service impact to your instance and determine the root cause. It may also be necessary to make some changes on a sub-production instance in order to troubleshoot the issue or to test a probable solution. These changes, if any, will be reverted back to the original state. If any change is not reverted a reason will be provided for the same.

If you need immediate assistance, please use one of the contact numbers from our support contact page:

http://www.servicenow.com/support/contact-support.html

You will then be able to enter your Case or Change number over the phone to have your call routed to the Support Team.

===

Thank you for your continued patience with this case.
We have identified that the memory issues are caused by excessively large service maps that pull in lots of their data during recomputation.

There isn't a lot we can do to reduce the impact of this but one option could be to reduce the 'sa.service.max_ci_service_population' property value to lower than 1000 which is OOB.
This property limits the amount of CIs in the dynamic service, so that once this value is reached the population logic stops which could impact functionality of larger maps.

Setting it to a lower value will stop the population phase at an earlier stage, thus it will not reached levels that contain thousands of CIs.
Note that this property is global, thus affects all Dynamic Services.

We have also engaged development for their assistance but they require SNC access to investigate further.

ServiceNow Blog

Search This Blog

ServiceNow - "service mapping recomputation" jobs causing ServiceNow performance issues

Memory degradation caused by high number of "Service Mapping Recomputation" jobs

Comments

Post a Comment

Popular posts from this blog

ServiceNow check for null or nil or empty (or not)

Get URL Parameter - server side script (portal or classic UI)

Get reference display value in CLIENT script