Back

Maintaining consistency of records when scraping public sources

July 25, 2025

Using public company house data sources as well as third parties who provide such information is a common practice in order to enable the team to load known data faster. A means to quickly scrape the data to create the first layer of data on counterparties. This however can have severe repercussions if the data you are scraping is not cleansed and centralised.

The CSR module in KYCP allows the user to search companies directly from within the system. A module that then allows you to scrape all the information into KYCP at a click of a button. This could be anything from fields (data points), documents as well as the related parties and involvement within the structure. 
 
To date, when importing and scraping such entities, KYCP would always create the entity as a new party within the structure. This at times led to instances whereby a party was added and such party already existing in KYCP elsewhere. This could be a party that is in a different entity format and involved in an application that is completely separate. This feature now allows for real time checking of data that might already be in your existing instance of KYCP reducing the possibility of creating duplicate data.
 
When importing entity/ies through CSR module, KYCP can now identify potential matches. This is done using the concept of "Identification" fields that are set against the entities. If a mapped field (for example, Entity Name) that is configured as ‘Identification’ field, has a potential match across existing entities in the solution. If a potential match is found, the user is prompted to go through the list of existing entities and select an action to either use the exiting entity in KYCP or update an existing entity in KYCP with data being received from CSR.
 
 
This feature was developed to ensure better data management and consistency of newly sourced data across the application. A module that ensures users are alerted of potential duplicates and giving them the ability to link the central record..
 
For more info contact us directly on sales@kycportal.com or schedule your live demo with us today.
 
If you are an existing client and you would like more information about this feature, please contact our CRM Team.
 
Feature
Specifications
Targeted For Compliance Teams
Status LIVE
Keywords company house data, scrape public sources, quicker onboarding
Direct Benefits Maintaining consistency of records when scraping public sources