Segmentation is a fundamental component of the anti-money laundering (AML) process, and is concerned with the groupings of entities based on similar business attributes and transactional behavior. Segmentation, when done well, enables AML typologies to focus on unusual behavior for specific groups of entities, using thresholds that allow precise detection of bad actors while minimizing the number of false positive alerts.
In a large, geographically distributed bank that provides correspondent banking services, the transactions that involve non-bank customers appear as 3rd party corresponds, also known as pseudo-customers. Unlike a bank’s own customers for which KYC (know your customer) information is available, very little is known about these pseudo-customers. Because a definite identification of the party does not exist, it is particularly difficult to monitor such customers and categorize them.
Most institutions employ a hierarchical approach to segmentation. This approach requires business attributes for the top-down analysis and transaction summaries for the bottom-up analysis.
This doesn’t work for pseudo-customers. Due to the absence of KYC information, the business attributes for non-bank customers are very limited, meaning the top-down analysis is not practical. Moreover, the bottom-up analysis is only focused on the rough summary of transactional behaviors, such as total transaction volume and/or dollar amount. The result is large, uneven segments that lack defining characteristics.
This is particularly problematic from a model acceptance perspective. The inadequacy of explanatory features precludes the business user (or internal model review board) from approving the model.
Even if the segmentation could be implemented, the segmentation is ultimately static. What that means is that the segmentation model cannot adapt to the inevitable changes we see in behavior.
An inaccurate segmentation model with large, uneven groups leads to high alert volumes with high level of noise (false positives) translating to higher investigator FTE requirements.
In order to capture the non-bank customers’ transactional behavior, an exhaustive list of features needs to be created in order to uncover the hidden behavior and reflect the AML risks. Some sophisticated institutions have turned to machine learning algorithms such as K-Means Clustering to solve for this. This, however, requires rigorous assumptions about the distribution of the underlying data (the n of K). This approach is non-performant in the face of high-dimensional problems – which is exactly what defines the pseudo-customer problem.
Ayasdi attacks the problem of intelligent segmentation solution with the following components:
- The list of features is expanded vertically and horizontally to reflect as many AML risks as possible. Vertically, by expanding past transaction volume and the dollar amount to include transaction type, currency and trend data on various intervals. Horizontally, by looking at counterparties’ information in great detail. In one such case, Ayasdi was able to increase the number of considered features by 12x driving performance improvements of over 50% (reduction in false positives).
- The champion segmentation model is more granular compared with the old one, and it gives more evenly distributed segment which benefits the downstream process.
The key differentiators of segments (the detailed explanations) are available immediately after the segmentation is refined and finalized.
- By revealing the ground truth of the customer behavior a multi-class classification process could be set up to build a generalized decision tree model for the dynamic segmentation engine. This dynamic segmentation engine automatically reclassifies non-bank customers on a periodic basis based on changes in their behavior.
The decision tree is fine-tuned to reach the best performance evaluated various metrics, such as P-R curve, AUC and F1 score.
- Alert Effectiveness Improvement
- By segmenting more effectively the bank can re-define the peer groups of non-bank customers localizing the suspicious activities to a specific area of the distribution. The alert generation process is benefits from this solution because the threshold can be set a high level without missing any suspicious activities.
Most practitioners who read this will appreciate that what Ayasdi has done is to inject intelligence into the most critical spot in the AML process – resulting in exceptional performance improvements and complete explainability for regulators without having to make significant changes to existing systems.
In coming posts I will go into additional detail the importance of the feature expansion, the group comparison capabilities and how the dynamic segmentation works (using Envision built applications).