This project presents a novel approach for detecting car insurance fraud by employing an optimized genetic algorithm for data clustering. Traditional clustering methods, such as K-means, often struggle with the complexities inherent in determining the optimal number of clusters and data allocation. Our approach leverages the power of genetic algorithms, a method inspired by natural evolution, to improve the accuracy and efficiency of clustering, particularly in the challenging field of insurance fraud detection.
- Optimized Clustering: The project uses a genetic algorithm to determine the best clustering structure, significantly improving the accuracy of fraud detection compared to traditional methods.
- Enhanced Accuracy: Our method has shown improvements in F1 score and overall accuracy across multiple insurance datasets.
- Scalable Solution: The approach is designed to handle large datasets and complex clustering problems, making it suitable for real-world insurance fraud detection scenarios.
- Genetic Algorithm for Clustering
- Chromosome Formation: Each chromosome represents a potential solution with the number of clusters and their corresponding centers encoded. The chromosome length is determined by multiplying the number of clusters by the number of features in the dataset.
- Crossover, Mutation, and Survival: New and diverse methods are applied for the crossover, mutation, and survival processes to enhance the clustering performance.
- Evaluation Criterion: Similar to the K-means algorithm, an evaluation criterion is chosen to optimize the clustering performance. The genetic algorithm allows for the exploration of a broader solution space, resulting in more accurate clustering outcomes.
The method was tested on three insurance datasets specifically selected for fraud detection. The results were promising:
- Dataset 1: 12% improvement in F1 score and a 10% increase in accuracy.
- Dataset 2: 1% improvement in F1 score and a 1% increase in accuracy.
- Dataset 3: 1% improvement in F1 score and a 2% increase in accuracy. These results demonstrate the effectiveness of the genetic algorithm in outperforming traditional clustering methods like K-means.
Made with contrib.rocks.