Instant Download

Download your project material immediately after online payment.

Project File Details


3,000.00

100% Money Back Guarantee

File Type: MS Word (DOC) & PDF

File Size:  1,276KB

Number of Pages:46

 

TABLE OF CONTENTS

Approval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
1 Introduction 1
1.1 Background to the Study . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Aim of the Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Signicance of the Study . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5 Scope of the Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.6 Structure of the Report . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Literature Review 6
2.1 Frauds and Fraud detection . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Rule-based fraud detection . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 Statistical fraud detection . . . . . . . . . . . . . . . . . . . . . . . . 7
2.4 Machine Learning Fraud detection . . . . . . . . . . . . . . . . . . . . 8
2.5 Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
v
3 Research Design & Implementation 16
3.1 Major concerns in Fraud detection/prevention . . . . . . . . . . . . . 16
3.2 The Proposed Methodology . . . . . . . . . . . . . . . . . . . . . . . 17
3.2.1 Big Data platform . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2.2 Deep learning model . . . . . . . . . . . . . . . . . . . . . . . 17
3.2.3 Anomaly detection . . . . . . . . . . . . . . . . . . . . . . . . 20
3.3 Implementation of the Proposed system . . . . . . . . . . . . . . . . . 21
3.3.1 Apache Spark Framework . . . . . . . . . . . . . . . . . . . . 21
3.3.2 H2O Work ow Framework . . . . . . . . . . . . . . . . . . . . 22
3.3.3 Data sets and design use . . . . . . . . . . . . . . . . . . . . . 24
3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4 Results & Analysis 26
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.2 Performance measures . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.3 Experiment I – Results & Analysis . . . . . . . . . . . . . . . . . . . . 27
4.3.1 The Activation function . . . . . . . . . . . . . . . . . . . . . 28
4.3.2 Between Deeper network & Epoch cycles . . . . . . . . . . . . 29
4.3.3 Multi-layer Feedforward Fraud predictor model . . . . . . . . 30
4.4 Experiment II { Results & Analysis . . . . . . . . . . . . . . . . . . . 31
4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5 Summary, Conclusions & Recommendations 33
5.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.2 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.3 Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Bibliography 35
vi

 

 

CHAPTER ONE

 

Introduction
1.1 Background to the Study
Fraud refers to the intentional illegal exploitation of a system which results in injury
of an oblivious entity. Financial fraud involves the exploitation of nancial systems
which results in the loss of nancial resource, the most prominent being monetary
although other damages such as loss of integrity are possible. Fraud, waste, and
abuse in many nancial systems are estimated to result in signicant losses annually
running into billions of US dollars.
Furthermore, the proliferation of the internet has exposed nancial systems to
diverse fraudsters using dierent mechanisms to exploit nancial systems. This pro-
vided an explode in attack patterns which rendered the once eective case-based
fraud detection solutionsno more eective as the computational complexity increases
with each new detected fraud. More seriously, their is a higher tendency for rst-
time frauds going undetected. The case-based detection methods are also slow as a
successful exploit could multiply if the solution took time to be integrated into the
system. This problem can only be addressed with an online (on-the- y) adaptive
(able to detect new frauds) solution.
1
Also of concern to nancial fraud detector solutions is, the prediction strength
that indicates a Fraud detector’s ability to correctly identify both known and novel
Frauds. This is usually a direct function of how much fraud samples there are to model
a solution. The emergence of Big Data and its Analytics has provided nancial fraud
detection experts with verse amount of data that will enhance the detection models.
Such solutions that use Big Data to model oer more comprehensive solutions.
A complete fraud detection model thus, must have the following properties:
1. Adaptive: This refers to the following abilities:
Ability to detect fraudulent activities within a short period of time. This
is also referred to as its alertness.
Ability to detect rst-time fraudulent activities with high accuracy.
2. Predictive: This refers to the following abilities:
Ability to detect all new instances of fraudulent activities that have hap-
pened in the past. This is very dicult to achieve if there is no data with
a considerable description of previous transactions.
Over the years solutions have been proposed to provide eective solutions to –
nancial frauds. Most of the models proposed to address the Fraud detection model
property 1 have been statistical models that try to detect outliers in the data set (See
[27], [21] and [8]). This follows after the assumption that fraudulent transactions will
behave abnormally dierent from legitimate transactions. An abnormal pattern of
behavior (i.e. an Outlier ) is agged \suspicious.” More recent, Machine Learning
methods have been used to develop more eective models (See [4], [11], [7] and [9]).
The emergence of Big Data analytic tools provided means to address Fraud detec-
tion model property 2. Such technology allows the integration of data from various
sources used to model and predict nancial fraud. For example location data of a
2
fraudster, social-media activity and credit card information can be reconciled to trace
a fraudulent transaction to him. However, Big Data analytics brings with it challenges
that limit application of techniques used to address fraud detection model property
1. Such challenges are enumerated in [24], some of which are:
1. High-dimensionality and data reduction,
2. Data quality and validation,
3. Data cleansing,
4. Feature engineering,
5. Data representations and distributed data sources,
6. Data sampling
Much research has gone into addressing some of the above issues so that existing
models that work for \small” data can scale-up to work with Big Data, for example
[16], [26] proposed improvements that address high-dimensionality, others are [13],
[29]. However all these attempts might not have scaled well as they were not originally
designed to handle Big Data complexity. Deep learning is one technique that has the
capability to handle such complex abstractions. It is good at analyzing and organizing
large amount of unsupervised data. Most raw data in Big Data Analytics are largely
unlabeled and uncategorised, which are ideally suited for Deep learning algorithms.
1.2 Problem Statement
Deep Learning algorithms model high level abstractions through a hierarchical
architecture. Higher levels learn more complex models from previous levels. This
allows such algorithms to address some of the challenges of Big Data due to its
\volume” and \variety” on the y. Deep Learning algorithms are also ideal for Big
3
Data analytics. Can a deep-learning model be built on a Big Data platform in order
to detect and predict nancial Frauds? This research looks into that and assesses
how eective such solutions can be.
Figure 1.1: Figure depicting a typical deep learning model with 3 layers. Each layer
applies a non-linear transformation to its input to produce output. It also shows two
stages of Deep learning and a path from training data to prediction. Adapted from
[5].
1.3 Aim of the Study
The aim of this work is to design a base Deep Learning model that will provide predic-
tive and adaptive Fraud detection on a Big Data platform. The model is positioned
to naturally reconcile some of the well-known Big-Data challenges. Thus it will also
outline the challenges that were simplied by the model. Finally, it will demonstrate
the model using existing tools and technologies.
1.4 Signicance of the Study
The work proposes a novel solution to nancial Fraud detection. Such proposal will
provide evidence to support the deployment of Deep Learning methods in Fraud
detection systems, an approach posited to evolve with the evolution of Big Data.
4
The result of the work will provide further insights into why Deep Learning meth-
ods work well with Big Data analytics problems addressing its inherent challenges.
This can be useful information to the Big Data research and application communities.
1.5 Scope of the Work
The scope of the work include the following:
1. Design of a Deep Learning model for nancial fraud detection on a Big Data
platform
2. Implementing the proposal in item 1 above using existing tools and technologies
3. Providing an outline of Big Data analytics challenges solved by Deep learning
1.6 Structure of the Report
The rst chapter provides an introduction to the thesis work. The second chapter
reviews relevant literature detailing the underlying concepts of the work as well as
recent contributions made. The third chapter provides the design and outlined Bid
Data analytic challenges solved by deep learning as well as the description of the im-
plementation. Chapter four describes the results obtained and their analysis. Finally,
the fth Chapter presents conclusions and recommen

GET THE FULL WORK

DISCLAIMER: All project works, files and documents posted on this website, projects.ng are the property/copyright of their respective owners. They are for research reference/guidance purposes only and the works are crowd-sourced. Please don’t submit someone’s work as your own to avoid plagiarism and its consequences. Most of the project works are provided by the schools' libraries to help in guiding students on their research. Use it as a guidance purpose only and not copy the work word for word (verbatim). If you see your work posted here, and you want it to be removed/credited, please call us on +2348157165603 or send us a mail together with the web address link to the work, to hello@projects.ng. We will reply to and honor every request. Please notice it may take up to 24 or 48 hours to process your request.