Package 'RWgraph'

Title: Random Walks on Graphs Representing a Transactional Network
Description: Random walk functions to extract new variables based on clients transactional behaviour. For more details, see Eddin et al. (2021) <arXiv:2112.07508v3> and Oliveira et al. (2021) <arXiv:2102.05373v2>.
Authors: Mafalda Sá Ferreira [aut, cre], Regina Bispo [ctb], FCT, I.P. [fnd] (under the scope of the projects UIDB/00297/2020 and UIDP/00297/2020 (NovaMath))
Maintainer: Mafalda Sá Ferreira <[email protected]>
License: GPL (>= 2)
Version: 1.0.0
Built: 2024-11-15 03:51:05 UTC
Source: https://github.com/cran/RWgraph

Help Index


Clients' information for a small example

Description

A dataset containing information about 20 clients of a certain bank.

Usage

data(clients_small_example)

Format

A data frame with 20 rows and 9 variables

Details

  • age, numeric. Age of the client in years.

  • antiquity_age, numeric. Age of the account in years.

  • gender, boolean. Gender of the client.

  • occupation, numeric. Occupation of the client.

  • nationality, character. Country of birth of the client (labelled in ISO-CODE 2).

  • residence, character. Country of residence of the client (labelled in ISO-CODE 2).

  • pep_flag, boolean. Indicator whether the client is involved in political activities (1) or not (0).

  • sar_flag, boolean. Indicator whether the client was involved in a reported transaction (1) or not (0).

  • customer_id, character. ID of the client's account.


Random walk metrics for each client

Description

Computes the metrics of the generated random walks for every client in the dataframe using the function 'mean_rw_client'.

Usage

info_client(g, data)

Arguments

g

The input graph. Transactional graph containing the amount (in monetary unit) as the attribute of each edge. The vertices must be the clients IDs.

data

Dataframe with information of the clients. It should include a column with the clients IDs named "customer_id" and the alert label named "sar_flag" that must be a boolean variable.

Value

A dataframe with the clients IDs and the computed metrics (minimum, mean and maximum for both the number of steps and total transactioned amount) for the random walks starting in each client.

References

Eddin, A. N., Bono, J., Aparício, D., Polido, D., Ascensao, J. T., Bizarro, P., and Ribeiro, P. (2021). Anti-money laundering alert optimization using machine learning with graphs. arXiv preprint arXiv:2112.07508.

Examples

g <- igraph::graph_from_data_frame(d = transactions_small_example[, 1:3], directed = TRUE)
info_client(g, data = clients_small_example)

Metrics for multiple random walks

Description

Computes metrics for 50 generated random walks using the function 'rw_client'.

Usage

mean_rw_client(v, g, data)

Arguments

v

The initial vertex of the input graph.

g

The input graph. It should be a transactional graph with the amount as the attribute of each edge. The vertices must be the clients IDs.

data

Dataframe with information of the clients. It should include a column with the clients IDs named "customer_id" and the alert label named "sar_flag" that must be a boolean variable.

Value

A vector with the minimum, mean and maximum for both the number of steps and total transactioned amount in the random walks calculated.

References

Eddin, A. N., Bono, J., Aparício, D., Polido, D., Ascensao, J. T., Bizarro, P., and Ribeiro, P. (2021). Anti-money laundering alert optimization using machine learning with graphs. arXiv preprint arXiv:2112.07508.

Examples

g <- igraph::graph_from_data_frame(d = transactions_small_example[, 1:3], directed = TRUE)
v <- transactions_small_example[1, 1]
mean_rw_client(v, g, data = clients_small_example)

Clients' information

Description

A dataset containing information about 3973 clients of a certain bank.

Usage

data(profiles)

Format

A data frame with 3973 rows and 9 variables

Details

  • age, numeric. Age of the client in years.

  • antiquity_age, numeric. Age of the account in years.

  • gender, boolean. Gender of the client.

  • occupation, numeric. Occupation of the client.

  • nationality, character. Country of birth of the client (labelled in ISO-CODE 2).

  • residence, character. Country of residence of the client (labelled in ISO-CODE 2).

  • pep_flag, boolean. Indicator whether the client is involved in political activities (1) or not (0).

  • sar_flag, boolean. Indicator whether the client was involved in a reported transaction (1) or not (0).

  • customer_id, character. ID of the client's account.


Random walk simulation

Description

Computes a random walk path for a given client.

Usage

rw_client(v, g, data)

Arguments

v

The initial vertex of the input graph.

g

The input graph. It should be a transactional graph with the amount as the attribute of each edge. The vertices must be the clients IDs.

data

Dataframe with information of the clients. It should include a column with the clients IDs named "customer_id" and the alert label named "sar_flag" that must be a boolean variable.

Value

A vector with the number of steps taken in the random walk and the total transactioned amount in it.

References

Eddin, A. N., Bono, J., Aparício, D., Polido, D., Ascensao, J. T., Bizarro, P., and Ribeiro, P. (2021). Anti-money laundering alert optimization using machine learning with graphs. arXiv preprint arXiv:2112.07508.

Examples

g <- igraph::graph_from_data_frame(d = transactions_small_example[, 1:3], directed = TRUE)
v <- transactions_small_example[1, 1]
rw_client(v, g, data = clients_small_example)

Transactions' information

Description

A dataset containing information about 15379 transactions of a certain bank.

Usage

data(transactions)

Format

A data frame with 15379 rows and 5 variables

Details

  • nameOrig, character. ID of the client that initiated the transaction.

  • nameDest, character. ID of the client that received the transaction.

  • amount, numeric. Amount of money involved in the transaction in euros (€).

  • isFraud, boolean. Indicator whether the transaction was reported (1) or not (0).

  • transactionDate, character. Date of the transaction.


Transactions' information for a small example

Description

A dataset containing information about 10 transactions of a certain bank.

Usage

data(transactions_small_example)

Format

A data frame with 10 rows and 5 variables

Details

  • nameOrig, character. ID of the client that initiated the transaction.

  • nameDest, character. ID of the client that received the transaction.

  • amount, numeric. Amount of money involved in the transaction in euros (€).

  • isFraud, boolean. Indicator whether the transaction was reported (1) or not (0).

  • transactionDate, character. Date of the transaction.