Title: | Random Walks on Graphs Representing a Transactional Network |
---|---|
Description: | Random walk functions to extract new variables based on clients transactional behaviour. For more details, see Eddin et al. (2021) <arXiv:2112.07508v3> and Oliveira et al. (2021) <arXiv:2102.05373v2>. |
Authors: | Mafalda Sá Ferreira [aut, cre], Regina Bispo [ctb], FCT, I.P. [fnd] (under the scope of the projects UIDB/00297/2020 and UIDP/00297/2020 (NovaMath)) |
Maintainer: | Mafalda Sá Ferreira <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.0.0 |
Built: | 2024-11-15 03:51:05 UTC |
Source: | https://github.com/cran/RWgraph |
A dataset containing information about 20 clients of a certain bank.
data(clients_small_example)
data(clients_small_example)
A data frame with 20 rows and 9 variables
age, numeric. Age of the client in years.
antiquity_age, numeric. Age of the account in years.
gender, boolean. Gender of the client.
occupation, numeric. Occupation of the client.
nationality, character. Country of birth of the client (labelled in ISO-CODE 2).
residence, character. Country of residence of the client (labelled in ISO-CODE 2).
pep_flag, boolean. Indicator whether the client is involved in political activities (1) or not (0).
sar_flag, boolean. Indicator whether the client was involved in a reported transaction (1) or not (0).
customer_id, character. ID of the client's account.
Computes the metrics of the generated random walks for every client in the dataframe using the function 'mean_rw_client'.
info_client(g, data)
info_client(g, data)
g |
The input graph. Transactional graph containing the amount (in monetary unit) as the attribute of each edge. The vertices must be the clients IDs. |
data |
Dataframe with information of the clients. It should include a column with the clients IDs named "customer_id" and the alert label named "sar_flag" that must be a boolean variable. |
A dataframe with the clients IDs and the computed metrics (minimum, mean and maximum for both the number of steps and total transactioned amount) for the random walks starting in each client.
Eddin, A. N., Bono, J., Aparício, D., Polido, D., Ascensao, J. T., Bizarro, P., and Ribeiro, P. (2021). Anti-money laundering alert optimization using machine learning with graphs. arXiv preprint arXiv:2112.07508.
g <- igraph::graph_from_data_frame(d = transactions_small_example[, 1:3], directed = TRUE) info_client(g, data = clients_small_example)
g <- igraph::graph_from_data_frame(d = transactions_small_example[, 1:3], directed = TRUE) info_client(g, data = clients_small_example)
Computes metrics for 50 generated random walks using the function 'rw_client'.
mean_rw_client(v, g, data)
mean_rw_client(v, g, data)
v |
The initial vertex of the input graph. |
g |
The input graph. It should be a transactional graph with the amount as the attribute of each edge. The vertices must be the clients IDs. |
data |
Dataframe with information of the clients. It should include a column with the clients IDs named "customer_id" and the alert label named "sar_flag" that must be a boolean variable. |
A vector with the minimum, mean and maximum for both the number of steps and total transactioned amount in the random walks calculated.
Eddin, A. N., Bono, J., Aparício, D., Polido, D., Ascensao, J. T., Bizarro, P., and Ribeiro, P. (2021). Anti-money laundering alert optimization using machine learning with graphs. arXiv preprint arXiv:2112.07508.
g <- igraph::graph_from_data_frame(d = transactions_small_example[, 1:3], directed = TRUE) v <- transactions_small_example[1, 1] mean_rw_client(v, g, data = clients_small_example)
g <- igraph::graph_from_data_frame(d = transactions_small_example[, 1:3], directed = TRUE) v <- transactions_small_example[1, 1] mean_rw_client(v, g, data = clients_small_example)
A dataset containing information about 3973 clients of a certain bank.
data(profiles)
data(profiles)
A data frame with 3973 rows and 9 variables
age, numeric. Age of the client in years.
antiquity_age, numeric. Age of the account in years.
gender, boolean. Gender of the client.
occupation, numeric. Occupation of the client.
nationality, character. Country of birth of the client (labelled in ISO-CODE 2).
residence, character. Country of residence of the client (labelled in ISO-CODE 2).
pep_flag, boolean. Indicator whether the client is involved in political activities (1) or not (0).
sar_flag, boolean. Indicator whether the client was involved in a reported transaction (1) or not (0).
customer_id, character. ID of the client's account.
Computes a random walk path for a given client.
rw_client(v, g, data)
rw_client(v, g, data)
v |
The initial vertex of the input graph. |
g |
The input graph. It should be a transactional graph with the amount as the attribute of each edge. The vertices must be the clients IDs. |
data |
Dataframe with information of the clients. It should include a column with the clients IDs named "customer_id" and the alert label named "sar_flag" that must be a boolean variable. |
A vector with the number of steps taken in the random walk and the total transactioned amount in it.
Eddin, A. N., Bono, J., Aparício, D., Polido, D., Ascensao, J. T., Bizarro, P., and Ribeiro, P. (2021). Anti-money laundering alert optimization using machine learning with graphs. arXiv preprint arXiv:2112.07508.
g <- igraph::graph_from_data_frame(d = transactions_small_example[, 1:3], directed = TRUE) v <- transactions_small_example[1, 1] rw_client(v, g, data = clients_small_example)
g <- igraph::graph_from_data_frame(d = transactions_small_example[, 1:3], directed = TRUE) v <- transactions_small_example[1, 1] rw_client(v, g, data = clients_small_example)
A dataset containing information about 15379 transactions of a certain bank.
data(transactions)
data(transactions)
A data frame with 15379 rows and 5 variables
nameOrig, character. ID of the client that initiated the transaction.
nameDest, character. ID of the client that received the transaction.
amount, numeric. Amount of money involved in the transaction in euros (€).
isFraud, boolean. Indicator whether the transaction was reported (1) or not (0).
transactionDate, character. Date of the transaction.
A dataset containing information about 10 transactions of a certain bank.
data(transactions_small_example)
data(transactions_small_example)
A data frame with 10 rows and 5 variables
nameOrig, character. ID of the client that initiated the transaction.
nameDest, character. ID of the client that received the transaction.
amount, numeric. Amount of money involved in the transaction in euros (€).
isFraud, boolean. Indicator whether the transaction was reported (1) or not (0).
transactionDate, character. Date of the transaction.