Machine learning data stack for real-time fraud detection using Feast on GCP

  • September 8, 2021
  • Jay Parthasarthy and Jules S. Damji

A machine learning (ML) model decides whether your transaction is blocked or approved every time you purchase using your credit card. Fraud detection is a canonical use case for real-time ML. Predictions are made upon each request quickly while you wait at a point of sale for payment approval.

Even though this is a common problem with ML, companies often build custom tooling to tackle these predictions. Like most ML problems, the hard part of fraud prediction is in the data. The fundamental data challenges are the following:

  1. Some data needed for prediction is available as part of the transaction request. This data is the easy part of passing to the model.  
  2. Other data (for example, a user’s historical purchases) provides a high signal for predictions, but it isn’t available as part of the transaction request. This data takes time to look up: it’s stored in a batch system like a data warehouse. This data is challenging to fetch since it requires a system to handle many queries per second (QPS).
  3. Together, they comprise ML features as signals to the model for predicting whether the requested transaction is fraudulent.

Feast is an open-source feature store that helps teams use batch data for real-time ML applications. It’s used as part of fraud prediction and other high-volume transactions systems to prevent fraud for billions of dollars worth of transactions at companies like Gojek and Postmates. In this blog, we discuss how we can use Feast to build a stack for fraud predictions. You can also follow along on Google Cloud Platform (GCP) by running this Colab tutorial notebook.

Generic data stack for fraud detection

Here’s what a generic stack for fraud prediction looks like:

Let’s break down these stages and explain what’s happening at each stage.

1. Generating batch features from data sources

The first step in deploying an ML model is to generate features from raw data stored in an offline system, such as a data warehouse (DWH) or a modern data lake. After that, we use these features in our ML model for training and inference. But before we get into the specifics of fraud detection related to our example below, let’s quickly understand some high-level concepts.

Data sources: This data repository records all historical transactions data for a user, account information, and any indication of user fraud history. Usually, it’s a data warehouse (DHW) with respective tables. The diagram above shows that features are generated from these data sources and put into another offline store (or the same store). Using transformational queries, like SQL, this data, joined from multiple tables, could be injected or stored as another table in a DWH— refined and computed as features.

Features used: In the fraud use case, one set of the raw data is a record of historical transactions. This record includes data about the transaction: 

  • Amount of transaction
  • Timestamp when the event occurred
  • User account information
  • Number of transactions in the last seven days
  • Label (what we want to predict)

The outcome desired: From these features, fed into our linear regression model (trained below), we predict whether the requested transaction in real-time by a particular user is fraudulent or not.

An example of our computed features look as:

Transformation Query: For example, we might want the model to predict using the number of transactions a user has made in the last week since a high number may indicate fraud occurring. Using SQL query to fetch data from our offline storage, we can generate this feature as:

   src_account AS user_id,
   COUNT(*) AS transaction_count_7d,
   timestamp AS feature_timestamp

This SQL snippet will generate feature values. When executed, the results can be stored in a table and look as follows:

Before we can use the transaction_count_7d feature for training and serving, we need to do two things:

  1. Determine what feature values were in the past so that we can backfill the feature. 
    1. For simplicity, you can just run this query as of some point in the past. As things scale, we can eventually implement a separate backfill pipeline that computes all historical values at once using window functions.
  2. Keep this table up to date as new transactions arrive, so we’ll schedule this query to run regularly.
    1. For simplicity, you can just schedule this query to run every day. However, we recommend that this query be eventually integrated into a batch scheduling system, like Airflow or dbt.

2. Training a linear regression fraud detection model

Next, we can train a model using our features. We select features by performing the following steps:

  1. First, we fetch past transactions from our transactions table (including the label indicating if the transaction is fraudulent or not.)
  2. Second, we join in historical feature values for each transaction. 

Feast makes these two vital steps easy. In Feast, the table of past transactions is called an entity dataframe (since it contains the entities for which we are fetching feature values.) To create a training DataFrame, we can just pass a query for this entity DataFrame alongside feature references, and Feast will assemble the training DataFrame:

from datetime import datetime, timedelta
from feast import FeatureStore
# Initialize a FeatureStore with our current repository's configurations
store = FeatureStore(repo_path=".")
# Get training data
now =
two_days_ago = - timedelta(days=2)
training_data = store.get_historical_features(
       src_account as user_id,
       timestamp between timestamp('{two_days_ago.isoformat()}')
       and timestamp('{now.isoformat()}')""",

The result from this code looks like the following:

We can now train and deploy our linear regression scikit learn model using this DataFrame, by selecting the feature rows and labels:

from sklearn.linear_model import LinearRegression
# Drop stray nulls
# Select training matrices
X = training_data[[
y = training_data["is_fraud"]
# Train a simple linear regression model
model = LinearRegression(), y)

Depending on what model hosting service you use, this trained model can be deployed to a model hosting service for real-time inference.

3. Materialize features to low-latency online stores

We have a model that’s ready for real-time inference. However, we won’t be able to make predictions in real-time if we need to fetch or compute data out of the data warehouse on each request because it’s slow.

Feast allows you to make real-time predictions based on warehouse data by materializing it into an online store. Using the Feast CLI, you can incrementally materialize your data, from the current time on since the previous materialized data:

 feast materialize-incremental $(date -u +"%Y-%m-%dT%H:%M:%S")

With our feature values loaded into the online store, a low-latency key-value store, as shown in the diagram above, we can retrieve new data when a new transaction request arrives in our system. 

Note that the feast materialize-incremental command needs to be run regularly so that the online store can continue to contain fresh feature values. We suggest that you integrate this command into your company’s scheduler (e.g., Airflow.)

4. Real-time inference using a low-latency online store

Our system is now ready to run real-time predictions on an oncoming transaction. The prediction can be made in your company’s scalable backend model serving system. For an incoming transaction request, here are the logical steps:

  1. Since each transaction request includes a user_id, we’ll pass this user_id to Feast’s get_online_features method to fetch up-to-date (or fresh) features for this user.
  2. Then, we’ll pass these features to our model serving endpoint to generate a prediction.
  3. Finally, we can use this prediction to approve or deny the transaction.

Request data: Although not part of this fraud detection system, the general fraud detection system stack will have a mechanism to incorporate a user’s additional data as part of the request data (shown in the diagram above) to the model for prediction. This augmented data request joined with user_id, could have the following information:

  • User’s geolocation
  • Time of transaction
  • Type of transaction: debit, credit, point of sale, etc

Let’s look at what the Feast code would look like in our Python predict function:

def predict(entity_rows: pd.DataFrame) -> int:
   feature_vector = store.get_online_features(
   # Delete entity keys
   del feature_vector["user_id"]
   # Flatten response
   instances = [
        [feature_values[i] for feature_values in feature_vector.values()]
        for i in range(len(entity_rows))
   # Get response from model service
   prediction = model_service.predict(instances)
   return prediction
# send in the user_id to make the prediction 
predict([{"user_id": "v5zlw0"}])


In summation, we outlined a general data stack for real-time fraudulent prediction use cases. We implemented an end-to-end fraud prediction system using Feast on GCP as part of our tutorial. 

We’d love to hear how your organization’s setup differs. This setup roughly corresponds to the most common patterns we’ve seen from our users, but things are usually more complicated as teams introduce feature logging, streaming features, and operational databases.

You can bootstrap a simple stack illustrated in this blog by running our tutorial notebook on GCP. From there, you can integrate your prediction service into your production application and start making predictions in real-time. We can’t wait to see what you build with Feast, and please share with the Feast community on slack your implementation.

Finally, we have included a new section, Tutorials, in our Feast documentation. You can launch these tutorials directly into Google Colab or view them on GitHub.