Feast 0.13 adds on-demand transforms, feature servers, and feature views without entities

  • October 2, 2021
  • Danny Chiao, Tsotne Tabidze, Achal Shah, and Felix Wang

We are delighted to announce the release of Feast 0.13, which introduces:

  • [Experimental] On demand feature views, which allow for consistently applied transformations in both training and online paths. This also introduces the concept of request data, which is data only available at the time of the prediction request, as potential inputs into these transformations
  • [Experimental] Python feature servers, which allow you to quickly deploy a local HTTP server to serve online features. Serverless deployments and java feature servers to come soon!
  • Feature views without entities, which allow you to specify features that should only be joined on event timestamps. You do not need lists of entities / entity values when defining and retrieving features from these feature views. 

Experimental features are subject to API changes in the near future as we collect feedback. If you have thoughts, please don’t hesitate to reach out to the Feast team!

[Experimental] On demand feature views

On demand feature views allows users to use existing features and request data to transform and create new features. Users define Python transformation logic which is executed in both historical retrieval and online retrieval paths.‌ This unlocks many use cases including fraud detection and recommender systems, and reduces training / serving skew by allowing for consistently applied transformations. Example features may include:

  • Transactional features such as transaction_amount_greater_than_7d_average where the inputs to features are part of the transaction, booking, or order event. 
  • Features requiring the current location or time such as user_account_age, distance_driver_customer
  • Feature crosses where the keyspace is too large to precompute such as movie_category_x_movie_rating or lat_bucket_x_lon_bucket

Currently, these transformations are executed locally. Future milestones include building a feature transformation server for executing transformations at higher scale.

First, we define the transformations:

# Define a request data source which encodes features / information only 
# available at request time (e.g. part of the user initiated HTTP request)
input_request = RequestDataSource(
    name="vals_to_add",
    schema={
        "val_to_add": ValueType.INT64,
        "val_to_add_2": ValueType.INT64
    }
)

# Define an on demand feature view which can generate new features based on 
# existing feature views and RequestDataSource features
@on_demand_feature_view(
   inputs={
       'driver_hourly_stats': driver_hourly_stats_view,
       'vals_to_add': input_request
   },
   features=[
     Feature(name='conv_rate_plus_val1', dtype=ValueType.DOUBLE),
     Feature(name='conv_rate_plus_val2', dtype=ValueType.DOUBLE)
   ]
)
def transformed_conv_rate(inputs: pd.DataFrame) -> pd.DataFrame:
    df = pd.DataFrame()
    df['conv_rate_plus_val1'] = (inputs['conv_rate'] + inputs['val_to_add'])
    df['conv_rate_plus_val2'] = (inputs['conv_rate'] + inputs['val_to_add_2'])
    return df

Now these new features are available for retrieval at training or serving time:

training_df = store.get_historical_features(
    entity_df=entity_df,
    features=[
        "driver_hourly_stats:string_feature",
        "driver_hourly_stats:conv_rate",
        "driver_hourly_stats:acc_rate",
        "driver_hourly_stats:avg_daily_trips",
        "transformed_conv_rate:conv_rate_plus_val1",
        "transformed_conv_rate:conv_rate_plus_val2",
        "driver_age:driver_age"
    ],
).to_df()

See On demand feature view for detailed info on how to use this functionality.

[Experimental] Python feature server

The Python feature server provides an HTTP endpoint that serves features from the feature store. This enables users to retrieve features from Feast using any programming language that can make HTTP requests. As of now, it’s only possible to run the server locally. A remote serverless feature server is currently being developed. Additionally, a low latency java feature server is in development.

$ feast init feature_repo
Creating a new Feast repository in /home/tsotne/feast/feature_repo.

$ cd feature_repo

$ feast apply
Registered entity driver_id
Registered feature view driver_hourly_stats
Deploying infrastructure for driver_hourly_stats

$ feast materialize-incremental $(date +%Y-%m-%d)
Materializing 1 feature views to 2021-09-09 17:00:00-07:00 into the sqlite
online store.

driver_hourly_stats from 2021-09-09 16:51:08-07:00 to 2021-09-09 17:00:00-07:00:
100%|████████████████████████████████████████████████████████████████| 5/5
[00:00<00:00, 295.24it/s]

$ feast serve
# This is an experimental feature. It's intended for early testing and feedback,
# and could change without warnings in future releases.
INFO:     Started server process [8889]
09/10/2021 10:42:11 AM INFO:Started server process [8889]
INFO:     Waiting for application startup.
    09/10/2021 10:42:11 AM INFO:Waiting for application startup.
    INFO:     Application startup complete.
09/10/2021 10:42:11 AM INFO:Application startup complete.
INFO:     Uvicorn running on http://127.0.0.1:6566 (Press CTRL+C to quit)
09/10/2021 10:42:11 AM INFO:Uvicorn running on http://127.0.0.1:6566
(Press CTRL+C to quit)

See Feature server for detailed info on how to use this functionality.

Feature views without entities

Feature views can now be defined without entities. Feature views without entities are joined on the event_timestamp column and can only take on a single value at any point in time. This can greatly simplify defining and retrieving features (e.g. global features like total_num_users):

global_stats_fv = FeatureView(
    name="global_stats",
    entities=[],
    features=[
        Feature(name="total_trips_today_by_all_drivers", dtype=ValueType.INT64),
    ],
    batch_source=BigQuerySource(
        table_ref="feast-oss.demo_data.global_stats"
    )
)

What’s next

We are collaborating with the community on supporting streaming sources, low latency serving, serverless deployments of feature servers, improved support for Kubernetes deployments, and more. 

In addition, there is active community work on building Hive, Snowflake, Azure, Astra, Presto, and Alibaba Cloud connectors. If you have thoughts on what to build next in Feast or what should belong in a feature store, please fill out this form.

Download Feast 0.13 today from PyPI (or pip install feast) and try it out! Let us know on our slack channel what you think.

Credits

We want to extend our gratitude and acknowledgement to all Feast community contributors, @achals, @adchia, @baineng, @codyjlin, @DvirDukhan, @felixwang9817, @GregKuhlmann, @guykhazma, @hamzakpt, @judahrand, @jdamji, @LarsKlingen, @MattDelac, @mavysavydav, @mmurdoch, @qooba, @tedhtchang, @samuel100, @tsotnet, and @WingCode, who helped contribute to this release.

To see all the features, enhancements, and bug fixes from the Feast community contributors, check the changelog for this release.