Skip to content

Latest commit

 

History

History
51 lines (30 loc) · 2.51 KB

File metadata and controls

51 lines (30 loc) · 2.51 KB

Extract Text Features

Description

The Extract Text Features custom step enables SAS Studio users to extract additional features from text fields. This steps creates a varying amount of features based on the selected options within the step. At a minimum it generates eight features but it can potentially generate hundreds of features. This custom step covers topics around NLP (Natural Language Processing).

The generated features range from simple features based on the occurrence of exclamation points in the text to extracting sentiment and text topics from the text.

For more information about all of the Advanced Text Analytics Features please refer to the SAS Documentation.

A full walkthrough of this custom step can be found in this YouTube video.

User Interface

  • Base Metadata

    Base Metadata

  • Custom RegEx Pattern

    Custom RegEx Pattern

  • Link Data

    Link Data

  • Text Analytics Start

    Text Analytics Start

  • Text Analytics Topic Creation

    Text Analytics Topic Creation

  • Text Analytics Bool Rule Creation

    Text Analytics Bool Rule Creation

  • About

    Information

Requirements

2022.10 or later.

For the gathering of Link Metadata you need to provide a Python runtime to the Compute Context containing at least the following three packages: pandas, requests & bs4.

For the Advanced Text Analytics features you need to have a SAS Visual Text Analytics license.

For a demonstration of the usage of this step please check out this YouTube video - the example flow is included here.

The video makes use of the train.csv data from this Kaggle Challenge - to download the data you need an account with Kaggle - for a quick demonstration of how to do this, check this part of the the YouTube video.

Change Log

  • Version 1.0 (27NOV2022)
    • Initial version