messageanalyzer.extract_keywords
Functions
|
Extracts the top keywords from a list of text messages using TF-IDF (Term Frequency-Inverse Document Frequency). |
Module Contents
- messageanalyzer.extract_keywords.extract_keywords(messages: List[str], num_keywords: int = 5) List[List[str]][source]
Extracts the top keywords from a list of text messages using TF-IDF (Term Frequency-Inverse Document Frequency).
This function applies TF-IDF to determine the most important words in each message based on their relative importance in the given text corpus. Stop words are automatically removed.
- Parameters:
messages (List[str]) – A list of text messages from which to extract keywords.
num_keywords (int, default = 5) – The number of top keywords to extract from each message.
- Raises:
TypeError – If messages is not a list or contains non-string elements.
- Returns:
A list where each sublist contains the top extracted keywords from the corresponding message.
- Return type:
List[List[str]]
Examples
>>> messages = ["Learning Data Science at MDS is amazing!", "I prefer to work with Python than R"] >>> extract_keywords(messages, num_keywords=3) [['data', 'science', 'amazing'], ['python', 'prefer', 'work']]