Skip to content

Classifying a patent using an AI at subclass level (600+ labels) - Multilabel classification

Notifications You must be signed in to change notification settings

vishvapalsinh/Patent-classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Patent classification 


This project is aiming to implement the patent classification at the subclass level
according to IPC and CPC systems. The total number of classes is more than 600.

The pipeline for the project implementation is as below:

  1. Extract dataset
  2. EDA of the dataset
  3. Train a model 

For all of the above tasks, the respective jupyter notebook is shared.

With the Google big query, the dataset for the classification task is generated. The generated dataset is stored in the CSV file. For each year varying from the year, 2009 to 2019 separate CSV files are created. This dataset is made publically available for experiment purposes. The attribute of these CSV files are as shown in the table below:

ID Date Title Claim cpc_subclass
8844051 2014-09-23 Lithium-ion secondary battery A lithium-ion secondary battery comprising ... H01M,Y02E,Y02T

The link to download this dataset by year is provided below.

smile

About

Classifying a patent using an AI at subclass level (600+ labels) - Multilabel classification

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published