Installing fidescls from Pypi allows for easily importing both Content and Context classification systems into your own projects:
1
fromfidescls.clsimportcontent,context
Classification Methods
Content Classification
Calling content.classify() with a string, list, or dictionary, will return potential PII classifications for the given input.
Content Classification
1 2 3 4 5 6 7 8 91011
content_test_string="example@email.com"content_test_list=["email@example.com","example@email.com"]content_test_dict={'email':content_test_list,'name':["John Smith","Jane Smith"]}## Classify a string, list, and dictionarycontent_cls_string=content.classify(content_test_string)content_cls_list=content.classify(content_test_list)content_cls_dict=content.classify(content_test_dict)
Reviewing the results shows the classification suggestions for each input, as well as the percentage score of certainty:
Results
1 2 3 4 5 6 7 8 91011121314151617181920212223
## String Results[ClassifyOutput(input='example@email.com',labels=[MethodOutput(label='EMAIL_ADDRESS',score=1.0,position_start=0,position_end=17)])]## List Results[ClassifyOutput(input='email@example.com',labels=[MethodOutput(label='EMAIL_ADDRESS',score=1.0,position_start=0,position_end=17)]),ClassifyOutput(input='example@email.com',labels=[MethodOutput(label='EMAIL_ADDRESS',score=1.0,position_start=0,position_end=17)])]## Dictionary Results{'email':[ClassifyOutput(input='email@example.com',labels=[MethodOutput(label='EMAIL_ADDRESS',score=1.0,position_start=0,position_end=17)]),ClassifyOutput(input='test@email.com',labels=[MethodOutput(label='EMAIL_ADDRESS',score=1.0,position_start=0,position_end=14)])],'name':[ClassifyOutput(input='John Smith',labels=[MethodOutput(label='PERSON',score=0.85,position_start=0,position_end=10)]),ClassifyOutput(input='Jane Smith',labels=[MethodOutput(label='PERSON',score=0.85,position_start=0,position_end=10)])]}
Context Classification
Classifying by context requires a set of data categories to compare your input against. The fideslang taxonomy allows you to easily classify your systems by common privacy definitions and standards, and is imported by default to easily work with fidescls' classification systems.
Calling context.classify(column_name) with a column name will return possible classification labels for the column's contents. You can also provide an integer value for top_n to specify the amount of potential categories returned for each input.
To take advantage of Fidescls' data aggregation methods, aggregation module is available as an import for your own projects:
1
fromfidescls.clsimportaggregation
Content Aggregation
Providing content.classify() with an aggregation_method alongside the data to be classified (represented here as content_data) will aggregate the results into a higher-level classification recommendation.
Calling aggregation.aggregate_system() with both content and context classification results will aggregate their suggestions into a single, compiled result.