-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Demo notebook NE display using HTML #56
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #56 +/- ##
==========================================
+ Coverage 93.47% 93.90% +0.43%
==========================================
Files 4 4
Lines 383 394 +11
==========================================
+ Hits 358 370 +12
+ Misses 25 24 -1 ☔ View full report in Codecov by Sentry. |
Quality Gate passedIssues Measures |
This still addresses the demonstration notebook #40, which lacked a way to visualize the results. Visualization can still be improved further, as this method highlights every occurence of an ORG, LOC or MISC, while in reality only one occurence is replaced at a time in the pseudonymization process. This can lead to parts of the text being highlighted, although it was not replaced in the pseudonymization. For PER, the highlighting is working accurately, since the pseudonymization class replaces every occurence of each name as well. |
This crashes when parsing too many emails though, something to keep in mind for further testing (did not display anything anymore / the kernel died for two separate tests on vscode / using jupyter-notebook). |
Implemented function to highlight NEs in the performance notebook using HTML
per_list
,org_list
,loc_list
andmisc_list
attibutes to pseudonymization class which can be accessed for the currently processed emailpseudonymize_per
method which now uses class attributeper_list
instead of passing list as parameternotebook/demo.ipynb