Skip to content

davcurse/smart_document_analyzer

Repository files navigation

Smart Document Analyzer

EC530 Final Project

David Li

An easy to use command-line based program to store, view, manipulate, and summarize documents with secure user authentication and history backtracking.

Features

  • User registration and login functionality
  • File upload and management
  • Word count for uploaded files (.txt, .doc, .docx, .pdf)
  • Logging of user actions and application events
  • Database management and file deletion options
  • Summarization of documents and additional links suggestions

Requirements

  • Python 3.x
  • SQLite3
  • PyPDF
  • python-docx
  • logging
  • nltk

Installation

  1. Clone the repository
  2. Run the application with python3.11 smart_doc.py or python3.x version of your choice
  3. Once logged in, you can:
  • View uploaded files
  • Upload new files
  • Delete files
  • Summarize uploaded files
  • Log out
  1. The application supports the following command-line arguments:
  • -a: Run program!
  • -c: Clean the database and delete all uploaded files
  • -g: Generate a new database for testing
  • -r: Clean and generate a new database

File Structure

  • smart_doc.py: Main executable for program
  • test_main.py: Backend testing implementation
  • file.py: Contains functions for user registration, login, and file management
  • database.py: Handles database initialization and creation of tables
  • fullclean_db.py: Provides functions for clearing the database and deleting files
  • Database/database.db: SQLite database file for storing user information and uploaded files
  • uploaded_files/: Directory for storing all uploaded files

Logging

The application logs various events and user actions to the app.log file. The log file includes timestamps, log levels, and log messages for tracking and debugging purposes.

Demo

demo.mp4

Example Use Screenshots

Must run smartdoc application with an argument tag.

Cleaning and generating new database.

Registration prompt. All registered users are securely store LOCALLY only! Can be seen in database.py and database folder.

Successful login and file manage options. Uploaded files can only be viewed and manipulated by the logged in user. Other users have no access to uploaded files.

Viewing files. (Textbooks have large word counts!)

Deleting file clears it from the database for the logged in user and removes from stored directory.

Viewing summary of the file. (Chapter on Palestine History)

Viewing summary of the file not only includes keywords and a summary but also related links pertaining to the topic!

Docker file implementation. Requirements.txt can be seen in source code.

License

This project is licensed under the MIT License.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published