Go to file
B.J. Dweck 9d33c2ee10 Pinned python package versions 2023-09-21 15:21:17 +03:00
.vscode overhauled the project to get away from files (a little) 2023-09-11 19:10:42 +03:00
.gitignore overhauled the project to get away from files (a little) 2023-09-11 19:10:42 +03:00
README.md Pinned python package versions 2023-09-21 15:21:17 +03:00
ankiai.py decoupled 2023-09-11 20:35:55 +03:00
constants.py decoupled 2023-09-11 20:35:55 +03:00
deck_creation.py revised prompt to generate more cards 2023-09-21 14:52:06 +03:00
image_processing.py BUGFIX: image processing only handles filenames with jpg 2023-09-21 14:42:47 +03:00
logging_config.py extracted contants and added logging 2023-09-11 20:02:17 +03:00
requirements.txt Pinned python package versions 2023-09-21 15:21:17 +03:00
server.py revised README.md 2023-09-11 21:04:51 +03:00

README.md

AnkiAI - Automated Anki Deck Creator

AnkiAI is a tool that leverages OCR (Optical Character Recognition) and GPT-3's powerful natural language processing capabilities to automatically generate Anki decks from images containing text.

Overview

  • AnkiAI is designed to streamline the process of creating Anki decks from images.
  • The core idea is to use OCR to extract text from images and then use GPT-3 to transform this text into a structured Anki deck format.
  • Users can make a POST request to a Flask server endpoint with their images to receive the Anki deck (.apkg file).

Directory Structure

  • .vscode/: Contains configuration for VSCode debugger for Flask applications.
  • ankiai.py: The main script that drives the creation of Anki decks from images.
  • constants.py: Contains constant variables used across the project.
  • deck_creation.py: Contains logic for communicating with OpenAI's API and deck creation using genanki.
  • image_processing.py: Processes images, converting them for OCR and then performing OCR to extract text.
  • logging_config.py: Logging configuration for the entire project.
  • server.py: Flask server that provides an API endpoint to upload images and get back an Anki deck.

Requirements

ImageMagick

ImageMagick is a software suite that allows you to create, edit, and compose bitmap images. It can read, convert, and write images in a variety of formats (over 100) including DPX, EXR, GIF, JPEG, JPEG-2000, PDF, PhotoCD, PNG, Postscript, SVG, and TIFF. In the AnkiAI project, it is used for preprocessing images to improve the performance of OCR.

sudo apt-get update
sudo apt-get install imagemagick

Tesseract

You need Tesseract for the OCR functionality:

sudo apt-get install tesseract-ocr

Python Dependencies

To ensure consistent functionality, it's crucial to use the provided requirements.txt file which pins dependencies to known compatible versions.

You can install the Python dependencies via pip using the requirements.txt file:

pip install -r requirements.txt

How to Run

  1. Environment Variables: Make sure to set the OPENAI_API_KEY environment variable to your OpenAI API key.

    export OPENAI_API_KEY=sk-myapikey
    
  2. Run the Flask server:

    python server.py
    

    This will start the Flask server. You can then make a POST request to http://localhost:5000/deck-from-images with your images to get an Anki deck.

  3. Run Directly:

    If you prefer not to use the Flask server, you can also run ankiai.py directly:

    python ankiai.py <directory_path_containing_images>
    

Example curl commands to interact with the service:

You can make POST requests to the server using curl. Here are some examples from the command line history:

curl -X POST -o deck.apkg \
-F "image=@/home/ubuntu/Pictures/image1.png" \
-F "image=@/home/ubuntu/Pictures/image2.png" \
-F "image=@/home/ubuntu/Pictures/image3.png" \
http://localhost:5000/deck-from-images

Batch processing of images:

for file in /home/ubuntu/Pictures/*; do 
    if [[ -f "$file" ]]; then 
        basefile=$(basename "$file");
        curl -X POST -o "deck-${basefile}.apkg" -F "image=@${file}" http://localhost:5000/deck-from-images;
    fi;
done

How to Debug (VSCode Users)

  • Open the project in VSCode.
  • Set up your breakpoints.
  • Use the VSCode debugger and select "Python: Flask" to start debugging the Flask server.

Important Notes

  • API Key: For the project to work, it is essential to have the OPENAI_API_KEY environment variable set.
  • Image Types: Currently, the image processing module supports PNG, JPG, and JPEG formats.
  • Output: The output .apkg file (Anki package file) will be named out.apkg.

Acknowledgements

This project heavily relies on the openai library for processing and the genanki library for deck generation.

Contributions

Contributions are always welcome. Please create a new issue or a pull request for any bug fixes or feature requests.