.vscode | ||
.gitignore | ||
ankiai.py | ||
constants.py | ||
deck_creation.py | ||
image_processing.py | ||
logging_config.py | ||
README.md | ||
requirements.txt | ||
server.py |
AnkiAI - Automated Anki Deck Creator
AnkiAI is a tool that leverages OCR (Optical Character Recognition) and GPT-3's powerful natural language processing capabilities to automatically generate Anki decks from images containing text.
Overview
- AnkiAI is designed to streamline the process of creating Anki decks from images.
- The core idea is to use OCR to extract text from images and then use GPT-3 to transform this text into a structured Anki deck format.
- Users can make a POST request to a Flask server endpoint with their images to receive the Anki deck (.apkg file).
Directory Structure
.vscode/
: Contains configuration for VSCode debugger for Flask applications.ankiai.py
: The main script that drives the creation of Anki decks from images.constants.py
: Contains constant variables used across the project.deck_creation.py
: Contains logic for communicating with OpenAI's API and deck creation using genanki.image_processing.py
: Processes images, converting them for OCR and then performing OCR to extract text.logging_config.py
: Logging configuration for the entire project.server.py
: Flask server that provides an API endpoint to upload images and get back an Anki deck.
Requirements
ImageMagick
ImageMagick is a software suite that allows you to create, edit, and compose bitmap images. It can read, convert, and write images in a variety of formats (over 100) including DPX, EXR, GIF, JPEG, JPEG-2000, PDF, PhotoCD, PNG, Postscript, SVG, and TIFF. In the AnkiAI project, it is used for preprocessing images to improve the performance of OCR.
sudo apt-get update
sudo apt-get install imagemagick
Tesseract
You need Tesseract for the OCR functionality:
sudo apt-get install tesseract-ocr
Python Dependencies
To ensure consistent functionality, it's crucial to use the provided requirements.txt
file which pins dependencies to known compatible versions.
You can install the Python dependencies via pip
using the requirements.txt
file:
pip install -r requirements.txt
How to Run
-
Environment Variables: Make sure to set the
OPENAI_API_KEY
environment variable to your OpenAI API key.export OPENAI_API_KEY=sk-myapikey
-
Run the Flask server:
python server.py
This will start the Flask server. You can then make a POST request to
http://localhost:5000/deck-from-images
with your images to get an Anki deck. -
Run Directly:
If you prefer not to use the Flask server, you can also run
ankiai.py
directly:python ankiai.py <directory_path_containing_images>
Example curl commands to interact with the service:
You can make POST requests to the server using curl. Here are some examples from the command line history:
curl -X POST -o deck.apkg \
-F "image=@/home/ubuntu/Pictures/image1.png" \
-F "image=@/home/ubuntu/Pictures/image2.png" \
-F "image=@/home/ubuntu/Pictures/image3.png" \
http://localhost:5000/deck-from-images
Batch processing of images:
for file in /home/ubuntu/Pictures/*; do
if [[ -f "$file" ]]; then
basefile=$(basename "$file");
curl -X POST -o "deck-${basefile}.apkg" -F "image=@${file}" http://localhost:5000/deck-from-images;
fi;
done
How to Debug (VSCode Users)
- Open the project in VSCode.
- Set up your breakpoints.
- Use the VSCode debugger and select "Python: Flask" to start debugging the Flask server.
Important Notes
- API Key: For the project to work, it is essential to have the
OPENAI_API_KEY
environment variable set. - Image Types: Currently, the image processing module supports PNG, JPG, and JPEG formats.
- Output: The output
.apkg
file (Anki package file) will be namedout.apkg
.
Acknowledgements
This project heavily relies on the openai
library for processing and the genanki
library for deck generation.
Contributions
Contributions are always welcome. Please create a new issue or a pull request for any bug fixes or feature requests.