Go to file
2023-09-08 16:24:34 +03:00
.gitignore Initial commit 2023-09-07 14:05:57 +03:00
csv2ankicards.py fixed comma extra fields issue 2023-09-07 14:20:31 +03:00
images2text.py The pipeline works end-to-end 2023-09-08 12:58:19 +03:00
pipeline.py Created a pipeline 2023-09-08 12:37:24 +03:00
README.md Created a pipeline 2023-09-08 12:37:24 +03:00
requirements.txt Created a server to serve apkg files in respose to image posts 2023-09-08 16:24:34 +03:00
server.py Created a server to serve apkg files in respose to image posts 2023-09-08 16:24:34 +03:00
text2csvdeck.py The pipeline works end-to-end 2023-09-08 12:58:19 +03:00

csv2ankicards

A comprehensive toolkit that offers:

  • Conversion of CSV files into Anki deck packages (.apkg files).
  • Conversion of image files in a directory to a text file using Optical Character Recognition (OCR).
  • Generation of CSV format question-answer pairs from textual content using OpenAI's GPT-3 model.

Features

  • Converts a CSV file with questions and answers into an Anki deck package.
  • Converts image files from a specified directory to a single text file using OCR.
  • Generates CSV formatted question-answer pairs based on a given text content, ideal for studying or summarization.
  • For CSV: there are only two columns in the CSV file, separated by the first comma encountered.
  • CSV files should have a "Front" column for questions and a "Back" column for answers.

Installation

  1. Clone this repository:

    git clone https://git.rudefox.io/bj/anki-csv2ankicards.git
    cd csv2ankicards
    
  2. Set up a virtual environment and activate it:

    python3 -m venv venv
    source venv/bin/activate
    
  3. Install the required packages:

    pip install -r requirements.txt
    

Configuration

Before using the text2csvdeck.py script, ensure that you have set the OPENAI_API_KEY environment variable:

export OPENAI_API_KEY=your_openai_api_key_here

Remember to replace your_openai_api_key_here with your actual OpenAI API key.

Usage

Pipeline Usage

To convert a directory of images directly to an Anki deck package:

python pipeline.py /path/to/your/image_directory/

This will process the images, extract text, convert text to a set of questions and answers in CSV format, and then produce an output.apkg file ready for import into Anki.

Image to Text Conversion

To convert images from a directory to a single text file using OCR:

python images2text.py /path/to/your/image_directory/

This will produce a final.txt file which contains the text extracted from the images.

Supported Image Formats

Currently supported formats for the images are: .png, .jpg, and .jpeg.

Text to CSV Deck Generation

To generate a CSV deck of question-answer pairs from a given text file:

python text2csvdeck.py /path/to/your/textfile.txt

This will analyze the content of the given text file and generate a corresponding _deck.csv file with questions and answers that capture the main points and themes of the text.

Note: This script uses the OpenAI GPT-3 model. Ensure you have the necessary API key and OpenAI Python client installed.

CSV to Anki Conversion

To convert a CSV file into an Anki deck package:

python csv2ankicards.py /path/to/your/csvfile.csv output.apkg

This will produce an output.apkg file which can then be imported into Anki.

CSV Format

The CSV file should follow this format:

Front,Back
Your question here,Your answer here
Another question,list of: answer1, answer2, answer3
...

Note: If your answers contain commas, they will be considered as part of the answer. Only the first comma is used to separate the question from the answer.