anki-csv2ankicards/README.md

127 lines
3.7 KiB
Markdown
Raw Normal View History

2023-09-07 11:26:28 +00:00
# csv2ankicards
2023-09-08 09:24:37 +00:00
A comprehensive toolkit that offers:
2023-09-08 08:30:05 +00:00
- Conversion of CSV files into Anki deck packages (.apkg files).
- Conversion of image files in a directory to a text file using Optical Character Recognition (OCR).
2023-09-08 09:24:37 +00:00
- Generation of CSV format question-answer pairs from textual content using OpenAI's GPT-3 model.
2023-09-08 13:29:15 +00:00
- **RESTful API endpoint to upload and convert multiple images directly into an Anki deck package.**
2023-09-07 11:26:28 +00:00
## Features
- Converts a CSV file with questions and answers into an Anki deck package.
2023-09-08 08:30:05 +00:00
- Converts image files from a specified directory to a single text file using OCR.
2023-09-08 09:24:37 +00:00
- Generates CSV formatted question-answer pairs based on a given text content, ideal for studying or summarization.
2023-09-08 08:30:05 +00:00
- For CSV: there are only two columns in the CSV file, separated by the first comma encountered.
2023-09-07 11:26:28 +00:00
- CSV files should have a "Front" column for questions and a "Back" column for answers.
2023-09-08 13:29:15 +00:00
- **API endpoint that accepts multiple image uploads, processes them through the pipeline, and returns an Anki deck package.**
2023-09-07 11:26:28 +00:00
## Installation
1. Clone this repository:
```bash
git clone https://git.rudefox.io/bj/anki-csv2ankicards.git
cd csv2ankicards
```
2. Set up a virtual environment and activate it:
```bash
python3 -m venv venv
source venv/bin/activate
```
3. Install the required packages:
```bash
pip install -r requirements.txt
```
2023-09-08 13:29:15 +00:00
2023-09-08 09:32:49 +00:00
## Configuration
Before using the `text2csvdeck.py` script, ensure that you have set the `OPENAI_API_KEY` environment variable:
```bash
export OPENAI_API_KEY=your_openai_api_key_here
```
Remember to replace `your_openai_api_key_here` with your actual OpenAI API key.
2023-09-07 11:26:28 +00:00
## Usage
2023-09-08 13:29:15 +00:00
### **REST API Usage**
To start the server:
```bash
python server.py
```
2023-09-08 15:22:57 +00:00
#### Endpoint: `/deck-from-images`
2023-09-08 13:29:15 +00:00
**Method**: POST
**Description**: Accepts multiple image uploads, processes them through the pipeline, and returns an Anki deck package.
**Body**:
- form-data with key `image` and multiple image files as values.
**Response**:
- An `output.apkg` file ready for import into Anki.
2023-09-08 09:37:24 +00:00
### Pipeline Usage
To convert a directory of images directly to an Anki deck package:
```bash
python pipeline.py /path/to/your/image_directory/
```
This will process the images, extract text, convert text to a set of questions and answers in CSV format, and then produce an `output.apkg` file ready for import into Anki.
2023-09-08 09:24:37 +00:00
### Image to Text Conversion
To convert images from a directory to a single text file using OCR:
```bash
python images2text.py /path/to/your/image_directory/
```
This will produce a `final.txt` file which contains the text extracted from the images.
#### Supported Image Formats
Currently supported formats for the images are: `.png`, `.jpg`, and `.jpeg`.
### Text to CSV Deck Generation
To generate a CSV deck of question-answer pairs from a given text file:
```bash
python text2csvdeck.py /path/to/your/textfile.txt
```
This will analyze the content of the given text file and generate a corresponding `_deck.csv` file with questions and answers that capture the main points and themes of the text.
**Note:** This script uses the OpenAI GPT-3 model. Ensure you have the necessary API key and OpenAI Python client installed.
2023-09-08 08:30:05 +00:00
### CSV to Anki Conversion
2023-09-07 11:26:28 +00:00
To convert a CSV file into an Anki deck package:
```bash
python csv2ankicards.py /path/to/your/csvfile.csv output.apkg
```
This will produce an `output.apkg` file which can then be imported into Anki.
2023-09-08 08:30:05 +00:00
#### CSV Format
2023-09-07 11:26:28 +00:00
The CSV file should follow this format:
```
Front,Back
2023-09-08 08:30:05 +00:00
Your question here,Your answer here
2023-09-07 11:26:28 +00:00
Another question,list of: answer1, answer2, answer3
...
```
2023-09-08 09:24:37 +00:00
**Note:** If your answers contain commas, they will be considered as part of the answer. Only the first comma is used to separate the question from the answer.