Customer Information Extractor

This project uses OpenAI's GPT-4 Vision model to extract customer information from images and compile it into an Excel spreadsheet. It's particularly useful for digitizing customer information from business cards, forms, or any documents containing customer details.

Features

Extracts customer information from images including:
- Name
- Phone Number
- Mobile Number
- Email
- Complete Address (Street, City, ZIP, State, Country)
- Geolocation (Latitude, Longitude)
Processes multiple images in batch
Outputs data to an organized Excel spreadsheet
Supports multiple image formats (PNG, JPG, JPEG)

Prerequisites

Python 3.7+
OpenAI API key
Required Python packages:
- openai
- Pillow (PIL)
- pandas
- openpyxl

Installation

Clone the repository:

git clone https://github.com/yourusername/customer-information-extractor.git
cd customer-information-extractor

Install required packages:

pip install -r requirements.txt

Set up your configuration:
- Copy config.template.py to config.py
- Add your OpenAI API key to config.py

OPENAI_API_KEY = "your-api-key-here"

Usage

Place your images in the images folder
Run the script:

python extract.py

Find the extracted data in customer_info.xlsx

Sample Output

Excel File Structure (customer_info.xlsx)

The script generates an Excel file with the following columns:

Name	Phone Number	Mobile Number	Email	Street	Street Number	City	ZIP Code	State	Country	Latitude	Longitude
John Smith	+1-555-0123	+1-555-4567	john.smith@example.com	Oak Avenue	123	Springfield	12345	IL	USA	39.78373	-89.65014
Sarah Johnson	+1-555-8901	+1-555-2345	sarah.j@example.com	Maple Street	456	Chicago	60601	IL	USA	41.87819	-87.62979
Michael Brown	+1-555-6789	+1-555-9012	m.brown@example.com	Pine Road	789	Boston	02108	MA	USA	42.35843	-71.05977

JSON Output Format

For each processed image, the script extracts information in this JSON structure:

{
    "Name": "John Smith",
    "Phone Number": "+1-555-0123",
    "Mobile Number": "+1-555-4567",
    "Email": "john.smith@example.com",
    "Street": "Oak Avenue",
    "Street Number": "123",
    "City": "Springfield",
    "ZIP Code": "12345",
    "State": "IL",
    "Country": "USA",
    "Latitude": "39.78373",
    "Longitude": "-89.65014"
}

File Structure

customer-information-extractor/
├── extract.py           # Main script for processing images
├── config.py           # Configuration file with API key (not tracked)
├── config.template.py  # Template for configuration
├── requirements.txt    # Python dependencies
├── README.md          # Project documentation
├── images/            # Input images folder
│   ├── card1.jpg
│   ├── card2.png
│   └── ...
└── customer_info.xlsx # Generated output file

How It Works

Image Processing:
- Images are loaded from the images folder
- Each image is converted to base64 format
- Supported formats: JPG, JPEG, PNG
API Processing:
- Images are sent to OpenAI's GPT-4 Vision model
- The model analyzes the image content
- Information is extracted in a structured JSON format
Data Compilation:
- JSON responses are parsed and validated
- Data is compiled into a pandas DataFrame
- Final output is saved as an Excel spreadsheet

Error Handling

The script includes comprehensive error handling for:

Image Processing Errors

try:
    with Image.open(image_path) as img:
        # Image processing
except Exception as e:
    print(f"Error processing image {image_path}: {str(e)}")

JSON Parsing Errors

try:
    customer_info = json.loads(json_str)
except json.JSONDecodeError as e:
    print(f"JSON parsing error: {str(e)}")

Excel File Creation Errors

try:
    df.to_excel('customer_info.xlsx', index=False)
except Exception as e:
    print(f"Excel file creation error: {str(e)}")

Common Issues and Solutions

API Key Error

Error: OpenAI API key not found
Solution: Ensure your API key is correctly set in config.py

Image Format Error

Error: Unable to process image
Solution: Ensure images are in JPG, JPEG, or PNG format

Excel File Access Error

Error: Permission denied when creating Excel file
Solution: Close customer_info.xlsx if it's open in another program

Contributing

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Security Notes

Never commit your config.py file containing your API key
Keep your API key secure and rotate it periodically
Monitor your API usage to prevent unexpected charges

Version History

v1.0.0 (Initial Release)
- Basic image processing functionality
- Excel output generation
- Multi-image batch processing

Support

If you encounter any issues or have questions, please:

Check the Common Issues section above
Open an issue in the GitHub repository
Provide sample images (if possible) when reporting issues

Acknowledgments

OpenAI for providing the GPT-4 Vision API
Contributors and maintainers of the dependent Python packages


The updated README now includes:
- More detailed sample output section with both Excel and JSON formats
- Code examples for error handling
- Expanded troubleshooting section
- Better organized file structure
- Support section
- More comprehensive documentation of the workflow

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Customer Information Extractor

Features

Prerequisites

Installation

Usage

Sample Output

Excel File Structure (customer_info.xlsx)

JSON Output Format

File Structure

How It Works

Error Handling

Image Processing Errors

JSON Parsing Errors

Excel File Creation Errors

Common Issues and Solutions

Contributing

License

Security Notes

Version History

Support

Acknowledgments

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
images		images
.gitignore		.gitignore
README.md		README.md
customer_info.xlsx		customer_info.xlsx
extract.py		extract.py
requirements.txt		requirements.txt

pythonicshariful/info_extractor

Folders and files

Latest commit

History

Repository files navigation

Customer Information Extractor

Features

Prerequisites

Installation

Usage

Sample Output

Excel File Structure (customer_info.xlsx)

JSON Output Format

File Structure

How It Works

Error Handling

Image Processing Errors

JSON Parsing Errors

Excel File Creation Errors

Common Issues and Solutions

Contributing

License

Security Notes

Version History

Support

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages